US20190300963A1 - Mirna expression signature in the classification of thyroid tumors - Google Patents
Mirna expression signature in the classification of thyroid tumors Download PDFInfo
- Publication number
- US20190300963A1 US20190300963A1 US16/192,221 US201816192221A US2019300963A1 US 20190300963 A1 US20190300963 A1 US 20190300963A1 US 201816192221 A US201816192221 A US 201816192221A US 2019300963 A1 US2019300963 A1 US 2019300963A1
- Authority
- US
- United States
- Prior art keywords
- hsa
- mir
- seq
- classifier
- thyroid
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000014509 gene expression Effects 0.000 title claims abstract description 264
- 108091070501 miRNA Proteins 0.000 title claims description 65
- 208000024770 Thyroid neoplasm Diseases 0.000 title abstract description 58
- 208000013076 thyroid tumor Diseases 0.000 title abstract description 33
- 108700011259 MicroRNAs Proteins 0.000 claims abstract description 434
- 239000002679 microRNA Substances 0.000 claims abstract description 291
- 238000000034 method Methods 0.000 claims abstract description 116
- 238000004458 analytical method Methods 0.000 claims abstract description 107
- 238000011282 treatment Methods 0.000 claims abstract description 29
- 230000003211 malignant effect Effects 0.000 claims description 176
- 108091092238 Homo sapiens miR-146b stem-loop Proteins 0.000 claims description 126
- 108091067008 Homo sapiens miR-342 stem-loop Proteins 0.000 claims description 119
- 210000001685 thyroid gland Anatomy 0.000 claims description 113
- 108091069006 Homo sapiens miR-125b-1 stem-loop Proteins 0.000 claims description 108
- 108091069087 Homo sapiens miR-125b-2 stem-loop Proteins 0.000 claims description 108
- 125000003729 nucleotide group Chemical group 0.000 claims description 98
- 108091070395 Homo sapiens miR-31 stem-loop Proteins 0.000 claims description 87
- 239000002773 nucleotide Substances 0.000 claims description 79
- 108091067573 Homo sapiens miR-222 stem-loop Proteins 0.000 claims description 77
- 108091063728 Homo sapiens miR-551b stem-loop Proteins 0.000 claims description 67
- 230000003902 lesion Effects 0.000 claims description 65
- 108091069092 Homo sapiens miR-138-1 stem-loop Proteins 0.000 claims description 62
- 108091069015 Homo sapiens miR-138-2 stem-loop Proteins 0.000 claims description 61
- 108091059229 Homo sapiens miR-486-2 stem-loop Proteins 0.000 claims description 54
- 238000004422 calculation algorithm Methods 0.000 claims description 54
- 108091053840 Homo sapiens miR-486 stem-loop Proteins 0.000 claims description 53
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 52
- 108091067535 Homo sapiens miR-375 stem-loop Proteins 0.000 claims description 45
- 238000003753 real-time PCR Methods 0.000 claims description 28
- 108091068997 Homo sapiens miR-152 stem-loop Proteins 0.000 claims description 24
- 230000002441 reversible effect Effects 0.000 claims description 21
- 201000009030 Carcinoma Diseases 0.000 claims description 17
- 238000001356 surgical procedure Methods 0.000 claims description 16
- 108091063808 Homo sapiens miR-574 stem-loop Proteins 0.000 claims description 11
- 238000001959 radiotherapy Methods 0.000 claims description 9
- 238000002512 chemotherapy Methods 0.000 claims description 6
- 238000010801 machine learning Methods 0.000 claims description 6
- 230000002068 genetic effect Effects 0.000 claims description 5
- 238000001794 hormone therapy Methods 0.000 claims description 4
- 108091070492 Homo sapiens miR-23a stem-loop Proteins 0.000 claims description 2
- 238000003752 polymerase chain reaction Methods 0.000 claims 2
- 238000003745 diagnosis Methods 0.000 abstract description 68
- 238000002560 therapeutic procedure Methods 0.000 abstract description 5
- 238000005457 optimization Methods 0.000 abstract 1
- 239000000523 sample Substances 0.000 description 222
- 150000007523 nucleic acids Chemical class 0.000 description 166
- 230000035945 sensitivity Effects 0.000 description 140
- 102000039446 nucleic acids Human genes 0.000 description 132
- 108020004707 nucleic acids Proteins 0.000 description 132
- 210000004027 cell Anatomy 0.000 description 86
- 108091066023 Homo sapiens miR-200c stem-loop Proteins 0.000 description 85
- 206010028980 Neoplasm Diseases 0.000 description 75
- 201000011510 cancer Diseases 0.000 description 48
- 239000011159 matrix material Substances 0.000 description 42
- 230000000295 complement effect Effects 0.000 description 41
- 208000003200 Adenoma Diseases 0.000 description 40
- 230000003325 follicular Effects 0.000 description 40
- 201000010198 papillary carcinoma Diseases 0.000 description 38
- 201000004260 follicular adenoma Diseases 0.000 description 36
- 208000030878 follicular thyroid adenoma Diseases 0.000 description 36
- 210000001519 tissue Anatomy 0.000 description 34
- 238000003556 assay Methods 0.000 description 32
- 238000012549 training Methods 0.000 description 32
- 201000002510 thyroid cancer Diseases 0.000 description 31
- 208000030901 thyroid gland follicular carcinoma Diseases 0.000 description 27
- 238000010200 validation analysis Methods 0.000 description 27
- 108091067634 Homo sapiens miR-181c stem-loop Proteins 0.000 description 25
- 210000004369 blood Anatomy 0.000 description 25
- 239000008280 blood Substances 0.000 description 25
- -1 microRNAs Chemical class 0.000 description 24
- 238000012360 testing method Methods 0.000 description 24
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 23
- 238000009396 hybridization Methods 0.000 description 23
- 239000003550 marker Substances 0.000 description 23
- 239000000203 mixture Substances 0.000 description 23
- 238000001574 biopsy Methods 0.000 description 21
- 239000010432 diamond Substances 0.000 description 21
- 108091028043 Nucleic acid sequence Proteins 0.000 description 20
- 239000013598 vector Substances 0.000 description 20
- 208000009453 Thyroid Nodule Diseases 0.000 description 19
- 210000005131 Hürthle cell Anatomy 0.000 description 16
- 230000003321 amplification Effects 0.000 description 16
- 239000000463 material Substances 0.000 description 16
- 208000023356 medullary thyroid gland carcinoma Diseases 0.000 description 16
- 238000003199 nucleic acid amplification method Methods 0.000 description 16
- 239000007787 solid Substances 0.000 description 16
- 238000001514 detection method Methods 0.000 description 15
- 239000002771 cell marker Substances 0.000 description 14
- 238000002493 microarray Methods 0.000 description 14
- 238000012706 support-vector machine Methods 0.000 description 14
- 108091066970 Homo sapiens miR-346 stem-loop Proteins 0.000 description 13
- 238000006243 chemical reaction Methods 0.000 description 13
- 208000007054 Medullary Carcinoma Diseases 0.000 description 12
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 12
- 208000016842 benign thyroid gland neoplasm Diseases 0.000 description 12
- 239000000975 dye Substances 0.000 description 12
- 238000012986 modification Methods 0.000 description 12
- 230000004048 modification Effects 0.000 description 12
- 238000010839 reverse transcription Methods 0.000 description 12
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 12
- 238000000018 DNA microarray Methods 0.000 description 11
- 108091032108 Homo sapiens miR-424 stem-loop Proteins 0.000 description 11
- 210000002919 epithelial cell Anatomy 0.000 description 11
- 230000007170 pathology Effects 0.000 description 11
- 239000008194 pharmaceutical composition Substances 0.000 description 11
- 239000000047 product Substances 0.000 description 11
- 239000000758 substrate Substances 0.000 description 11
- 206010054107 Nodule Diseases 0.000 description 10
- 206010033701 Papillary thyroid cancer Diseases 0.000 description 10
- 230000027455 binding Effects 0.000 description 10
- 238000009739 binding Methods 0.000 description 10
- 239000002299 complementary DNA Substances 0.000 description 10
- 238000007481 next generation sequencing Methods 0.000 description 10
- 238000010606 normalization Methods 0.000 description 10
- 230000002380 cytological effect Effects 0.000 description 9
- 239000003814 drug Substances 0.000 description 9
- 210000000265 leukocyte Anatomy 0.000 description 9
- 108090000623 proteins and genes Proteins 0.000 description 9
- 238000012163 sequencing technique Methods 0.000 description 9
- 208000030045 thyroid gland papillary carcinoma Diseases 0.000 description 9
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 8
- 208000001204 Hashimoto Disease Diseases 0.000 description 8
- 208000030836 Hashimoto thyroiditis Diseases 0.000 description 8
- 108091067572 Homo sapiens miR-221 stem-loop Proteins 0.000 description 8
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 8
- 238000000692 Student's t-test Methods 0.000 description 8
- 239000011230 binding agent Substances 0.000 description 8
- 238000002372 labelling Methods 0.000 description 8
- 238000007477 logistic regression Methods 0.000 description 8
- 102000004169 proteins and genes Human genes 0.000 description 8
- 238000003908 quality control method Methods 0.000 description 8
- 235000000346 sugar Nutrition 0.000 description 8
- 239000005495 thyroid hormone Substances 0.000 description 8
- 229940036555 thyroid hormone Drugs 0.000 description 8
- 108091032955 Bacterial small RNA Proteins 0.000 description 7
- 108091070493 Homo sapiens miR-21 stem-loop Proteins 0.000 description 7
- 108091008065 MIR21 Proteins 0.000 description 7
- AUYYCJSJGJYCDS-LBPRGKRZSA-N Thyrolar Chemical class IC1=CC(C[C@H](N)C(O)=O)=CC(I)=C1OC1=CC=C(O)C(I)=C1 AUYYCJSJGJYCDS-LBPRGKRZSA-N 0.000 description 7
- 239000012472 biological sample Substances 0.000 description 7
- 230000008859 change Effects 0.000 description 7
- 230000001419 dependent effect Effects 0.000 description 7
- 238000003786 synthesis reaction Methods 0.000 description 7
- CSCPPACGZOOCGX-UHFFFAOYSA-N Acetone Chemical compound CC(C)=O CSCPPACGZOOCGX-UHFFFAOYSA-N 0.000 description 6
- IAZDPXIOMUYVGZ-UHFFFAOYSA-N Dimethylsulphoxide Chemical compound CS(C)=O IAZDPXIOMUYVGZ-UHFFFAOYSA-N 0.000 description 6
- 108091069088 Homo sapiens miR-150 stem-loop Proteins 0.000 description 6
- 108091070489 Homo sapiens miR-17 stem-loop Proteins 0.000 description 6
- 108091065166 Homo sapiens miR-200a stem-loop Proteins 0.000 description 6
- 108091069457 Homo sapiens miR-200b stem-loop Proteins 0.000 description 6
- 108091072924 Homo sapiens miR-3074 stem-loop Proteins 0.000 description 6
- 108091066987 Homo sapiens miR-345 stem-loop Proteins 0.000 description 6
- 108091088903 Homo sapiens miR-5701-1 stem-loop Proteins 0.000 description 6
- 108091089437 Homo sapiens miR-5701-2 stem-loop Proteins 0.000 description 6
- OKKJLVBELUTLKV-UHFFFAOYSA-N Methanol Chemical compound OC OKKJLVBELUTLKV-UHFFFAOYSA-N 0.000 description 6
- 238000012179 MicroRNA sequencing Methods 0.000 description 6
- 239000004743 Polypropylene Substances 0.000 description 6
- 238000011948 assay development Methods 0.000 description 6
- 210000000601 blood cell Anatomy 0.000 description 6
- 239000000872 buffer Substances 0.000 description 6
- 208000035269 cancer or benign tumor Diseases 0.000 description 6
- 230000002596 correlated effect Effects 0.000 description 6
- 230000000875 corresponding effect Effects 0.000 description 6
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 6
- 208000015799 differentiated thyroid carcinoma Diseases 0.000 description 6
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 6
- FDGQSTZJBFJUBT-UHFFFAOYSA-N hypoxanthine Chemical compound O=C1NC=NC2=C1NC=N2 FDGQSTZJBFJUBT-UHFFFAOYSA-N 0.000 description 6
- 238000002360 preparation method Methods 0.000 description 6
- 230000001105 regulatory effect Effects 0.000 description 6
- 239000000243 solution Substances 0.000 description 6
- 229940035893 uracil Drugs 0.000 description 6
- MWWSFMDVAYGXBV-RUELKSSGSA-N Doxorubicin hydrochloride Chemical compound Cl.O([C@H]1C[C@@](O)(CC=2C(O)=C3C(=O)C=4C=CC=C(C=4C(=O)C3=C(O)C=21)OC)C(=O)CO)[C@H]1C[C@H](N)[C@H](O)[C@H](C)O1 MWWSFMDVAYGXBV-RUELKSSGSA-N 0.000 description 5
- 108010067770 Endopeptidase K Proteins 0.000 description 5
- 102000004190 Enzymes Human genes 0.000 description 5
- 108090000790 Enzymes Proteins 0.000 description 5
- 108091067692 Homo sapiens miR-199a-1 stem-loop Proteins 0.000 description 5
- 108091067467 Homo sapiens miR-199a-2 stem-loop Proteins 0.000 description 5
- 108091045523 Homo sapiens miR-5701-3 stem-loop Proteins 0.000 description 5
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 5
- 230000004663 cell proliferation Effects 0.000 description 5
- 239000003153 chemical reaction reagent Substances 0.000 description 5
- 238000009472 formulation Methods 0.000 description 5
- 238000012165 high-throughput sequencing Methods 0.000 description 5
- 206010020718 hyperplasia Diseases 0.000 description 5
- 238000010348 incorporation Methods 0.000 description 5
- 210000000056 organ Anatomy 0.000 description 5
- 239000012188 paraffin wax Substances 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 125000002652 ribonucleotide group Chemical group 0.000 description 5
- 238000007619 statistical method Methods 0.000 description 5
- 239000000126 substance Substances 0.000 description 5
- 229940113082 thymine Drugs 0.000 description 5
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 5
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 4
- KDCGOANMDULRCW-UHFFFAOYSA-N 7H-purine Chemical compound N1=CNC2=NC=NC2=C1 KDCGOANMDULRCW-UHFFFAOYSA-N 0.000 description 4
- LRFVTYWOQMYALW-UHFFFAOYSA-N 9H-xanthine Chemical compound O=C1NC(=O)NC2=C1NC=N2 LRFVTYWOQMYALW-UHFFFAOYSA-N 0.000 description 4
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 4
- WSFSSNUMVMOOMR-UHFFFAOYSA-N Formaldehyde Chemical compound O=C WSFSSNUMVMOOMR-UHFFFAOYSA-N 0.000 description 4
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 4
- 108091068838 Homo sapiens miR-103a-2 stem-loop Proteins 0.000 description 4
- 108091069004 Homo sapiens miR-125a stem-loop Proteins 0.000 description 4
- 108091069017 Homo sapiens miR-140 stem-loop Proteins 0.000 description 4
- 108091069089 Homo sapiens miR-146a stem-loop Proteins 0.000 description 4
- 108091070491 Homo sapiens miR-16-1 stem-loop Proteins 0.000 description 4
- 108091070490 Homo sapiens miR-18a stem-loop Proteins 0.000 description 4
- 108091067982 Homo sapiens miR-197 stem-loop Proteins 0.000 description 4
- 239000002118 L01XE12 - Vandetanib Substances 0.000 description 4
- BACYUWVYYTXETD-UHFFFAOYSA-N N-Lauroylsarcosine Chemical compound CCCCCCCCCCCC(=O)N(C)CC(O)=O BACYUWVYYTXETD-UHFFFAOYSA-N 0.000 description 4
- CTQNGGLPUBDAKN-UHFFFAOYSA-N O-Xylene Chemical compound CC1=CC=CC=C1C CTQNGGLPUBDAKN-UHFFFAOYSA-N 0.000 description 4
- 238000011529 RT qPCR Methods 0.000 description 4
- 108091028664 Ribonucleotide Proteins 0.000 description 4
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N Silicium dioxide Chemical compound O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 4
- 108010090804 Streptavidin Proteins 0.000 description 4
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 4
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 4
- 102000011923 Thyrotropin Human genes 0.000 description 4
- 108010061174 Thyrotropin Proteins 0.000 description 4
- 229920004892 Triton X-102 Polymers 0.000 description 4
- 239000002253 acid Substances 0.000 description 4
- 238000005119 centrifugation Methods 0.000 description 4
- 150000001875 compounds Chemical class 0.000 description 4
- 238000002790 cross-validation Methods 0.000 description 4
- 238000012350 deep sequencing Methods 0.000 description 4
- 238000011161 development Methods 0.000 description 4
- 238000002405 diagnostic procedure Methods 0.000 description 4
- 239000003085 diluting agent Substances 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 239000012634 fragment Substances 0.000 description 4
- 125000000524 functional group Chemical group 0.000 description 4
- 229940088597 hormone Drugs 0.000 description 4
- 239000005556 hormone Substances 0.000 description 4
- DRAVOWXCEBXPTN-UHFFFAOYSA-N isoguanine Chemical compound NC1=NC(=O)NC2=C1NC=N2 DRAVOWXCEBXPTN-UHFFFAOYSA-N 0.000 description 4
- 230000000670 limiting effect Effects 0.000 description 4
- 239000007788 liquid Substances 0.000 description 4
- 230000036210 malignancy Effects 0.000 description 4
- 239000000546 pharmaceutical excipient Substances 0.000 description 4
- 238000004393 prognosis Methods 0.000 description 4
- 230000035755 proliferation Effects 0.000 description 4
- 239000002336 ribonucleotide Substances 0.000 description 4
- 108700004121 sarkosyl Proteins 0.000 description 4
- 238000011269 treatment regimen Methods 0.000 description 4
- UHTHHESEBZOYNR-UHFFFAOYSA-N vandetanib Chemical compound COC1=CC(C(/N=CN2)=N/C=3C(=CC(Br)=CC=3)F)=C2C=C1OCC1CCN(C)CC1 UHTHHESEBZOYNR-UHFFFAOYSA-N 0.000 description 4
- 239000008096 xylene Substances 0.000 description 4
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 3
- 229930024421 Adenine Natural products 0.000 description 3
- 208000001446 Anaplastic Thyroid Carcinoma Diseases 0.000 description 3
- 206010002240 Anaplastic thyroid cancer Diseases 0.000 description 3
- UHOVQNZJYSORNB-UHFFFAOYSA-N Benzene Chemical compound C1=CC=CC=C1 UHOVQNZJYSORNB-UHFFFAOYSA-N 0.000 description 3
- 238000000729 Fisher's exact test Methods 0.000 description 3
- 108091070521 Homo sapiens let-7a-1 stem-loop Proteins 0.000 description 3
- 108091070522 Homo sapiens let-7a-2 stem-loop Proteins 0.000 description 3
- 108091070513 Homo sapiens let-7a-3 stem-loop Proteins 0.000 description 3
- 108091070514 Homo sapiens let-7b stem-loop Proteins 0.000 description 3
- 108091070511 Homo sapiens let-7c stem-loop Proteins 0.000 description 3
- 108091070512 Homo sapiens let-7d stem-loop Proteins 0.000 description 3
- 108091070510 Homo sapiens let-7f-1 stem-loop Proteins 0.000 description 3
- 108091070526 Homo sapiens let-7f-2 stem-loop Proteins 0.000 description 3
- 108091069046 Homo sapiens let-7g stem-loop Proteins 0.000 description 3
- 108091069047 Homo sapiens let-7i stem-loop Proteins 0.000 description 3
- 108091068855 Homo sapiens miR-103a-1 stem-loop Proteins 0.000 description 3
- 108091068941 Homo sapiens miR-106a stem-loop Proteins 0.000 description 3
- 108091065165 Homo sapiens miR-106b stem-loop Proteins 0.000 description 3
- 108091068928 Homo sapiens miR-107 stem-loop Proteins 0.000 description 3
- 108091068993 Homo sapiens miR-142 stem-loop Proteins 0.000 description 3
- 108091068999 Homo sapiens miR-144 stem-loop Proteins 0.000 description 3
- 108091067009 Homo sapiens miR-148b stem-loop Proteins 0.000 description 3
- 108091067014 Homo sapiens miR-151a stem-loop Proteins 0.000 description 3
- 108091069045 Homo sapiens miR-15b stem-loop Proteins 0.000 description 3
- 108091068927 Homo sapiens miR-16-2 stem-loop Proteins 0.000 description 3
- 108091067469 Homo sapiens miR-181a-1 stem-loop Proteins 0.000 description 3
- 108091067618 Homo sapiens miR-181a-2 stem-loop Proteins 0.000 description 3
- 108091068954 Homo sapiens miR-185 stem-loop Proteins 0.000 description 3
- 108091068998 Homo sapiens miR-191 stem-loop Proteins 0.000 description 3
- 108091069034 Homo sapiens miR-193a stem-loop Proteins 0.000 description 3
- 108091070519 Homo sapiens miR-19b-1 stem-loop Proteins 0.000 description 3
- 108091070495 Homo sapiens miR-19b-2 stem-loop Proteins 0.000 description 3
- 108091070496 Homo sapiens miR-20a stem-loop Proteins 0.000 description 3
- 108091070371 Homo sapiens miR-25 stem-loop Proteins 0.000 description 3
- 108091070372 Homo sapiens miR-26a-1 stem-loop Proteins 0.000 description 3
- 108091065428 Homo sapiens miR-26a-2 stem-loop Proteins 0.000 description 3
- 108091070399 Homo sapiens miR-26b stem-loop Proteins 0.000 description 3
- 108091069021 Homo sapiens miR-30b stem-loop Proteins 0.000 description 3
- 108091065163 Homo sapiens miR-30c-1 stem-loop Proteins 0.000 description 3
- 108091067641 Homo sapiens miR-30c-2 stem-loop Proteins 0.000 description 3
- 108091067650 Homo sapiens miR-30d stem-loop Proteins 0.000 description 3
- 108091067619 Homo sapiens miR-34a stem-loop Proteins 0.000 description 3
- 108091032103 Homo sapiens miR-425 stem-loop Proteins 0.000 description 3
- 108091035183 Homo sapiens miR-4306 stem-loop Proteins 0.000 description 3
- 108091053854 Homo sapiens miR-484 stem-loop Proteins 0.000 description 3
- 108091070380 Homo sapiens miR-92a-1 stem-loop Proteins 0.000 description 3
- 108091070381 Homo sapiens miR-92a-2 stem-loop Proteins 0.000 description 3
- 108091070377 Homo sapiens miR-93 stem-loop Proteins 0.000 description 3
- UGQMRVRMYYASKQ-UHFFFAOYSA-N Hypoxanthine nucleoside Natural products OC1C(O)C(CO)OC1N1C(NC=NC2=O)=C2N=C1 UGQMRVRMYYASKQ-UHFFFAOYSA-N 0.000 description 3
- KFZMGEQAYNKOFK-UHFFFAOYSA-N Isopropanol Chemical compound CC(C)O KFZMGEQAYNKOFK-UHFFFAOYSA-N 0.000 description 3
- 208000037196 Medullary thyroid carcinoma Diseases 0.000 description 3
- 238000000636 Northern blotting Methods 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- 108091093037 Peptide nucleic acid Proteins 0.000 description 3
- 101710086015 RNA ligase Proteins 0.000 description 3
- 238000003559 RNA-seq method Methods 0.000 description 3
- 230000001594 aberrant effect Effects 0.000 description 3
- 239000013543 active substance Substances 0.000 description 3
- 229960000643 adenine Drugs 0.000 description 3
- HFCFMRYTXDINDK-WNQIDUERSA-N cabozantinib malate Chemical compound OC(=O)[C@@H](O)CC(O)=O.C=12C=C(OC)C(OC)=CC2=NC=CC=1OC(C=C1)=CC=C1NC(=O)C1(C(=O)NC=2C=CC(F)=CC=2)CC1 HFCFMRYTXDINDK-WNQIDUERSA-N 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 230000015556 catabolic process Effects 0.000 description 3
- 239000003795 chemical substances by application Substances 0.000 description 3
- 229940104302 cytosine Drugs 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 229960002918 doxorubicin hydrochloride Drugs 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 239000011521 glass Substances 0.000 description 3
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 3
- 230000012010 growth Effects 0.000 description 3
- 230000036541 health Effects 0.000 description 3
- 238000007901 in situ hybridization Methods 0.000 description 3
- 229910052740 iodine Inorganic materials 0.000 description 3
- 210000001165 lymph node Anatomy 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 206010061289 metastatic neoplasm Diseases 0.000 description 3
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 3
- 239000004033 plastic Substances 0.000 description 3
- 229920003023 plastic Polymers 0.000 description 3
- 239000003755 preservative agent Substances 0.000 description 3
- UBQKCCHYAOITMY-UHFFFAOYSA-N pyridin-2-ol Chemical compound OC1=CC=CC=N1 UBQKCCHYAOITMY-UHFFFAOYSA-N 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- IVDHYUQIDRJSTI-UHFFFAOYSA-N sorafenib tosylate Chemical compound [H+].CC1=CC=C(S([O-])(=O)=O)C=C1.C1=NC(C(=O)NC)=CC(OC=2C=CC(NC(=O)NC=3C=C(C(Cl)=CC=3)C(F)(F)F)=CC=2)=C1 IVDHYUQIDRJSTI-UHFFFAOYSA-N 0.000 description 3
- 238000012353 t test Methods 0.000 description 3
- 238000002626 targeted therapy Methods 0.000 description 3
- 230000001225 therapeutic effect Effects 0.000 description 3
- 208000013818 thyroid gland medullary carcinoma Diseases 0.000 description 3
- 208000019179 thyroid gland undifferentiated (anaplastic) carcinoma Diseases 0.000 description 3
- 229940121358 tyrosine kinase inhibitor Drugs 0.000 description 3
- 239000005483 tyrosine kinase inhibitor Substances 0.000 description 3
- 150000004917 tyrosine kinase inhibitor derivatives Chemical class 0.000 description 3
- 229960000241 vandetanib Drugs 0.000 description 3
- 239000011534 wash buffer Substances 0.000 description 3
- XQCZBXHVTFVIFE-UHFFFAOYSA-N 2-amino-4-hydroxypyrimidine Chemical compound NC1=NC=CC(O)=N1 XQCZBXHVTFVIFE-UHFFFAOYSA-N 0.000 description 2
- AGFIRQJZCNVMCW-UAKXSSHOSA-N 5-bromouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)C(Br)=C1 AGFIRQJZCNVMCW-UAKXSSHOSA-N 0.000 description 2
- ZCYVEMRRCGMTRW-UHFFFAOYSA-N 7553-56-2 Chemical compound [I] ZCYVEMRRCGMTRW-UHFFFAOYSA-N 0.000 description 2
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 2
- 206010060999 Benign neoplasm Diseases 0.000 description 2
- 208000006274 Brain Stem Neoplasms Diseases 0.000 description 2
- 210000003771 C cell Anatomy 0.000 description 2
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 2
- 102000016911 Deoxyribonucleases Human genes 0.000 description 2
- 108010053770 Deoxyribonucleases Proteins 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 208000004463 Follicular Adenocarcinoma Diseases 0.000 description 2
- YLQBMQCUIZJEEH-UHFFFAOYSA-N Furan Chemical compound C=1C=COC=1 YLQBMQCUIZJEEH-UHFFFAOYSA-N 0.000 description 2
- 108091068853 Homo sapiens miR-100 stem-loop Proteins 0.000 description 2
- 108091067628 Homo sapiens miR-10a stem-loop Proteins 0.000 description 2
- 108091044954 Homo sapiens miR-1229 stem-loop Proteins 0.000 description 2
- 108091069085 Homo sapiens miR-126 stem-loop Proteins 0.000 description 2
- 108091068991 Homo sapiens miR-141 stem-loop Proteins 0.000 description 2
- 108091067654 Homo sapiens miR-148a stem-loop Proteins 0.000 description 2
- 108091065981 Homo sapiens miR-155 stem-loop Proteins 0.000 description 2
- 108091067602 Homo sapiens miR-181b-1 stem-loop Proteins 0.000 description 2
- 108091065989 Homo sapiens miR-181b-2 stem-loop Proteins 0.000 description 2
- 108091067627 Homo sapiens miR-182 stem-loop Proteins 0.000 description 2
- 108091067635 Homo sapiens miR-187 stem-loop Proteins 0.000 description 2
- 108091067995 Homo sapiens miR-192 stem-loop Proteins 0.000 description 2
- 108091068960 Homo sapiens miR-195 stem-loop Proteins 0.000 description 2
- 108091067484 Homo sapiens miR-199b stem-loop Proteins 0.000 description 2
- 108091067482 Homo sapiens miR-205 stem-loop Proteins 0.000 description 2
- 108091067468 Homo sapiens miR-210 stem-loop Proteins 0.000 description 2
- 108091067580 Homo sapiens miR-214 stem-loop Proteins 0.000 description 2
- 108091070494 Homo sapiens miR-22 stem-loop Proteins 0.000 description 2
- 108091069527 Homo sapiens miR-223 stem-loop Proteins 0.000 description 2
- 108091069517 Homo sapiens miR-224 stem-loop Proteins 0.000 description 2
- 108091055169 Homo sapiens miR-2392 stem-loop Proteins 0.000 description 2
- 108091065168 Homo sapiens miR-29c stem-loop Proteins 0.000 description 2
- 108091065436 Homo sapiens miR-30e stem-loop Proteins 0.000 description 2
- 108091072943 Homo sapiens miR-3141 stem-loop Proteins 0.000 description 2
- 108091072956 Homo sapiens miR-3162 stem-loop Proteins 0.000 description 2
- 108091056656 Homo sapiens miR-3648-1 stem-loop Proteins 0.000 description 2
- 108091045458 Homo sapiens miR-3648-2 stem-loop Proteins 0.000 description 2
- 108091056608 Homo sapiens miR-3679 stem-loop Proteins 0.000 description 2
- 108091067563 Homo sapiens miR-376a-1 stem-loop Proteins 0.000 description 2
- 108091063912 Homo sapiens miR-376a-2 stem-loop Proteins 0.000 description 2
- 108091054965 Homo sapiens miR-3940 stem-loop Proteins 0.000 description 2
- 108091035151 Homo sapiens miR-4270 stem-loop Proteins 0.000 description 2
- 108091034227 Homo sapiens miR-4284 stem-loop Proteins 0.000 description 2
- 108091032930 Homo sapiens miR-429 stem-loop Proteins 0.000 description 2
- 108091035072 Homo sapiens miR-4324 stem-loop Proteins 0.000 description 2
- 108091055493 Homo sapiens miR-4443 stem-loop Proteins 0.000 description 2
- 108091055528 Homo sapiens miR-4447 stem-loop Proteins 0.000 description 2
- 108091055376 Homo sapiens miR-4448 stem-loop Proteins 0.000 description 2
- 108091055444 Homo sapiens miR-4454 stem-loop Proteins 0.000 description 2
- 108091054145 Homo sapiens miR-4534 stem-loop Proteins 0.000 description 2
- 108091054155 Homo sapiens miR-4538 stem-loop Proteins 0.000 description 2
- 108091054141 Homo sapiens miR-4539 stem-loop Proteins 0.000 description 2
- 108091063813 Homo sapiens miR-455 stem-loop Proteins 0.000 description 2
- 108091023075 Homo sapiens miR-4689 stem-loop Proteins 0.000 description 2
- 108091023074 Homo sapiens miR-4690 stem-loop Proteins 0.000 description 2
- 108091093164 Homo sapiens miR-4739 stem-loop Proteins 0.000 description 2
- 108091053841 Homo sapiens miR-483 stem-loop Proteins 0.000 description 2
- 108091063895 Homo sapiens miR-487b stem-loop Proteins 0.000 description 2
- 108091092303 Homo sapiens miR-497 stem-loop Proteins 0.000 description 2
- 108091063646 Homo sapiens miR-5001 stem-loop Proteins 0.000 description 2
- 108091090409 Homo sapiens miR-5100 stem-loop Proteins 0.000 description 2
- 108091063565 Homo sapiens miR-532 stem-loop Proteins 0.000 description 2
- 108091061666 Homo sapiens miR-542 stem-loop Proteins 0.000 description 2
- 108091089161 Homo sapiens miR-5684 stem-loop Proteins 0.000 description 2
- 108091088947 Homo sapiens miR-5698 stem-loop Proteins 0.000 description 2
- 108091089754 Homo sapiens miR-5739 stem-loop Proteins 0.000 description 2
- 108091038972 Homo sapiens miR-6076 stem-loop Proteins 0.000 description 2
- 108091039970 Homo sapiens miR-6086 stem-loop Proteins 0.000 description 2
- 108091058611 Homo sapiens miR-6127 stem-loop Proteins 0.000 description 2
- 108091061649 Homo sapiens miR-625 stem-loop Proteins 0.000 description 2
- 108091061608 Homo sapiens miR-650 stem-loop Proteins 0.000 description 2
- 108091061674 Homo sapiens miR-658 stem-loop Proteins 0.000 description 2
- 108091089456 Homo sapiens miR-664b stem-loop Proteins 0.000 description 2
- 108091086460 Homo sapiens miR-708 stem-loop Proteins 0.000 description 2
- 108091087855 Homo sapiens miR-765 stem-loop Proteins 0.000 description 2
- 108091063740 Homo sapiens miR-92b stem-loop Proteins 0.000 description 2
- 108091070376 Homo sapiens miR-96 stem-loop Proteins 0.000 description 2
- 102000003960 Ligases Human genes 0.000 description 2
- 108090000364 Ligases Proteins 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 108091030146 MiRBase Proteins 0.000 description 2
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 2
- 108091005461 Nucleic proteins Proteins 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- 229910019142 PO4 Inorganic materials 0.000 description 2
- 101710124239 Poly(A) polymerase Proteins 0.000 description 2
- 238000002123 RNA extraction Methods 0.000 description 2
- 238000013381 RNA quantification Methods 0.000 description 2
- VMHLLURERBWHNL-UHFFFAOYSA-M Sodium acetate Chemical compound [Na+].CC([O-])=O VMHLLURERBWHNL-UHFFFAOYSA-M 0.000 description 2
- FKNQFGJONOIPTF-UHFFFAOYSA-N Sodium cation Chemical compound [Na+] FKNQFGJONOIPTF-UHFFFAOYSA-N 0.000 description 2
- PPBRXRYQALVLMV-UHFFFAOYSA-N Styrene Chemical compound C=CC1=CC=CC=C1 PPBRXRYQALVLMV-UHFFFAOYSA-N 0.000 description 2
- 102000004523 Sulfate Adenylyltransferase Human genes 0.000 description 2
- 108010022348 Sulfate adenylyltransferase Proteins 0.000 description 2
- 108700005078 Synthetic Genes Proteins 0.000 description 2
- 239000007983 Tris buffer Substances 0.000 description 2
- 239000002671 adjuvant Substances 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 125000003118 aryl group Chemical group 0.000 description 2
- 229960002685 biotin Drugs 0.000 description 2
- 235000020958 biotin Nutrition 0.000 description 2
- 239000011616 biotin Substances 0.000 description 2
- 229960002865 cabozantinib s-malate Drugs 0.000 description 2
- 229910052791 calcium Inorganic materials 0.000 description 2
- 239000011575 calcium Substances 0.000 description 2
- 239000002775 capsule Substances 0.000 description 2
- 239000000969 carrier Substances 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000000546 chi-square test Methods 0.000 description 2
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000013480 data collection Methods 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 238000010790 dilution Methods 0.000 description 2
- 239000012895 dilution Substances 0.000 description 2
- XPPKVPWEQAFLFU-UHFFFAOYSA-J diphosphate(4-) Chemical compound [O-]P([O-])(=O)OP([O-])([O-])=O XPPKVPWEQAFLFU-UHFFFAOYSA-J 0.000 description 2
- 235000011180 diphosphates Nutrition 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 239000000834 fixative Substances 0.000 description 2
- 239000012530 fluid Substances 0.000 description 2
- 239000007850 fluorescent dye Substances 0.000 description 2
- 239000006260 foam Substances 0.000 description 2
- 125000005843 halogen group Chemical group 0.000 description 2
- 125000000623 heterocyclic group Chemical group 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 238000001727 in vivo Methods 0.000 description 2
- 238000011534 incubation Methods 0.000 description 2
- 239000004615 ingredient Substances 0.000 description 2
- 230000002401 inhibitory effect Effects 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000009545 invasion Effects 0.000 description 2
- 239000011630 iodine Substances 0.000 description 2
- 210000004698 lymphocyte Anatomy 0.000 description 2
- HQKMJHAJHXVSDF-UHFFFAOYSA-L magnesium stearate Chemical compound [Mg+2].CCCCCCCCCCCCCCCCCC([O-])=O.CCCCCCCCCCCCCCCCCC([O-])=O HQKMJHAJHXVSDF-UHFFFAOYSA-L 0.000 description 2
- 238000007726 management method Methods 0.000 description 2
- 238000002844 melting Methods 0.000 description 2
- 230000008018 melting Effects 0.000 description 2
- 230000004060 metabolic process Effects 0.000 description 2
- 208000011645 metastatic carcinoma Diseases 0.000 description 2
- 238000001531 micro-dissection Methods 0.000 description 2
- 230000002438 mitochondrial effect Effects 0.000 description 2
- 238000002156 mixing Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000035772 mutation Effects 0.000 description 2
- 239000013642 negative control Substances 0.000 description 2
- IDBIFFKSXLYUOT-UHFFFAOYSA-N netropsin Chemical compound C1=C(C(=O)NCCC(N)=N)N(C)C=C1NC(=O)C1=CC(NC(=O)CN=C(N)N)=CN1C IDBIFFKSXLYUOT-UHFFFAOYSA-N 0.000 description 2
- 239000002853 nucleic acid probe Substances 0.000 description 2
- 239000002777 nucleoside Substances 0.000 description 2
- 150000003833 nucleoside derivatives Chemical class 0.000 description 2
- 239000008188 pellet Substances 0.000 description 2
- 125000001997 phenyl group Chemical group [H]C1=C([H])C([H])=C(*)C([H])=C1[H] 0.000 description 2
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical group [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 2
- 239000010452 phosphate Substances 0.000 description 2
- 239000013641 positive control Substances 0.000 description 2
- 239000000843 powder Substances 0.000 description 2
- ZSKGQVFRTSEPJT-UHFFFAOYSA-N pyrrole-2-carboxaldehyde Chemical compound O=CC1=CC=CN1 ZSKGQVFRTSEPJT-UHFFFAOYSA-N 0.000 description 2
- 238000001303 quality assessment method Methods 0.000 description 2
- 238000011002 quantification Methods 0.000 description 2
- 230000005855 radiation Effects 0.000 description 2
- 238000002271 resection Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000012552 review Methods 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000007841 sequencing by ligation Methods 0.000 description 2
- 239000000377 silicon dioxide Substances 0.000 description 2
- 229910001415 sodium ion Inorganic materials 0.000 description 2
- 229960000487 sorafenib tosylate Drugs 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 239000000725 suspension Substances 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 206010043778 thyroiditis Diseases 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 2
- 210000004881 tumor cell Anatomy 0.000 description 2
- 238000003260 vortexing Methods 0.000 description 2
- 229940075420 xanthine Drugs 0.000 description 2
- CALDMMCNNFPJSI-CRCLSJGQSA-N (3r,5s)-5-(hydroxymethyl)pyrrolidin-3-ol Chemical compound OC[C@@H]1C[C@@H](O)CN1 CALDMMCNNFPJSI-CRCLSJGQSA-N 0.000 description 1
- 125000004169 (C1-C6) alkyl group Chemical group 0.000 description 1
- FYADHXFMURLYQI-UHFFFAOYSA-N 1,2,4-triazine Chemical class C1=CN=NC=N1 FYADHXFMURLYQI-UHFFFAOYSA-N 0.000 description 1
- LKUDPHPHKOZXCD-UHFFFAOYSA-N 1,3,5-trimethoxybenzene Chemical compound COC1=CC(OC)=CC(OC)=C1 LKUDPHPHKOZXCD-UHFFFAOYSA-N 0.000 description 1
- GZEFTKHSACGIBG-UGKPPGOTSA-N 1-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)-2-propyloxolan-2-yl]pyrimidine-2,4-dione Chemical compound C1=CC(=O)NC(=O)N1[C@]1(CCC)O[C@H](CO)[C@@H](O)[C@H]1O GZEFTKHSACGIBG-UGKPPGOTSA-N 0.000 description 1
- SGKGZYGMLGVQHP-ZOQUXTDFSA-N 1-[(2r,3r,4s,5r)-3,4-dihydroxy-5-(hydroxymethyl)oxolan-2-yl]-6-methylpyrimidine-2,4-dione Chemical compound CC1=CC(=O)NC(=O)N1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 SGKGZYGMLGVQHP-ZOQUXTDFSA-N 0.000 description 1
- MWBWWFOAEOYUST-UHFFFAOYSA-N 2-aminopurine Chemical compound NC1=NC=C2N=CNC2=N1 MWBWWFOAEOYUST-UHFFFAOYSA-N 0.000 description 1
- FSVMNZBNUZWLMV-UHFFFAOYSA-N 3,6,7,8-tetrahydropyrrolo[3,2-e]indole-2-carboxylic acid Chemical compound C1=C2NC(C(=O)O)=CC2=C2CCNC2=C1 FSVMNZBNUZWLMV-UHFFFAOYSA-N 0.000 description 1
- VPLZGVOSFFCKFC-UHFFFAOYSA-N 3-methyluracil Chemical compound CN1C(=O)C=CNC1=O VPLZGVOSFFCKFC-UHFFFAOYSA-N 0.000 description 1
- UDGUGZTYGWUUSG-UHFFFAOYSA-N 4-[4-[[2,5-dimethoxy-4-[(4-nitrophenyl)diazenyl]phenyl]diazenyl]-n-methylanilino]butanoic acid Chemical compound COC=1C=C(N=NC=2C=CC(=CC=2)N(C)CCCC(O)=O)C(OC)=CC=1N=NC1=CC=C([N+]([O-])=O)C=C1 UDGUGZTYGWUUSG-UHFFFAOYSA-N 0.000 description 1
- PHAFOFIVSNSAPQ-UHFFFAOYSA-N 4-fluoro-6-methyl-1h-benzimidazole Chemical compound CC1=CC(F)=C2NC=NC2=C1 PHAFOFIVSNSAPQ-UHFFFAOYSA-N 0.000 description 1
- GCNTZFIIOFTKIY-UHFFFAOYSA-N 4-hydroxypyridine Chemical compound OC1=CC=NC=C1 GCNTZFIIOFTKIY-UHFFFAOYSA-N 0.000 description 1
- QCXGJTGMGJOYDP-UHFFFAOYSA-N 4-methyl-1h-benzimidazole Chemical compound CC1=CC=CC2=C1N=CN2 QCXGJTGMGJOYDP-UHFFFAOYSA-N 0.000 description 1
- ZAYHVCMSTBRABG-UHFFFAOYSA-N 5-Methylcytidine Natural products O=C1N=C(N)C(C)=CN1C1C(O)C(O)C(CO)O1 ZAYHVCMSTBRABG-UHFFFAOYSA-N 0.000 description 1
- LQLQRFGHAALLLE-UHFFFAOYSA-N 5-bromouracil Chemical compound BrC1=CNC(=O)NC1=O LQLQRFGHAALLLE-UHFFFAOYSA-N 0.000 description 1
- ZAYHVCMSTBRABG-JXOAFFINSA-N 5-methylcytidine Chemical compound O=C1N=C(N)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 ZAYHVCMSTBRABG-JXOAFFINSA-N 0.000 description 1
- 108020004565 5.8S Ribosomal RNA Proteins 0.000 description 1
- 108020005075 5S Ribosomal RNA Proteins 0.000 description 1
- BZTDTCNHAFUJOG-UHFFFAOYSA-N 6-carboxyfluorescein Chemical compound C12=CC=C(O)C=C2OC2=CC(O)=CC=C2C11OC(=O)C2=CC=C(C(=O)O)C=C21 BZTDTCNHAFUJOG-UHFFFAOYSA-N 0.000 description 1
- QCHLWIPINNRBSG-UHFFFAOYSA-N 6-thiophen-2-yl-7h-purin-2-amine Chemical compound C=12NC=NC2=NC(N)=NC=1C1=CC=CS1 QCHLWIPINNRBSG-UHFFFAOYSA-N 0.000 description 1
- WIPHNVWGXOHNEF-UHFFFAOYSA-N 7-thiophen-2-yl-1h-imidazo[4,5-b]pyridine Chemical compound C1=CSC(C=2C=3N=CNC=3N=CC=2)=C1 WIPHNVWGXOHNEF-UHFFFAOYSA-N 0.000 description 1
- ASUCSHXLTWZYBA-UMMCILCDSA-N 8-Bromoguanosine Chemical compound C1=2NC(N)=NC(=O)C=2N=C(Br)N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O ASUCSHXLTWZYBA-UMMCILCDSA-N 0.000 description 1
- HDZZVAMISRMYHH-UHFFFAOYSA-N 9beta-Ribofuranosyl-7-deazaadenin Natural products C1=CC=2C(N)=NC=NC=2N1C1OC(CO)C(O)C1O HDZZVAMISRMYHH-UHFFFAOYSA-N 0.000 description 1
- 206010001233 Adenoma benign Diseases 0.000 description 1
- 208000000884 Airway Obstruction Diseases 0.000 description 1
- 239000004382 Amylase Substances 0.000 description 1
- 102000013142 Amylases Human genes 0.000 description 1
- 108010065511 Amylases Proteins 0.000 description 1
- 108091023037 Aptamer Proteins 0.000 description 1
- 108010007730 Apyrase Proteins 0.000 description 1
- 102000007347 Apyrase Human genes 0.000 description 1
- 102100030356 Arginase-2, mitochondrial Human genes 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 208000023328 Basedow disease Diseases 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- QCMYYKRYFNMIEC-UHFFFAOYSA-N COP(O)=O Chemical class COP(O)=O QCMYYKRYFNMIEC-UHFFFAOYSA-N 0.000 description 1
- 102000055006 Calcitonin Human genes 0.000 description 1
- 108060001064 Calcitonin Proteins 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 206010053567 Coagulopathies Diseases 0.000 description 1
- 108091035707 Consensus sequence Proteins 0.000 description 1
- 239000004971 Cross linker Substances 0.000 description 1
- IGXWBGJHJZYPQS-SSDOTTSWSA-N D-Luciferin Chemical compound OC(=O)[C@H]1CSC(C=2SC3=CC=C(O)C=C3N=2)=N1 IGXWBGJHJZYPQS-SSDOTTSWSA-N 0.000 description 1
- 108020004414 DNA Proteins 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- CYCGRDQQIOGCKX-UHFFFAOYSA-N Dehydro-luciferin Natural products OC(=O)C1=CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 CYCGRDQQIOGCKX-UHFFFAOYSA-N 0.000 description 1
- SHIBSTMRCDJXLN-UHFFFAOYSA-N Digoxigenin Natural products C1CC(C2C(C3(C)CCC(O)CC3CC2)CC2O)(O)C2(C)C1C1=CC(=O)OC1 SHIBSTMRCDJXLN-UHFFFAOYSA-N 0.000 description 1
- 102100038002 Dolichyl-diphosphooligosaccharide-protein glycosyltransferase subunit STT3A Human genes 0.000 description 1
- 206010073306 Exposure to radiation Diseases 0.000 description 1
- BJGNCJDXODQBOB-UHFFFAOYSA-N Fivefly Luciferin Natural products OC(=O)C1CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 BJGNCJDXODQBOB-UHFFFAOYSA-N 0.000 description 1
- 206010016935 Follicular thyroid cancer Diseases 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 102100024185 G1/S-specific cyclin-D2 Human genes 0.000 description 1
- 108010010803 Gelatin Proteins 0.000 description 1
- 206010018498 Goitre Diseases 0.000 description 1
- 208000015023 Graves' disease Diseases 0.000 description 1
- 102100040896 Growth/differentiation factor 15 Human genes 0.000 description 1
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical class C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 1
- 101000792835 Homo sapiens Arginase-2, mitochondrial Proteins 0.000 description 1
- 101000661592 Homo sapiens Dolichyl-diphosphooligosaccharide-protein glycosyltransferase subunit STT3A Proteins 0.000 description 1
- 101000980741 Homo sapiens G1/S-specific cyclin-D2 Proteins 0.000 description 1
- 101000893549 Homo sapiens Growth/differentiation factor 15 Proteins 0.000 description 1
- 101000601394 Homo sapiens Neuroendocrine convertase 2 Proteins 0.000 description 1
- 101000979460 Homo sapiens Protein Niban 1 Proteins 0.000 description 1
- 101000835541 Homo sapiens Target of Nesh-SH3 Proteins 0.000 description 1
- 108091069022 Homo sapiens miR-130a stem-loop Proteins 0.000 description 1
- 108091069090 Homo sapiens miR-149 stem-loop Proteins 0.000 description 1
- 108091068837 Homo sapiens miR-29b-1 stem-loop Proteins 0.000 description 1
- 108091068845 Homo sapiens miR-29b-2 stem-loop Proteins 0.000 description 1
- 108091064366 Homo sapiens miR-513a-1 stem-loop Proteins 0.000 description 1
- 108091064370 Homo sapiens miR-513a-2 stem-loop Proteins 0.000 description 1
- 108091061644 Homo sapiens miR-624 stem-loop Proteins 0.000 description 1
- 108091067625 Homo sapiens miR-7-1 stem-loop Proteins 0.000 description 1
- 108091067630 Homo sapiens miR-7-2 stem-loop Proteins 0.000 description 1
- 108091067633 Homo sapiens miR-7-3 stem-loop Proteins 0.000 description 1
- 206010020850 Hyperthyroidism Diseases 0.000 description 1
- 206010062767 Hypophysitis Diseases 0.000 description 1
- 102000001706 Immunoglobulin Fab Fragments Human genes 0.000 description 1
- 108010054477 Immunoglobulin Fab Fragments Proteins 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- 102100034343 Integrase Human genes 0.000 description 1
- 108090001025 Iron regulatory protein 1 Proteins 0.000 description 1
- 102000004901 Iron regulatory protein 1 Human genes 0.000 description 1
- 239000005511 L01XE05 - Sorafenib Substances 0.000 description 1
- 239000002176 L01XE26 - Cabozantinib Substances 0.000 description 1
- GUBGYTABKSRVRQ-QKKXKWKRSA-N Lactose Natural products OC[C@H]1O[C@@H](O[C@H]2[C@H](O)[C@@H](O)C(O)O[C@@H]2CO)[C@H](O)[C@@H](O)[C@H]1O GUBGYTABKSRVRQ-QKKXKWKRSA-N 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- DDWFXDSYGUXRAY-UHFFFAOYSA-N Luciferin Natural products CCc1c(C)c(CC2NC(=O)C(=C2C=C)C)[nH]c1Cc3[nH]c4C(=C5/NC(CC(=O)O)C(C)C5CC(=O)O)CC(=O)c4c3C DDWFXDSYGUXRAY-UHFFFAOYSA-N 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 108091007773 MIR100 Proteins 0.000 description 1
- 108091008060 MIR10A Proteins 0.000 description 1
- 108091007701 MIR3162 Proteins 0.000 description 1
- 108091007707 MIR4454 Proteins 0.000 description 1
- 208000009018 Medullary thyroid cancer Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 108060004795 Methyltransferase Proteins 0.000 description 1
- 102000016397 Methyltransferase Human genes 0.000 description 1
- 108020005196 Mitochondrial DNA Proteins 0.000 description 1
- VQAYFKKCNSOZKM-IOSLPCCCSA-N N(6)-methyladenosine Chemical compound C1=NC=2C(NC)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O VQAYFKKCNSOZKM-IOSLPCCCSA-N 0.000 description 1
- VQAYFKKCNSOZKM-UHFFFAOYSA-N NSC 29409 Natural products C1=NC=2C(NC)=NC=NC=2N1C1OC(CO)C(O)C1O VQAYFKKCNSOZKM-UHFFFAOYSA-N 0.000 description 1
- 206010061309 Neoplasm progression Diseases 0.000 description 1
- 206010029216 Nervousness Diseases 0.000 description 1
- 108010042309 Netropsin Proteins 0.000 description 1
- 102100037732 Neuroendocrine convertase 2 Human genes 0.000 description 1
- 239000000020 Nitrocellulose Substances 0.000 description 1
- 208000015914 Non-Hodgkin lymphomas Diseases 0.000 description 1
- 239000004677 Nylon Substances 0.000 description 1
- 108020005187 Oligonucleotide Probes Proteins 0.000 description 1
- 238000002944 PCR assay Methods 0.000 description 1
- 239000004698 Polyethylene Substances 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- 108010029485 Protein Isoforms Proteins 0.000 description 1
- 102000001708 Protein Isoforms Human genes 0.000 description 1
- 102100023076 Protein Niban 1 Human genes 0.000 description 1
- CZPWVGJYEJSRLH-UHFFFAOYSA-N Pyrimidine Chemical compound C1=CN=CN=C1 CZPWVGJYEJSRLH-UHFFFAOYSA-N 0.000 description 1
- 238000012341 Quantitative reverse-transcriptase PCR Methods 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 238000001069 Raman spectroscopy Methods 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 102000039471 Small Nuclear RNA Human genes 0.000 description 1
- 238000012167 Small RNA sequencing Methods 0.000 description 1
- 229920002472 Starch Polymers 0.000 description 1
- 102100026544 Target of Nesh-SH3 Human genes 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- 208000033781 Thyroid carcinoma Diseases 0.000 description 1
- 208000024799 Thyroid disease Diseases 0.000 description 1
- 108010057666 Transcription Factor CHOP Proteins 0.000 description 1
- GLNADSQYFUSGOU-GPTZEZBUSA-J Trypan blue Chemical compound [Na+].[Na+].[Na+].[Na+].C1=C(S([O-])(=O)=O)C=C2C=C(S([O-])(=O)=O)C(/N=N/C3=CC=C(C=C3C)C=3C=C(C(=CC=3)\N=N\C=3C(=CC4=CC(=CC(N)=C4C=3O)S([O-])(=O)=O)S([O-])(=O)=O)C)=C(O)C2=C1N GLNADSQYFUSGOU-GPTZEZBUSA-J 0.000 description 1
- 229910052770 Uranium Inorganic materials 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 208000005248 Vocal Cord Paralysis Diseases 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 229920006397 acrylic thermoplastic Polymers 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- IRLPACMLTUPBCL-FCIPNVEPSA-N adenosine-5'-phosphosulfate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@@H](CO[P@](O)(=O)OS(O)(=O)=O)[C@H](O)[C@H]1O IRLPACMLTUPBCL-FCIPNVEPSA-N 0.000 description 1
- 150000003838 adenosines Chemical class 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 239000000443 aerosol Substances 0.000 description 1
- 238000007605 air drying Methods 0.000 description 1
- 150000001298 alcohols Chemical class 0.000 description 1
- PPQRONHOSHZGFQ-LMVFSUKVSA-N aldehydo-D-ribose 5-phosphate Chemical group OP(=O)(O)OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PPQRONHOSHZGFQ-LMVFSUKVSA-N 0.000 description 1
- 125000003342 alkenyl group Chemical group 0.000 description 1
- 125000000304 alkynyl group Chemical group 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 125000003277 amino group Chemical group 0.000 description 1
- 235000019418 amylase Nutrition 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- 125000002178 anthracenyl group Chemical group C1(=CC=CC2=CC3=CC=CC=C3C=C12)* 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 239000000427 antigen Substances 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 239000002246 antineoplastic agent Substances 0.000 description 1
- 229940041181 antineoplastic drug Drugs 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000002820 assay format Methods 0.000 description 1
- 238000011888 autopsy Methods 0.000 description 1
- HFACYLZERDEVSX-UHFFFAOYSA-N benzidine Chemical compound C1=CC(N)=CC=C1C1=CC=C(N)C=C1 HFACYLZERDEVSX-UHFFFAOYSA-N 0.000 description 1
- 125000002619 bicyclic group Chemical group 0.000 description 1
- 230000002146 bilateral effect Effects 0.000 description 1
- 238000010876 biochemical test Methods 0.000 description 1
- 230000031018 biological processes and functions Effects 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000000740 bleeding effect Effects 0.000 description 1
- 239000002981 blocking agent Substances 0.000 description 1
- 238000009534 blood test Methods 0.000 description 1
- 210000001124 body fluid Anatomy 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 101150048834 braF gene Proteins 0.000 description 1
- 238000002725 brachytherapy Methods 0.000 description 1
- BBBFJLBPOGFECG-VJVYQDLKSA-N calcitonin Chemical compound N([C@H](C(=O)N[C@@H](CC(C)C)C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1NC=NC=1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CO)C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(=O)N1[C@@H](CCC1)C(N)=O)C(C)C)C(=O)[C@@H]1CSSC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1 BBBFJLBPOGFECG-VJVYQDLKSA-N 0.000 description 1
- 229960004015 calcitonin Drugs 0.000 description 1
- 230000009702 cancer cell proliferation Effects 0.000 description 1
- 229940056434 caprelsa Drugs 0.000 description 1
- 229910052799 carbon Inorganic materials 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 231100000357 carcinogen Toxicity 0.000 description 1
- 239000003183 carcinogenic agent Substances 0.000 description 1
- 230000032823 cell division Effects 0.000 description 1
- 210000000170 cell membrane Anatomy 0.000 description 1
- 238000003320 cell separation method Methods 0.000 description 1
- 230000003833 cell viability Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 230000019522 cellular metabolic process Effects 0.000 description 1
- 230000007960 cellular response to stress Effects 0.000 description 1
- 210000003850 cellular structure Anatomy 0.000 description 1
- 239000012829 chemotherapy agent Substances 0.000 description 1
- 235000012000 cholesterol Nutrition 0.000 description 1
- ZYVSOIYQKUDENJ-WKSBCEQHSA-N chromomycin A3 Chemical compound O([C@@H]1C[C@@H](O[C@H](C)[C@@H]1OC(C)=O)OC=1C=C2C=C3C[C@H]([C@@H](C(=O)C3=C(O)C2=C(O)C=1C)O[C@@H]1O[C@H](C)[C@@H](O)[C@H](O[C@@H]2O[C@H](C)[C@@H](O)[C@H](O[C@@H]3O[C@@H](C)[C@H](OC(C)=O)[C@@](C)(O)C3)C2)C1)[C@H](OC)C(=O)[C@@H](O)[C@@H](C)O)[C@@H]1C[C@@H](O)[C@@H](OC)[C@@H](C)O1 ZYVSOIYQKUDENJ-WKSBCEQHSA-N 0.000 description 1
- 230000008711 chromosomal rearrangement Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000009535 clinical urine test Methods 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 230000035602 clotting Effects 0.000 description 1
- 239000000084 colloidal system Substances 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 229940034568 cometriq Drugs 0.000 description 1
- 238000012875 competitive assay Methods 0.000 description 1
- 229920001577 copolymer Polymers 0.000 description 1
- 239000013601 cosmid vector Substances 0.000 description 1
- 210000002726 cyst fluid Anatomy 0.000 description 1
- 230000000093 cytochemical effect Effects 0.000 description 1
- 230000000120 cytopathologic effect Effects 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 231100000517 death Toxicity 0.000 description 1
- 238000003066 decision tree Methods 0.000 description 1
- 239000007857 degradation product Substances 0.000 description 1
- 238000004925 denaturation Methods 0.000 description 1
- 230000036425 denaturation Effects 0.000 description 1
- CFCUWKMKBJTWLW-UHFFFAOYSA-N deoliosyl-3C-alpha-L-digitoxosyl-MTM Natural products CC=1C(O)=C2C(O)=C3C(=O)C(OC4OC(C)C(O)C(OC5OC(C)C(O)C(OC6OC(C)C(O)C(C)(O)C6)C5)C4)C(C(OC)C(=O)C(O)C(C)O)CC3=CC2=CC=1OC(OC(C)C1O)CC1OC1CC(O)C(O)C(C)O1 CFCUWKMKBJTWLW-UHFFFAOYSA-N 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000000368 destabilizing effect Effects 0.000 description 1
- 230000009274 differential gene expression Effects 0.000 description 1
- 238000009792 diffusion process Methods 0.000 description 1
- QONQRTHLHBTMGP-UHFFFAOYSA-N digitoxigenin Natural products CC12CCC(C3(CCC(O)CC3CC3)C)C3C11OC1CC2C1=CC(=O)OC1 QONQRTHLHBTMGP-UHFFFAOYSA-N 0.000 description 1
- SHIBSTMRCDJXLN-KCZCNTNESA-N digoxigenin Chemical compound C1([C@@H]2[C@@]3([C@@](CC2)(O)[C@H]2[C@@H]([C@@]4(C)CC[C@H](O)C[C@H]4CC2)C[C@H]3O)C)=CC(=O)OC1 SHIBSTMRCDJXLN-KCZCNTNESA-N 0.000 description 1
- ZPTBLXKRQACLCR-XVFCMESISA-N dihydrouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)CC1 ZPTBLXKRQACLCR-XVFCMESISA-N 0.000 description 1
- XNYZHCFCZNMTFY-UHFFFAOYSA-N diminazene Chemical compound C1=CC(C(=N)N)=CC=C1N\N=N\C1=CC=C(C(N)=N)C=C1 XNYZHCFCZNMTFY-UHFFFAOYSA-N 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-K dioxido-sulfanylidene-sulfido-$l^{5}-phosphane Chemical compound [O-]P([O-])([S-])=S NAGJZTKCGNOGPW-UHFFFAOYSA-K 0.000 description 1
- 238000007598 dipping method Methods 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 239000012154 double-distilled water Substances 0.000 description 1
- 239000003937 drug carrier Substances 0.000 description 1
- 238000000835 electrochemical detection Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 230000001804 emulsifying effect Effects 0.000 description 1
- 239000000839 emulsion Substances 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 230000002121 endocytic effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 230000009144 enzymatic modification Effects 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 238000012869 ethanol precipitation Methods 0.000 description 1
- 238000007387 excisional biopsy Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000011347 external beam therapy Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- LIYGYAHYXQDGEP-UHFFFAOYSA-N firefly oxyluciferin Natural products Oc1csc(n1)-c1nc2ccc(O)cc2s1 LIYGYAHYXQDGEP-UHFFFAOYSA-N 0.000 description 1
- 239000000796 flavoring agent Substances 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 235000013355 food flavoring agent Nutrition 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000005021 gait Effects 0.000 description 1
- 229920000159 gelatin Polymers 0.000 description 1
- 239000008273 gelatin Substances 0.000 description 1
- 235000019322 gelatine Nutrition 0.000 description 1
- 235000011852 gelatine desserts Nutrition 0.000 description 1
- 239000003292 glue Substances 0.000 description 1
- 150000004676 glycans Chemical class 0.000 description 1
- 150000002334 glycols Chemical class 0.000 description 1
- 201000003872 goiter Diseases 0.000 description 1
- 239000008187 granular material Substances 0.000 description 1
- 239000003979 granulating agent Substances 0.000 description 1
- 230000005484 gravity Effects 0.000 description 1
- 229910001385 heavy metal Inorganic materials 0.000 description 1
- 210000003494 hepatocyte Anatomy 0.000 description 1
- 235000003642 hunger Nutrition 0.000 description 1
- 239000001257 hydrogen Substances 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- 229920003063 hydroxymethyl cellulose Polymers 0.000 description 1
- 229940031574 hydroxymethyl cellulose Drugs 0.000 description 1
- 208000003532 hypothyroidism Diseases 0.000 description 1
- 230000002989 hypothyroidism Effects 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 210000002865 immune cell Anatomy 0.000 description 1
- 230000000984 immunochemical effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012296 in situ hybridization assay Methods 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 238000007386 incisional biopsy Methods 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 238000007918 intramuscular administration Methods 0.000 description 1
- 238000007912 intraperitoneal administration Methods 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- PNDPGZBMCMUPRI-UHFFFAOYSA-N iodine Chemical compound II PNDPGZBMCMUPRI-UHFFFAOYSA-N 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 239000008101 lactose Substances 0.000 description 1
- 238000012417 linear regression Methods 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 239000000314 lubricant Substances 0.000 description 1
- 230000000527 lymphocytic effect Effects 0.000 description 1
- 210000002540 macrophage Anatomy 0.000 description 1
- 235000019359 magnesium stearate Nutrition 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 230000001394 metastastic effect Effects 0.000 description 1
- 108091039994 miR-486 stem-loop Proteins 0.000 description 1
- 238000010208 microarray analysis Methods 0.000 description 1
- CFCUWKMKBJTWLW-BKHRDMLASA-N mithramycin Chemical compound O([C@@H]1C[C@@H](O[C@H](C)[C@H]1O)OC=1C=C2C=C3C[C@H]([C@@H](C(=O)C3=C(O)C2=C(O)C=1C)O[C@@H]1O[C@H](C)[C@@H](O)[C@H](O[C@@H]2O[C@H](C)[C@H](O)[C@H](O[C@@H]3O[C@H](C)[C@@H](O)[C@@](C)(O)C3)C2)C1)[C@H](OC)C(=O)[C@@H](O)[C@@H](C)O)[C@H]1C[C@@H](O)[C@H](O)[C@@H](C)O1 CFCUWKMKBJTWLW-BKHRDMLASA-N 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 230000000877 morphologic effect Effects 0.000 description 1
- 238000000491 multivariate analysis Methods 0.000 description 1
- UPBAOYRENQEPJO-UHFFFAOYSA-N n-[5-[[5-[(3-amino-3-iminopropyl)carbamoyl]-1-methylpyrrol-3-yl]carbamoyl]-1-methylpyrrol-3-yl]-4-formamido-1-methylpyrrole-2-carboxamide Chemical compound CN1C=C(NC=O)C=C1C(=O)NC1=CN(C)C(C(=O)NC2=CN(C)C(C(=O)NCCC(N)=N)=C2)=C1 UPBAOYRENQEPJO-UHFFFAOYSA-N 0.000 description 1
- 239000002105 nanoparticle Substances 0.000 description 1
- 125000001624 naphthyl group Chemical group 0.000 description 1
- 230000009826 neoplastic cell growth Effects 0.000 description 1
- 229940080607 nexavar Drugs 0.000 description 1
- 229920001220 nitrocellulos Polymers 0.000 description 1
- 201000008494 nodular goiter Diseases 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 238000003499 nucleic acid array Methods 0.000 description 1
- 229920001778 nylon Polymers 0.000 description 1
- 239000003921 oil Substances 0.000 description 1
- 239000002674 ointment Substances 0.000 description 1
- 239000002751 oligonucleotide probe Substances 0.000 description 1
- 230000000177 oncogenetic effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000011369 optimal treatment Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 238000012261 overproduction Methods 0.000 description 1
- JJVOROULKOMTKG-UHFFFAOYSA-N oxidized Photinus luciferin Chemical compound S1C2=CC(O)=CC=C2N=C1C1=NC(=O)CS1 JJVOROULKOMTKG-UHFFFAOYSA-N 0.000 description 1
- 125000004043 oxo group Chemical group O=* 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 239000013610 patient sample Substances 0.000 description 1
- 125000003933 pentacenyl group Chemical group C1(=CC=CC2=CC3=CC4=CC5=CC=CC=C5C=C4C=C3C=C12)* 0.000 description 1
- XDRYMKDFEDOLFX-UHFFFAOYSA-N pentamidine Chemical compound C1=CC(C(=N)N)=CC=C1OCCCCCOC1=CC=C(C(N)=N)C=C1 XDRYMKDFEDOLFX-UHFFFAOYSA-N 0.000 description 1
- 229960004448 pentamidine Drugs 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N pentofuranose Chemical group OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 239000008177 pharmaceutical agent Substances 0.000 description 1
- PTMHPRAIXMAOOB-UHFFFAOYSA-L phosphoramidate Chemical compound NP([O-])([O-])=O PTMHPRAIXMAOOB-UHFFFAOYSA-L 0.000 description 1
- 150000008298 phosphoramidates Chemical class 0.000 description 1
- 238000000206 photolithography Methods 0.000 description 1
- INAAIJLSXJJHOZ-UHFFFAOYSA-N pibenzimol Chemical compound C1CN(C)CCN1C1=CC=C(N=C(N2)C=3C=C4NC(=NC4=CC=3)C=3C=CC(O)=CC=3)C2=C1 INAAIJLSXJJHOZ-UHFFFAOYSA-N 0.000 description 1
- 239000006187 pill Substances 0.000 description 1
- 210000003635 pituitary gland Anatomy 0.000 description 1
- 239000013600 plasmid vector Substances 0.000 description 1
- 229960003171 plicamycin Drugs 0.000 description 1
- 229920003229 poly(methyl methacrylate) Polymers 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920001748 polybutylene Polymers 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 229920001223 polyethylene glycol Polymers 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 102000040430 polynucleotide Human genes 0.000 description 1
- 108091033319 polynucleotide Proteins 0.000 description 1
- 239000002157 polynucleotide Substances 0.000 description 1
- 229920001155 polypropylene Polymers 0.000 description 1
- 229920001282 polysaccharide Polymers 0.000 description 1
- 239000005017 polysaccharide Substances 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 229920002635 polyurethane Polymers 0.000 description 1
- 239000004814 polyurethane Substances 0.000 description 1
- 229920000036 polyvinylpyrrolidone Polymers 0.000 description 1
- 239000001267 polyvinylpyrrolidone Substances 0.000 description 1
- 235000013855 polyvinylpyrrolidone Nutrition 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 238000010882 preoperative diagnosis Methods 0.000 description 1
- 230000002335 preservative effect Effects 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- MWWATHDPGQKSAR-UHFFFAOYSA-N propyne Chemical compound CC#C MWWATHDPGQKSAR-UHFFFAOYSA-N 0.000 description 1
- 125000001725 pyrenyl group Chemical group 0.000 description 1
- 125000000168 pyrrolyl group Chemical group 0.000 description 1
- UOWVMDUEMSNCAV-WYENRQIDSA-N rachelmycin Chemical compound C1([C@]23C[C@@H]2CN1C(=O)C=1NC=2C(OC)=C(O)C4=C(C=2C=1)CCN4C(=O)C1=CC=2C=4CCN(C=4C(O)=C(C=2N1)OC)C(N)=O)=CC(=O)C1=C3C(C)=CN1 UOWVMDUEMSNCAV-WYENRQIDSA-N 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 239000012857 radioactive material Substances 0.000 description 1
- 239000000941 radioactive substance Substances 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 239000011535 reaction buffer Substances 0.000 description 1
- 239000000018 receptor agonist Substances 0.000 description 1
- 229940044601 receptor agonist Drugs 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000009256 replacement therapy Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 229920005989 resin Polymers 0.000 description 1
- 239000011347 resin Substances 0.000 description 1
- 238000012340 reverse transcriptase PCR Methods 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 150000003290 ribose derivatives Chemical group 0.000 description 1
- DWRXFEITVBNRMK-JXOAFFINSA-N ribothymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@H]1[C@H](O)[C@H](O)[C@@H](CO)O1 DWRXFEITVBNRMK-JXOAFFINSA-N 0.000 description 1
- 239000012266 salt solution Substances 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 210000002966 serum Anatomy 0.000 description 1
- 230000019491 signal transduction Effects 0.000 description 1
- RMAQACBXLXPBSY-UHFFFAOYSA-N silicic acid Chemical compound O[Si](O)(O)O RMAQACBXLXPBSY-UHFFFAOYSA-N 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 150000003376 silicon Chemical class 0.000 description 1
- 235000012239 silicon dioxide Nutrition 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 108091029842 small nuclear ribonucleic acid Proteins 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000009987 spinning Methods 0.000 description 1
- 239000007921 spray Substances 0.000 description 1
- 206010041823 squamous cell carcinoma Diseases 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 108010042747 stallimycin Proteins 0.000 description 1
- 238000010561 standard procedure Methods 0.000 description 1
- 235000019698 starch Nutrition 0.000 description 1
- 238000000528 statistical test Methods 0.000 description 1
- 210000002536 stromal cell Anatomy 0.000 description 1
- 150000008163 sugars Chemical class 0.000 description 1
- 239000006228 supernatant Substances 0.000 description 1
- 239000000829 suppository Substances 0.000 description 1
- 238000011477 surgical intervention Methods 0.000 description 1
- 238000001308 synthesis method Methods 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 239000003826 tablet Substances 0.000 description 1
- 239000000454 talc Substances 0.000 description 1
- 229910052623 talc Inorganic materials 0.000 description 1
- 235000012222 talc Nutrition 0.000 description 1
- ISXSCDLOGDJUNJ-UHFFFAOYSA-N tert-butyl prop-2-enoate Chemical compound CC(C)(C)OC(=O)C=C ISXSCDLOGDJUNJ-UHFFFAOYSA-N 0.000 description 1
- 125000001935 tetracenyl group Chemical group C1(=CC=CC2=CC3=CC4=CC=CC=C4C=C3C=C12)* 0.000 description 1
- 229940126585 therapeutic drug Drugs 0.000 description 1
- 125000003396 thiol group Chemical group [H]S* 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-K thiophosphate Chemical compound [O-]P([O-])([O-])=S RYYWUUFWQRZTIU-UHFFFAOYSA-K 0.000 description 1
- 208000013077 thyroid gland carcinoma Diseases 0.000 description 1
- 208000021510 thyroid gland disease Diseases 0.000 description 1
- 230000003867 tiredness Effects 0.000 description 1
- 208000016255 tiredness Diseases 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 235000011178 triphosphate Nutrition 0.000 description 1
- 239000001226 triphosphate Substances 0.000 description 1
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 1
- HDZZVAMISRMYHH-KCGFPETGSA-N tubercidin Chemical compound C1=CC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O HDZZVAMISRMYHH-KCGFPETGSA-N 0.000 description 1
- 230000005751 tumor progression Effects 0.000 description 1
- 238000011311 validation assay Methods 0.000 description 1
- 229910052720 vanadium Inorganic materials 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 230000004584 weight gain Effects 0.000 description 1
- 235000019786 weight gain Nutrition 0.000 description 1
- 230000004580 weight loss Effects 0.000 description 1
- 230000036642 wellbeing Effects 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/112—Disease subtyping, staging or classification
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/178—Oligonucleotides characterized by their use miRNA, siRNA or ncRNA
Definitions
- the present invention relates to methods for classification of thyroid tumors. Specifically the invention relates to microRNA molecules associated with specific thyroid tumors.
- MicroRNAs are an important class of regulatory RNAs, which have a profound impact on a wide array of biological processes. These small (typically 18-24 nucleotides long) non-coding RNA molecules can modulate protein expression pattern by promoting RNA degradation, inhibiting mRNA translation, and also by affecting gene transcription.
- miRs play pivotal roles in diverse processes such as development and differentiation, control of cell proliferation, stress response and metabolism.
- the expression of many miRs was found to be altered in numerous types of human cancer, and in some cases suggesting that such alterations may play a causative role in tumor progression.
- the thyroid gland is formed of two main types of cells: the follicular cells and the C or parafollicular cells.
- Follicular cells produce thyroid hormones, which are regulators of human metabolism. Overproduction of thyroid hormone (hyperthyroidism) causes rapid or irregular heartbeat, trouble sleeping, nervousness, hunger, weight loss, and a feeling of being too warm. In counterpart, hypothyroidism causes metabolism slowdown, tiredness, and weight gain. Thyroid hormone release is regulated by the thyroid-stimulating hormone (TSH), produced by the pituitary gland.
- Thyroid hormone thyroid-stimulating hormone
- the C cells produce calcitonin, a hormone responsible for use of calcium. Lymphocytes and stromal cells are also found in the thyroid.
- Thyroid cancer is the eighth most common cancer in the United States, and the most rapidly increasing cancer in the US, with more than 60,000 new cases diagnosed every year, and being the cause of about 1,800 deaths in 2014. Thyroid cancer usually presents itself as a palpable thyroid nodule. Different types of thyroid tumors develop from different cell types, which is a determinant for the gravity and the optimal treatment administered. Most of the growths and tumors in the thyroid gland are benign (non-cancerous) but others are malignant (cancerous).
- DTC differentiated thyroid carcinomas
- PTC papillary thyroid carcinoma
- FTC follicular thyroid carcinoma
- FNA fine-needle aspiration
- U.S. Pat. No. 7,319,011 describes the measuring the expression of any one of the genes DDIT3, ARG2, ITM1, C1orf24, TARSH, and ACO1 in a test follicular thyroid specimen for distinguishing between follicular adenoma (FA) from follicular carcinoma (FC).
- FA follicular adenoma
- FC follicular carcinoma
- U.S. Pat. No. 7,670,775 describes the analysis of the expression of CCND2, PCSK2, and PLAB for identifying malignant thyroid tissue.
- 6,723,506 describes the molecular characterization of PAX8-PPAR1 molecules in connection with diagnosis and treatment of thyroid follicular carcinomas.
- U.S. Pat. No. 7,378,233 describes the occurrence of the T1796A mutation of the BRAF gene in 24 (69%) of papillary thyroid carcinomas.
- the present invention provides a method of classifying a thyroid lesion sample as malignant or benign, the method comprising:
- RNA extracted from a thyroid lesion sample obtained from a human subject a. providing RNA extracted from a thyroid lesion sample obtained from a human subject
- step (b) or (c) further comprising a step of obtaining the ratio between the expression levels of at least one pair of nucleic acids; and wherein in step (d) said classifier algorithm may be applied to any one of the nucleic acid expression profile, said ratio of at least one pair of nucleic acids, or to a combination thereof.
- said thyroid lesion sample is obtained by fine needle aspiration (FNA) biopsy.
- said sample is a smear from a FNA biopsy.
- algorithm is a machine-learning algorithm.
- said algorithm further combines the microRNA expression profile with clinical or genetic data from said sample.
- step (b) following step (b) if at least one of said microRNA expression levels, or the ratio between the expression levels of at least one pair of microRNAs is below or above a threshold for thyroid cells, said sample is discarded based on the expression level of said microRNA.
- said measuring is performed by hybridization, amplification or next generation sequencing method.
- microRNAs comprising SEQ ID NOS. 1-13, 17-22, 25, 26, 36 and 37 is determined.
- said hybridization comprises contacting the sample with probes, wherein the probes comprise (i) DNA equivalents of the microRNAs, (ii) the complements thereof, (iii) sequences at least 80% identical to (i) or (ii) or (iv) a nucleic acid sequence that hybridizes with at least eight contiguous nucleotides of any one of SEQ ID NOs 1-37.
- said probes are attached to a solid substrate.
- amplification is real-time polymerase chain reaction (RT-PCR), said RT-PCR amplification method comprising forward and reverse primers, and optionally further comprising hybridization with a probe.
- RT-PCR real-time polymerase chain reaction
- Amplification by RT-PCR comprises contacting the RNA with forward and reverse primers for each of the miRNAs, wherein each forward primer comprises 15-21 nucleotides identical to one of the miRNAs.
- Forward primers are as defined in Table 8.
- said method further comprises the step of administering a differential treatment to said subject if said thyroid lesion is benign or malignant.
- said lesion is malignant and said treatment is any one of surgery, chemotherapy, radiotherapy, hormone therapy, or any other recommended treatment.
- said probe is a general probe. In another embodiment said probe is a microRNA sequence-specific probe.
- the present invention provides an isolated nucleic acid, said nucleic acid comprising at least 12 contiguous nucleotides at least 80% identical to the sequence of any one of SEQ ID NOs. 27-29, 33, 34, 139, 140, 307 and 308.
- the present invention provides a pharmaceutical composition
- a pharmaceutical composition comprising as active agent the isolated nucleic acids described herein, and optionally adjuvants, carriers, diluents and excipients.
- said nucleic acid molecules may be comprised as an active agent in a pharmaceutical composition, a formulation or a medicament.
- the present invention provides a vector comprising the isolated nucleic acid described herein.
- the present invention provides a probe comprising the isolated nucleic acid described herein.
- the present invention provides a biochip comprising the isolated nucleic acid described herein.
- the present invention provides the use of an isolated nucleic acid as described herein in the preparation of a medicament.
- FIGS. 1A-1B Novel microRNAs detected by next generation sequencing.
- FIG. 1A shows the predicted secondary structure of two novel microRNAs, MD2-495 (top) and MD2-437 (bottom) detected in thyroid tissue.
- FIG. 1B shows the expression of the two novel microRNAs in each one of 11 resected thyroid samples.
- FIGS. 2A-2B Hurthle cell marker.
- the dashed factor line ⁇ 0.6.
- FIG. 3 Profiling of malignant and benign samples with Thyroid assay set of microRNAs.
- the numbers refer to (50—normalized Ct value).
- FIGS. 4A-4C A Discriminant Analysis classifier was used to classify samples from the malignant+benign cohort as malignant (diamonds, M) or benign (squares, B).
- FIG. 4A Classifier with two microRNAs (hsa-miR-551b-3p and hsa-miR-146b-5p), presented sensitivity of 84.8% and specificity of 68.9%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 4A Classifier with two microRNAs (hsa-miR-551b-3p and hsa-miR-146b-5p), presented sensitivity of 84.8% and specificity of 68.9%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- 4C Classifier with eight microRNA (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-375; hsa-miR-125b-5p; hsa-miR-152-3p; hsa-miR-181c-5p), presenting sensitivity of 83.5% and specificity of 81.5%.
- FIGS. 5A-5C A Discriminant Analysis classifier was used to classify samples from the malignant+benign cohort as malignant (diamonds, M) or benign (squares, B).
- FIG. 5A Classifier with two microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p and hsa-miR-31-5p:hsa-miR-342-3p), presented sensitivity of 78% and specificity of 79.5%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 6A-6C A Discriminant Analysis classifier was used to classify samples from the malignant+benign cohort as malignant (diamonds, M) or benign (squares, B), based on a combination of microRNAs and microRNA ratios.
- FIG. 6A Classifier of one microRNA ratio and one microRNA (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-551b-3p) presented sensitivity of 82.9% and specificity of 82.8%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 6A-6C A Discriminant Analysis classifier was used to classify samples from the malignant+benign cohort as malignant (diamonds, M) or benign (squares, B), based on a combination of microRNAs and microRNA ratios.
- FIG. 6A Classifier of one microRNA ratio and one microRNA (hsa-
- FIG. 7A-7C A K-nearest neighbor (KNN) classifier was used to classify samples from the malignant+benign cohort as malignant (M) from benign (B).
- FIG. 7A Classifier using six microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-375; hsa-miR-125b-5p) is shown as a confusion matrix where the x-axis shows the classifier answer (Clas.
- FIG. 7B Classifier of eight microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-375; hsa-miR-125b-5p; hsa-miR-152-3p; hsa-miR-181c-5p) is shown in a confusion matrix where the x-axis shows the classifier answer (Clas.
- FIG. 7C Classifier using twelve microRNAs (hsa-m
- FIG. 8A-8B A KNN classifier was used to classify samples from the malignant+benign cohort as malignant (M) or benign (B).
- FIG. 8A Classifier using six microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-222-3p:hsa-miR-486-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for the classification, is shown as a confusion matrix where the x-axis represents the classifier answer (Clas.
- FIG. 8B Classifier using eight microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-222-3p:hsa-miR-486-5p; hsa-miR-200c-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p) as the features for the classification,
- FIG. 9A-9C A KNN classifier was used to classify samples from the malignant+benign cohort as malignant (M) or benign (B).
- FIG. 9A Classifier using four microRNAs and two microRNA ratios (hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p;) as the features for the classification, is shown as a confusion matrix where the x-axis represents the classifier answer (Clas.
- FIG. 9B Classifier using five microRNAs and three microRNA ratios (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-222-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-31-5p; hsa-miR-125b-5p:hsa-miR-138-5p) as the features for the classification, is shown as a confusion matrix where the x-axis represents the classifier answer (Clas.
- FIG. 9C Classifier using seven microRNAs and five microRNA ratios (hsa-miR-375; hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-152-3p; hsa-miR-125b-5p; hsa-miR-222-3p; hsa-miR-31-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-125b-5p:hsa-miR-200c
- FIG. 10A-10C A Discriminant Analysis classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B).
- FIG. 10A A classifier using two microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p) showed sensitivity of 80% and specificity of 56.3%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 10A-10C A Discriminant Analysis classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B).
- FIG. 10A A classifier using two microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p) showed sensitivity of 80% and specificity of 56.3%. The grey shaded area marks the space in which a sample is classified as malignant
- FIG. 10B A classifier of three microRNAs (hsa-miR-146b-5; hsa-miR-551b-3p; hsa-miR-222-3p) showed sensitivity of 82.6% and specificity of 59.5%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 10B A classifier of three microRNAs (hsa-miR-146b-5; hsa-miR-551b-3p; hsa-miR-222-3p) showed sensitivity of 82.6% and specificity of 59.5%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 11A-11C A Discriminant Analysis classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B)
- FIG. 11A A classifier using two microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p) as the features for the classification presented a sensitivity of classifier 80% and specificity of 72.2%.
- the grey shaded area marks the space in which a sample is classified as malignant.
- 11B A classifier using three microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 80% and specificity of 69%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 12A-12C A Discriminant Analysis classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B), using a combination of microRNAs and microRNA ratios.
- FIG. 12A A classifier using one microRNA and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 80% and specificity of 73.8%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 12A A Discriminant Analysis classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B), using a combination of microRNAs and microRNA ratios.
- FIG. 12A A classifier using one microRNA and one microRNA ratio (hsa
- FIG. 12B A classifier using two microRNAs and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 79.1% and specificity of 73%.
- FIG. 12B A classifier using two microRNAs and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 79.1% and specificity of 73%.
- FIG. 13A-13C A KNN classifier was used to classify samples from the Indeterminate sub-cohort as malignant (M) or benign (B)
- FIG. 13A A classifier of six microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375) is shown as a confusion matrix where the x-axis shows the classifier answer (Clas.
- FIG. 13B A classifier using eight microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-181c-5p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas.
- FIG. 13C A classifier using 12 microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-181c-5p; hsa-miR-424-3p; hsa-miR-486-5p; hsa-miR-200c-3p; hsa-miR-346) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the
- FIG. 14A-14B A KNN classifier was used to classify samples from the Indeterminate sub-cohort as malignant (M) or benign (B), using microRNA ratios.
- FIG. 14A A classifier using six microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas.
- FIG. 14B A classifier using eight microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for classification is
- FIG. 15A-15C A KNN classifier was used to classify samples from the Indeterminate sub-cohort as malignant (M) or benign (B) using microRNAs and microRNA ratios.
- FIG. 15A A classifier using three microRNAs and three microRNA ratios (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas.
- FIG. 15B A classifier using five microRNAs and three microRNA ratios (hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (
- FIG. 15C A classifier using seven microRNAs and five microRNA ratios (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p;) as the features for classification is shown as a confusion matrix where
- FIG. 16A-16C A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B)
- FIG. 16A A classifier using three microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p) as the features for classification presented sensitivity of classifier 82.6% and specificity of 54.8%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 16A-16C A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B)
- FIG. 16A A classifier using three microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p) as the features for classification presented sensitivity
- FIG. 17A-17C A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) using microRNA ratios.
- FIG. 17A A classifier using three microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 81.7% and specificity of 67.5%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 17A A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) using microRNA ratios.
- FIG. 17A A classifier using three microRNA ratios
- a classifier using six microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas.
- FIG. 17C A classifier using eight microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for
- FIG. 18A-18C A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) samples using the combination of microRNAs and microRNA ratios.
- FIG. 18A A classifier using two microRNAs and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 80% and specificity of 71.4%.
- FIG. 18A A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) samples using the combination of microRNAs and microRNA ratios.
- FIG. 18A A classifier using two microRNAs and one microRNA ratio (hsa-miR-146b-5p; hsa-mi
- a classifier using threemicroRNAs and three microRNA ratios (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas.
- FIG. 18C A classifier using five microRNAs and three microRNA ratios (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p;) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (
- FIG. 19A-19C A Discriminant analysis ensemble classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B).
- FIG. 19A A classifier using two microRNA (hsa-miR-146b-5p; hsa-miR-551b-3p) as the features for the classification presented sensitivity of 85.2% and specificity of 45.2%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 19B A classifier using three microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p) as the features for the classification presented sensitivity of 84.3% and specificity of 45.2%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 19B A classifier using three microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p) as the features for the classification presented sensitivity of 84.3% and specificity of 45.2%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 20A-20C A Discriminant analysis ensemble classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B)
- FIG. 20A A classifier using two microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 86.1% and specificity of 61.1%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 20B A classifier using three microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 87% and specificity of 57.1%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 21A-21C A Discriminant analysis ensemble classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) using a combination of microRNAs and microRNA ratios.
- FIG. 21A A classifier using one microRNA and one microRNA ratio (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-146b-5p) as the features for the classification presented sensitivity of 83.5% and specificity of 58.7%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 21A A Discriminant analysis ensemble classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) using a combination of microRNAs and microRNA ratios.
- FIG. 21A A classifier using one microRNA and one microRNA ratio (
- FIG. 21B A classifier using two microRNAs and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 85.2% and specificity of 65.9%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 22A-22C A Discriminant analysis classifier was used to classify samples from the Bethesda IV sub-cohort as malignant (diamonds, M) or benign (squares, B).
- FIG. 22A A classifier using two microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p) as the features for the classification presented sensitivity of 91.5% and specificity of 42.9%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 22A-22C A Discriminant analysis classifier was used to classify samples from the Bethesda IV sub-cohort as malignant (diamonds, M) or benign (squares, B).
- FIG. 22A A classifier using two microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p) as the features for the classification presented sensitivity of 91.5% and specific
- FIG. 22B A classifier using three microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p) as the features for the classification presented sensitivity of 91.5% and specificity of 39.7%. Misclassified samples (miscl.) are represented by a dot.
- FIG. 22B A classifier using three microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p) as the features for the classification presented sensitivity of 91.5% and specificity of 39.7%. Misclassified samples (miscl.) are represented by a dot.
- a classifier using eight microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p; hsa-miR-31-5p; hsa-miR-138-5p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class-re.cl), and presented sensitivity of 89.4% and specificity of 47.6%.
- FIG. 23A-23C A Discriminant analysis classifier was used to classify the samples from the Bethesda IV sub-cohort as malignant (diamonds, M) or benign (squares, B).
- FIG. 23A Classifier using two microRNA ratios (hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 89.4% and specificity of 28.6%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 24A-24C A Discriminant analysis classifier was used to classify samples from the Bethesda IV sub-cohort as malignant (diamonds, M) or benign (squares, B) using microRNAs and microRNA ratios.
- FIG. 24A Classifier using one microRNA and one microRNA ratio (hsa-miR-125b-5p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 93.6% and specificity of 33.3%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 24A A Discriminant analysis classifier was used to classify samples from the Bethesda IV sub-cohort as malignant (diamonds, M) or benign (squares, B) using microRNAs and microRNA ratios.
- FIG. 24A Classifier using one microRNA and one microRNA ratio (hsa-m
- FIG. 25A-25C A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign.
- FIG. 25A-25C A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign.
- 25A Classifier using six microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p) as the features for the classification presented sensitivity of 72.3% and specificity of 39.7%.
- 25B Classifier using eight microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p; hsa-miR-31-5p; hsa-miR-138-5p) as the features for the classification presented sensitivity of 66% and specificity of 61.9%.
- 25C Classifier using twelve microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p; hsa-miR-31-5p; hsa-miR-138-5p; hsa-miR-200c-3p; MID-16582; hsa-miR-346; hsa-miR-152-3p) as the features for the classification presented sensitivity of 66% and specificity of 61.9%.
- FIG. 26A-26B A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign.
- FIG. 26A-26B A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign.
- 26A Classifier using six microRNA ratios (hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; MID-16582:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 78.7% and specificity of 61.9%.
- 26B Classifier using eight microRNA ratios (hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; MID-16582:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for the classification presented sensitivity of 80.9% and specificity of 50.8%.
- FIG. 27A-27C A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign using expression of microRNAs and microRNA ratios.
- FIG. 27A-27C A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign using expression of microRNAs and microRNA ratios.
- FIG. 27A Classifier using four microRNAs and two microRNA ratios (hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-125b-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification, presented sensitivity of 63.8% and specificity of 46%.
- FIG. 28 The normalized expression (Exp.) levels of hsa-miR-146b-5p is shown as a dot plot for Bethesda IV non-medullary malignant (“Mal.”) and for benign (“Ben.”) samples. Lines represent the median values for each group. Within each group, dots are randomly distributed along the x-axis.
- FIG. 29 The normalized expression (Exp.) levels of the microRNA ratio hsa-miR-146b-5p:hsa-miR-342-3p is shown as a dot plot for Bethesda IV non-medullary malignant (“Mal.”) and for benign (“Ben.”) samples. Lines represent the median values for each group. Within each group, dots are randomly distributed along the x-axis.
- FIG. 30 A Discriminant Analysis classifier was used to classify malignant (diamonds, M) from benign (squares, B) samples, wherein the malignant group included samples of medullary tumor.
- the normalized values of two microRNA hsa-miR-222-3p; hsa-miR-551b-3p
- the sensitivity of this classifier is 85.2% and the specificity is 53.6%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 31 A Discriminant Analysis classifier was used to classify malignant (diamonds, M) from benign (squares, B) samples, wherein the malignant group included samples of medullary tumor.
- the values of two microRNA ratios (hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-146b-5p:hsa-miR-342-3p) were used as the features for the classification.
- the sensitivity of this classifier is 84.7% and the specificity is 80.8%.
- the grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.
- FIG. 32 Expression pattern of hsa-miR-486-5p and hsa-miR-200c-3p is determinant for the quality of the sample.
- Four samples of blood smears (BS) were analyzed for the expression of hsa-miR-486-5p (SEQ ID NO: 22) and hsa-miR-200c-3p (SEQ ID NO: 23 or 24) in comparison with their expression in malignant (M) and benign (B) thyroid samples. Normalized values for the two miRs are shown (normalized using all normalizers).
- FIG. 33 Sub-typing of Benign Thyroid Tumors.
- Each cross represents a microRNA or a microRNA ratio.
- the ratio hsa-miR-125b-5p:hsa-miR-200c-3p correlated to FA, while expression of hsa-miR-342-3p and hsa-miR-31-5p correlated with Hashimoto.
- Diamonds represent any one of the microRNAs of SEQ ID NOs. 26-37.
- Significant microRNAs (p-value for t-test ⁇ 0.05) are represented by circles.
- FIG. 34 Sub-typing of Malignant Thyroid Tumors.
- Each cross represents a microRNA or a microRNA ratio.
- Diamonds are any one of the microRNAs of SEQ ID NOs. 26-37.
- Significant microRNAs p-value for t-test ⁇ 0.05
- Only normalized microRNA values are labeled. Unlabeled circles represent significant ratios.
- FIG. 35 Flowchart representing the protocol for diagnosis of indeterminate thyroid nodule samples obtained through FNA.
- FIG. 36 Medullary carcinoma, Linear Discriminant Analysis step based on the expression of hsa-miR-375. All of the training medullary carcinoma stained smear samples and two of the three medullary validation samples demonstrate over-expression of hsa-miR-375 (x-axis shows hsa-miR-375 expression; y-axis shows hsa-miR-146b expression). Diamonds: malignant non-medullary training samples; squares, benign training samples; circles, medullary carcinoma training samples; stars, medullary carcinoma validation samples.
- FIG. 37 Negative Predictive Value (NPV) and Positive Predictive Value (PPV) for varying prevalence values.
- NPV and PPV were calculated, based on the sensitivity and specificity of the entire validation set (85.2% and 71.9%, respectively; dashed lines) and the validation agreement set (97.5% and 78.2%, respectively; solid lines), for varying prevalence (Prev.) values.
- Prev. prevalence
- the present inventors have developed a first-of-its-kind assay utilizing microRNA expression in FNA smears for distinguishing benign from malignant thyroid nodules. This assay offers a valuable tool for the classification of pre-operative thyroid samples, including those that are presently indeterminate to cytological evaluation.
- the present invention provides a sensitive, specific and accurate methodology for distinguishing between malignant and benign thyroid tumors, as well as particular subtypes of thyroid tumors. Distinguishing between different subtypes of thyroid tumors is essential for providing the patient with the best and most suitable treatment.
- the present invention provides a significant improvement of the technologies currently available in the field of thyroid tumor classification and diagnosis.
- the present inventors have developed an integrative platform for the classification of thyroid lesions, by profiling and characterizing microRNA expression in thyroid clinical samples obtained by FNA biopsies, while also overcoming hindrances such as low number of cells in the sample and the amount of blood in the sample by microRNA profiling.
- This technological platform was applied to stratify thyroid lesions into benign or malignant neoplasms, as well as subtypes of thyroid tumors, as an adjunctive tool in the pre-operative management of thyroid nodules.
- the inventors have exceptionally developed a method for classification of benign and malignant thyroid lesions, and specific subtypes of thyroid cancer and follicular lesions, while integrating steps for filtering out sub-optimal samples, by implementing specific algorithms based on microRNA profiling.
- the method is part of an overall protocol, in which existing or available clinical cytological slides having smears from FNA samples may be used, without the need to generate or collect additional material from the patients.
- the present method further incorporates the analysis of microRNAs in minute amounts of RNA material from cytological samples. Once an FNA sample is collected, between one and several passes of material are smeared onto slides. Currently available methods usually require the use of several passes for having enough material for analysis. The present inventors developed a method in which even only one FNA slide provides sufficient material for microRNA detection. In addition, the method developed by the inventors allows for the analysis of samples having very small amounts of cells, such as samples having 50 cells, up to 120 cells and over.
- the present method includes steps for eliminating or disqualifying samples that lack thyroid cells and/or in which non-thyroid cells, such as blood cells, are over-represented.
- the assay developed by the present inventors accurately differentiates benign from malignant thyroid nodules in indeterminate FNA smears.
- the assay is advantageous over other previous diagnostic methods for thyroid nodules particularly in regard to its performance for cases in which all pathologists are in full agreement, showing 98% of sensitivity (95% confidence interval [CI], 87-100%); 78% of specificity (CI, 69-85%); a Negative Predictive Value (NPV) of 99% (CI, 94-100%); and a Positive Predictive Value (PPV) of 62% (CI, 49-74%).
- Performance for the overall validation sample set was 85% Sensitivity (CI, 74-93%); 72% Specificity (CI, 63-79%); 91% of NPV (CI, 84-96%); and 59% of PPV (CI, 48-69%).
- microRNA expression signature for thyroid lesions through profiling the expression of the microRNAs denoted by SEQ ID NOs.1-308.
- a microRNA expression signature for classification of thyroid lesions is also obtained through profiling the expression of at least ten of the microRNAs denoted by SEQ ID NOs.1-37.
- a microRNA expression signature for classification of thyroid lesions is obtained through profiling the expression of at least seven of the microRNAs denoted by SEQ ID NOs.1-37.
- the present inventors have develop a platform for classification of thyroid clinical samples based on the levels of expression of a set of microRNAs, comprising at least two microRNAs, selected from the group consisting of hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-424-3p (SEQ ID NO: 16), hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-346 (SEQ ID NO: 14), MID-16582 (SEQ ID NO: 25), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-181c-5p (SEQ ID NO: 15), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-375 (SEQ ID NO: 8), hsa
- Variations of said platform for classification of thyroid clinical samples include determining the levels of expression of at least ten (10) microRNAs comprised in the group consisting of hsa-miR-375 (SEQ ID NO: 8), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-574-3p (SEQ ID NO: 36 or 37), hs
- the classification of thyroid clinical samples include determining the levels of expression of eleven microRNAs comprised in the group consisting of hsa-miR-375 (SEQ ID NO: 8), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-31-5p (SEQ ID NO: 5 or 7), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-574-3p (SEQ ID NO: 36 or 37), hsa-miR-222
- the present invention is particularly useful for the 25% of the cases in which FNA specimens present inconclusive results in cytopathology, usually referred to as “indeterminate”, and which include thyroid lesion samples classified in Bethesda categories III, IV and V.
- indeterminate the cases in which FNA specimens present inconclusive results in cytopathology, usually referred to as “indeterminate”, and which include thyroid lesion samples classified in Bethesda categories III, IV and V.
- patients with specimens falling within this category undergo repeat FNA procedure, or surgery, including lobectomy and thyroidectomy.
- the present invention provides a method of classification for thyroid lesion samples that fall into the “indeterminate” cases, classified in categories III, IV and V of the Bethesda System (described further herein).
- the present invention provides a method of classification for thyroid lesion samples classified in category IV of the Bethesda System, which relates to “Follicular Neoplasm” or “Suspicious of a Follicular Neoplasm”, which is known to be the most difficult category to be classified.
- the present invention presents primarily a protocol for management of thyroid lesion samples which failed to be classified by cytopathological analysis.
- Particular samples that are of interest are those obtained by FNA.
- routine smears from FNA samples are used.
- FNA samples in preservative solutions may be used.
- Total RNA is extracted from the FNA samples, and the expression of microRNAs is measured.
- the expression of about 2200 microRNAs is measured.
- the expression of 182 microRNAs, comprising the sequences of SEQ ID NO: 1-182 is measured.
- the expression of the microRNAs comprising the sequences of SEQ ID NO: 1-37 is measured.
- classification of the thyroid sample as malignant or benign comprises measuring the expression levels of hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-375 (SEQ ID NO: 8), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-346 (SEQ ID NO: 14), hsa-miR-181c-5p (SEQ ID NO: 15), hsa-miR-424-3p (SEQ ID NO: 16), hsa-miR-342-3p (SEQ ID NO: 17 or 18),
- the present invention provides a method for distinguishing between malignant and benign thyroid tumor lesions in a subject in need, said method comprising obtaining a thyroid tumor lesion sample from said subject, or provided a biological sample obtained from said subject, determining an expression profile in said sample of one or more, or at least four, at least eight, at least eleven, at least thirteen microRNAs comprising SEQ ID NOS: 1-37, or a sequence at least 80%, at least 85%, at least 90%, or at least 95% identical thereto, or any combination of said microRNAs, by hybridization or by amplification, comparing said expression profile to a reference threshold value by using a classifier algorithm; and determining whether the thyroid lesion is malignant or benign.
- the method of the invention is for distinguishing sub-types of malignant or benign thyroid tumor lesions.
- the method of the invention comprises measuring the expression of at least four of the microRNAs comprising SEQ ID NOS: 1-37, obtaining the microRNA expression profile value of said sample, and using a classifier to establish, based on said value, whether the thyroid lesion is malignant or benign, and optionally further classifying the sample into one of the malignant or benign subtypes.
- said determining an expression profile by hybridization comprises contacting the sample with probes that hybridize to each of SEQ ID NOS: 1-37, or to a sequence at least 80%, at least 85%, or at least 90% identical thereto. In another embodiment, said determining an expression profile by hybridization comprises contacting the sample with probes that hybridize with at least eight, at least ten, at least twelve, at least fourteen, or at least sixteen contiguous nucleotides of said microRNA comprising SEQ ID NOS: 1-37.
- the present invention further provides a method of classifying a sample as malignant or benign, and/or sub-typing said sample, whereby, further to measuring the expression levels of microRNAs in the sample, obtaining an expression profile and optionally calculating microRNA ratios, applying a multi-step analysis of the expression data.
- Said multi-step analysis comprising applying one or more algorithms, in parallel or sequentially, to at least one of the microRNA expression profiles, microRNA ratios, or a combination thereof.
- Said multi-step analysis may also further include analyzing the expression of one or more single microRNA levels which may be indicative of the overall quality of the sample.
- criteria examples include the expression of non-malignant cell markers, the expression of microRNAs that correlate with a specific sub-type of thyroid tumor, and the like.
- one step may be examining whether the expression of non-thyroid cell markers is higher or lower than the threshold established in the data set, e.g. the training data set, in which case the sample may be disqualified.
- Another further step may be examining the expression of a microRNA or microRNA ratio that correlates with a thyroid tumor sub-type, e.g. if the expression of hsa-miR-342-3p (SEQ ID NO: 17 or 18) is very high compared to the threshold established in the data set, e.g.
- the sample may be classified as benign, and further sub-typed as being Hashimoto.
- the expression of hsa-miR-342-3p (SEQ ID NO: 17 or 18) is very high compared to the threshold established in the data set, e.g. the training data set, the sample may be disqualified for lack of sufficient thyroid cells.
- Another further optional step may relate to the level of expression of MID-16582 (SEQ ID NO: 25), may be used to determine whether the sample may be discarded, or analyzed using a classifier specific for these samples in which MID-16582 (SEQ ID NO: 25) is high (compared to the threshold established in the training set).
- said non-thyroid cell marker is a blood cell marker.
- said cell marker is an epithelial cell marker.
- said cell marker is a blood cell marker, a white blood cell marker or an epithelial cell marker.
- blood cell markers are hsa-miR-486-5p (SEQ ID NO: 22), hsa-miR-320a (SEQ ID NO: 173), hsa-miR-106a-5p (SEQ ID NO: 150), hsa-miR-93-5p (SEQ ID NO: 182), hsa-miR-17-3p (SEQ ID NO: 160), hsa-let-7d-5p (SEQ ID NO: 144), hsa-miR-107 (SEQ ID NO: 152), hsa-miR-103a-3p (SEQ ID NO: 149), hsa-miR-17-5p (SEQ ID NO: 161), hsa-miR-191-5p (SEQ ID NO: 163), hsa-miR-25-3p (SEQ ID NO: 22).
- white blood cell markers are hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-146a-5p and hsa-miR-150-5p (SEQ ID NO: 59).
- epithelial markers are hsa-miR-200c-3p (SEQ ID NO: 23 or 24), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-3648 (SEQ ID NO: 174), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-125a-5p (SEQ ID NO: 153), hsa-miR-192-3p (SEQ ID NO: 164), hsa-miR-4324 (SEQ ID NO: 178), hsa-miR-376a-3p (SEQ ID NO: 175).
- quality assessment (QA) or quality control (QC) of the sample may be one of the steps of the multi-step analysis of the expression data, or one of the steps of the method of the invention.
- QC relates to the expression profile of microRNAs of any one of SEQ ID NOS. 1-37, or to the ratio of the expression level of at least one pair of microRNAs of any one of SEQ ID NOS. 1-37. Therefore, the expression profile or the ratio of the expression levels of microRNAs are optionally used indicators of sample quality.
- said microRNA ratio is the ratio between the normalized or non-normalized expression level of a pair of microRNAs, wherein the expression level of one microRNA is used as the numerator and the expression level of a second microRNA is the denominator.
- pairs of microRNAs whose expression level ratios are used in the classifier are hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-375:hsa-miR-146b; hsa-miR-551b-3p:hsa-miR-23a-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-222-3p:hsa-miR-486-5p; hsa-miR-200c-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR
- said determining an expression profile comprises contacting the sample with RT-PCR reagents, including forward and reverse primers as exemplified herein in the Examples, and generating RT-PCR products.
- said method comprises contacting RT-PCR products with specific or general probes, or a combination thereof, as exemplified herein in the Examples, detecting and measuring the PCR products.
- said determining an expression profile comprises measuring microRNA expression by hybridization, using microarrays and the like. In another further embodiment, said determining an expression profile comprises measuring microRNA expression by next-generation sequencing.
- the method of the invention further comprises optionally determining the expression profile of at least one microRNA to be used as normalizer.
- any microRNA as described in Table 1 may be used as a normalizer.
- any of the microRNAs comprising SEQ ID NO: 26-37, or a sequence at least 80%, 85%, 90%, or 95% identical thereto, are used as normalizers.
- Said markers may be any one of malignant markers, secondary markers and cell-type markers, or any combination thereof, comprising SEQ ID NOS: 1-25, or a sequence at least 80%, 85%, 90%, or 95% identical thereto.
- the full set of markers may be used.
- any combination of malignant, secondary and cell-type markers may be used.
- the method may comprise at least one malignant marker, in association with at least one secondary marker and/or at least one cell-type marker.
- each of the cell type markers may be used as in the form of raw or normalized signals.
- the cell type markers may be used as a preliminary test prior to performing the classification, in order to determine whether the sample has sufficient relevant material to perform classification, or whether the sample should be discarded.
- the cell-type markers may be used as part of the final classifier, where the signal of the cell type marker is used by the classifier.
- the cell-type markers as the denominator of a miR ratio optionally used by the classifier. For example, the expression level of a malignant or a secondary marker may be divided by the expression level of a cell-type marker, and the resulting miR ratio used in the classifier.
- said classifier may be any one of a single classifier, a multi-step classifier, a classifier which uses all the malignant markers, a classifier which uses a subset of the malignant markers, a classifier which uses all the malignant markers and the secondary markers, a classifier which uses a subset of the malignant markers and a subset of the secondary markers, a classifier which uses all the malignant markers and the secondary markers and the cell type markers, a classifier which employs a subset of all the malignant markers and the secondary markers and the cell type markers, a classifier which uses all or a subset of the malignant markers and all or a subset of the cell type markers.
- the performance of the classification may be improved by further combining the result from the algorithm classifier with additional clinical or molecular data available for the thyroid sample being analyzed.
- Additional data available may be related to the thyroid lesion, such as the size of the nodule, the number of nodules; it may relate to other clinical information available for the subject from whom the sample was obtained, such as molecular test results, like the expression of other molecular markers, genetic markers, biochemical test results, blood test results, urine test results, recurrence, prognosis data, family history, patient medical history, and the like.
- Other data that may also be combined is thyroid genetic data, such as mutation analysis, gene fusions, chromosomal rearrangements, gene expression, protein expression, and the like.
- Therapeutic indications may vary according to the diagnostic obtained with the method or protocol of the invention. Typically there are five types of therapy that may be administered to a thyroid cancer patient: surgery, radiation therapy, chemotherapy, thyroid hormone therapy and targeted therapy.
- Lobectomy Removal of the lobe in which thyroid cancer is found. Biopsies of lymph nodes in the area may be done to see if they contain cancer.
- Total thyroidectomy Removal of the whole thyroid.
- Lymphadenectomy Removal of lymph nodes in the neck that contain cancer.
- Thyroidectomy is a surgical procedure that has several potential complications or sequela including: temporary or permanent change in voice, temporary or permanently low calcium, need for lifelong thyroid hormone replacement, bleeding, infection, and the remote possibility of airway obstruction due to bilateral vocal cord paralysis. Therefore, accurate diagnosis which would prevent the unnecessary removal of the thyroid gland is very desirable.
- Radioactive iodine RAI is administered orally and collects in any remaining thyroid tissue, including thyroid cancer cells that have spread to other places in the body.
- the RAI Since only thyroid tissue takes up iodine, the RAI destroys thyroid tissue and thyroid cancer cells without harming other tissues. Before a full treatment dose of RAI is given, a small test-dose is given to see if the tumor takes up the iodine.
- Chemotherapy is another option for thyroid cancer treatment.
- Chemotherapy may be administered orally or by injection, intravenous or intramuscular.
- Chemotherapy may also be administered directly into the cancer affected area instead of systemically. The choice of administration will depend on the type and stage of the cancer.
- a few examples of drugs that have been approved for thyroid cancer treatment are: Adriamycin PFS (Doxorubicin Hydrochloride), Adriamycin RDF (Doxorubicin Hydrochloride), Cabozantinib-S-Malate, Caprelsa (Vandetanib), Cometriq (Cabozantinib-S-Malate), Doxorubicin Hydrochloride, Nexavar (Sorafenib Tosylate), Sorafenib Tosylate and Vandetanib.
- drugs may be given to prevent thyroid-stimulating hormone (TSH) production, in order to avoid that the hormone would induce the growth or recurrence of the thyroid cancer.
- TSH thyroid-stimulating hormone
- thyroid cancer treatment specifically targets thyroid cells, the thyroid is not able to make enough thyroid hormone. Patients are given thyroid hormone replacement pills.
- Targeted therapy uses drugs or other substances to identify and attack specific cancer cells without harming normal cells.
- Tyrosine kinase inhibitor (TKI) therapy blocks signal transduction in thyroid cancer cells, inhibiting their growth.
- Vandetanib is a TKI used to treat thyroid cancer.
- Dosage and duration of any therapy will depend on individual evaluation of the patient and on standard practice known by the health care provider.
- the duration of treatment is the period of time during which doses of a pharmaceutical agent or pharmaceutical composition are administered.
- the identification and differentiation of the thyroid tumor, firstly as benign or malignant, and subsequently its classification into the various subtypes through the analysis of differentially expressed microRNAs can provide further clues to the biological differences between the subtypes, their diverging oncogenetic processes and possible new targets for type-specific target therapy.
- the present invention provides diagnostic assays and methods, both quantitative and qualitative, for detecting, diagnosing, monitoring, staging and prognosticating thyroid cancers by comparing levels of the specific microRNA molecules as described herein. Such levels are measured in a patient sample, which may be from a biopsy, tumor samples, cells, tissues and/or bodily fluids.
- the method of the invention is particularly useful for discriminating between different subtypes of malignant thyroid tumors, such types being follicular carcinoma, papillary carcinoma, follicular variant of papillary carcinoma (FVPC or FVPTC), encapsulated FVPC (or encapsulated FVPTC), medullary carcinoma, anaplastic thyroid cancer, poorly differentiated thyroid cancer, and for determining the therapeutic course to be followed after diagnosis.
- the present invention provides a method for classifying sub-types of benign thyroid tumor, e.g. follicular adenoma, Hashimoto thyroiditis, hyperplasia (Goiter).
- the present invention also provides a method of treatment of thyroid cancer, said method comprising the method of distinguishing between benign or malignant thyroid tumor as described herein, optionally subtyping the thyroid tumor type, and administering the treatment according to the diagnosis provided by the present method.
- All the methods of the present invention may optionally further include measuring levels of other cancer markers.
- Other cancer markers in addition to said microRNA molecules useful in the present invention, will depend on the cancer being tested and are known to those of skill in the art.
- Assay techniques that can be used to determine levels of gene expression, such as the nucleic acid sequence of the present invention, in a sample derived from a patient are well known to those of skill in the art.
- Such assay methods include, but are not limited to, reverse transcriptase PCR (RT-PCR) assays, in situ hybridization assays, competitive-binding assays, Northern blot analyses, nucleic acid microarrays and biochip analysis.
- An arbitrary threshold on the expression level of one or more nucleic acid sequences can be set for assigning a sample or tumor sample to one of two groups.
- expression levels of one or more nucleic acid sequences of the invention are combined by taking ratios of expression levels of two nucleic acid sequences and/or by a method such as logistic regression to define a metric which is then compared to previously measured samples or to a threshold.
- the threshold for assignment is treated as a parameter, which can be used to quantify the confidence with which samples are assigned to each class.
- the threshold for assignment can be scaled to favor sensitivity or specificity, depending on the clinical scenario.
- the correlation value to the reference data generates a continuous score that can be scaled and provides diagnostic information on the likelihood that a samples belongs to a certain class of thyroid subtype. In multivariate analysis, the microRNA signature provides a high level of prognostic information.
- the present invention also provides novel microRNA molecules, comprising nucleic acids denoted by SEQ ID NOS.27-29, 33, 34, 139, 140, 307 and 308. It is to be understood, that the cDNA, complement sequence, and anti-miR corresponding to any one of SEQ ID NOS.27-29, 33, 34, 139, 140, 307 and 308 are also encompassed by the present invention.
- compositions, formulations and medicaments comprising the microRNAs described herein.
- the present invention provides compositions, formulations and medicaments comprising as an active agent the microRNA comprising any one of SEQ ID NOS.27-29, 33, 34, 139, 140, 307 and 308, variants thereof, or a sequence at least 80%, at least 85%, or at least 90% identical thereto.
- Said compositions, formulations and medicaments may further optionally comprise any one of adjuvants, carriers, diluents and excipients.
- microRNAs described herein can be formulated into compositions, formulations and medicaments by combination with appropriate, pharmaceutically acceptable carriers or diluents, and can be formulated into preparations in solid, semi-solid, liquid or gaseous forms, such as tablets, capsules, powders, granules, ointments, solutions, suppositories, injections, inhalants and aerosols.
- administration of the microRNA or a pharmaceutical composition comprising thereof can be achieved in various ways, including oral, buccal, rectal, parenteral, intraperitoneal, intradermal, transdermal, intratracheal, etc.
- compositions of the present invention comprise one or more nucleic acids of the invention and one or more excipients.
- excipients are selected from water, salt solutions, alcohol, polyethylene glycols, gelatin, lactose, amylase, magnesium stearate, talc, silicic acid, viscous paraffin, hydroxymethylcellulose and polyvinylpyrrolidone.
- a pharmaceutical composition of the present invention is prepared using known techniques, including, but not limited to mixing, dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or tabletting processes. Methods for the preparation of pharmaceutical compositions may be found in the literature, e.g. in Gennaro, A. R. (2000) Remington: The Science and Practice of Pharmacy, 20 th ed.
- a pharmaceutical composition of the present invention is a liquid (e.g., a suspension, elixir and/or solution).
- a liquid pharmaceutical composition is prepared using ingredients known in the art, including, but not limited to, water, glycols, oils, alcohols, flavoring agents, preservatives, and coloring agents.
- a pharmaceutical composition of the present invention is a solid (e.g., a powder, tablet, and/or capsule).
- a solid pharmaceutical composition comprising one or more nucleic acids of the invention is prepared using ingredients known in the art, including, but not limited to, starches, sugars, diluents, granulating agents, lubricants, binders, and disintegrating agents.
- the present application provides vectors and probes comprising the compounds (the nucleic acids) disclosed herein.
- the present application provides vectors and probes comprising nucleic acids denoted by SEQ ID NOS.27-29, 33, 34, 139, 140, 307 and 308, variants thereof or a sequence at least 80%, at least 85%, or at least 90% identical thereto.
- each intervening number there between with the same degree of precision is explicitly contemplated.
- the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0 for example, the numbers 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9 and 7.0 are explicitly contemplated.
- aberrant proliferation means cell proliferation that deviates from the normal, proper, or expected course.
- Aberrant cell proliferation may include cell proliferation whose characteristics are associated with an indication caused by, mediated by, or resulting in inappropriately high levels of cell division, inappropriately low levels of apoptosis, or both.
- Such indications may be characterized, for example, by single or multiple local abnormal proliferations of cells, groups of cells, or tissue(s), whether cancerous or non-cancerous, benign or malignant.
- Aberrant proliferation is one of the main features of cancer.
- “Attached” or “immobilized”, as used herein to refer to a probe and a solid support, means that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal.
- the binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of a specific reactive group on either the solid support or the probe or both molecules.
- Non-covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a combination of covalent and non-covalent interactions.
- Bio sample or “sample”, as used herein, means a sample of biological tissue or fluid that comprises nucleic acids, microRNA in particular. Such samples include, but are not limited to, tissue or fluid isolated from subjects. Biological samples also include sections of tissues such as biopsy and autopsy samples, fine-needle aspiration (FNA) samples, frozen sections taken for histological purposes, blood, blood fraction, plasma, serum, and the like. A biological sample may be provided by removing a sample of cells from a subject, but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), which may then be cultured or not. Archival tissues, such as those having treatment or outcome history, may also be used.
- FNA fine-needle aspiration
- the FNA biopsy is prepared as a smear.
- classification refers to a procedure and/or algorithm in which individual items are placed into groups or classes based on quantitative information on one or more characteristics inherent in the items (referred to as traits, variables, characters, features, etc.) and based on a statistical model and/or a training set of previously labeled items.
- the term “classifying thyroid tumors” refers to the identification of one or more properties of a thyroid tissue sample (e.g., including but not limited to, the presence of microRNAs expressed in cancerous tissue, the presence of microRNAs expressed in pre-cancerous tissue that is likely to become cancerous, and the presence of microRNAs expressed in cancerous tissue that is likely to metastasize).
- classifier refers to an algorithm used to classify, distinguish or identify thyroid tumors (or lesions) as benign or malignant, or to classify, distinguish or identify sub-types of thyroid tumor.
- the algorithm to be used in the method or protocol of the invention is a machine-learning algorithm.
- machine-learning algorithms are discriminant analysis, K-nearest neighbor classifier (KNN), Support Vector Machine (SVM) classifier, logistic regression classifier, neural network classifier, Gaussian mixture model (GMM), nearest centroid classifier, linear regression classifier, decision tree classifier, and random forest classifier, ensemble of classifiers, or any combination thereof.
- the classifier may comprise a combination of algorithms, such as for example a discriminant analysis and a KNN classifier, a KNN classifier and a SVM classifier, a discriminant analysis and a logistic regression classifier, and so forth with any of the machine-learning algorithms.
- the discriminant may be any one of a linear, quadratic, a diagonal of the linear covariance matrix, diagonals of the quadratic covariance matrices, pseudoinverse of the linear covariance matrix, and pseudoinverse of the quadratic covariance matrices.
- the k may be altered and the distance metric can be either Pearson correlation, spearman correlation, Euclidean or cityblock (Manhattan) distance.
- a KNN classifier uses a number of neighboring samples for the classification of each sample, which may be 5 neighboring samples, 6 neighboring samples, 7 neighboring samples, 8 neighboring samples, 9 neighboring samples, 10 neighboring samples, and the like.
- the kernel may be linear, Gaussian or polynomial.
- an ensemble method classifier it usually applies algorithms such as classification trees, KNN or discriminate analysis classifiers.
- the ensembles can be either created using boosting or bagging algorithms and the number of ensemble learning cycles can range from two up to a few thousand.
- One such combination of algorithms may comprise one linear discriminant analysis (LDA) and one KNN, two LDA algorithms and one KNN, three LDA algorithms and one KNN, four LDA algorithms and one KNN, or five LDA algorithms and one KNN.
- LDA linear discriminant analysis
- a classifier for the classification of thyroid tumor samples and its use comprising a combination of algorithms, particularly machine-learning algorithms, which utilize (as an input) the expression levels of nucleic acids of SEQ ID NOS:1-37, or the ratios between pairs of expression levels of said nucleic acids of SEQ ID NOS:1-37, in a series of steps which result in the classification of a thyroid tumor sample as benign or malignant (as the output).
- fusion matrix refers to a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one.
- a “confusion matrix” may also be referred to as a contingency table or an error matrix.
- “Complement” or “complementary”, as used herein to refer to a nucleic acid may mean Watson-Crick (e.g., A-T/U and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
- a full complement or fully complementary means 100% complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules.
- the complementary sequence has a reverse orientation (5′-3′).
- the present invention also provides the complement of the nucleic acids denoted by SEQ ID NOS. 7-29, 33, 34, 139, and 140.
- C T signals represent the first cycle of PCR where amplification crosses a threshold (cycle threshold) of fluorescence. Accordingly, low values of C T represent high abundance or expression levels of the microRNA.
- the PCR C T signal is normalized such that the normalized C T is inversed from the expression level. In other embodiments the PCR C T signal may be normalized and then inverted such that low normalized-inverted C T represents low abundance or expression levels of the microRNA.
- a “data processing routine” refers to a process that can be embodied in software that determines the biological significance of acquired data (i.e., the ultimate results of an assay or analysis) with respect to one or more samples. For example, the data processing routine can make determination of whether a thyroid lesion from which a sample was collected or obtained is benign or malignant, or of a specific sub-type, based upon the data collected. In the systems and methods herein, the data processing routine can also control the data collection routine based upon the results determined. The data processing routine and the data collection routines can be integrated and provide feedback to operate the data acquisition, and hence provide assay-based judging methods.
- Detection means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means determining the level of a component, either quantitatively or qualitatively.
- “Differential expression” or a “difference in expression levels” means qualitative or quantitative differences in the microRNA expression patterns in thyroid samples.
- a differentially expressed microRNA may qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus diseased thyroid tissue.
- a qualitatively regulated microRNA may exhibit an expression pattern within a thyroid sample or cell type which may be detectable by standard techniques. Some microRNAs may be expressed in one thyroid sample or cell type, and not in other, or expressed at different levels between different cell types or different samples.
- the difference in expression may be quantitative, e.g., in that expression is modulated, up-regulated, resulting in an increased amount of microRNA, or down-regulated, resulting in a decreased amount of microRNA.
- the degree to which expression differs needs only be large enough to quantify via standard characterization techniques such as expression arrays, next generation sequencing (NGS), quantitative reverse transcriptase PCR, northern blot analysis, real-time PCR, in situ hybridization and RNase
- expression profile is used broadly to include a genomic expression profile, as well as an expression profile of microRNAs, for example.
- expression profile means the set of data obtained for the nucleic acid (or microRNA) expression. It may refer to the raw data or to the normalized expression values.
- Expression profiles may be generated by any convenient means for determining a level of a nucleic acid sequence e.g. quantitative hybridization of microRNA, labeled microRNA, amplified microRNA, cDNA, etc., quantitative PCR, and the like. Further to measuring nucleic acid sequence levels, the data obtained may be normalized—normalization of data has been discussed somewhere else in this application. Expression profiles allow the analysis of differential gene expression between two or more samples, as well as between samples and thresholds.
- classifiers may be applied to expression profiles in order to obtain information about the sample, such as classification, diagnosis, sub-typing of the sample, and the like.
- Nucleic acid sequences of interest are nucleic acid sequences that are found to be predictive, including the nucleic acid sequences provided herein in Table 1, where the expression profile may include expression data for 5, 10, 20, 25, 50, 100 or more of, including all of the listed nucleic acid sequences.
- the term “expression profile” means measuring the abundance of the nucleic acid sequences in the measured samples.
- microRNA expression profiles are characterized in each thyroid sample.
- “Expression ratio”, as used herein, refers to relative expression levels of two or more nucleic acids, i.e. microRNAs, as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample, such as a thyroid sample. Since microRNA expression levels are expressed as C T S, which are obtained in log scale, in practice expression ratios are obtained by subtraction of the C T S, rather than by division.
- FDR or “False Discovery Rate” is a statistical method used in multiple hypothesis testing to correct for multiple comparisons.
- multiple statistical tests for example in comparing the signal between two groups in multiple data features, there is an increasingly high probability of obtaining false positive results, by random differences between the groups that can reach levels that would otherwise be considered as statistically significant.
- statistical significance is defined only for data features in which the differences reached a p-value (by two-sided t-test) below a threshold, which is dependent on the number of tests performed and the distribution of p-values obtained in these tests.
- Fine-needle aspiration biopsy FNAB, FNA or NAB
- fine-needle aspiration cytology FNAC
- a biopsy is collected by inserting a thin, hollow needle into the mass for sampling of cells that, after being stained, will be examined under a microscope. There could be cytology exam of aspirate (cell specimen evaluation, FNAC) or histological (biopsy—tissue specimen evaluation, FNAB).
- FNA is a popular biopsy method used for thyroid nodules since a major surgical (excisional or open) biopsy can be avoided by performing a needle aspiration biopsy instead.
- a detailed description of specimen collection and preparation may be found in “Atlas of Fine Needle Aspiration Cytology” by Henryk A. Domanski (2014), the contents of which are incorporated herein by reference.
- the preparation of aspiration specimens has been well described in the art. Usually, a suitable amount of aspirate (usually about one drop) is spread thinly and evenly over a microscopic slide which is then stained and mounted. FNA specimen prepared in this manner are also referred to as “smear”. The result should be compatible to a sectioned histological slide with regard to specimen thickness and evenness.
- Fixation of FNA smears is usually by air drying (generally referred to as “routine air dried FNAB”) or wet fixing using either 95% ethanol or cyto-spray as fixative.
- suitable liquid fixatives are methanol, acetone, isopropyl alcohol, acetone/methanol and the like.
- FNA samples may be added to or mixed with preservatives in a tube.
- a “follicular” lesion may be any one of follicular adenoma (FA), follicular carcinoma (FC) and follicular variant of papillary thyroid carcinoma (FVPTC).
- FA follicular adenoma
- FC follicular carcinoma
- FVPTC papillary thyroid carcinoma
- “Fragment” is used herein to indicate a non-full-length part of a nucleic acid. Thus, a fragment is itself also a nucleic acid.
- Gave binder and/or “minor groove binder” (MGB), as used herein, may be used interchangeably and refer to small molecules that fit into the minor groove of double-stranded DNA, typically in a sequence-specific manner.
- Minor groove binders may be long, flat molecules that can adopt a crescent-like shape and thus, fit snugly into the minor groove of a double helix, often displacing water.
- Minor groove binding molecules may typically comprise several aromatic rings connected by bonds with torsional freedom such as furan, benzene, or pyrrole rings.
- Minor groove binders may be antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic anti-tumor drugs such as chromomycin and mithramycin, CC-1065, dihydrocyclopyrroloindole tripeptide (DPI 3 ), 1,2-dihydro-(3H)-pyrrolo[3,2-e]indole-7-carboxylate (CDPI 3 ), and related compounds and analogues, including those described in Nucleic Acids in Chemistry and Biology, 2d ed., Blackburn and Gait, eds., Oxford University Press, 1996, and PCT Published Application No.
- antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic anti-tumor drugs such as chromomycin and
- a minor groove binder may be a component of a primer, a probe, a hybridization tag complement, or combinations thereof. Minor groove binders may increase the T m of the primer or a probe to which they are attached, allowing such primers or probes to effectively hybridize at higher temperatures.
- the residues of single sequence are included in the denominator but not the numerator of the calculation.
- thymine (T) and uracil (U) may be considered equivalent.
- Identity may be performed manually or by using a computer sequence algorithm such as BLAST, BLAST 2.0, and the like.
- “In situ detection”, as used herein, means the detection of expression or expression levels in the original site hereby meaning in a tissue sample such as biopsy.
- Label means a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means.
- the label may be any entity that does not naturally occur in a protein or nucleic acid and allows the nucleic acid or protein to be detectable.
- useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymes, biotin, digoxigenin, or haptens and other entities which can be made detectable, and the like.
- a label may be incorporated into nucleic acids and proteins at any position.
- Logistic regression is part of a category of statistical models called generalized linear models. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. The dependent or response variable can be dichotomous, for example, one of two possible types of cancer. Logistic regression models the natural log of the odds ratio, i.e. the ratio of the probability of belonging to the first group (P) over the probability of belonging to the second group (1-P), as a linear combination of the different expression levels (in log-space). The logistic regression output can be used as a classifier by prescribing that a case or sample will be classified into the first type is P is greater than 0.5 or 50%. Alternatively, the calculated probability P can be used as a variable in other contexts such as a 1D or 2D threshold classifier.
- the term “prior” refers to a probability for each class, e.g., given to the different classes, and used by the likelihood that a sample is malignant or benign, without any additional knowledge regarding the expression profile of the sample in a classification.
- Priors may be set at different ratios, such as for example 80%-20% malignant-benign, 75%-25% malignant-benign, 70%-30% malignant-benign, 65%-35% malignant-benign, 60%-40% malignant-benign, 50%-50% malignant-benign (i.e., uniform).
- priors may be empirical, i.e., based on the distribution of the samples in training cohort. Priors may be adjusted in order to achieve a predetermined sensitivity or specificity.
- a “marker” is a microRNA, or a nucleic acid sequence, whose presence and abundance is measured in a sample. A “marker” further provides an indication of the status of the sample.
- malignant marker is a microRNA, or a nucleic acid sequence which is present at higher levels in malignant samples versus benign samples.
- a malignant marker may or may not be present in test samples.
- secondary marker is a microRNA, or a nucleic acid sequence, which is used to differentiate between malignant and benign samples, and for which the difference, or the ratio, in the expression levels of said secondary marker in malignant and benign samples is less than the difference, or the ratio, in the expression levels of malignant markers.
- a secondary marker may or may not be present in test samples.
- cell type marker refers to a microRNA, or nucleic acid sequence, whose expression correlates with certain cell types. Said cell types may generally be found in a sample, e.g. blood cells, white blood cells, red blood cells, epithelial cells, Hurthle cells, mitochondrial-rich cells, lymphocytes, follicular cells, parafollicular cells (C cells), metastatic cells, immune cells, macrophages and the like. Other markers included as “cell type markers” may be species-specific markers, such as markers from bacteria, fungi, and the like.
- Normalizer means a microRNA or a nucleic acid sequence whose signal (i.e., level of expression) is used in order to normalize each sample.
- a normalizer may be used alone (one microRNA as normalizer), or as part of a set of normalizers (more than one microRNA as normalizer, for example two, three, four, five, six, seven eight, nine, ten eleven, twelve, thirteen fourteen, sixteen or seventeen microRNAs may be used as normalizers in a set).
- any microRNA detected in the sample may be used as a normalizer.
- any microRNA may be used as a normalizer.
- MicroRNAs denoted by any one of SEQ ID NOs 1-182 may be used as normalizers.
- MicroRNAs denoted by any one of SEQ ID NOs. 1-37 may be used as normalizers.
- Particular examples of microRNAs that may be used as normalizers are hsa-miR-23a-3p, MID-20094, MID-50969, hsa-miR-345-5p, hsa-miR-3074-5p, MID-50976, MID-50971, hsa-miR-5701 and hsa-miR-574-3p.
- Normalization of data values refers to mapping the original data range into another scale. Normalization may be done by subtracting the mean expression of the set of normalizers, subtracting the median expression of the set of normalizers, fitting the expression values of the normalizers to a reference set of values (using a polynomial fit) and applying this fit to all signals. All the normalizers, or a subset of the normalizers may be used.
- Nucleic acid or “oligonucleotide” or “polynucleotide”, as used herein, means at least two nucleotides covalently linked together.
- the depiction of a single strand also defines the sequence of the complementary strand.
- a nucleic acid also encompasses the complementary strand of a depicted single strand.
- Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid.
- a nucleic acid also encompasses substantially identical nucleic acids and complements thereof.
- a single strand may provide a probe that hybridizes to a target sequence under stringent hybridization conditions.
- a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions.
- Nucleic acids may be single-stranded or double-stranded, or may contain portions of both double-stranded and single-stranded sequences.
- the nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, isocytosine and isoguanine.
- Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.
- a nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included.
- the analog may include a non-naturally occurring linkage, backbone, or nucleotide.
- the analog may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages and peptide nucleic acid backbones and linkages.
- Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated herein by reference.
- Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids.
- the modified nucleotide analog may be located for example at the 5′-end and/or the 3′-end of the nucleic acid molecule.
- Representative examples of nucleotide analogs may be selected from sugar- or backbone-modified ribonucleotides.
- nucleobase-modified ribonucleotides i.e., ribonucleotides containing a non-naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridines or cytidines modified at the 5-position, e.g., 5-(2-amino) propyl uridine, 5-bromo uridine; adenosines and guanosines modified at the 8-position, e.g., 8-bromo guanosine; deaza nucleotides, e.g., 7-deaza-adenosine; O- and N-alkylated nucleotides, e.g., N6-methyl adenosine are suitable.
- uridines or cytidines modified at the 5-position e.g., 5-(2-amino) propyl uridine, 5-bromo uridine
- the 2′-OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH 2 , NHR, NR 2 or CN, wherein R is C1-C6 alkyl, alkenyl or alkynyl and halo is F, Cl, Br or I.
- Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al. (Nature 2005; 438:685-689), Soutschek et al. (Nature 2004; 432:173-178), and WO 2005/079397, which are incorporated herein by reference.
- Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip.
- the backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells.
- the backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver and thyroid. Mixtures of naturally occurring nucleic acids and analogs may be made. Alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
- nucleic acids provided herein may be non-naturally occurring, synthesized nucleic acids.
- the nucleic acid provided herein may be a synthetic nucleic acid. Methods of synthesizing nucleic acids are known to the man skilled in the art, and are described, e.g., in U.S. Pat. No. 7,579,451, the contents of which are incorporated herein by reference.
- the nucleic acids may comprise at least one of the sequences of SEQ ID NOS: 1-308 or a variant thereof. In one embodiment, the nucleic acids comprise at least one of the sequences of SEQ ID NOS: 1-182.
- the variant may be a complement of the referenced nucleotide sequence.
- the variant may be a nucleotide sequence that is 70%, 75%, 80%, 85%, 90% or 95% identical to the referenced nucleotide sequence or the complement thereof.
- the variant may be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical thereto.
- a nucleic acid as described herein may have a length of from about 10 to about 250 nucleotides.
- the nucleic acid may have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or 250 nucleotides.
- the nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene.
- the nucleic acid may be synthesized as a single strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex.
- the nucleic acid may be introduced to a cell, tissue or organ in a single- or double-stranded form or capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Pat. No. 6,506,559, the contents of which are incorporated by reference herein.
- the nucleic acid may comprise a microRNA sequence shown in Table 1, or a variant thereof. In some instances, variants of the same microRNA are also provided in Table 1. It is to be noted that SEQ ID NOs.1-180 in Table 1 present the cDNA corresponding to the sequence of the naturally occurring microRNA, i.e., the sequences present thymine (T) instead of uracil (U).
- nucleic acid refers to deoxyribonucleotides, ribonucleotides, or modified nucleotides, and polymers thereof in single- or double-stranded form.
- the term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides.
- Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs) and unlocked nucleic acids (UNAs; see, e.g., Jensen et al. Nucleic Acids Symposium Series 52: 133-4), and derivatives thereof.
- Nucleotide is used as recognized in the art, to include those with natural bases (standard), and modified bases well known in the art. Such bases are generally located at the 1′ position of a nucleotide sugar moiety. Nucleotides generally comprise a base, sugar and a phosphate group. The nucleotides can be unmodified or modified at the sugar, phosphate and/or base moiety, also referred to interchangeably as nucleotide analogs, modified nucleotides, non-natural nucleotides, non-standard nucleotides and other (see, e.g., WO 92/07065; WO 93/15187; the contents of which are incorporated herein by reference).
- nucleic acid bases There are several examples of modified nucleic acid bases known in the art as summarized by Limbach, et al, Nucleic Acids Res. 22:2183, 1994. Some of the non-limiting examples of base modifications that can be introduced into nucleic acid molecules include, hypoxanthine, purine, pyridin-4-one, pyridin-2-one, phenyl, pseudouracil, 2,4,6-trimethoxy benzene, 3-methyl uracil, dihydrouridine, naphthyl, aminophenyl, 5-alkylcytidines (e.g., 5-methylcytidine), 5-alkyluridines (e.g., ribothymidine), 5-halouridine (e.g., 5-bromouridine) or 6-azapyrimidines or 6-alkylpyrimidines (e.g.
- modified bases in this aspect is meant nucleotide bases other than adenine, guanine, cytosine and uracil at 1′ position or their equivalents.
- Modified nucleotide refers to a nucleotide that has one or more modifications to the nucleoside, the nucleobase, pentose ring, or phosphate group. Modifications include those naturally occurring that result from modification by enzymes that modify nucleotides, such as methyltransferases. Modified nucleotides also include synthetic or non-naturally occurring nucleotides.
- Synthetic or non-naturally occurring modifications in nucleotides include those with 2′ modifications, e.g., 2′-methoxyethoxy, 2′-fluoro, 2′-allyl, 2′-O-[2-(methylamino)-2-oxoethyl], 4′-thio, 4′-CH 2 —O-2′-bridge, 4′-(CH 2 ) 2 —O-2′-bridge, 2′-LNA or other bicyclic or “bridged” nucleoside analog, and 2′-O—(N-methylcarbamate) or those comprising base analogs.
- 2′ modifications e.g., 2′-methoxyethoxy, 2′-fluoro, 2′-allyl, 2′-O-[2-(methylamino)-2-oxoethyl], 4′-thio, 4′-CH 2 —O-2′-bridge, 4′-(CH 2 ) 2 —O-2′-bridge, 2′-LNA or other bi
- amino is meant 2′-NH 2 or 2′-O—NH 2 , which can be modified or unmodified. Such modified groups are described, e.g., in U.S. Pat. Nos. 5,672,695 and 6,248,878. “Modified nucleotides” of the instant invention can also include nucleotide analogs as described above.
- base analog refers to a heterocyclic moiety which is located at the 1′ position of a nucleotide sugar moiety in a modified nucleotide that can be incorporated into a nucleic acid duplex (or the equivalent position in a nucleotide sugar moiety substitution that can be incorporated into a nucleic acid duplex).
- a base analog may be generally a purine or a pyrimidine base, excluding the common bases guanine (G), cytosine (C), adenine (A), thymine (T), and uracil (U). Base analogs can duplex with other bases or base analogs in dsRNAs.
- Base analogs include those useful in the compounds and methods of the invention, e.g., those disclosed in U.S. Pat. Nos. 5,432,272, 6,001,983 and 7,579,451, which are herein incorporated by reference.
- bases include hypoxanthine (I), xanthine (X), 313-D-ribofuranosyl-(2,6-diaminopyrimidine) (K), 3-gamma-D-ribofuranosyl-(1-methyl-pyrazolo[4,3-d]pyrimidine-5,7(4H,6H)-dione) (P), iso-cytosine (iso-C), iso-guanine (iso-G), 1-gamma-D-ribofuranosyl-(5-nitroindole), 1-gamma-D-ribofuranosyl-(3-nitropyrrole), 5-bromouracil, 2-aminopurine, 4-thio-dT, 7-(2-thio-
- Base analogs may also be a universal base.
- Universal base refers to a heterocyclic moiety located at the 1′ position of a nucleotide sugar moiety in a modified nucleotide, or the equivalent position in a nucleotide sugar moiety substitution, that, when present in a nucleic acid duplex, can be positioned opposite more than one type of base without altering the double helical structure (e.g., the structure of the phosphate backbone). Additionally, the universal base does not destroy the ability of the single stranded nucleic acid in which it resides to duplex to a target nucleic acid.
- microRNAs of the invention SEQ miR name ID NO. Sequence hsa-miR-222-3p 1 AGCTACATCTGGCTACTGGGT 2 AGCTACATCTGGCTACTGGGTCTC hsa-miR-551b-3p 3 GACCCATACTTGGTTTCAGAGG 4 GCGACCCATACTTGGTTTCAG hsa-miR-31-5p 5 AGGCAAGATGCTGGCATAGCT 6 AGGCAAGATGCTGGCATAGCTGT 7 GGCAAGATGCTGGCATAGCTG hsa-miR-375 8 TTTGTTCGTTCGGCTCGCGTGA hsa-miR-125b-5p 9 TCCCTGAGACCCTAACTTGTGA hsa-miR-146b-5p 10 TGAGAACTGAATTCCATAGGCT 11 TGAGAACTGAATTCCATAGGCTGT hsa-miR-152-3p 12 TCAGTGCATGACAGAACTTGG 13 TCAGTGCATGACA
- miR name is the miRBase registry name (release 20), except for the miR names represented by MID-[numeral] or MD2-[numeral].
- MID-00078, MID-00321, MID-00387, MID-00671, MID-00672, MID-00690, MID-15965, MID-16318, MID-17144, MID-17866, MID-18468, MID-19433, MID-19434, MID-23168, MID-23794, MID-24496, MID-24705, MD2-495 and MD2-437 are putative microRNAs, which were predicted and/or cloned at Rosetta Genomics.
- the nucleic acid may also comprise a miR hairpin sequence shown in Table 2, or a variant thereof.
- Hairpins of the microRNAs of the invention Hairpin SEQ ID miR name NO. Hairpin Sequence hsa-mir-7 183 GTGGACCGGCTGGCCCCATCTGGAAGACTAGTGATTTTGTTGTTGTCTTA CTGCGCTCAACAACAAATCCCAGTCTACCTAATGGTGCCAGCCATCGC hsa-mir-10a 184 GTCTTCTGTATATACCCTGTAGATCCGAATTTGTGTAAGGAATTTTGTGG TCACAAATTCGTATCTAGGGGAATATGTAGTTGAC hsa-mir-18a 185 GTTCTAAGGTGCATCTAGTGCAGATAGTGAAGTAGATTAGCATCTACTGC CCTAAGTGCTCCTTCTGGC hsa-mir-21 186 GTACCACCTTGTCGGGTAGCTTATCAGACTGATGTTGACTGTTGAATCTC ATGGCAACACCAGTCGATGGGCTGTCTGACATTTTGGTAT hsa-mir-23a 187 GGCCGGCTGGTTCCTGGG
- SEQ ID NOs.183-306 in Table 2 present the cDNA corresponding to the sequence of the naturally occurring pre-miR, i.e., the sequences present thymine (T) instead of uracil (U).
- the nucleic acid may be in the form of a nucleic acid complex, and may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, or an aptamer.
- the nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof.
- the pre-miRNA sequence may comprise from 45-90, 60-80 or 60-70 nucleotides.
- the sequence of the pre-miRNA may comprise a miRNA and a miRNA* as set forth herein.
- the sequence of the pre-miRNA may comprise the sequence of SEQ ID NOS: 183-308 or variants thereof.
- the nucleic acid may be at least 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical to the nucleic acid sequences in Tables 1 or 2 (with increments of 1% from 80 to 99%), over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50 or more nucleotides.
- the nucleic acid may also comprise a sequence of a microRNA (including a miRNA*) or a variant thereof, including those putative microRNAs represented by MID-[numeral].
- microRNAs include those miRs which have been listed in the miRBase registry name (release 20), as well as putative microRNAs which have been predicted and/or cloned by Rosetta Genomics and which are represented by MID-[numeral].
- the microRNA sequence may comprise from 13-33, 18-24 or 21-23 nucleotides.
- the microRNA may also comprise a total of at least 5, 67, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides.
- the sequence of the microRNA may be the first 13-33 nucleotides of the pre-miRNA.
- the sequence of the microRNA may also be the last 13-33 nucleotides of the pre-miRNA.
- the sequence of the microRNA may comprise the sequence of any one of SEQ ID NOS: 1-182 or a variant thereof.
- the present invention employs microRNAs for the identification, classification and diagnosis of thyroid nodules.
- “Variant”, as used herein referring to a nucleic acid means (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that differs from the referenced nucleotide sequence by a point-mutation or the complement thereof; (iv) a naturally-occurring variant of the referenced nucleotide sequence present in the general population or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, of the complement thereof.
- Probe means an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. For example, for hybridization assays, the probe may be complementary to at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20 contiguous nucleotides of the sequence of the microRNA being detected.
- the probe may be complementary to at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20 contiguous nucleotides of the sequence of the PCR product being detected.
- a probe may be complementary to, or may hybridize to at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% of its target nucleic acid.
- a probe may be single-stranded or partially single- and partially double-stranded.
- the strandedness of the probe is dictated by the structure, composition and properties of the target sequence.
- Probes may include a label, an attachment, or a nucleotide sequence that does not naturally occur in a nucleic acid described herein. Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may bind.
- Probe may be an agent for detecting the nucleic acid sequences described herein.
- Probe may be a labeled nucleic acid probe capable of hybridizing to a portion of the nucleic acid sequence of the invention, or amplification products derived therefrom.
- the nucleic acid probe is reverse complementary nucleic acid molecule of the nucleic acid sequence disclosed herein.
- a probe may be a nucleic acid sequence which sufficiently specifically hybridizes under stringent conditions to the nucleic acid disclosed herein.
- a probe is optionally labeled with a fluorescent molecule such as a fluorescein, e.g. 6-carboxyfluorescein (FAM), an indocarbocyanine, e.g.
- FAM 6-carboxyfluorescein
- QUASAR-670 a hexafluorocine, such as 6-carboxyhexafluorescein (HEX), or other fluorophore molecules and optionally a quencher.
- a quencher is appreciated to be matched to a fluorophore.
- Illustrative examples of a quencher include the black hole quenchers BHQ1, and BHQ2, or minor groove binders (MGB), e.g. dihydrocyclopyrroloindole tripeptide.
- MGB minor groove binders
- Other fluorophores and quenchers are known in the art and are similarly operable herein.
- the present invention also provides a probe, said probe comprising the novel nucleic acid sequences described herein, defined by any one of SEQ ID NOs. 27-29, 33, 34, 139, 140, 307 and 308, or variants thereof.
- Probes may be used for screening and diagnostic methods.
- the probe may be attached or immobilized to a solid substrate, such as a biochip.
- the probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides.
- the probe may have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280 or 300 nucleotides.
- the probe may further comprise a linker sequence of from 10-60 nucleotides.
- the probe may further comprise a linker.
- the linker may comprise a sequence that does not occur naturally in a nucleic acid described herein.
- the linker may be 10-60 nucleotides in length.
- the linker may be 20-27 nucleotides in length.
- the linker may be of sufficient length to allow the probe to be a total length of 45-60 nucleotides.
- the linker may not be capable of forming a stable secondary structure, or may not be capable of folding on itself, or may not be capable of folding on a non-linker portion of a nucleic acid contained in the probe.
- the sequence of the linker is heterogeneous, and it may not appear in the genome of the animal from which the probe non-linker nucleic acid is derived.
- the term “reference value” means a value that statistically correlates to a particular outcome when compared to an assay result.
- the reference value is determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes.
- the reference value may vary according to the classifier (i.e. the algorithm) used.
- the reference value may be the expression levels (or values) of all the microRNAs in the training data.
- the reference value may be one or more thresholds established by the classifier.
- the reference value may further be a coefficient or set of coefficients. Essentially the reference value refers to any parameter needed or used by the algorithm.
- “Sensitivity”, as used herein, may mean a statistical measure of how well a classification test correctly identifies a condition or conditions, for example, how frequently it correctly classifies a cancer into the correct type out of two or more possible types.
- sensitivity means how frequently the test correctly classifies a cancer into the correct type out of two possible types.
- the sensitivity for class A is the proportion of cases that are determined to belong to class “A” by the test out of the cases that are in class “A”, as determined by some absolute or gold standard.
- “Smear”, as used herein, refers to a sample of thyroid tissue spread thinly on a microscope slide for examination, typically for medical diagnosis. Smears from FNAs usually have very small amounts of cells, which results in small amounts of RNA, which may range from 1-1000 ng, 1-100 ng, 1-50 ng, 1-40 ng, 5-100 ng, 5-50 ng, 5-40 ng, accordingly. Smears may be stained with any stain known to the man skilled in the art of cytology, histology or pathology, such as any stain used to differentiate cells in pathologic specimens.
- stains are multichromatic stains, like Papanicolaou, which are a combination of nuclear stain and cytoplasm stain; cellular structure stains such as Wright, Giemsa, Romanowsky and the like; nuclear stains, such as Hoescht stains and the like; cell viability stains, such as Trypan blue, and the like, enzyme activity, such as benzidine for HRP to form visible precipitate and the like.
- Specificity may mean a statistical measure of how well a binary classification test correctly identifies cases that do not have a specific condition, for example, how frequently it correctly classifies a sample as non-cancer when indeed it is a non-cancerous sample.
- the specificity for class A is the proportion of cases that are determined to belong to class “not A” by the test out of the cases that are in class “not A”, as determined by some absolute or gold standard.
- Specificity may mean a statistical measure of how well a classification test correctly identifies cases that do not have a specific condition.
- the specificity for class A is the proportion of cases that are determined by the test not to belong to class A out of the cases that are not in class A, as determined by some absolute or gold standard.
- stage of cancer refers to a numerical measurement of the level of advancement of a cancer. Criteria used to determine the stage of a cancer include, but are not limited to, the size of the tumor, whether the tumor has spread to other parts of the body and where the cancer has spread (e.g., within the same organ or region of the body or to another organ).
- Stringent hybridization conditions mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid sequence (e.g., target), such as in a complex mixture of nucleic acids. Stringent conditions are sequence-dependent and will be different in different circumstances. Stringent conditions may be selected to be about 5-10° C. lower than the thermal melting point (T m ) for the specific sequence at a defined ionic strength pH. The T m may be the temperature (under defined ionic strength, pH and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at T m , 50% of the probes are occupied at equilibrium).
- Stringent conditions may be those in which the salt concentration is less than about 1.0 M sodium ion, such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C. for short probes (e.g., about 10-50 nucleotides) and at least about 60° C. for long probes (e.g., greater than about 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal may be at least 2 to 10 times background hybridization.
- Exemplary stringent hybridization conditions include the following: 50% formamide, 5 ⁇ SSC, and 1% SDS, incubating at 42° C., or, 5 ⁇ SSC, 1% SDS, incubating at 65° C., with wash in 0.2 ⁇ SSC, and 0.1% SDS at 65° C., DMSO, 6 ⁇ SSPE+0.005% N-Lauroylsarcosine+0.005% Triton X-102, 0.06 ⁇ SSPE+0.005% N-Lauroylsarcosine+0.005% Triton X-102.
- the term “subject” refers to a mammal, including both human and other mammals.
- the methods of the present invention are preferably applied to human subjects.
- subtype of cancer refers to different types of cancer that affect the same organ (e.g., papillary, follicular carcinoma and follicular variant papillary carcinoma of the thyroid).
- thyroid lesion may mean a thyroid tumor, including sub-types of thyroid tumors, such as Hashimoto disease, follicular carcinoma, papillary carcinoma, follicular variant of papillary carcinoma (FVPC or FVPTC), encapsulated FVPC (or encapsulated FVPTC), non-encapsulated (infiltrative/diffuse) FVPC or FVPTC, medullary carcinoma, anaplastic thyroid cancer, or poorly differentiated thyroid cancer.
- sub-types of thyroid tumors such as Hashimoto disease, follicular carcinoma, papillary carcinoma, follicular variant of papillary carcinoma (FVPC or FVPTC), encapsulated FVPC (or encapsulated FVPTC), non-encapsulated (infiltrative/diffuse) FVPC or FVPTC, medullary carcinoma, anaplastic thyroid cancer, or poorly differentiated thyroid cancer.
- threshold expression profile refers to a criterion expression profile to which measured values are compared in order to classify a tumor.
- tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts.
- the phrase “suspected of being cancerous”, as used herein, means a cancer tissue sample believed by one of ordinary skill in the medical arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser-based microdissection, or other art-known cell-separation methods.
- Tumor refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues.
- the cytological classification of the thyroid lesions or tumor samples used herein is based on “The Bethesda System for Reporting Thyroid Cytopathology”, the “BSRTC” (Syed, Z. Ali and Edmund S. Cibas, eds.; DOI 10.1007/978-0-387-87666-5_1; Springer Science+Business Media, LLC 2010).
- the BSRTC recommends that each thyroid FNA report be accompanied by a general diagnostic category, in which each category has an implied cancer risk.
- Indeterminate refers to thyroid lesions or tumor samples examined for cytology and classified according to the Bethesda classification in categories III, IV and V.
- the present invention further provides a method for identifying subtypes of thyroid lesions in a subject, said subtypes of thyroid lesions being said subtypes of malignant or benign thyroid tumor.
- Subtype is any one of follicular carcinoma, papillary carcinoma, follicular variant of papillary carcinoma (FVPC or FVPTC), encapsulated FVPC (or encapsulated FVPTC), non-encapsulated FVPC (or non-encapsulated FVPTC), medullary carcinoma, anaplastic thyroid cancer or poorly differentiated thyroid cancer.
- said subtype is any one of Hashimoto thyroiditis, follicular adenoma or hyperplasia.
- said subtype is Hurthle cell carcinoma.
- the present invention provides a method for distinguishing between follicular adenoma and follicular carcinoma.
- the present invention provides a method for distinguishing follicular adenoma from papillary carcinoma.
- the present invention provides a method for distinguishing follicular adenoma from follicular variant of papillary carcinoma.
- the present invention provides a method for distinguishing non-encapsulated follicular variant of papillary carcinoma from benign lesions.
- the present invention provides a method for distinguishing papillary carcinoma and Hashimoto thyroiditis.
- Vector refers to any known vector such as a plasmid vector, a phage vector, a phagemid vector, a cosmid vector, or a virus vector.
- the nucleic acid described herein may be comprised in a vector.
- the vector may be used for delivery of the nucleic acid.
- the vector preferably contains at least a promoter that enhances expression of the nucleic acid carried, and in this case the nucleic acid is preferably operably linked to such a promoter.
- the vector may or may not be replicable in a host cell, and the transcription of a gene may be carried out either outside the nucleus or within the nucleus of a host cell. In the latter case, the nucleic acid may be incorporated into the genome of a host cell.
- a vector may be a DNA or RNA vector.
- a vector may be either a self-replicating extrachromosomal vector or a vector that integrates into a host genome.
- the levels of microRNAs are measured by reverse transcription polymerase chain reaction (RT-PCR).
- Target sequences of a cDNA are generated by reverse transcription of a target RNA, which may be a nucleic acid described herein (comprising a sequence provided in Tables 1 and 2).
- RNA which may be a nucleic acid described herein (comprising a sequence provided in Tables 1 and 2).
- Known methods for generating cDNA involve reverse transcribing either polyadenylated RNA or alternatively, RNA with a ligated adaptor sequence.
- RNA may be ligated to an adaptor sequence prior to reverse transcription.
- a ligation reaction may be performed by T4 RNA ligase to ligate an adaptor sequence at the 3′ end of the RNA.
- Reverse transcription (RT) reaction may then be performed using a primer comprising a sequence that is complementary to the 3′ end of the adaptor sequence.
- polyadenylated RNA may be used in a reverse transcription (RT) reaction using a poly(T) primer comprising a 5′ adaptor sequence.
- the poly(T) sequence may comprise 8, 9, 10, 11, 12, 13, or 14 consecutive thymines.
- the reverse transcript of the RNA may then be amplified by real-time PCR, using a specific forward primer comprising at least 15 nucleic acids complementary to the target nucleic acid and a 5′ tail sequence; a reverse primer that is complementary to the 3′ end of the adaptor sequence; and a probe comprising at least 8 nucleic acids complementary to the target nucleic acid.
- the probe may be partially complementary to the 5′ end of the adaptor sequence.
- the amplification of the reverse transcripts of the target nucleic acids may be by PCR or the like.
- the first cycles of the PCR reaction may have an annealing temperature of 56° C., 57° C., 58° C., 59° C., or 60° C.
- the first cycles may comprise 1-10 cycles.
- the remaining cycles of the PCR reaction may be 60° C.
- the remaining cycles may comprise 2-40 cycles.
- the PCR reaction comprises a forward primer.
- the forward primer may comprise 15, 16, 17, 18, 19, 20, or 21 nucleotides identical to the target nucleic acid.
- the 3′ end of the forward primer may be sensitive to differences in sequence between a target nucleic acid and highly similar sequences.
- the forward primer may also comprise a 5′ overhanging tail.
- the 5′ tail may increase the melting temperature of the forward primer.
- the sequence of the 5′ tail may comprise a sequence that is non-identical to the target nucleic acid.
- the sequence of the 5′ tail may also be synthetic.
- the 5′ tail may comprise 8, 9, 10, 11, 12, 13, 14, 15, or 16 nucleotides. Examples of forward primers used in the invention are provided in Table 8.
- the PCR reaction comprises a reverse primer.
- the reverse primer may be complementary to a target nucleic acid.
- the reverse primer may also comprise a sequence complementary to an adaptor sequence. Examples of reverse primers used in the invention are provided in Example 8.
- the probes used to detect products of RT-PCR amplification may be general probes or sequence-specific probes.
- General probes are designed to detect (or hybridize with) RT-PCR amplification products in a non-sequence specific manner. Said probes are between 16 and 20 nucleotides long, preferably 18 nucleotides long, and comprise a sequence which is the reverse complement of the RT primer, including 4 adenines (As) at the 5′ end.
- Sequence-specific probes are designed to detect (or hybridize with) RT-PCR amplification products based on total or partial complementarity between the sequence of the probe and the sequence of the RT-PCR product.
- Said probes are between 20 and 28 nucleotides longs, preferably 24 nucleotides long, and comprising at the 5′ end three nucleotides from each at least two are complementary to the RT primer, followed by between 10 to 14, preferably 12 thymines (Ts), followed by between 6 to 10, preferably 8 contiguous nucleotides which correspond to the reverse complementary sequence of the specific corresponding microRNA.
- Ts thymines
- a biochip comprising novel nucleic acids described herein is provided.
- the biochip may comprise probes that recognize the novel nucleic acids described herein.
- Said nucleic acids are isolated nucleic acids comprising at least 12 contiguous nucleotides at least 80% identical to the sequence of any one of SEQ ID NOs. 27-29, 33, 34, 139, 140, 307 and 308.
- said isolated nucleic acid comprises at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, or at least 20 contiguous nucleotides identical to the sequence of any one of SEQ ID NOs. 27-29, 33, 34, 139, 140, 307 and 308.
- the biochip may comprise a solid substrate comprising an attached nucleic acid, probe or plurality of probes described herein.
- the probes may be capable of hybridizing to a target sequence under stringent hybridization conditions.
- the probes may be attached at spatially defined addresses on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence.
- the probes may be capable of hybridizing to target sequences associated with a single disorder appreciated by those in the art.
- the probes may either be synthesized first, with subsequent attachment to the biochip, or may be directly synthesized on the biochip.
- the solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes and is amenable to at least one detection method.
- substrates include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics.
- the substrates may allow optical detection without appreciably fluorescing.
- the substrate may be planar, although other configurations of substrates may be used as well. For example, probes may be placed on the inside surface of a tube, for flow-through sample analysis to minimize sample volume.
- the substrate may be flexible, such as flexible foam, including closed cell foams made of particular plastics.
- the biochip and the probe may be derivatized with chemical functional groups for subsequent attachment of the two.
- the biochip may be derivatized with a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups.
- the probes may be attached using functional groups on the probes either directly or indirectly using a linker.
- the probes may be attached to the solid support by either the 5′ terminus, 3′ terminus, or via an internal nucleotide.
- the probe may also be attached to the solid support non-covalently.
- biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment.
- probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
- measuring the microRNAs for classification of thyroid lesions may be effected by high throughput sequencing.
- High throughput sequencing can involve sequencing-by-synthesis, sequencing-by-ligation, and ultra-deep sequencing. Sequence-by-synthesis can be initiated using sequencing primers complementary to the sequencing element on the nucleic acid tags.
- the method involves detecting the identity of each nucleotide immediately after (substantially real-time) or upon (real-time) the incorporation of a labeled nucleotide or nucleotide analog into a growing strand of a complementary nucleic acid sequence in a polymerase reaction. After the successful incorporation of a label nucleotide, a signal is measured and then nulled by methods known in the art.
- sequence-by-synthesis methods are known in the art, and are described for example in U.S. Pat. Nos. 7,056,676, 8,802,368 and 7,169,560, the contents of which are incorporated herein by reference.
- labels that can be used to label nucleotide or nucleotide analogs for sequencing-by-synthesis include, but are not limited to, chromophores, fluorescent moieties, enzymes, antigens, heavy metal, magnetic probes, dyes, phosphorescent groups, radioactive materials, chemiluminescent moieties, scattering or fluorescent nanoparticles, Raman signal generating moieties, and electrochemical detection moieties.
- Sequencing-by-synthesis can generate at least 1,000, at least 5,000, at least 10,000, at least 20,000, 30,000, at least 40,000, at least 50,000, at least 100,000 or at least 500,000 reads per hour.
- Such reads can have at least 40, at least 45, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 120 or at least 150 bases per read.
- Sequencing-by-synthesis may be performed on a solid surface (or a chip) using fold-back PCR and anchored primers. Since microRNAs occur as small nucleic acid fragments—adaptors are added to the 5′ and 3′ ends of the fragments. Nucleic acid fragments that are attached to the surface of flow cell channels are extended and bridge amplified. The fragments become double stranded, and the double stranded molecules are denatured. Multiple cycles of the solid-phase amplification followed by denaturation can create several million clusters of approximately 1,000 copies of single-stranded nucleic acid molecules of the same template in each channel of the flow cell.
- Primers, polymerase and four fluorophore-labeled, reversibly terminating nucleotides are used to perform sequential sequencing. After nucleotide incorporation, a laser is used to excite the fluorophores, an image is captured and the identity of the first base is recorded. The 3′ terminators and fluorophores from each incorporated base are removed and the incorporation, detection and identification steps are repeated. This technology is used, for example, in the Illumina® sequencing platform.
- Another sequencing method involves hybridizing the amplified regions to a primer complementary to the sequence element in an LST (a file listing the names of fasta files).
- This hybridization complex is incubated with a polymerase, ATP sulfurylase, luciferase, apyrase, and the substrates luciferin and adenosine 5′ phosphosulfate.
- deoxynucleotide triphosphates corresponding to the bases A, C, G, and T (U) are added sequentially.
- Each base incorporation is accompanied by release of pyrophosphate, converted to ATP by sulfurylase, which drives synthesis of oxyluciferin and the release of visible light.
- pyrophosphate release is equimolar with the number of incorporated bases
- the light given off is proportional to the number of nucleotides adding in any one step.
- the process is repeated until the entire sequence is determined.
- Yet another sequencing method involves a four-color sequencing by ligation scheme (degenerate ligation), which involves hybridizing an anchor primer to one of four positions. Then an enzymatic ligation reaction of the anchor primer to a population of degenerate nonamers that are labeled with fluorescent dyes is performed. At any given cycle, the population of nonamers that is used is structure such that the identity of one of its positions is correlated with the identity of the fluorophore attached to that nonamer.
- the fluorescent signal allows the inference of the identity of the base.
- the anchor primer:nonamer complexes are stripped and a new cycle begins.
- Methods to image sequence information after performing ligation are known in the art. In some cases, high throughput sequencing involves the use of ultra-deep sequencing, such as described in Marguiles et al., Nature 437 (7057): 376-80 (2005).
- MicroRNA sequencing is a type of RNA Sequencing (RNA-Seq) which uses next-generation sequencing or massively parallel high-throughput DNA sequencing to sequence microRNAs. miRNA-seq differs from other forms of RNA-Seq in that input material is often enriched for small RNAs. miRNA-seq provides tissue specific expression patterns, which may lead to disease associations and microRNAs isoforms. miRNA-seq is also used for the discovery of previously uncharacterized microRNAs, such as the nucleic acid sequences denoted by SEQ ID NOs 139-140 and 307-308.
- diagnosis refers to classifying pathology, or a symptom, determining a severity of the pathology (grade or stage), monitoring pathology progression, forecasting an outcome of pathology and/or prospects of recovery.
- the phrase “subject in need thereof” refers to an human subject who is known to have cancer, at risk of having cancer (e.g., a genetically predisposed subject, a subject with medical and/or family history of cancer, a subject who has been exposed to carcinogens, occupational hazard, environmental hazard) and/or a subject who exhibits suspicious clinical signs of cancer (e.g., nodules in the thyroid). Additionally or alternatively, the subject in need thereof can be a healthy human subject undergoing a routine well-being check-up.
- cancer e.g., a genetically predisposed subject, a subject with medical and/or family history of cancer, a subject who has been exposed to carcinogens, occupational hazard, environmental hazard
- a subject who exhibits suspicious clinical signs of cancer e.g., nodules in the thyroid.
- the subject in need thereof can be a healthy human subject undergoing a routine well-being check-up.
- Analyzing presence of malignant or pre-malignant cells can be effected in vivo or ex vivo, whereby a biological sample (e.g., biopsy) is retrieved.
- a biological sample e.g., biopsy
- Such biopsy samples comprise cells and may be an incisional or excisional biopsy.
- the sample may be retrieved from the thyroid of the subject, and may be retrieved using FNA. Alternatively the cells may be retrieved from a complete resection.
- treatment regimen refers to a treatment plan that specifies the type of treatment, dosage, schedule and/or duration of a treatment provided to a subject in need thereof (e.g., a subject diagnosed with a pathology).
- the selected treatment regimen can be an aggressive one which is expected to result in the best clinical outcome (e.g., complete cure of the pathology) or a more moderate one which may relieve symptoms of the pathology yet results in incomplete cure of the pathology. It will be appreciated that in certain cases the treatment regimen may be associated with some discomfort to the subject or adverse side effects (e.g., damage to healthy cells or tissue).
- the type of treatment can include a surgical intervention (e.g., removal of lesion, diseased cells, tissue, or organ), a cell replacement therapy, an administration of a therapeutic drug (e.g., receptor agonists, antagonists, hormones, chemotherapy agents) in a local or a systemic mode, an exposure to radiation therapy using an external source (e.g., external beam) and/or an internal source (e.g., brachytherapy) and/or any combination thereof.
- a surgical intervention e.g., removal of lesion, diseased cells, tissue, or organ
- a cell replacement therapy e.g., an administration of a therapeutic drug (e.g., receptor agonists, antagonists, hormones, chemotherapy agents) in a local or a systemic mode
- an exposure to radiation therapy using an external source e.g., external beam
- an internal source e.g., brachytherapy
- the dosage, schedule and duration of treatment can vary, depending on the severity of pathology and the selected type of treatment, and those
- a method of diagnosis comprises detecting an expression level of a specific cancer-associated nucleic acid in a biological sample. Diagnosis of a specific cancer state in a patient may allow for prognosis and selection of therapeutic strategy. Further, the developmental stage of cells may be classified by determining temporarily expressed specific cancer-associated nucleic acids.
- In situ hybridization of labeled probes to tissue sections or FNA smears may be performed.
- the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the nucleic acid sequence which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
- kits may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base.
- the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein.
- the kit may further comprise a software package for data analysis of expression profiles.
- the kit may be a kit for the amplification, detection, identification or quantification of a target nucleic acid sequence.
- the kit may comprise a poly (T) primer, a forward primer, a reverse primer, and a probe.
- a kit for thyroid tumor classification said kit comprising: (a) probes for performing thyroid tumor classification, wherein said probes comprise any one of (i) DNA equivalents of microRNAs comprising at least one of SEQ ID NOs 1-37, (ii) the complements thereof, (iii) sequences at least 80% identical to (i) or (ii), (iv) a nucleic acid sequence that hybridizes with at least eight contiguous nucleotides of any one of SEQ ID NOs 1-37, or (v) a nucleic acid sequence that hybridizes with RT-PCR products; and optionally (b) an instruction manual for using said probes.
- kits may comprise, in suitable container means, an enzyme for labeling the microRNA by incorporating labeled nucleotide or unlabeled nucleotides that are subsequently labeled. It may also include one or more buffers, such as reaction buffer, labeling buffer, washing buffer, or a hybridization buffer, compounds for preparing the microRNA probes, components for in situ hybridization and components for isolating microRNA.
- buffers such as reaction buffer, labeling buffer, washing buffer, or a hybridization buffer, compounds for preparing the microRNA probes, components for in situ hybridization and components for isolating microRNA.
- Other kits of the invention may include components for making a nucleic acid array comprising microRNA, and thus, may include, for example, a solid support.
- RNA expression assays e.g., microarray analysis, RT-PCR, high throughput sequencing (next generation sequencing), cloning, and quantitative real time polymerase chain reaction (qRT-PCR).
- qRT-PCR quantitative real time polymerase chain reaction
- Proteins were degraded by proteinase K solution (5-12 ⁇ l Proteinase K (e.g., Sigma or ABI) in 500 ⁇ l of Buffer B (10 mM NaCl, 500 mM Tris pH 7.5, 20 mM EDTA pH 8, 1% SDS), at 45° C. for a few hours (about 16 hours). Proteinase K was inactivated by incubation at 95° C. for 7 minutes. After the tubes were chilled 10 ⁇ l of RNA synthetic spikes was added (e.g., 2 spikes of 0.15 fmol/ ⁇ l). RNA was extracted using acid phenol/chloroform equal volume, vortexing, followed by centrifugation at 4° C. for 15 minutes at 12000 g.
- Buffer B 10 mM NaCl, 500 mM Tris pH 7.5, 20 mM EDTA pH 8, 1% SDS
- RNA was then precipitated using 8 ⁇ l linear acrylamide, 0.1 volumes of 3M NaOAc pH 5.2, and 3 volumes of absolute 100% ethanol, for 30 minutes to 16 hours followed by centrifugation at 4° C. for at least 40 minutes at 20000 g (14,000 rpm). The pellet was washed by adding 1 ml 85% cold Ethanol. DNAses were introduced at 37° C. for 60 minutes to digest DNA (e.g. 10 ⁇ l TurboTM DNase), followed by extraction using acid phenol/chloroform and ethanol precipitated as described above.
- DNA e.g. 10 ⁇ l TurboTM DNase
- DDW double-distilled water
- RNA synthetic spikes e.g., 2 spikes of 0.15 fmol/ ⁇ l was added. RNA was extracted using acid phenol/chloroform equal volume, vortexing, spinning down at 4° C.
- RNA quantification was performed by fluorospectrometry in a NanoDrop 3300 (ND3300) fluorospectrometer using the RiboGreen® dye (Thermo Fisher Scientific®, Wilmington, Del.).
- the ND3300 RNA detection range is of 25 ng/ml-1000 ng/ml when using a high concentration of RiboGreen® dye (1:200 dilution), and 5 ng/ml-50 ng/ml when using a 1:2000 dilution of RiboGreen® dye.
- the RNA amounts which were determined by ND3300 were highly correlated to the detected expressed microRNA.
- Custom microarrays were generated by printing DNA oligonucleotide probes to: 2172 miRs sequences, 17 negative controls, 23 spikes, and 10 positive controls (total of 2222 probes).
- Negative spikes and positive probes were printed from 3 to 200 times. Seventeen (17) negative control probes were designed using sequences that do not match the genome.
- Two groups of positive control probes were designed to hybridize to the microRNA array: (i) synthetic small RNAs were spiked to the RNA before labeling to verify the labeling efficiency; and (ii) probes for abundant small RNA, e.g., small nuclear RNAs (U43, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA were spotted on the array to verify RNA quality.
- synthetic small RNAs were spiked to the RNA before labeling to verify the labeling efficiency
- probes for abundant small RNA e.g., small nuclear RNAs (U43, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA were spotted on the array to verify RNA quality.
- RNA (20-1000 ng) was labeled by ligation (Thomson et al. Nature Methods 2004; 1:47-53) with an RNA linker, p-rCrU-Cy/dye or several sequential Cys (BioSpring GmbH, IBA GmbH or equivalent), to the 3′ end with Cy3 or Cy5.
- the labeling reaction contained total RNA, spikes (0.1-100 fmoles), 250-400 ng RNA-linker-dye, 15% DMSO, 1 ⁇ ligase buffer and 20 units of T4 RNA ligase (NEB or equivalent), and proceeded at 4° C. for 1 hour, followed by 1 hour at 37° C., followed by 4° C. up to 40 minutes.
- the labeled RNA was mixed with 30 ⁇ l hybridization mixture (mixture of 45 ⁇ L of the 10 ⁇ GE Agilent Blocking Agent and 246 ⁇ L of 2 ⁇ Hi-RPM Hybridization).
- the labeling mixture was incubated at 100° C. for 5 minutes followed by ice incubation in water bath for 5 minutes. Slides were hybridized at 54-55° C. for 16-20 hours, followed by two washes.
- the first wash was conducted at room temperature with Agilent GE Wash Buffer 1 (e.g. 6 ⁇ SSPE+0.005% N-Lauroylsarcosine+0.005% Triton X-102,) for 5 minutes followed by a second wash with Agilent GE Wash Buffer 2 at 37° C. for 5 minutes (e.g. 0.06 ⁇ SSPE+0.005% N-Lauroylsarcosine+0.005% Triton X-102).
- Arrays were scanned using a microarray scanner (Agilent Microarray Scanner Bundle G2565BA, resolution of 5 ⁇ m at XDR Hi 100%, XDR Lo 10%). Array images were analyzed using appropriate software (Feature Extraction 10.7 software, Agilent).
- poly (A) polymerase Poly (A) Polymerase NEB-M0276L
- ATP an oligodT primer harboring a consensus sequence
- reverse transcriptase SuperScript® II RT, Invitrogen, Carlsbad, Calif.
- the amplification reaction included a microRNA-specific forward primer, being a TaqMan® (MGB) probe complementary to the 3′ of the specific microRNA sequence and or to part of the polyA adaptor sequence, and a universal reverse primer complementary to the consensus 3′ sequence of the oligodT tail.
- MGB TaqMan®
- RT-PCR methodology may be found in publication WO 2008/029295, the contents of which are incorporated herein by reference.
- the cycle threshold (CT, the PCR cycle at which probe signal reaches the threshold) was determined for each microRNA.
- each value obtained by RT-PCR was subtracted from 50 (50-CT).
- the 50-C T expression for each microRNA for each patient was compared with the signal obtained by the microarray method.
- the initial data set consisted of signals measured for multiple probes for every sample. For the analysis, signals were used only for probes that were designed to measure the expression levels of known or validated human microRNAs.
- Triplicate spots were combined into one signal by taking the logarithmic mean of the reliable spots. All data was log-transformed and the analysis was performed in log-space. A reference data vector for normalization, R, was calculated by taking the mean expression level for each probe in two representative samples, one from each tumor type.
- a 2nd degree polynomial F k was found so as to provide the best fit between the sample data and the reference data, such that R ⁇ F k (S k ).
- Remote data points (“outliers”) were not used for fitting the polynomials F.
- Statistical analysis is performed in log-space. For presentation and calculation of fold-change, data is translated back to linear-space by taking the exponent.
- Sequence library construction may be performed using a variety of different kits depending on the high-throughput sequencing platform being employed. However, there are several common steps for small RNA sequencing preparation.
- the ligation step adds DNA adaptors to both ends of the small RNAs, which act as primer binding sites during reverse transcription and PCR amplification.
- An adenylated single strand DNA 3′ adaptor followed by a 5′ adaptor is ligated to the small RNAs using a ligating enzyme such as T4 RNA ligase or adding 5′ adaptor using 5′ RACE reaction 2.
- the adaptors are also designed to capture small RNAs with a 5′ phosphate group, characteristic microRNAs, rather than RNA degradation products with a 5′ hydroxyl group.
- Reverse transcription and PCR amplification steps convert the small adaptor ligated RNAs into cDNA clones used in the sequencing reaction. PCR is then carried out to amplify the pool of cDNA sequences. Primers designed with unique nucleotide tags may also be used in this step to create ID tags in pooled library multiplex sequencing.
- RNA deep sequencing 500 ng of RNA from each FFPE sample were used for small RNA deep sequencing (miRSeq). Libraries were loaded on two lanes of the sequence analyzer (Illumina® HiSeqTM 2000 DNA). An average of about 6.3 million reads per library were obtained. To find novel microRNAs, sequence analysis software (miRDeep2, Friedlander M R et al. Nucleic Acids Res. 2012 January; 40(1):37-52) was applied on the raw sequencing data (primer-adapter sequences were trimmed).
- P-values were calculated using a two-sided (unpaired) Student's t-test on the log-transformed normalized fluorescence signal.
- the threshold for significant differences was determined by setting a false discovery rate (FDR) of 0.05 to 0.1, to correct for effects of multiple hypothesis testing, resulting in p-value cutoffs in the range of 0.01-0.06.
- FDR false discovery rate
- AUC area under curve
- miRs Three sets of miRs were excluded from the statistical analysis: (a) miRs that were previously found as highly expressed in blood samples (due to high percentages of blood in FNA samples), (b) miRs whose level of expression did not correlate with decreasing amounts of RNA, i.e: these miRs did not show linear decrease in signal in association with decreasing measured RNA amounts, and (c) miRs whose level of expression correlated with miRs in set (b).
- microRNA profiling was conducted in a few Papanicolaou, Giemsa and Diff-Quick stained smears from ex-vivo FNA biopsy samples in order to ensure feasibility of the methodology. Since FNA smears often have very few cells, providing a minuscule amount of RNA for analysis, e.g. 1-1000 ng, it was first necessary to evaluate whether microRNA would be detectable under such low RNA amounts. Thus, microRNA expression levels of about 2200 individual microRNAs was measured in Giemsa-stained papillary carcinoma and non-papillary carcinoma smears.
- microRNAs (hsa-miR-146b-5p, hsa-miR-31-5p, hsa-miR-222-3p, hsa-miR-221-3p, and hsa-miR-21-5p), previously shown to correlate with papillary carcinoma were found over-represented (upregulated) in the papillary-carcinoma smears.
- the cohort of samples used in the experimental analysis is shown in Table 7 as cohort II.
- the histological diagnosis assessed ultimately the malignancy or benignity of the thyroid lesions.
- the cytological classification was based on “The Bethesda System for Reporting Thyroid Cytopathology” (Syed, Z. Ali and Edmund S. Cibas, eds.; DOI 10.1007/978-0-387-87666-5_1; Springer Science+Business Media, LLC 2010).
- the study protocol was approved by the Institutional Review Board (IRB, equivalent to Ethical Review Board) of the contributing institution.
- Tumor classification was based on the World Health Organization (WHO) guidelines.
- An additional cohort consisted of 13 thyroid ex-vivo FNA smears, prepared after thyroidectomy, and obtained from the University Milano-Bicocca (Milan, Italy).
- RNA total RNA (at least 10 ng) was extracted from these samples, and microRNA expression was profiled using custom microarrays containing about 2200 miRs. The results exhibited a significant difference in the expression pattern between benign and malignant lesions of several miRs listed in Table 3 (upregulated or downregulated in malignant versus benign).
- a classification algorithm for differentiating between malignant and benign thyroid tumor was developed based on miRNA expression in 35 benign and 38 malignant FNA samples.
- a logistic regression classifier was trained to distinguish between malignant and benign thyroid lesions, based on eight miRs (hsa-miR-125b-5p, hsa-miR-21-5p, hsa-miR-222-3p, hsa-miR-221-3p, hsa-miR-146b-5p, hsa-miR-181a-5p, hsa-miR-138-5p, and MID-23794) that were found to be differentially expressed in these conditions, either between benign or malignant or between specific thyroid tumor subtypes (data not shown).
- the classifier reached 89% accuracy with sensitivity of 87% and specificity of 91% for identifying malignant samples.
- hsa-miR-125b-5p, hsa-miR-21-5p, hsa-miR-222-3p, hsa-miR-221-3p, hsa-miR-146b-5p and hsa-miR-181a-5p exhibited higher expression in malignant lesions, while hsa-miR-138-5p and MID-23794 exhibited higher expression in benign lesions (data not shown).
- Example 3 Distinguishing Different Sub-Types of Malignant and Benign Thyroid Lesions
- microRNAs that were upregulated or downregulated in follicular adenoma relative to follicular carcinoma are presented in Table 4.
- microRNAs that were upregulated or downregulated in follicular variant of papillary carcinoma relative to follicular adenoma are presented in Table 5.
- miRs Expression levels of miRs were compared in 8 Hashimoto thyroiditis samples and 9 (non-follicular) papillary carcinoma samples.
- microRNAs that were upregulated or downregulated in papillary carcinoma relative to Hashimoto thyroiditis are presented in Table 6.
- the miRs that are the best candidates for the profile signature for comparing these two thyroid lesions are hsa-miR-146b-5p, hsa-miR-200a-3p and MID-23794.
- FFPE Form Fixed Paraffin Embedded thyroid resection samples (obtained from surgical biopsies and fixed in formalin and preserved in paraffin) from follicular lesions were obtained from the Department of Pathology at Rabin Medical Center. The specimens included 6 follicular adenomas and 5 follicular carcinomas. Tumor cellular content was higher than 50% in all the samples.
- FIG. 1A shows the secondary structures of the two novel microRNAs, predicted by sequence analysis software.
- FIG. 1B shows the expression of the two novel microRNAs (normalized number of reads) in each of the 11 samples. The color-coded bar on the right represents a scale for expression.
- Example 5 Specific microRNAs are Differentially Expressed Between Benign and Malignant Thyroid Lesions
- RNA including the microRNA fraction
- FFPE and cytological (FNA) samples were profiled by custom printed microarrays measuring over 2000 microRNAs to identify differentially expressed microRNAs and to develop a classifier.
- microRNAs Differential expression of microRNAs was found between benign and malignant neoplasms. Classification of malignant vs. benign smears based on two microRNAs: hsa-miR-146b-5p and hsa-miR-375 results in over 85% accuracy (based on the median of ten 10-fold cross-validation runs, data not shown).
- Hsa-miR-375 is a Significant Marker for Medullary Thyroid Carcinoma in FNA Samples
- hsa-miR-375 is a significant marker for medullary thyroid carcinoma.
- Example 7 Stained Thyroid Smears can be Used for microRNA Profiling
- MicroRNA expression level in samples stained with different dyes was compared in order to evaluate microRNA stability and reproducibility of the microRNA level detection upon staining (data not shown).
- a total of 143 smears from FNA cohort I were stained as follows: 60 with May-Griinwald Giemsa (MGG), 64 with DiffQuik and 19 with Papanicolaou.
- MicroRNA expression levels in duplicates of the same sample stained with different dyes showed significant correlation (more than expected).
- the normalized expression level of hsa-miR-146b-5p SEQ ID NO: 10 or 11
- SEQ ID NO: 10 or 11 is similar when the same sample is stained with different dyes. Therefore, different cytological dyes used in the clinical setting (Papanicolaou; May-Griinwald Giemsa; and DiffQuik) do not affect the detection and quantification of microRNA expression.
- microRNAs A total of twenty-four (24) microRNAs overall were chosen for establishing the status of thyroid samples as malignant versus benign. MicroRNA expression was measured by RT-PCR as described above. The list of miRs and their respective forward primers are provided in Table 8. First-strand generation was done using polyT adaptor presented below. Forward primers were sequence-specific while the reverse primer was universal.
- Detection of the RT-PCR products was done with the universal MGB probe for miRs hsa-miR-31-5p, hsa-miR-5701, hsa-miR-424-3p (SEQ ID NO: 16), MID-50971, MID-20094, MID-50976, hsa-miR-3074-5p, hsa-miR-222-3p, MID-50969, hsa-miR-146b-5p, hsa-miR-346, MID-16582, or with probes specific for the miRs as provided in Table 9.
- Reverse primer (SEQ ID NO: 309) GCGAGCACAGAATTAATACGAC; PolyT adaptor (SEQ ID NO: 310) GCGAGCACAGAATTAATACGACTCACTATCGGTTTTTTTTTTVN, where “V” may be any one of A, G or C; and “N” may be any one of G, C, A or U/T; Universal MGB probe (SEQ ID NO: 311) AAAACCGATAGTGAGTCG.
- Marker microRNAs were selected based on their patterns of expression in several preliminary studies performed by the inventors (data not shown), and provided the reasoning for classifying the same as “malignant”, “cell type” or alternatively, to be used as normalizers.
- Cell type markers hsa-miR-486-5p, hsa-miR-342-3p, hsa-miR-138-5p, hsa-miR-200c-3p, and MID-16582 were chosen by the inventors according to their pattern or expression as exemplified below.
- hsa-miR-486-5p (SEQ ID NO: 22) was found enriched in whole blood relative to thyroid epithelial cells. Along with other microRNAs (data not shown), it was found to be associated with the amount of blood in thyroid FNA samples. Thus, hsa-miR-486-5p is one example of whole blood marker.
- microRNAs were detected in high correlation (>0.85) with miR-486-5p, and may also be considered blood markers, including hsa-miR-320a, hsa-miR-106a-5p, hsa-miR-93-5p, hsa-miR-17-3p, hsa-let-7d-5p, hsa-miR-107, hsa-miR-103a-3p, hsa-miR-17-5p, hsa-miR-191-5p, hsa-miR-25-3p, hsa-miR-106b-5p, hsa-miR-20a-5p, hsa-miR-18a-5p, hsa-miR-144-3p, hsa-miR-140-3p, hsa-miR-15b-5p, hsa-miR-16-5p, hsa-miR-92
- hsa-miR-342-3p (SEQ ID NO: 17 or 18) was one of the microRNAs, amongst others, which was enriched in white blood cells, and may therefore be considered an example of white blood cell marker.
- hsa-miR-342-3p showed to be expressed in correlation with hsa-miR-150-5p, suggesting that also hsa-miR-150-5p is a white blood cell marker.
- hsa-miR-146a-5p was also shown to be expressed in white blood cells (data not shown).
- hsa-miR-200c-3p SEQ ID NO: 23 or 24
- hsa-miR-138-5p SEQ ID NO: 19, 20, or 21
- smears were generated with blood in the absence of thyroid tissue material, and compared with smears from thyroid tissue.
- Both hsa-miR-200c-3p and hsa-miR-138-5p were found to be expressed at much higher levels in the thyroid smears (both benign and malignant) compared to blood smears (data not shown).
- Other microRNAs were also found enriched in epithelial cells (data not shown).
- hsa-miR-200c-3p and hsa-miR-138-5p are examples of epithelial cell markers.
- the inventors found that the expression of hsa-miR-138-5p correlated with the presence of epithelial cells, and in certain subsets of the data hsa-miR-138-5p was found to be upregulated in benign samples (data not shown).
- MID-16582 (SEQ ID NO: 25) was found at higher expression levels in Hurthle cells.
- this microRNA is upregulated in follicular adenoma presenting Hurthle cells versus follicular adenomas not indicated to have Hurthle cells ( FIGS. 2A-2B ). This result may be attributed to the mitochondrial enrichment found in Hurthle cells.
- the present inventors have found that the sequence of MID-16582, as well as other nucleic acid sequences found in Hurthle cells, can be mapped to mitochondrial DNA (data not shown). Thus, MID-16582 is an example of Hurthle cell marker.
- the assay development set included about 360 distinct samples. Most of the samples were stained FNA smears (Papanicolaou, May-Griinwald Giemsa or Diff-Quik). Forty-five (45) FNA samples were in cell blocks. The samples were collected from medical centers in Israel, Europe and USA. Thirty-three of the samples came from thyroid nodules that were less than 1 cm in size. The smallest nodule size was 0.1 cm. Samples of medullary carcinoma were excluded from most of the analyses, unless where indicated. Table 10 provides the distribution of the samples per category.
- results of the assay development in a sub-set of samples are shown in FIG. 3 .
- Expression of microRNAs hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-346 (SEQ ID NO: 14), hsa-miR-181c-5p (SEQ ID NO: 15), and hsa-miR-375 (SEQ ID NO: 8) above the threshold are found in correlation with malignant samples.
- the expression levels shown in FIG. 3 were obtained by the following formula: [50—normalized Ct of each marker].
- the normalization was done by subtracting the mean signal of the normalizers.
- the value of the mean signal of the normalizers over all the samples used, was added to all the expression values detected, in order to bring the values to a range more manageable for calculation.
- expression levels of hsa-miR-125a-5p correlate with that of hsa-miR-125b-5p.
- MicroRNA Ratios were obtained from pairs of microRNAs in an attempt to subtract certain factors from the classifier.
- a ratio of hsa-miR-31-5p:hsa-miR-342-3p enables to reduce the contribution of white blood cells (through the expression of hsa-miR-342-3p, the denominator) in the expression of hsa-miR-31-5p (the numerator).
- C T S are in log-scale, ratios were created by subtracting one miR expression from the other. Each ratio was further normalized by adding a constant, in order for the ratios to be within the same range as the microRNA normalized values.
- microRNA normalized values were obtained by dividing microRNA expression levels by the expression levels of hsa-miR-23a-3p, MID-20094, MID-50969, hsa-miR-345-5p, hsa-miR-3074-5p, MID-50976, MID-50971, hsa-miR-5701 or hsa-miR-574-3p; and were further subtracted from 50, in order for lower C T S to be associated with higher expression values.
- LDA linear discriminant type of discriminant analysis
- microRNA expression levels e.g. two (e.g. hsa-miR-551b-3p and hsa-miR-146b-5p), three (e.g. hsa-miR-551b-3p, hsa-miR-146b-5p, and hsa-miR-31-5p), or eight microRNAs (e.g.
- hsa-miR-146b-5p:hsa-miR-342-3p and hsa-miR-31-5p:hsa-miR-342-3p] [e.g. hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p], or [e.g.
- microRNA ratios e.g. two, three (hsa-miR-146b-5; hsa-miR-551b-3p; hsa-miR-222-3p), or eight microRNAs resulted in a sensitivity of between 80-82.3%, and a specificity of between 56-71.4%; or (ii) microRNA ratios [e.g. two, three (hsa-miR-146b-5; hsa-miR-551b-3p; hsa-miR-222-3p), or eight microRNAs resulted in a sensitivity of between 80-82.3%, and a specificity of between 56-71.4%; or (ii) microRNA ratios [e.g.
- hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p] resulted in a sensitivity of around 80% and a specificity of between 66-72.2%.
- a combination of microRNAs and microRNA ratios was used, resulting in a sensitivity of between 79-88% and a specificity of between 67.5-73.8% ( FIG. 10A-10C , FIG. 11A-11C , FIG. 12A-12C ).
- microRNA expression levels achieved sensitivity of between 81.1-82.9% and specificity of up to 74.2%.
- Analysis of six or eight microRNA ratios achieved sensitivity of between 78-80.5% and specificity of up to 65.6%.
- Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 83.5-85.4% and specificity of up to 70.9% ( FIG. 7A-7C , FIG. 8A-8B , FIG. 9A-9C ).
- microRNA expression levels achieved sensitivity of between 73.9-82.6% and specificity of up to 73%.
- Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 76.5-80.9% and specificity of up to 67.5% ( FIG. 13A-13C , FIG. 14A-14B , FIG. 15A-15C ).
- a third analysis was performed applying SVM (Support vector machine) as the algorithm, in which linear kernel was used.
- SVM Small vector machine
- the analysis with the SVM algorithm was applied to the three sets of samples as mentioned above (malignant+benign, indeterminate and Bethesda IV), using as features either different combinations of microRNA expression levels, microRNA ratios, or a combination of microRNA expression levels and microRNA ratios, respectively. The results are described below.
- microRNA expression levels achieved sensitivity of between 82.3-86% and specificity of up to 75.5%.
- Analysis of three, six or eight microRNA ratios achieved sensitivity of about 83% and specificity of up to 80.8%.
- Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 82.9-86.6% and specificity of up to 83.4% (data not shown).
- microRNA expression levels achieved sensitivity of between 82.6-90.4% and specificity of up to 60.3%.
- Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 80-89.9% and specificity of up to 71.4% ( FIG. 16A-16C , FIG. 17A-17C , FIG. 18A-18C ).
- a fourth analysis was performed applying Ensemble methods as the algorithm.
- An ensemble of up to 100 discriminant analysis classifiers was created using AdaBoost and applied to the data.
- the analysis with the Ensemble algorithm was applied to three sets of samples as mentioned above (malignant+benign, indeterminate and Bethesda IV), using as features either different combinations of microRNA expression levels, microRNA ratios, or a combination of microRNA expression levels and microRNA ratios. The results are described below.
- microRNA expression levels For malignant+benign samples, analysis of two, three or eight microRNA expression levels achieved sensitivity of about 85% and specificity of up to 74.8%. Analysis of three, six or eight microRNA ratios, achieved sensitivity of about 83.5-86% and specificity of up to 79.5%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of about 86% and specificity of up to 82.8% (data not shown).
- microRNA expression levels achieved sensitivity of between 84.3-88.7% and specificity of up to 64.3%.
- Analysis of two, three or eight microRNA ratios achieved sensitivity of between 86.1-89.7% and specificity of up to 65.1%.
- Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 83.5-87.8% and specificity of up to 65.9% ( FIG. 19A-19C , FIG. 20A-20C , FIG. 21A-21C ).
- Example 10 A Classifier for Malignant Samples Including Medullary
- Example 9 The same sample set used in Example 9, but including medullary malignant samples was used for establishing a classifier. All classifiers (LDA, KNN, SVM and Discriminant Analysis Ensemble) were applied, and a representative set of results from the discriminant analysis algorithm showed that when normalized values of two microRNA ratios (e.g. hsa-miR-125b-5p:hsa-miR-138-5p; and hsa-miR-146b-5p:hsa-miR-342-3p) were used as the features for the classification, the sensitivity of the classifier was 84.7% and the specificity, 80.8% ( FIG. 30-31 ). When the normalized values of two microRNAs (e.g. hsa-miR-222-3p and hsa-miR-551b-3p) were used as the features for the classification, the sensitivity was 85.2% and the specificity, 53.6% ( FIG. 30-31 ).
- hsa-miR-486-5p SEQ ID NO: 22
- hsa-miR-200c-3p SEQ ID NO: 23 or 24
- hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21) has also been shown to be low, compared to the threshold, in blood smears (data not shown). Samples with this profile are eligible to be disqualified and/or discarded from the protocol for classification of thyroid lesion samples.
- hsa-miR-342-3p SEQ ID NO: 17 or 18
- hsa-miR-342-3p correlates with white blood cells (data not shown).
- high expression of hsa-miR-342-3p compared to the threshold indicated lack of sufficient thyroid cells, and samples with this profile are eligible to be disqualified and/or discarded from the protocol for classification of thyroid lesion samples.
- hsa-miR-200c-3p is an indicator of the presence of epithelial cells in general, and specifically thyroid cells (data not shown and FIG. 32 ).
- the expression of hsa-miR-200c-3p above a threshold may be used as an indicator of sufficiency of thyroid cells in the sample.
- the inventors also tested microRNA ratios for sub-typing benign thyroid tumors.
- the miR ratio of hsa-miR-125b-5p:hsa-miR-200c-3p was significant for classifying follicular adenoma (FA) versus Hashimoto samples (data not shown).
- 146b-5p, 222-3p, 31-5p, 125b-5p, 551-3p and 375 were found to be highly expressed in papillary carcinoma, while MID-16582 was found to be highly expressed in follicular carcinoma ( FIG. 34 ).
- the ratios of the following miR pairs were significant for classifying Papillary Carcinoma (PC) versus Follicular Carcinoma samples: hsa-miR-146b-5p:hsa-miR-342-3p, hsa-miR-125b-5p:hsa-miR-200c-3p, hsa-miR-222-3p:hsa-miR-486-5p, hsa-miR-31-5p:hsa-miR-342-3p, MID-16582:hsa-miR-200c-3p, MID-16582:hsa-miR-138-5p (data not shown).
- malignant thyroid tumor sub-typing may be performed using miR ratios, particularly miR ratios where the denominator is a cell marker microRNA, such as hsa-miR-486-5p, hsa-miR-200c-3p, hsa-miR-138-5p, and hsa-miR-342-3p.
- miR ratios particularly miR ratios where the denominator is a cell marker microRNA, such as hsa-miR-486-5p, hsa-miR-200c-3p, hsa-miR-138-5p, and hsa-miR-342-3p.
- Example 13 Protocol for the Classification of Thyroid Nodules as Malignant or Benign
- a flowchart with a protocol for thyroid nodule sample analysis, from collection of FNA samples to laboratory analysis and diagnostic ( FIG. 35 ).
- FNA samples are collected from patients having thyroid nodules, and are routinely processed. Smears are prepared from the FNA samples.
- a specialist in cytopathology examines the FNA sample and provides an analysis. In cases where the analysis is inconclusive, particularly in samples classified as Bethesda III, IV, or V, i.e. so-called “indeterminate”, the sample is sent to Rosetta Genomics' laboratories to undergo microRNA profiling and conclusive diagnostic.
- Total RNA is extracted from the sample, which undergoes microRNA profiling.
- MicroRNA profiling may be performed by amplification (RT-PCR or NGS) or hybridization (microarray), as shown in the Examples above.
- the protocol may include any one of the following:
- One or more algorithms may be used during classification, and will be applied on data comprising single microRNAs expression, microRNA ratios, or a combination thereof.
- Samples wherein the hsa-miR-375 expression level is above a specific threshold may be determined as malignant, e.g. a threshold of at least 10, or a threshold of at least 18 (data not shown), when the expression was analyzed by array or PCR, respectively.
- the threshold is dependent on the normalization of the samples, as well as on the methodology used for measuring the microRNAs.
- the threshold may also be a function of the target sensitivity and specificity.
- Samples wherein the hsa-miR-146b-5p expression level is above a specific threshold will be determined as malignant (e.g. a threshold of at least 16; FIG. 28 and data not shown).
- the threshold is dependent on the normalization of the samples, as well as on the methodology used for measuring the microRNAs.
- the threshold may also be a function of the target sensitivity and specificity.
- Samples wherein the ratio hsa-miR-146b-5p:hsa-miR-342-3p, further to normalization, is above a specific threshold will be determined as malignant, e.g. a threshold of at least 16 ( FIG. 29 and data not shown).
- the threshold is dependent on the normalization of the samples, as well as on the methodology used for measuring the microRNAs.
- the level of expression of the normalizers may be used as an indicator for discarding samples, due to insufficient tumor-derived material. Thus, samples presenting low levels of any of the normalizers, or the minimal, median or maximal value of expression for the normalizers may be discarded. For example, low levels of hsa-miR-23a-3p (compared to the overall levels of hsa-miR-23a-3p expression in the cohort) are likely to be misclassified. In counterpart, high levels of hsa-miR-23a-3p improve the classification by improving sensitivity and specificity (data not shown).
- results permitting which include the expression of microRNAs that may be associated with thyroid tumor sub-types, as shown in FIGS. 33 and 34 , for example, the sample is further classified according to its thyroid tumor subtype.
- Example 14 A Classifier for Diagnosing Indeterminate Thyroid Nodules as Benign or Malignant
- a training set of 375 FNA smears (Table 11) was used to develop a classifier for diagnosing indeterminate thyroid nodules as benign or malignant according to microRNA profiling, using the set of 24 miRNAs established and described in Example 8 above. From the total 375 FNA smear samples, 252 samples were profiled in the RG-IL (Rosetta Genomics, Rehovot, Israel) laboratory and 123 samples were profiled in the RGL-US (Rosetta Genomics, Philadelphia, US) laboratory.
- the classifier developed combines several linear discriminant analysis (LDA) steps and a KNN-based classifier.
- LDA linear discriminant analysis
- the KNN classifier uses a Pearson correlation distance metric over the C T values of six of the measured miRNAs, and uses a training set of 314 samples. Samples with at least four benign neighbors are classified as benign by the KNN classifier. Only a subset of the training samples were used to determine the classifier. For example, samples with low microRNA expression in the KNN step were not included in the KNN classifier.
- Table 12 shows the result of the classification according to the classifier (malignant or benign), as well as sensitivity and specificity in each sub-group of samples, Indeterminate (all sub-types), Indeterminate (subtypes III and IV) and Determinates.
- Sensitivity of the classifier on indeterminate samples (Bethesda III-V) was estimated to be 86%, based on the training set, and the specificity was estimated to be 75%.
- the performance of the classifier on determinate samples (Bethesda II and VI) was estimated to be higher, with a sensitivity of 96% and a specificity of 82% for these samples.
- Table 13 shows the performance of the assay per histological type, and indicates the number of samples and the percentage of agreement between the classification obtained with the classifier and the cytopathologist classification.
- the classifier was also assessed on a set of 48 FNA cell blocks (used also in Examples 2 and 5), which were run in the final assay format.
- the performance of the classifier on the indeterminate samples (72% sensitivity and 79% specificity) was similar to the performance of the FNA smears, as can be seen in Table 14, with a slightly lower sensitivity and a slightly higher specificity.
- the sensitivity of the malignant Bethesda VI smears was 89% and the specificity of the benign Bethesda II samples was 63%.
- Example 15 Clinical Validation of a Classifier for the Diagnosis of Indeterminate Thyroid Nodules as Benign or Malignant
- the retrospective validation cohort is an independent set of 201 consecutive, indeterminate FNA samples (Table 11) from five centers, which was tested blindly in the clinical validation assay.
- This set of 201 samples included only samples for which at least two of the three pathologists (the original pathologist and at least one of the additional pathologists) agreed on the final diagnosis of the excised H&E stained nodule.
- samples whose diagnosis was determined to be medullary carcinoma, according to the original pathologist were included (regardless of the diagnosis of the additional two pathologists). Only twelve of the 201 samples (6%) failed during processing or QC steps, with the most common reason being low miRNA expression.
- the performance of the validation cohort was found to be very similar to the performance estimates of the training cohort, as can be seen in Tables 12 and 13.
- the NPV of the entire set was 91%, the sensitivity was 85%, the specificity was 72% and the PPV was 59% ( FIG. 37 ).
- the sensitivity was 84% and the specificity was 72%.
- Seventy of the validation samples from the US passed QC steps, which demonstrated a sensitivity of 82%, and a specificity of 71%.
- the nine misclassified malignant samples (Table 15) included samples from all three indeterminate Bethesda classes, both Giemsa and Papanicolaou stains, and were from three medical centers.
- the misclassified FC sample was described as having minimal capsular invasion, according to the original pathologist, as were the other two, correctly classified FC samples.
- the Hashimoto samples showed a lower correct classification rate, relative to both the training performance and to the other benign samples. However, this difference may be due to the small number of Hashimoto samples in the validation set.
- Agreement set a subset of the validation set (“Agreement set”) was compiled. This set comprised 160 samples (80% of the validation set) for which all three pathologists agreed on the final diagnosis. This set demonstrated very high performance (Table 16).
- the NPV of the Agreement set was 99% (only one malignant sample was misclassified as benign), with a sensitivity of 98%, a specificity of 78%, and a PPV of 62%. If the NPV and PPV are calculated based on the sensitivity, specificity and prevalence, the NPV would remain over 95%, even if the rate of malignancy was above 60%.
- the performance of the assay is influenced by the accuracy of the diagnosis. Therefore, the level of agreement between the pathologists for the different histological types (Table 16) was examined.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Genetics & Genomics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Analytical Chemistry (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Immunology (AREA)
- Pathology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Oncology (AREA)
- Microbiology (AREA)
- Hospice & Palliative Care (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The present invention provides a method for classification of thyroid tumors through the analysis of the expression patterns of specific microRNAs in fine needle aspiration samples. Thyroid tumor classification according to a microRNA expression signature allows optimization of diagnosis and treatment, as well as determination of signature-specific therapy.
Description
- The present invention relates to methods for classification of thyroid tumors. Specifically the invention relates to microRNA molecules associated with specific thyroid tumors.
- The accurate diagnosis of thyroid nodules continues to challenge physicians managing patients with thyroid disease. Patients with cytologically indeterminate nodules are often referred for diagnostic surgery, though most of these nodules prove post-surgery to be benign. This limitation of FNA cytology in the pre-operative diagnosis leads to a clinical need for reliable pre-operative molecular markers to distinguish benign from malignant thyroid nodules. MicroRNAs (miRs) are an important class of regulatory RNAs, which have a profound impact on a wide array of biological processes. These small (typically 18-24 nucleotides long) non-coding RNA molecules can modulate protein expression pattern by promoting RNA degradation, inhibiting mRNA translation, and also by affecting gene transcription. miRs play pivotal roles in diverse processes such as development and differentiation, control of cell proliferation, stress response and metabolism. The expression of many miRs was found to be altered in numerous types of human cancer, and in some cases suggesting that such alterations may play a causative role in tumor progression.
- The thyroid gland is formed of two main types of cells: the follicular cells and the C or parafollicular cells. Follicular cells produce thyroid hormones, which are regulators of human metabolism. Overproduction of thyroid hormone (hyperthyroidism) causes rapid or irregular heartbeat, trouble sleeping, nervousness, hunger, weight loss, and a feeling of being too warm. In counterpart, hypothyroidism causes metabolism slowdown, tiredness, and weight gain. Thyroid hormone release is regulated by the thyroid-stimulating hormone (TSH), produced by the pituitary gland. The C cells produce calcitonin, a hormone responsible for use of calcium. Lymphocytes and stromal cells are also found in the thyroid.
- Thyroid cancer is the eighth most common cancer in the United States, and the most rapidly increasing cancer in the US, with more than 60,000 new cases diagnosed every year, and being the cause of about 1,800 deaths in 2014. Thyroid cancer usually presents itself as a palpable thyroid nodule. Different types of thyroid tumors develop from different cell types, which is a determinant for the gravity and the optimal treatment administered. Most of the growths and tumors in the thyroid gland are benign (non-cancerous) but others are malignant (cancerous).
- Approximately 95% of thyroid cancers are differentiated thyroid carcinomas (DTC) that arise from thyroid follicular cells. There are two histological subtypes of DTC: papillary thyroid carcinoma (PTC) type (90-95%) and follicular thyroid carcinoma (FTC) type (5-10%).
- The most commonly used method for thyroid cancer diagnosis is biopsy by fine-needle aspiration (FNA). FNA samples are routinely examined for cytology to determine whether the nodules are benign or cancerous. The sensitivity and specificity of the cytological examination of an FNA sample range from 68% to 98%, and 72% to 100%, respectively, depending on institutions and doctors. Unfortunately, in at least 25% of the cases the FNA specimens collected are either inadequate for diagnosis or indeterminable by cytology. In current medical practice, most patients with indeterminate results undergo surgery, and are subject to all risks and consequences of the surgical procedure. Follow-up results show that only 25% of the patients operated on are diagnosed with cancer, meaning that 75% of the patients underwent an unnecessary surgical procedure. Surgery entails significant cost and morbidity. One study has shown that adding molecular testing could have an overall positive impact on healthcare cost and patients' quality of life, reaching up to 74% fewer surgeries for benign nodules with no greater number of untreated cancers. Over a 5-year period, the study estimated a savings of almost $1,500 per patient [Li et al. 2011 J Clin Endocrinol. Metab 96(11): E1719-E1726].
- When examining cytochemical or genetic markers, there is no unique marker that on its own is able to provide reliable results in order to replace the morphologic diagnosis of thyroid lesions. U.S. Pat. No. 7,319,011 describes the measuring the expression of any one of the genes DDIT3, ARG2, ITM1, C1orf24, TARSH, and ACO1 in a test follicular thyroid specimen for distinguishing between follicular adenoma (FA) from follicular carcinoma (FC). U.S. Pat. No. 7,670,775 describes the analysis of the expression of CCND2, PCSK2, and PLAB for identifying malignant thyroid tissue. U.S. Pat. No. 6,723,506 describes the molecular characterization of PAX8-PPAR1 molecules in connection with diagnosis and treatment of thyroid follicular carcinomas. U.S. Pat. No. 7,378,233 describes the occurrence of the T1796A mutation of the BRAF gene in 24 (69%) of papillary thyroid carcinomas.
- Accumulated efforts have been invested in finding a molecular diagnostic test which will overcome the uncertainty of indeterminate cytology, and ultimately eliminate unnecessary surgery for non-cancer patients [Chen, Y. T. et. al. (2008) Mod. Pathol. 21, 1139-1146; He, H. et al. (2005) Proc. Natl Acad. Sci. USA 102, 19075-19080; Nikiforova, M. N. et al. (2009) Endocr. Pathol. 20, 85-91; Pallante, P. et al. (2006) Endocr. Relat. Cancer 13, 497-508; Nikiforova, M. N. et al. (2008) J. Clin. Endocrinol. Metab. 93, 1600-1608; Visone, R. et al. (2007) Endocr. Relat. Cancer 14(3):791-8; US 2014/0030714 A1; U.S. Pat. No. 8,541,170; US 2012/0220474 A1; U.S. Pat. Nos. 8,465,914; 7,598,052; 8,202,692; WO 2013/066678; WO 2012/129378; US 2013/0237590; EP 2772 550 A1; Pallante et al. (2010) Endocrine-Related Cancer 17 F91-F104; Dettmer et al. (2014) J Mol Endocrinol. March 6; 52(2):181-9].
- Nonetheless, numerous are the challenges that remain. It is of great necessity to develop a molecular assay with not only high sensitivity and specificity, but also that is able to deal with samples that failed the cytology analysis and that fall under the category of indeterminate samples. The present invention provides solutions for this challenge.
- Thus, in a first aspect, the present invention provides a method of classifying a thyroid lesion sample as malignant or benign, the method comprising:
- a. providing RNA extracted from a thyroid lesion sample obtained from a human subject;
- b. obtaining an expression profile comprising expression levels of microRNAs comprising SEQ ID NOS. 1-37;
- c. applying a classifier algorithm to the expression profile; wherein the classifier algorithm compares the expression profile to a reference value; and
- d. classifying said thyroid lesion as benign or malignant, or of a sub-type of benign or malignant tumor based on the result from the classifier algorithm.
- In one embodiment of the method of the invention, following step (b) or (c) further comprising a step of obtaining the ratio between the expression levels of at least one pair of nucleic acids; and wherein in step (d) said classifier algorithm may be applied to any one of the nucleic acid expression profile, said ratio of at least one pair of nucleic acids, or to a combination thereof.
- In a further embodiment of the method of the invention, said thyroid lesion sample is obtained by fine needle aspiration (FNA) biopsy. In one particular embodiment, said sample is a smear from a FNA biopsy.
- In another further embodiment of the method of the invention, algorithm is a machine-learning algorithm. In one particular embodiment of said method of the invention, said algorithm further combines the microRNA expression profile with clinical or genetic data from said sample.
- In another further embodiment of the method of the invention, following step (b) if at least one of said microRNA expression levels, or the ratio between the expression levels of at least one pair of microRNAs is below or above a threshold for thyroid cells, said sample is discarded based on the expression level of said microRNA.
- In another further embodiment of the method of the invention, said measuring is performed by hybridization, amplification or next generation sequencing method.
- In particular, an expression profile of microRNAs comprising SEQ ID NOS. 1-13, 17-22, 25, 26, 36 and 37 is determined.
- In one particular embodiment of the method of the invention, said hybridization comprises contacting the sample with probes, wherein the probes comprise (i) DNA equivalents of the microRNAs, (ii) the complements thereof, (iii) sequences at least 80% identical to (i) or (ii) or (iv) a nucleic acid sequence that hybridizes with at least eight contiguous nucleotides of any one of SEQ ID NOs 1-37. In another particular embodiment of the invention, said probes are attached to a solid substrate.
- In another further particular embodiment of the method of the invention, amplification is real-time polymerase chain reaction (RT-PCR), said RT-PCR amplification method comprising forward and reverse primers, and optionally further comprising hybridization with a probe.
- Amplification by RT-PCR comprises contacting the RNA with forward and reverse primers for each of the miRNAs, wherein each forward primer comprises 15-21 nucleotides identical to one of the miRNAs. Forward primers are as defined in Table 8.
- In another further embodiment, said method further comprises the step of administering a differential treatment to said subject if said thyroid lesion is benign or malignant.
- In another further particular embodiment of the method of the invention, said lesion is malignant and said treatment is any one of surgery, chemotherapy, radiotherapy, hormone therapy, or any other recommended treatment.
- In one embodiment, said probe is a general probe. In another embodiment said probe is a microRNA sequence-specific probe.
- In another further aspect, the present invention provides an isolated nucleic acid, said nucleic acid comprising at least 12 contiguous nucleotides at least 80% identical to the sequence of any one of SEQ ID NOs. 27-29, 33, 34, 139, 140, 307 and 308.
- In another further aspect, the present invention provides a pharmaceutical composition comprising as active agent the isolated nucleic acids described herein, and optionally adjuvants, carriers, diluents and excipients. Thus, said nucleic acid molecules may be comprised as an active agent in a pharmaceutical composition, a formulation or a medicament.
- In another further aspect, the present invention provides a vector comprising the isolated nucleic acid described herein.
- In another further aspect, the present invention provides a probe comprising the isolated nucleic acid described herein.
- In another further aspect, the present invention provides a biochip comprising the isolated nucleic acid described herein.
- In another further aspect, the present invention provides the use of an isolated nucleic acid as described herein in the preparation of a medicament.
-
FIGS. 1A-1B : Novel microRNAs detected by next generation sequencing.FIG. 1A shows the predicted secondary structure of two novel microRNAs, MD2-495 (top) and MD2-437 (bottom) detected in thyroid tissue.FIG. 1B shows the expression of the two novel microRNAs in each one of 11 resected thyroid samples. -
FIGS. 2A-2B : Hurthle cell marker. The plots shows higher expression of MID-16582 in follicular adenoma presenting Hurthle cells versus follicular adenomas with no indication of Hurthle cells. Sign.=significant; Diff.=differential; f-ch=fold change; Bl.=blood; NT, not tested.FIG. 2A : The y and x axes show the median array expression levels of the miRs in FA (follicular adenoma) samples not documented as having Hurthle cells (n=22) versus FA samples with Hurthle cells (n=9). The dashed factor line=×1.5. Bl.=blood. NT, not tested.FIG. 2B : The y and x axes show the median PCR expression levels of the miRs in FA samples with no indication of Hurthle cells (n=21) versus FA samples with Hurthle cells (n=9). The dashed factor line=±0.6. -
FIG. 3 : Profiling of malignant and benign samples with Thyroid assay set of microRNAs. The x and y axis show the expression levels of the miRs in benign (B) (n=166) versus malignant (M) (n=187) samples, respectively. The microRNA median expression levels for hsa-miR-222-3p, hsa-miR-551b-3p, hsa-miR-31-5p, hsa-miR-125b-5p, hsa-miR-146b-5p, hsa-miR-152-3p, hsa-miR-346, hsa-miR-181c-5p, hsa-miR-424-3p, and hsa-miR-375 are highlighted. The numbers refer to (50—normalized Ct value). Diamonds (♦) represent any one of the microRNAs of SEQ ID NOs. 26-37. Sign.=significant; Diff.=differential; f-ch=fold change. The dashed factor line=±0.6. -
FIGS. 4A-4C : A Discriminant Analysis classifier was used to classify samples from the malignant+benign cohort as malignant (diamonds, M) or benign (squares, B).FIG. 4A : Classifier with two microRNAs (hsa-miR-551b-3p and hsa-miR-146b-5p), presented sensitivity of 84.8% and specificity of 68.9%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 4B : Classifier with three microRNAs (hsa-miR-551b-3p, hsa-miR-146b-5p, and hsa-miR-31-5p) presenting sensitivity of 82.9% and specificity of 72.2%. Misclassified samples (miscl.) are represented by a dot.FIG. 4C : Classifier with eight microRNA (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-375; hsa-miR-125b-5p; hsa-miR-152-3p; hsa-miR-181c-5p), presenting sensitivity of 83.5% and specificity of 81.5%. The figure shows a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.) while the y-axis shows the true diagnosis (Real class=re.cl.). -
FIGS. 5A-5C : A Discriminant Analysis classifier was used to classify samples from the malignant+benign cohort as malignant (diamonds, M) or benign (squares, B).FIG. 5A : Classifier with two microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p and hsa-miR-31-5p:hsa-miR-342-3p), presented sensitivity of 78% and specificity of 79.5%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 5B : Classifier with three microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p), presented sensitivity of 81.1% and specificity of 82.1%. Misclassified samples (miscl.) are represented by a dot.FIG. 5C : Classifier with 8 microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-222-3p:hsa-miR-486-5p; hsa-miR-200c-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p) is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presented sensitivity of 74.4% and specificity of 84.1%. -
FIG. 6A-6C : A Discriminant Analysis classifier was used to classify samples from the malignant+benign cohort as malignant (diamonds, M) or benign (squares, B), based on a combination of microRNAs and microRNA ratios.FIG. 6A : Classifier of one microRNA ratio and one microRNA (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-551b-3p) presented sensitivity of 82.9% and specificity of 82.8%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 6B : Classifier of one microRNA ratio and two microRNAs (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-551b-3p; hsa-miR-146b-5p) presented sensitivity of 82.9% and specificity of 82.8%.FIG. 6C : Classifier of five microRNAs and three microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-375) as the features for the classification, is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.) (Real class=re.cl.), and presented sensitivity of 93.3% and specificity of 42.4%. -
FIG. 7A-7C : A K-nearest neighbor (KNN) classifier was used to classify samples from the malignant+benign cohort as malignant (M) from benign (B).FIG. 7A : Classifier using six microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-375; hsa-miR-125b-5p) is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), presented sensitivity of 82.3% and specificity of 68.2%.FIG. 7B : Classifier of eight microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-375; hsa-miR-125b-5p; hsa-miR-152-3p; hsa-miR-181c-5p) is shown in a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presenting sensitivity of 82.9% and of 74.2%.FIG. 7C : Classifier using twelve microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-375; hsa-miR-125b-5p; hsa-miR-152-3p; hsa-miR-181c-5p; hsa-miR-486-5p; hsa-miR-424-3p; hsa-miR-200c-3p; hsa-miR-346) is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), presenting sensitivity of 81.1% and specificity of 68.9%. -
FIG. 8A-8B : A KNN classifier was used to classify samples from the malignant+benign cohort as malignant (M) or benign (B).FIG. 8A : Classifier using six microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-222-3p:hsa-miR-486-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for the classification, is shown as a confusion matrix where the x-axis represents the classifier answer (Clas. Ans.), and the y-axis represents the true diagnosis (Real class=re.cl.), and presented sensitivity of 78% and specificity of 58.9%.FIG. 8B : Classifier using eight microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-222-3p:hsa-miR-486-5p; hsa-miR-200c-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p) as the features for the classification, shown in a confusion matrix where the x-axis represents the classifier answer (Clas. Ans.), and the y-axis represents the true diagnosis (Real class=re.cl.), presented sensitivity of 80.5% and specificity of 65.6%. -
FIG. 9A-9C : A KNN classifier was used to classify samples from the malignant+benign cohort as malignant (M) or benign (B).FIG. 9A : Classifier using four microRNAs and two microRNA ratios (hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p;) as the features for the classification, is shown as a confusion matrix where the x-axis represents the classifier answer (Clas. Ans.), while the y-axis represents the true diagnosis (Real class=re.cl.), and presented sensitivity of 85.4% and specificity of 66.9%.FIG. 9B : Classifier using five microRNAs and three microRNA ratios (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-222-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-31-5p; hsa-miR-125b-5p:hsa-miR-138-5p) as the features for the classification, is shown as a confusion matrix where the x-axis represents the classifier answer (Clas. Ans.), while the y-axis represents the true diagnosis (Real class=re.cl.), and presented sensitivity of 83.5% and specificity of 70.9%.FIG. 9C : Classifier using seven microRNAs and five microRNA ratios (hsa-miR-375; hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-152-3p; hsa-miR-125b-5p; hsa-miR-222-3p; hsa-miR-31-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-222-3p:hsa-miR-486-5p;) as the features for the classification, is shown as a confusion matrix where the x-axis represents the classifier answer (Clas. Ans.), while the y-axis represents the true diagnosis (Real class=re.cl.), and presented sensitivity of 83.5% and specificity of 66.9%. -
FIG. 10A-10C : A Discriminant Analysis classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B).FIG. 10A : A classifier using two microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p) showed sensitivity of 80% and specificity of 56.3%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 10B : A classifier of three microRNAs (hsa-miR-146b-5; hsa-miR-551b-3p; hsa-miR-222-3p) showed sensitivity of 82.6% and specificity of 59.5%. Misclassified samples (miscl.) are represented by a dot.FIG. 10C : A classifier of eight microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-181c-5p) is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), and the y-axis shows the true diagnosis (Real class=re.cl.). The sensitivity of this classifier is 81.7% and the specificity is 71.4%. -
FIG. 11A-11C : A Discriminant Analysis classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B)FIG. 11A : A classifier using two microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p) as the features for the classification presented a sensitivity of classifier 80% and specificity of 72.2%. The grey shaded area marks the space in which a sample is classified as malignant.FIG. 11B : A classifier using three microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 80% and specificity of 69%. Misclassified samples (miscl.) are represented by a dot.FIG. 11C : A classifier using eight microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for the classification, shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), and the y-axis shows the true diagnosis (Real class=re.cl.), presented sensitivity of 80% and specificity of 66.7%. -
FIG. 12A-12C : A Discriminant Analysis classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B), using a combination of microRNAs and microRNA ratios.FIG. 12A : A classifier using one microRNA and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 80% and specificity of 73.8%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 12B : A classifier using two microRNAs and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 79.1% and specificity of 73%.FIG. 12C : A classifier using five microRNAs and three microRNA ratios (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 87.8% and specificity of 67.5%, and it is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.). -
FIG. 13A-13C : A KNN classifier was used to classify samples from the Indeterminate sub-cohort as malignant (M) or benign (B)FIG. 13A : A classifier of six microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375) is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), presented sensitivity of 78.3% and specificity of 65.9%.FIG. 13B : A classifier using eight microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-181c-5p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), presented sensitivity of 82.6% and specificity of 73%.FIG. 13C : A classifier using 12 microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-181c-5p; hsa-miR-424-3p; hsa-miR-486-5p; hsa-miR-200c-3p; hsa-miR-346) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), presented sensitivity of 73.9% and specificity of 68.3%. -
FIG. 14A-14B : A KNN classifier was used to classify samples from the Indeterminate sub-cohort as malignant (M) or benign (B), using microRNA ratios.FIG. 14A : A classifier using six microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presented sensitivity of 80.9% and specificity of 65.9%.FIG. 14B : A classifier using eight microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class-re.cl.), presented sensitivity of 76.5% and specificity of 62.7%. -
FIG. 15A-15C : A KNN classifier was used to classify samples from the Indeterminate sub-cohort as malignant (M) or benign (B) using microRNAs and microRNA ratios.FIG. 15A : A classifier using three microRNAs and three microRNA ratios (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class-re.cl.), and presented sensitivity of 76.5% and specificity of 57.9%.FIG. 15B : A classifier using five microRNAs and three microRNA ratios (hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), presented sensitivity of 78.3% and specificity of 64.3% -
FIG. 15C : A classifier using seven microRNAs and five microRNA ratios (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p;) as the features for classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), presented sensitivity of 80.9% and specificity of 67.5%. -
FIG. 16A-16C : A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B)FIG. 16A : A classifier using three microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p) as the features for classification presented sensitivity of classifier 82.6% and specificity of 54.8%. Misclassified samples (miscl.) are represented by a dot.FIG. 16B : A classifier using sixmicroRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375) as the features for the classification, is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presented sensitivity of 82.6% and specificity of 59.5%.FIG. 16C : A classifier using eight microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-181c-5p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presented sensitivity of 90.4% and specificity of 60.3%. -
FIG. 17A-17C : A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) using microRNA ratios.FIG. 17A : A classifier using three microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 81.7% and specificity of 67.5%. Misclassified samples (miscl.) are represented by a dot.FIG. 17B : A classifier using six microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presented sensitivity of 88.7% and specificity of 63.5%.FIG. 17C : A classifier using eight microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presented sensitivity of 87.8% and specificity of 58.7%. -
FIG. 18A-18C : A SVM classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) samples using the combination of microRNAs and microRNA ratios.FIG. 18A : A classifier using two microRNAs and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 80% and specificity of 71.4%.FIG. 18B : A classifier using threemicroRNAs and three microRNA ratios (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class-re.cl.), presented sensitivity of 89.9% and specificity of 51.6%.FIG. 18C : A classifier using five microRNAs and three microRNA ratios (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p;) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class-re.cl.), and presented sensitivity of 84.3% and specificity of 68.3%. -
FIG. 19A-19C : A Discriminant analysis ensemble classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B).FIG. 19A : A classifier using two microRNA (hsa-miR-146b-5p; hsa-miR-551b-3p) as the features for the classification presented sensitivity of 85.2% and specificity of 45.2%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 19B : A classifier using three microRNAs (hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-222-3p) as the features for the classification presented sensitivity of 84.3% and specificity of 45.2%. Misclassified samples (miscl.) are represented by a dot.FIG. 19C : A classifier using eight microRNAs (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-375; hsa-miR-152-3p; hsa-miR-181c-5p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presented sensitivity of 88.7% and specificity of 64.3%. -
FIG. 20A-20C : A Discriminant analysis ensemble classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B)FIG. 20A : A classifier using two microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 86.1% and specificity of 61.1%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 20B : A classifier using three microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 87% and specificity of 57.1%. Misclassified samples (miscl.) are represented by a dot.FIG. 20C : A classifier using eight microRNA ratios (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), presented sensitivity of 89.6% and specificity of 65.1%. -
FIG. 21A-21C : A Discriminant analysis ensemble classifier was used to classify samples from the Indeterminate sub-cohort as malignant (diamonds, M) or benign (squares, B) using a combination of microRNAs and microRNA ratios.FIG. 21A : A classifier using one microRNA and one microRNA ratio (hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-146b-5p) as the features for the classification presented sensitivity of 83.5% and specificity of 58.7%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 21B : A classifier using two microRNAs and one microRNA ratio (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 85.2% and specificity of 65.9%. Misclassified samples (miscl.) are represented by a dot.FIG. 21C : A classifier using five microRNAs and 3 microRNA ratios (hsa-miR-146b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-125b-5p; hsa-miR-31-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.), and presented sensitivity of 87.8% and specificity of 62.7%. -
FIG. 22A-22C : A Discriminant analysis classifier was used to classify samples from the Bethesda IV sub-cohort as malignant (diamonds, M) or benign (squares, B).FIG. 22A : A classifier using two microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p) as the features for the classification presented sensitivity of 91.5% and specificity of 42.9%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 22B : A classifier using three microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p) as the features for the classification presented sensitivity of 91.5% and specificity of 39.7%. Misclassified samples (miscl.) are represented by a dot.FIG. 22C : A classifier using eight microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p; hsa-miR-31-5p; hsa-miR-138-5p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class-re.cl), and presented sensitivity of 89.4% and specificity of 47.6%. -
FIG. 23A-23C : A Discriminant analysis classifier was used to classify the samples from the Bethesda IV sub-cohort as malignant (diamonds, M) or benign (squares, B).FIG. 23A : Classifier using two microRNA ratios (hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 89.4% and specificity of 28.6%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 23B : Classifier using three microRNA ratios (hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 91.5% and specificity of 30.2%. Misclassified samples (miscl.) are represented by a dot.FIG. 23C : Classifier using eight microRNA ratios (hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; MID-16582:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; hsa-miR-125b-5p: hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for the classification, as shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class-re.cl.), and presented sensitivity of 80.9% and specificity of 57.1%. -
FIG. 24A-24C : A Discriminant analysis classifier was used to classify samples from the Bethesda IV sub-cohort as malignant (diamonds, M) or benign (squares, B) using microRNAs and microRNA ratios.FIG. 24A : Classifier using one microRNA and one microRNA ratio (hsa-miR-125b-5p; hsa-miR-125b-5p:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 93.6% and specificity of 33.3%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier.FIG. 24B : Classifier using one microRNA and two microRNA ratios (hsa-miR-125b-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification presented sensitivity of 89.4% and specificity of 41.3%. Misclassified samples (miscl.) are represented by a dot.FIG. 24C : Classifier using four microRNAs and four microRNA ratios (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; MID-16582:hsa-miR-138-5p) as the features for the classification is shown as a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl), and presented sensitivity of 87.2% and specificity of 46%. -
FIG. 25A-25C : A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign. The figures show a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.).FIG. 25A : Classifier using six microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p) as the features for the classification presented sensitivity of 72.3% and specificity of 39.7%.FIG. 25B : Classifier using eight microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p; hsa-miR-31-5p; hsa-miR-138-5p) as the features for the classification presented sensitivity of 66% and specificity of 61.9%.FIG. 25C : Classifier using twelve microRNAs (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p; hsa-miR-31-5p; hsa-miR-138-5p; hsa-miR-200c-3p; MID-16582; hsa-miR-346; hsa-miR-152-3p) as the features for the classification presented sensitivity of 66% and specificity of 61.9%. -
FIG. 26A-26B : A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign. The figures show a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.).FIG. 26A : Classifier using six microRNA ratios (hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; MID-16582:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 78.7% and specificity of 61.9%.FIG. 26B : Classifier using eight microRNA ratios (hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; MID-16582:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-200c-3p:hsa-miR-486-5p) as the features for the classification presented sensitivity of 80.9% and specificity of 50.8%. -
FIG. 27A-27C : A KNN classifier was used to classify samples from the Bethesda IV sub-cohort as malignant or benign using expression of microRNAs and microRNA ratios. The figures show a confusion matrix where the x-axis shows the classifier answer (Clas. Ans.), while the y-axis shows the true diagnosis (Real class=re.cl.).FIG. 27A : Classifier using four microRNAs and two microRNA ratios (hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-125b-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p) as the features for the classification, presented sensitivity of 63.8% and specificity of 46%.FIG. 27B : Classifier using four microRNAs and four microRNA ratios (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; MID-16582:hsa-miR-138-5p) as the features for the classification presented sensitivity of 68.1% and specificity of 49.2%.FIG. 27C : Classifier using six microRNAs and six microRNA ratios (hsa-miR-146b-5p; hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-375; hsa-miR-222-3p; hsa-miR-181c-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; MID-16582:hsa-miR-138-5p; hsa-miR-222-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p) as the features for the classification presented sensitivity of 74.5% and specificity of 58.7%. -
FIG. 28 : The normalized expression (Exp.) levels of hsa-miR-146b-5p is shown as a dot plot for Bethesda IV non-medullary malignant (“Mal.”) and for benign (“Ben.”) samples. Lines represent the median values for each group. Within each group, dots are randomly distributed along the x-axis. -
FIG. 29 : The normalized expression (Exp.) levels of the microRNA ratio hsa-miR-146b-5p:hsa-miR-342-3p is shown as a dot plot for Bethesda IV non-medullary malignant (“Mal.”) and for benign (“Ben.”) samples. Lines represent the median values for each group. Within each group, dots are randomly distributed along the x-axis. -
FIG. 30 : A Discriminant Analysis classifier was used to classify malignant (diamonds, M) from benign (squares, B) samples, wherein the malignant group included samples of medullary tumor. The normalized values of two microRNA (hsa-miR-222-3p; hsa-miR-551b-3p) were used as features for the classification. The sensitivity of this classifier is 85.2% and the specificity is 53.6%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier. -
FIG. 31 : A Discriminant Analysis classifier was used to classify malignant (diamonds, M) from benign (squares, B) samples, wherein the malignant group included samples of medullary tumor. The values of two microRNA ratios (hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-146b-5p:hsa-miR-342-3p) were used as the features for the classification. The sensitivity of this classifier is 84.7% and the specificity is 80.8%. The grey shaded area marks the space in which a sample is classified as malignant, as determined by the classifier. -
FIG. 32 : Expression pattern of hsa-miR-486-5p and hsa-miR-200c-3p is determinant for the quality of the sample. Four samples of blood smears (BS) were analyzed for the expression of hsa-miR-486-5p (SEQ ID NO: 22) and hsa-miR-200c-3p (SEQ ID NO: 23 or 24) in comparison with their expression in malignant (M) and benign (B) thyroid samples. Normalized values for the two miRs are shown (normalized using all normalizers). -
FIG. 33 : Sub-typing of Benign Thyroid Tumors. microRNA expression profile (median) was established for two sub-types of benign tumors, Follicular Adenoma (FA, y axis, n=81) and Hashimoto (Hash., x axis, n=6). Each cross represents a microRNA or a microRNA ratio. The ratio hsa-miR-125b-5p:hsa-miR-200c-3p correlated to FA, while expression of hsa-miR-342-3p and hsa-miR-31-5p correlated with Hashimoto. Diamonds represent any one of the microRNAs of SEQ ID NOs. 26-37. Significant microRNAs (p-value for t-test<0.05) are represented by circles. -
FIG. 34 : Sub-typing of Malignant Thyroid Tumors. microRNA expression profile was established for two sub-types of malignant thyroid tumors, papillary carcinoma (Pap.; y-axis, n=161) and follicular carcinoma (FC; x-axis, n=16). Each cross represents a microRNA or a microRNA ratio. Diamonds are any one of the microRNAs of SEQ ID NOs. 26-37. Significant microRNAs (p-value for t-test<0.05) are encircled. Only normalized microRNA values are labeled. Unlabeled circles represent significant ratios. -
FIG. 35 : Flowchart representing the protocol for diagnosis of indeterminate thyroid nodule samples obtained through FNA. -
FIG. 36 : Medullary carcinoma, Linear Discriminant Analysis step based on the expression of hsa-miR-375. All of the training medullary carcinoma stained smear samples and two of the three medullary validation samples demonstrate over-expression of hsa-miR-375 (x-axis shows hsa-miR-375 expression; y-axis shows hsa-miR-146b expression). Diamonds: malignant non-medullary training samples; squares, benign training samples; circles, medullary carcinoma training samples; stars, medullary carcinoma validation samples. -
FIG. 37 : Negative Predictive Value (NPV) and Positive Predictive Value (PPV) for varying prevalence values. NPV and PPV were calculated, based on the sensitivity and specificity of the entire validation set (85.2% and 71.9%, respectively; dashed lines) and the validation agreement set (97.5% and 78.2%, respectively; solid lines), for varying prevalence (Prev.) values. Calculated NPV: line starts on 1 on the left-hand side. Calculated PPV: lines starts on 1 on the right-hand side. Dotted, thick line: NPV=95%. - Despite accumulated efforts in the search for accurate diagnosis of thyroid lesions, a great number of technical problems remain with no solution in sight. As a result of the quality of the material obtained, the diagnosis of thyroid lesions in fine needle aspiration (FNA) samples is still challenging. The low number of cells, the amount of blood, the ratio between thyroid tumor cells and non-thyroid tumor cells in the sample, make it challenging to extract enough material that will provide conclusive results.
- The present inventors have developed a first-of-its-kind assay utilizing microRNA expression in FNA smears for distinguishing benign from malignant thyroid nodules. This assay offers a valuable tool for the classification of pre-operative thyroid samples, including those that are presently indeterminate to cytological evaluation.
- The present invention provides a sensitive, specific and accurate methodology for distinguishing between malignant and benign thyroid tumors, as well as particular subtypes of thyroid tumors. Distinguishing between different subtypes of thyroid tumors is essential for providing the patient with the best and most suitable treatment. The present invention provides a significant improvement of the technologies currently available in the field of thyroid tumor classification and diagnosis.
- The present inventors have developed an integrative platform for the classification of thyroid lesions, by profiling and characterizing microRNA expression in thyroid clinical samples obtained by FNA biopsies, while also overcoming hindrances such as low number of cells in the sample and the amount of blood in the sample by microRNA profiling. This technological platform was applied to stratify thyroid lesions into benign or malignant neoplasms, as well as subtypes of thyroid tumors, as an adjunctive tool in the pre-operative management of thyroid nodules. The inventors have exceptionally developed a method for classification of benign and malignant thyroid lesions, and specific subtypes of thyroid cancer and follicular lesions, while integrating steps for filtering out sub-optimal samples, by implementing specific algorithms based on microRNA profiling. The method is part of an overall protocol, in which existing or available clinical cytological slides having smears from FNA samples may be used, without the need to generate or collect additional material from the patients.
- The present method further incorporates the analysis of microRNAs in minute amounts of RNA material from cytological samples. Once an FNA sample is collected, between one and several passes of material are smeared onto slides. Currently available methods usually require the use of several passes for having enough material for analysis. The present inventors developed a method in which even only one FNA slide provides sufficient material for microRNA detection. In addition, the method developed by the inventors allows for the analysis of samples having very small amounts of cells, such as samples having 50 cells, up to 120 cells and over.
- The present method includes steps for eliminating or disqualifying samples that lack thyroid cells and/or in which non-thyroid cells, such as blood cells, are over-represented.
- The assay developed by the present inventors accurately differentiates benign from malignant thyroid nodules in indeterminate FNA smears. The assay is advantageous over other previous diagnostic methods for thyroid nodules particularly in regard to its performance for cases in which all pathologists are in full agreement, showing 98% of sensitivity (95% confidence interval [CI], 87-100%); 78% of specificity (CI, 69-85%); a Negative Predictive Value (NPV) of 99% (CI, 94-100%); and a Positive Predictive Value (PPV) of 62% (CI, 49-74%). Performance for the overall validation sample set was 85% Sensitivity (CI, 74-93%); 72% Specificity (CI, 63-79%); 91% of NPV (CI, 84-96%); and 59% of PPV (CI, 48-69%).
- Thus the present inventors have identified a unique microRNA expression signature for thyroid lesions through profiling the expression of the microRNAs denoted by SEQ ID NOs.1-308. A microRNA expression signature for classification of thyroid lesions is also obtained through profiling the expression of at least ten of the microRNAs denoted by SEQ ID NOs.1-37. Alternatively, a microRNA expression signature for classification of thyroid lesions is obtained through profiling the expression of at least seven of the microRNAs denoted by SEQ ID NOs.1-37.
- More specifically, the present inventors have develop a platform for classification of thyroid clinical samples based on the levels of expression of a set of microRNAs, comprising at least two microRNAs, selected from the group consisting of hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-424-3p (SEQ ID NO: 16), hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-346 (SEQ ID NO: 14), MID-16582 (SEQ ID NO: 25), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-181c-5p (SEQ ID NO: 15), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-375 (SEQ ID NO: 8), hsa-miR-486-5p (SEQ ID NO: 22), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-200c-3p (SEQ ID NO: 23 or 24) and hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21); or a sequence at least 80%, at least 85%, or at least 90% identical thereto. The platform was established based on a training study with a robust cohort, and which also included the optional measurement of additional microRNAs that served as normalizers.
- Variations of said platform for classification of thyroid clinical samples, particularly clinical samples classified as Bethesda III, IV and/or V, include determining the levels of expression of at least ten (10) microRNAs comprised in the group consisting of hsa-miR-375 (SEQ ID NO: 8), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-574-3p (SEQ ID NO: 36 or 37), hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-23a-3p (SEQ ID NO: 26), MID-16582 (SEQ ID NO: 25), and hsa-miR-486-5p (SEQ ID NO: 22); or a sequence at least 80%, at least 85%, at least 90%, or at least 95% identical thereto.
- Alternatively, the classification of thyroid clinical samples, particularly FNA samples classified as Bethesda III, IV and/or V, include determining the levels of expression of eleven microRNAs comprised in the group consisting of hsa-miR-375 (SEQ ID NO: 8), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-31-5p (SEQ ID NO: 5 or 7), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-574-3p (SEQ ID NO: 36 or 37), hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-23a-3p (SEQ ID NO: 26), MID-16582 (SEQ ID NO: 25), and hsa-miR-486-5p (SEQ ID NO: 22); or a sequence at least 80%, at least 85%, at least 90%, or at least 95% identical thereto. Further to determining the level of expression of at least ten (10) microRNAs as denoted above, the method may comprise optionally determining the level of expression of two additional microRNAs as denoted above.
- The present invention is particularly useful for the 25% of the cases in which FNA specimens present inconclusive results in cytopathology, usually referred to as “indeterminate”, and which include thyroid lesion samples classified in Bethesda categories III, IV and V. In current medical practice, patients with specimens falling within this category undergo repeat FNA procedure, or surgery, including lobectomy and thyroidectomy.
- Thus, in one embodiment, the present invention provides a method of classification for thyroid lesion samples that fall into the “indeterminate” cases, classified in categories III, IV and V of the Bethesda System (described further herein). In one particular embodiment, the present invention provides a method of classification for thyroid lesion samples classified in category IV of the Bethesda System, which relates to “Follicular Neoplasm” or “Suspicious of a Follicular Neoplasm”, which is known to be the most difficult category to be classified.
- Thus, the present invention presents primarily a protocol for management of thyroid lesion samples which failed to be classified by cytopathological analysis. Particular samples that are of interest are those obtained by FNA. In one embodiment, routine smears from FNA samples are used. In another embodiment, FNA samples in preservative solutions may be used. Total RNA is extracted from the FNA samples, and the expression of microRNAs is measured. In one embodiment, the expression of about 2200 microRNAs is measured. In another embodiment, the expression of 182 microRNAs, comprising the sequences of SEQ ID NO: 1-182 is measured. In a further embodiment, the expression of the microRNAs comprising the sequences of SEQ ID NO: 1-37 is measured. In another further embodiment, at least three, at least four, at least five, at least six, at least seven, at least eight, at least nine, at least ten, at least eleven, at least twelve, at least thirteen, at least fourteen, or all microRNAs from the group selected from hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-424-3p (SEQ ID NO: 16), hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-346 (SEQ ID NO: 14), MID-16582 (SEQ ID NO: 25), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-181c-5p (SEQ ID NO: 15), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-375 (SEQ ID NO: 8), hsa-miR-486-5p (SEQ ID NO: 22), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-200c-3p (SEQ ID NO: 23 or 24), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-23a-3p (SEQ ID NO: 26), and hsa-miR-574-3p (SEQ ID NO: 36 or 37), or a sequence at least 80%, at least 85%, or at least 90% identical thereto, are measured and used in the classification.
- In a further embodiment, classification of the thyroid sample as malignant or benign comprises measuring the expression levels of hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-375 (SEQ ID NO: 8), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-346 (SEQ ID NO: 14), hsa-miR-181c-5p (SEQ ID NO: 15), hsa-miR-424-3p (SEQ ID NO: 16), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-486-5p (SEQ ID NO: 22), hsa-miR-200c-3p (SEQ ID NO: 23 or 24), MID-16582 (SEQ ID NO: 25), or any combination thereof, or a sequence at least 80%, at least 85%, or at least 90% identical thereto, providing the levels of expression to a classifier which analyzes and classifies the sample as malignant or benign.
- Thus, the present invention provides a method for distinguishing between malignant and benign thyroid tumor lesions in a subject in need, said method comprising obtaining a thyroid tumor lesion sample from said subject, or provided a biological sample obtained from said subject, determining an expression profile in said sample of one or more, or at least four, at least eight, at least eleven, at least thirteen microRNAs comprising SEQ ID NOS: 1-37, or a sequence at least 80%, at least 85%, at least 90%, or at least 95% identical thereto, or any combination of said microRNAs, by hybridization or by amplification, comparing said expression profile to a reference threshold value by using a classifier algorithm; and determining whether the thyroid lesion is malignant or benign. In one particular embodiment, the method of the invention is for distinguishing sub-types of malignant or benign thyroid tumor lesions.
- In one embodiment, the method of the invention comprises measuring the expression of at least four of the microRNAs comprising SEQ ID NOS: 1-37, obtaining the microRNA expression profile value of said sample, and using a classifier to establish, based on said value, whether the thyroid lesion is malignant or benign, and optionally further classifying the sample into one of the malignant or benign subtypes.
- In one particular embodiment, said determining an expression profile by hybridization comprises contacting the sample with probes that hybridize to each of SEQ ID NOS: 1-37, or to a sequence at least 80%, at least 85%, or at least 90% identical thereto. In another embodiment, said determining an expression profile by hybridization comprises contacting the sample with probes that hybridize with at least eight, at least ten, at least twelve, at least fourteen, or at least sixteen contiguous nucleotides of said microRNA comprising SEQ ID NOS: 1-37.
- The present invention further provides a method of classifying a sample as malignant or benign, and/or sub-typing said sample, whereby, further to measuring the expression levels of microRNAs in the sample, obtaining an expression profile and optionally calculating microRNA ratios, applying a multi-step analysis of the expression data. Said multi-step analysis comprising applying one or more algorithms, in parallel or sequentially, to at least one of the microRNA expression profiles, microRNA ratios, or a combination thereof. Said multi-step analysis may also further include analyzing the expression of one or more single microRNA levels which may be indicative of the overall quality of the sample.
- Examples of criteria that may be included in the multi-step analysis, in any order and in any combination, are: the expression of non-malignant cell markers, the expression of microRNAs that correlate with a specific sub-type of thyroid tumor, and the like. Thus for example, one step may be examining whether the expression of non-thyroid cell markers is higher or lower than the threshold established in the data set, e.g. the training data set, in which case the sample may be disqualified. Another further step may be examining the expression of a microRNA or microRNA ratio that correlates with a thyroid tumor sub-type, e.g. if the expression of hsa-miR-342-3p (SEQ ID NO: 17 or 18) is very high compared to the threshold established in the data set, e.g. the training data set, the sample may be classified as benign, and further sub-typed as being Hashimoto. Alternatively, if the expression of hsa-miR-342-3p (SEQ ID NO: 17 or 18) is very high compared to the threshold established in the data set, e.g. the training data set, the sample may be disqualified for lack of sufficient thyroid cells. Another further optional step may relate to the level of expression of MID-16582 (SEQ ID NO: 25), may be used to determine whether the sample may be discarded, or analyzed using a classifier specific for these samples in which MID-16582 (SEQ ID NO: 25) is high (compared to the threshold established in the training set).
- In one particular embodiment of the invention, said non-thyroid cell marker is a blood cell marker.
- In another particular embodiment of the invention, said cell marker is an epithelial cell marker.
- In a further particular embodiment of the invention, said cell marker is a blood cell marker, a white blood cell marker or an epithelial cell marker. Examples of blood cell markers are hsa-miR-486-5p (SEQ ID NO: 22), hsa-miR-320a (SEQ ID NO: 173), hsa-miR-106a-5p (SEQ ID NO: 150), hsa-miR-93-5p (SEQ ID NO: 182), hsa-miR-17-3p (SEQ ID NO: 160), hsa-let-7d-5p (SEQ ID NO: 144), hsa-miR-107 (SEQ ID NO: 152), hsa-miR-103a-3p (SEQ ID NO: 149), hsa-miR-17-5p (SEQ ID NO: 161), hsa-miR-191-5p (SEQ ID NO: 163), hsa-miR-25-3p (SEQ ID NO: 167), hsa-miR-106b-5p (SEQ ID NO: 151), hsa-miR-20a-5p (SEQ ID NO: 166), hsa-miR-18a-5p (SEQ ID NO: 40), hsa-miR-144-3p (SEQ ID NO: 154), hsa-miR-140-3p (SEQ ID NO: 51), hsa-miR-15b-5p (SEQ ID NO: 157), hsa-miR-16-5p (SEQ ID NO: 159), hsa-miR-92a-3p (SEQ ID NO: 181), hsa-miR-484 (SEQ ID NO: 179), hsa-miR-151a-5p (SEQ ID NO: 156), hsa-let-7f-5p (SEQ ID NO: 145), hsa-let-7a-5p (SEQ ID NO: 141), hsa-let-7c-5p (SEQ ID NO: 143), hsa-let-7b-5p (SEQ ID NO: 142), hsa-let-7g-5p (SEQ ID NO: 146), hsa-let-7i-5p (SEQ ID NO: 147), hsa-miR-185-5p (SEQ ID NO: 162), hsa-miR-30d-5p (SEQ ID NO: 172), hsa-miR-30b-5p (SEQ ID NO: 170), hsa-miR-30c-5p (SEQ ID NO: 171), hsa-miR-19b-3p, hsa-miR-26a-5p (SEQ ID NO: 168), hsa-miR-26b-5p (SEQ ID NO: 169), hsa-miR-425-5p (SEQ ID NO: 176), MID-19433 (SEQ ID NO: 133), and hsa-miR-4306 (SEQ ID NO: 177). Examples of white blood cell markers are hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-146a-5p and hsa-miR-150-5p (SEQ ID NO: 59). Examples of epithelial markers are hsa-miR-200c-3p (SEQ ID NO: 23 or 24), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-3648 (SEQ ID NO: 174), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-125a-5p (SEQ ID NO: 153), hsa-miR-192-3p (SEQ ID NO: 164), hsa-miR-4324 (SEQ ID NO: 178), hsa-miR-376a-3p (SEQ ID NO: 175).
- Thus, quality assessment (QA) or quality control (QC) of the sample may be one of the steps of the multi-step analysis of the expression data, or one of the steps of the method of the invention. In the context of the method of the invention, QC relates to the expression profile of microRNAs of any one of SEQ ID NOS. 1-37, or to the ratio of the expression level of at least one pair of microRNAs of any one of SEQ ID NOS. 1-37. Therefore, the expression profile or the ratio of the expression levels of microRNAs are optionally used indicators of sample quality.
- As referred to herein, said microRNA ratio is the ratio between the normalized or non-normalized expression level of a pair of microRNAs, wherein the expression level of one microRNA is used as the numerator and the expression level of a second microRNA is the denominator.
- Examples of pairs of microRNAs whose expression level ratios are used in the classifier are hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-375:hsa-miR-146b; hsa-miR-551b-3p:hsa-miR-23a-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-222-3p:hsa-miR-486-5p; hsa-miR-200c-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p; hsa-miR-222-3p:MID-16582; and the like.
- In another particular embodiment, said determining an expression profile comprises contacting the sample with RT-PCR reagents, including forward and reverse primers as exemplified herein in the Examples, and generating RT-PCR products.
- In a further particular embodiment, said method comprises contacting RT-PCR products with specific or general probes, or a combination thereof, as exemplified herein in the Examples, detecting and measuring the PCR products.
- In a further embodiment, said determining an expression profile comprises measuring microRNA expression by hybridization, using microarrays and the like. In another further embodiment, said determining an expression profile comprises measuring microRNA expression by next-generation sequencing.
- The method of the invention further comprises optionally determining the expression profile of at least one microRNA to be used as normalizer. In this context, any microRNA as described in Table 1 may be used as a normalizer. In one particular embodiment, any of the microRNAs comprising SEQ ID NO: 26-37, or a sequence at least 80%, 85%, 90%, or 95% identical thereto, are used as normalizers.
- The present inventors have surprisingly found that the classification of a thyroid tumor sample is improved when a number of markers, from different categories as defined and exemplified herein are used. Said markers may be any one of malignant markers, secondary markers and cell-type markers, or any combination thereof, comprising SEQ ID NOS: 1-25, or a sequence at least 80%, 85%, 90%, or 95% identical thereto. In order to perform the method of the invention, the full set of markers may be used. Alternatively, any combination of malignant, secondary and cell-type markers may be used. Thus, the method may comprise at least one malignant marker, in association with at least one secondary marker and/or at least one cell-type marker.
- Depending on the analysis of the data, each of the cell type markers may be used as in the form of raw or normalized signals. Alternatively, the cell type markers may be used as a preliminary test prior to performing the classification, in order to determine whether the sample has sufficient relevant material to perform classification, or whether the sample should be discarded. Yet another option is to use the cell-type markers as part of the final classifier, where the signal of the cell type marker is used by the classifier. A further option is to use the cell-type markers as the denominator of a miR ratio optionally used by the classifier. For example, the expression level of a malignant or a secondary marker may be divided by the expression level of a cell-type marker, and the resulting miR ratio used in the classifier.
- Thus, in a further embodiment of the method for distinguishing between malignant and benign thyroid tumor lesions in a subject in need, said classifier may be any one of a single classifier, a multi-step classifier, a classifier which uses all the malignant markers, a classifier which uses a subset of the malignant markers, a classifier which uses all the malignant markers and the secondary markers, a classifier which uses a subset of the malignant markers and a subset of the secondary markers, a classifier which uses all the malignant markers and the secondary markers and the cell type markers, a classifier which employs a subset of all the malignant markers and the secondary markers and the cell type markers, a classifier which uses all or a subset of the malignant markers and all or a subset of the cell type markers.
- In another further embodiment of the method or the protocol of the invention, the performance of the classification may be improved by further combining the result from the algorithm classifier with additional clinical or molecular data available for the thyroid sample being analyzed. Additional data available may be related to the thyroid lesion, such as the size of the nodule, the number of nodules; it may relate to other clinical information available for the subject from whom the sample was obtained, such as molecular test results, like the expression of other molecular markers, genetic markers, biochemical test results, blood test results, urine test results, recurrence, prognosis data, family history, patient medical history, and the like. Other data that may also be combined is thyroid genetic data, such as mutation analysis, gene fusions, chromosomal rearrangements, gene expression, protein expression, and the like.
- Therapeutic indications may vary according to the diagnostic obtained with the method or protocol of the invention. Typically there are five types of therapy that may be administered to a thyroid cancer patient: surgery, radiation therapy, chemotherapy, thyroid hormone therapy and targeted therapy.
- Surgery is the most common treatment of thyroid cancer. One of the following procedures may be used:
- Lobectomy: Removal of the lobe in which thyroid cancer is found. Biopsies of lymph nodes in the area may be done to see if they contain cancer.
- Near-total thyroidectomy: Removal of all but a very small part of the thyroid.
- Total thyroidectomy: Removal of the whole thyroid.
- Lymphadenectomy: Removal of lymph nodes in the neck that contain cancer.
- Thyroidectomy is a surgical procedure that has several potential complications or sequela including: temporary or permanent change in voice, temporary or permanently low calcium, need for lifelong thyroid hormone replacement, bleeding, infection, and the remote possibility of airway obstruction due to bilateral vocal cord paralysis. Therefore, accurate diagnosis which would prevent the unnecessary removal of the thyroid gland is very desirable.
- Radiation therapy uses high-energy x-rays or other types of radiation to eliminate cancer cells or inhibit their proliferation. There are two types of radiation therapy. External radiation therapy uses a machine outside the body to send radiation toward the cancer. Internal radiation therapy uses a radioactive substance sealed in needles, seeds, wires, or catheters that are placed directly into or near the cancer. The radiation therapy of choice will be dependent on the type and stage of the thyroid cancer. Radiation therapy may be supplementary to surgery in order to eliminate cancer cells that were not successfully removed. Follicular and papillary thyroid cancers may be treated with radioactive iodine (RAI) therapy. RAI is administered orally and collects in any remaining thyroid tissue, including thyroid cancer cells that have spread to other places in the body. Since only thyroid tissue takes up iodine, the RAI destroys thyroid tissue and thyroid cancer cells without harming other tissues. Before a full treatment dose of RAI is given, a small test-dose is given to see if the tumor takes up the iodine.
- Chemotherapy is another option for thyroid cancer treatment. Chemotherapy may be administered orally or by injection, intravenous or intramuscular. Chemotherapy may also be administered directly into the cancer affected area instead of systemically. The choice of administration will depend on the type and stage of the cancer. A few examples of drugs that have been approved for thyroid cancer treatment are: Adriamycin PFS (Doxorubicin Hydrochloride), Adriamycin RDF (Doxorubicin Hydrochloride), Cabozantinib-S-Malate, Caprelsa (Vandetanib), Cometriq (Cabozantinib-S-Malate), Doxorubicin Hydrochloride, Nexavar (Sorafenib Tosylate), Sorafenib Tosylate and Vandetanib.
- Thyroid hormone therapy http://www.cancer.gov/Commnon/PopUps/popDefinition.aspx?id=45110&version=Patient&language=English is a cancer treatment that removes hormones or blocks their action and inhibits cancer cell proliferation. In the treatment of thyroid cancer, drugs may be given to prevent thyroid-stimulating hormone (TSH) production, in order to avoid that the hormone would induce the growth or recurrence of the thyroid cancer.
- Also, because thyroid cancer treatment specifically targets thyroid cells, the thyroid is not able to make enough thyroid hormone. Patients are given thyroid hormone replacement pills.
- Targeted therapy uses drugs or other substances to identify and attack specific cancer cells without harming normal cells. Tyrosine kinase inhibitor (TKI) therapy blocks signal transduction in thyroid cancer cells, inhibiting their growth. Vandetanib is a TKI used to treat thyroid cancer.
- Dosage and duration of any therapy will depend on individual evaluation of the patient and on standard practice known by the health care provider. The duration of treatment is the period of time during which doses of a pharmaceutical agent or pharmaceutical composition are administered.
- The identification and differentiation of the thyroid tumor, firstly as benign or malignant, and subsequently its classification into the various subtypes through the analysis of differentially expressed microRNAs can provide further clues to the biological differences between the subtypes, their diverging oncogenetic processes and possible new targets for type-specific target therapy.
- The present invention provides diagnostic assays and methods, both quantitative and qualitative, for detecting, diagnosing, monitoring, staging and prognosticating thyroid cancers by comparing levels of the specific microRNA molecules as described herein. Such levels are measured in a patient sample, which may be from a biopsy, tumor samples, cells, tissues and/or bodily fluids.
- Thus, the method of the invention is particularly useful for discriminating between different subtypes of malignant thyroid tumors, such types being follicular carcinoma, papillary carcinoma, follicular variant of papillary carcinoma (FVPC or FVPTC), encapsulated FVPC (or encapsulated FVPTC), medullary carcinoma, anaplastic thyroid cancer, poorly differentiated thyroid cancer, and for determining the therapeutic course to be followed after diagnosis. In a further embodiment, the present invention provides a method for classifying sub-types of benign thyroid tumor, e.g. follicular adenoma, Hashimoto thyroiditis, hyperplasia (Goiter).
- The present invention also provides a method of treatment of thyroid cancer, said method comprising the method of distinguishing between benign or malignant thyroid tumor as described herein, optionally subtyping the thyroid tumor type, and administering the treatment according to the diagnosis provided by the present method.
- All the methods of the present invention may optionally further include measuring levels of other cancer markers. Other cancer markers, in addition to said microRNA molecules useful in the present invention, will depend on the cancer being tested and are known to those of skill in the art.
- Assay techniques that can be used to determine levels of gene expression, such as the nucleic acid sequence of the present invention, in a sample derived from a patient are well known to those of skill in the art. Such assay methods include, but are not limited to, reverse transcriptase PCR (RT-PCR) assays, in situ hybridization assays, competitive-binding assays, Northern blot analyses, nucleic acid microarrays and biochip analysis.
- An arbitrary threshold on the expression level of one or more nucleic acid sequences can be set for assigning a sample or tumor sample to one of two groups. Alternatively, in a preferred embodiment, expression levels of one or more nucleic acid sequences of the invention are combined by taking ratios of expression levels of two nucleic acid sequences and/or by a method such as logistic regression to define a metric which is then compared to previously measured samples or to a threshold. The threshold for assignment is treated as a parameter, which can be used to quantify the confidence with which samples are assigned to each class. The threshold for assignment can be scaled to favor sensitivity or specificity, depending on the clinical scenario. The correlation value to the reference data generates a continuous score that can be scaled and provides diagnostic information on the likelihood that a samples belongs to a certain class of thyroid subtype. In multivariate analysis, the microRNA signature provides a high level of prognostic information.
- The present invention also provides novel microRNA molecules, comprising nucleic acids denoted by SEQ ID NOS.27-29, 33, 34, 139, 140, 307 and 308. It is to be understood, that the cDNA, complement sequence, and anti-miR corresponding to any one of SEQ ID NOS.27-29, 33, 34, 139, 140, 307 and 308 are also encompassed by the present invention.
- Further, the present application provides compositions, formulations and medicaments comprising the microRNAs described herein. In one particular embodiment, the present invention provides compositions, formulations and medicaments comprising as an active agent the microRNA comprising any one of SEQ ID NOS.27-29, 33, 34, 139, 140, 307 and 308, variants thereof, or a sequence at least 80%, at least 85%, or at least 90% identical thereto. Said compositions, formulations and medicaments may further optionally comprise any one of adjuvants, carriers, diluents and excipients. The microRNAs described herein can be formulated into compositions, formulations and medicaments by combination with appropriate, pharmaceutically acceptable carriers or diluents, and can be formulated into preparations in solid, semi-solid, liquid or gaseous forms, such as tablets, capsules, powders, granules, ointments, solutions, suppositories, injections, inhalants and aerosols. As such, administration of the microRNA or a pharmaceutical composition comprising thereof can be achieved in various ways, including oral, buccal, rectal, parenteral, intraperitoneal, intradermal, transdermal, intratracheal, etc.
- In certain embodiments, pharmaceutical compositions of the present invention comprise one or more nucleic acids of the invention and one or more excipients. In certain such embodiments, excipients are selected from water, salt solutions, alcohol, polyethylene glycols, gelatin, lactose, amylase, magnesium stearate, talc, silicic acid, viscous paraffin, hydroxymethylcellulose and polyvinylpyrrolidone.
- In certain embodiments, a pharmaceutical composition of the present invention is prepared using known techniques, including, but not limited to mixing, dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping or tabletting processes. Methods for the preparation of pharmaceutical compositions may be found in the literature, e.g. in Gennaro, A. R. (2000) Remington: The Science and Practice of Pharmacy, 20th ed.
- In certain embodiments, a pharmaceutical composition of the present invention is a liquid (e.g., a suspension, elixir and/or solution). In certain of such embodiments, a liquid pharmaceutical composition is prepared using ingredients known in the art, including, but not limited to, water, glycols, oils, alcohols, flavoring agents, preservatives, and coloring agents.
- In certain embodiments, a pharmaceutical composition of the present invention is a solid (e.g., a powder, tablet, and/or capsule). In certain of such embodiments, a solid pharmaceutical composition comprising one or more nucleic acids of the invention is prepared using ingredients known in the art, including, but not limited to, starches, sugars, diluents, granulating agents, lubricants, binders, and disintegrating agents.
- Further, the present application provides vectors and probes comprising the compounds (the nucleic acids) disclosed herein. In one particular embodiment, the present application provides vectors and probes comprising nucleic acids denoted by SEQ ID NOS.27-29, 33, 34, 139, 140, 307 and 308, variants thereof or a sequence at least 80%, at least 85%, or at least 90% identical thereto.
- It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and it is not intended to be limiting. It must be noted that, as used in the specification and the appended claims, the singular forms “a,” “an” and “the” include plural referents unless the context clearly dictates otherwise.
- For the recitation of numeric ranges herein, each intervening number there between with the same degree of precision is explicitly contemplated. For example, for the range of 6-9, the numbers 7 and 8 are contemplated in addition to 6 and 9, and for the range 6.0-7.0 for example, the numbers 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6, 6.7, 6.8, 6.9 and 7.0 are explicitly contemplated.
- As used herein, the term “aberrant proliferation” means cell proliferation that deviates from the normal, proper, or expected course. Aberrant cell proliferation may include cell proliferation whose characteristics are associated with an indication caused by, mediated by, or resulting in inappropriately high levels of cell division, inappropriately low levels of apoptosis, or both. Such indications may be characterized, for example, by single or multiple local abnormal proliferations of cells, groups of cells, or tissue(s), whether cancerous or non-cancerous, benign or malignant. Aberrant proliferation is one of the main features of cancer.
- As used herein, the term “about” refers to +/−10%.
- “Attached” or “immobilized”, as used herein to refer to a probe and a solid support, means that the binding between the probe and the solid support is sufficient to be stable under conditions of binding, washing, analysis, and removal. The binding may be covalent or non-covalent. Covalent bonds may be formed directly between the probe and the solid support or may be formed by a cross linker or by inclusion of a specific reactive group on either the solid support or the probe or both molecules. Non-covalent binding may be one or more of electrostatic, hydrophilic, and hydrophobic interactions. Included in non-covalent binding is the covalent attachment of a molecule, such as streptavidin, to the support and the non-covalent binding of a biotinylated probe to the streptavidin. Immobilization may also involve a combination of covalent and non-covalent interactions.
- “Biological sample” or “sample”, as used herein, means a sample of biological tissue or fluid that comprises nucleic acids, microRNA in particular. Such samples include, but are not limited to, tissue or fluid isolated from subjects. Biological samples also include sections of tissues such as biopsy and autopsy samples, fine-needle aspiration (FNA) samples, frozen sections taken for histological purposes, blood, blood fraction, plasma, serum, and the like. A biological sample may be provided by removing a sample of cells from a subject, but can also be accomplished by using previously isolated cells (e.g., isolated by another person, at another time, and/or for another purpose), which may then be cultured or not. Archival tissues, such as those having treatment or outcome history, may also be used.
- In another embodiment of the invention, the FNA biopsy is prepared as a smear.
- The term “classification” refers to a procedure and/or algorithm in which individual items are placed into groups or classes based on quantitative information on one or more characteristics inherent in the items (referred to as traits, variables, characters, features, etc.) and based on a statistical model and/or a training set of previously labeled items.
- As used herein, the term “classifying thyroid tumors” refers to the identification of one or more properties of a thyroid tissue sample (e.g., including but not limited to, the presence of microRNAs expressed in cancerous tissue, the presence of microRNAs expressed in pre-cancerous tissue that is likely to become cancerous, and the presence of microRNAs expressed in cancerous tissue that is likely to metastasize).
- The term “classifier” as used herein refers to an algorithm used to classify, distinguish or identify thyroid tumors (or lesions) as benign or malignant, or to classify, distinguish or identify sub-types of thyroid tumor. Once the microRNA expression profile of the samples of any study cohort is acquired, for example from the training cohort, a database is generated in which the expression levels of all the microRNAs in the samples of the cohorts are stored. This database is also referred to as “the training data” and it is used to choose an optimal algorithm for classification. Nucleic acid (or microRNA) ratios, alone or in combination with nucleic acid (or microRNA) levels may also be used by the algorithm for the classification of thyroid samples.
- In one embodiment, the algorithm to be used in the method or protocol of the invention is a machine-learning algorithm. Examples of machine-learning algorithms are discriminant analysis, K-nearest neighbor classifier (KNN), Support Vector Machine (SVM) classifier, logistic regression classifier, neural network classifier, Gaussian mixture model (GMM), nearest centroid classifier, linear regression classifier, decision tree classifier, and random forest classifier, ensemble of classifiers, or any combination thereof.
- Thus, the classifier may comprise a combination of algorithms, such as for example a discriminant analysis and a KNN classifier, a KNN classifier and a SVM classifier, a discriminant analysis and a logistic regression classifier, and so forth with any of the machine-learning algorithms.
- When a discriminant analysis classifier is used, the discriminant may be any one of a linear, quadratic, a diagonal of the linear covariance matrix, diagonals of the quadratic covariance matrices, pseudoinverse of the linear covariance matrix, and pseudoinverse of the quadratic covariance matrices. When a KNN classifier is used, the k may be altered and the distance metric can be either Pearson correlation, spearman correlation, Euclidean or cityblock (Manhattan) distance. A KNN classifier uses a number of neighboring samples for the classification of each sample, which may be 5 neighboring samples, 6 neighboring samples, 7 neighboring samples, 8 neighboring samples, 9 neighboring samples, 10 neighboring samples, and the like.
- When a SVM classifier is used, the kernel may be linear, Gaussian or polynomial. When an ensemble method classifier is used, it usually applies algorithms such as classification trees, KNN or discriminate analysis classifiers. The ensembles can be either created using boosting or bagging algorithms and the number of ensemble learning cycles can range from two up to a few thousand.
- One such combination of algorithms may comprise one linear discriminant analysis (LDA) and one KNN, two LDA algorithms and one KNN, three LDA algorithms and one KNN, four LDA algorithms and one KNN, or five LDA algorithms and one KNN.
- Thus, a classifier for the classification of thyroid tumor samples and its use are provided herein, said classifier comprising a combination of algorithms, particularly machine-learning algorithms, which utilize (as an input) the expression levels of nucleic acids of SEQ ID NOS:1-37, or the ratios between pairs of expression levels of said nucleic acids of SEQ ID NOS:1-37, in a series of steps which result in the classification of a thyroid tumor sample as benign or malignant (as the output).
- As used herein, “confusion matrix” refers to a specific table layout that allows visualization of the performance of an algorithm, typically a supervised learning one. A “confusion matrix” may also be referred to as a contingency table or an error matrix.
- “Complement” or “complementary”, as used herein to refer to a nucleic acid, may mean Watson-Crick (e.g., A-T/U and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. A full complement or fully complementary means 100% complementary base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. In some embodiments, the complementary sequence has a reverse orientation (5′-3′). The present invention also provides the complement of the nucleic acids denoted by SEQ ID NOS. 7-29, 33, 34, 139, and 140.
- As used herein, “CT signals” or “CT” represent the first cycle of PCR where amplification crosses a threshold (cycle threshold) of fluorescence. Accordingly, low values of CT represent high abundance or expression levels of the microRNA. In some embodiments the PCR CT signal is normalized such that the normalized CT is inversed from the expression level. In other embodiments the PCR CT signal may be normalized and then inverted such that low normalized-inverted CT represents low abundance or expression levels of the microRNA.
- As used herein, a “data processing routine” refers to a process that can be embodied in software that determines the biological significance of acquired data (i.e., the ultimate results of an assay or analysis) with respect to one or more samples. For example, the data processing routine can make determination of whether a thyroid lesion from which a sample was collected or obtained is benign or malignant, or of a specific sub-type, based upon the data collected. In the systems and methods herein, the data processing routine can also control the data collection routine based upon the results determined. The data processing routine and the data collection routines can be integrated and provide feedback to operate the data acquisition, and hence provide assay-based judging methods.
- “Detection” means detecting the presence of a component in a sample. Detection also means detecting the absence of a component. Detection also means determining the level of a component, either quantitatively or qualitatively.
- “Differential expression” or a “difference in expression levels” means qualitative or quantitative differences in the microRNA expression patterns in thyroid samples. Thus, a differentially expressed microRNA may qualitatively have its expression altered, including an activation or inactivation, in, e.g., normal versus diseased thyroid tissue. A qualitatively regulated microRNA may exhibit an expression pattern within a thyroid sample or cell type which may be detectable by standard techniques. Some microRNAs may be expressed in one thyroid sample or cell type, and not in other, or expressed at different levels between different cell types or different samples. Thus, the difference in expression may be quantitative, e.g., in that expression is modulated, up-regulated, resulting in an increased amount of microRNA, or down-regulated, resulting in a decreased amount of microRNA. The degree to which expression differs needs only be large enough to quantify via standard characterization techniques such as expression arrays, next generation sequencing (NGS), quantitative reverse transcriptase PCR, northern blot analysis, real-time PCR, in situ hybridization and RNase protection.
- The term “expression profile” is used broadly to include a genomic expression profile, as well as an expression profile of microRNAs, for example. As used herein, expression profile means the set of data obtained for the nucleic acid (or microRNA) expression. It may refer to the raw data or to the normalized expression values. Expression profiles may be generated by any convenient means for determining a level of a nucleic acid sequence e.g. quantitative hybridization of microRNA, labeled microRNA, amplified microRNA, cDNA, etc., quantitative PCR, and the like. Further to measuring nucleic acid sequence levels, the data obtained may be normalized—normalization of data has been discussed somewhere else in this application. Expression profiles allow the analysis of differential gene expression between two or more samples, as well as between samples and thresholds. Further, classifiers may be applied to expression profiles in order to obtain information about the sample, such as classification, diagnosis, sub-typing of the sample, and the like. Nucleic acid sequences of interest are nucleic acid sequences that are found to be predictive, including the nucleic acid sequences provided herein in Table 1, where the expression profile may include expression data for 5, 10, 20, 25, 50, 100 or more of, including all of the listed nucleic acid sequences. According to some embodiments, the term “expression profile” means measuring the abundance of the nucleic acid sequences in the measured samples. In a specific embodiment, microRNA expression profiles are characterized in each thyroid sample.
- “Expression ratio”, as used herein, refers to relative expression levels of two or more nucleic acids, i.e. microRNAs, as determined by detecting the relative expression levels of the corresponding nucleic acids in a biological sample, such as a thyroid sample. Since microRNA expression levels are expressed as CTS, which are obtained in log scale, in practice expression ratios are obtained by subtraction of the CTS, rather than by division.
- As used herein, “FDR” or “False Discovery Rate”, is a statistical method used in multiple hypothesis testing to correct for multiple comparisons. When performing multiple statistical tests, for example in comparing the signal between two groups in multiple data features, there is an increasingly high probability of obtaining false positive results, by random differences between the groups that can reach levels that would otherwise be considered as statistically significant. In order to limit the proportion of such false discoveries, statistical significance is defined only for data features in which the differences reached a p-value (by two-sided t-test) below a threshold, which is dependent on the number of tests performed and the distribution of p-values obtained in these tests.
- As used herein, “FNA” relates to “fine needle aspiration”. Fine-needle aspiration biopsy (FNAB, FNA or NAB), or fine-needle aspiration cytology (FNAC), is a diagnostic procedure used to investigate superficial (just under the skin) lumps or masses, and it is particularly useful for thyroid lesion biopsies. A biopsy is collected by inserting a thin, hollow needle into the mass for sampling of cells that, after being stained, will be examined under a microscope. There could be cytology exam of aspirate (cell specimen evaluation, FNAC) or histological (biopsy—tissue specimen evaluation, FNAB). FNA is a popular biopsy method used for thyroid nodules since a major surgical (excisional or open) biopsy can be avoided by performing a needle aspiration biopsy instead. A detailed description of specimen collection and preparation may be found in “Atlas of Fine Needle Aspiration Cytology” by Henryk A. Domanski (2014), the contents of which are incorporated herein by reference. The preparation of aspiration specimens has been well described in the art. Usually, a suitable amount of aspirate (usually about one drop) is spread thinly and evenly over a microscopic slide which is then stained and mounted. FNA specimen prepared in this manner are also referred to as “smear”. The result should be compatible to a sectioned histological slide with regard to specimen thickness and evenness. Fixation of FNA smears is usually by air drying (generally referred to as “routine air dried FNAB”) or wet fixing using either 95% ethanol or cyto-spray as fixative. Other suitable liquid fixatives are methanol, acetone, isopropyl alcohol, acetone/methanol and the like. Alternatively, FNA samples may be added to or mixed with preservatives in a tube.
- As referred to herein, a “follicular” lesion may be any one of follicular adenoma (FA), follicular carcinoma (FC) and follicular variant of papillary thyroid carcinoma (FVPTC).
- “Fragment” is used herein to indicate a non-full-length part of a nucleic acid. Thus, a fragment is itself also a nucleic acid.
- “Groove binder” and/or “minor groove binder” (MGB), as used herein, may be used interchangeably and refer to small molecules that fit into the minor groove of double-stranded DNA, typically in a sequence-specific manner. Minor groove binders may be long, flat molecules that can adopt a crescent-like shape and thus, fit snugly into the minor groove of a double helix, often displacing water. Minor groove binding molecules may typically comprise several aromatic rings connected by bonds with torsional freedom such as furan, benzene, or pyrrole rings. Minor groove binders may be antibiotics such as netropsin, distamycin, berenil, pentamidine and other aromatic diamidines, Hoechst 33258, SN 6999, aureolic anti-tumor drugs such as chromomycin and mithramycin, CC-1065, dihydrocyclopyrroloindole tripeptide (DPI3), 1,2-dihydro-(3H)-pyrrolo[3,2-e]indole-7-carboxylate (CDPI3), and related compounds and analogues, including those described in Nucleic Acids in Chemistry and Biology, 2d ed., Blackburn and Gait, eds., Oxford University Press, 1996, and PCT Published Application No. WO 03/078450, the contents of which are incorporated herein by reference. A minor groove binder may be a component of a primer, a probe, a hybridization tag complement, or combinations thereof. Minor groove binders may increase the Tm of the primer or a probe to which they are attached, allowing such primers or probes to effectively hybridize at higher temperatures.
- “Identical” or “identity”, as used herein in the context of two or more nucleic acid sequences, mean that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity. In cases where the two sequences are of different lengths or the alignment produces one or more staggered ends and the specified region of comparison includes only a single sequence, the residues of single sequence are included in the denominator but not the numerator of the calculation. When comparing DNA and RNA sequences, thymine (T) and uracil (U) may be considered equivalent. Identity may be performed manually or by using a computer sequence algorithm such as BLAST, BLAST 2.0, and the like.
- “In situ detection”, as used herein, means the detection of expression or expression levels in the original site hereby meaning in a tissue sample such as biopsy.
- “Label”, as used herein, means a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. The label may be any entity that does not naturally occur in a protein or nucleic acid and allows the nucleic acid or protein to be detectable. For example, useful labels include 32P, fluorescent dyes, electron-dense reagents, enzymes, biotin, digoxigenin, or haptens and other entities which can be made detectable, and the like. A label may be incorporated into nucleic acids and proteins at any position.
- “Logistic regression” is part of a category of statistical models called generalized linear models. Logistic regression allows one to predict a discrete outcome, such as group membership, from a set of variables that may be continuous, discrete, dichotomous, or a mix of any of these. The dependent or response variable can be dichotomous, for example, one of two possible types of cancer. Logistic regression models the natural log of the odds ratio, i.e. the ratio of the probability of belonging to the first group (P) over the probability of belonging to the second group (1-P), as a linear combination of the different expression levels (in log-space). The logistic regression output can be used as a classifier by prescribing that a case or sample will be classified into the first type is P is greater than 0.5 or 50%. Alternatively, the calculated probability P can be used as a variable in other contexts such as a 1D or 2D threshold classifier.
- As used herein, the term “prior” refers to a probability for each class, e.g., given to the different classes, and used by the likelihood that a sample is malignant or benign, without any additional knowledge regarding the expression profile of the sample in a classification. Priors may be set at different ratios, such as for example 80%-20% malignant-benign, 75%-25% malignant-benign, 70%-30% malignant-benign, 65%-35% malignant-benign, 60%-40% malignant-benign, 50%-50% malignant-benign (i.e., uniform). In addition, priors may be empirical, i.e., based on the distribution of the samples in training cohort. Priors may be adjusted in order to achieve a predetermined sensitivity or specificity.
- As used herein, a “marker” is a microRNA, or a nucleic acid sequence, whose presence and abundance is measured in a sample. A “marker” further provides an indication of the status of the sample.
- As used herein, “malignant marker” is a microRNA, or a nucleic acid sequence which is present at higher levels in malignant samples versus benign samples. A malignant marker may or may not be present in test samples.
- As used herein, “secondary marker” is a microRNA, or a nucleic acid sequence, which is used to differentiate between malignant and benign samples, and for which the difference, or the ratio, in the expression levels of said secondary marker in malignant and benign samples is less than the difference, or the ratio, in the expression levels of malignant markers. A secondary marker may or may not be present in test samples.
- As used herein, “cell type marker” refers to a microRNA, or nucleic acid sequence, whose expression correlates with certain cell types. Said cell types may generally be found in a sample, e.g. blood cells, white blood cells, red blood cells, epithelial cells, Hurthle cells, mitochondrial-rich cells, lymphocytes, follicular cells, parafollicular cells (C cells), metastatic cells, immune cells, macrophages and the like. Other markers included as “cell type markers” may be species-specific markers, such as markers from bacteria, fungi, and the like.
- “Normalizer”, as used herein, means a microRNA or a nucleic acid sequence whose signal (i.e., level of expression) is used in order to normalize each sample. A normalizer may be used alone (one microRNA as normalizer), or as part of a set of normalizers (more than one microRNA as normalizer, for example two, three, four, five, six, seven eight, nine, ten eleven, twelve, thirteen fourteen, sixteen or seventeen microRNAs may be used as normalizers in a set). As referred to herein, any microRNA detected in the sample may be used as a normalizer. Essentially, any microRNA may be used as a normalizer. MicroRNAs denoted by any one of SEQ ID NOs 1-182 may be used as normalizers. MicroRNAs denoted by any one of SEQ ID NOs. 1-37 may be used as normalizers. Particular examples of microRNAs that may be used as normalizers are hsa-miR-23a-3p, MID-20094, MID-50969, hsa-miR-345-5p, hsa-miR-3074-5p, MID-50976, MID-50971, hsa-miR-5701 and hsa-miR-574-3p.
- “Normalization” of data values refers to mapping the original data range into another scale. Normalization may be done by subtracting the mean expression of the set of normalizers, subtracting the median expression of the set of normalizers, fitting the expression values of the normalizers to a reference set of values (using a polynomial fit) and applying this fit to all signals. All the normalizers, or a subset of the normalizers may be used.
- “Nucleic acid” or “oligonucleotide” or “polynucleotide”, as used herein, means at least two nucleotides covalently linked together. The depiction of a single strand also defines the sequence of the complementary strand. Thus, a nucleic acid also encompasses the complementary strand of a depicted single strand. Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid. Thus, a nucleic acid also encompasses substantially identical nucleic acids and complements thereof. A single strand may provide a probe that hybridizes to a target sequence under stringent hybridization conditions. Thus, a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions.
- Nucleic acids may be single-stranded or double-stranded, or may contain portions of both double-stranded and single-stranded sequences. The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine, hypoxanthine, isocytosine and isoguanine. Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.
- A nucleic acid will generally contain phosphodiester bonds, although nucleic acid analogs may be included. The analog may include a non-naturally occurring linkage, backbone, or nucleotide. The analog may have at least one different linkage, e.g., phosphoramidate, phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite linkages and peptide nucleic acid backbones and linkages. Other analog nucleic acids include those with positive backbones; non-ionic backbones, and non-ribose backbones, including those described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are incorporated herein by reference. Nucleic acids containing one or more non-naturally occurring or modified nucleotides are also included within one definition of nucleic acids. The modified nucleotide analog may be located for example at the 5′-end and/or the 3′-end of the nucleic acid molecule. Representative examples of nucleotide analogs may be selected from sugar- or backbone-modified ribonucleotides. It should be noted, however, that also nucleobase-modified ribonucleotides, i.e., ribonucleotides containing a non-naturally occurring nucleobase instead of a naturally occurring nucleobase such as uridines or cytidines modified at the 5-position, e.g., 5-(2-amino) propyl uridine, 5-bromo uridine; adenosines and guanosines modified at the 8-position, e.g., 8-bromo guanosine; deaza nucleotides, e.g., 7-deaza-adenosine; O- and N-alkylated nucleotides, e.g., N6-methyl adenosine are suitable. The 2′-OH-group may be replaced by a group selected from H, OR, R, halo, SH, SR, NH2, NHR, NR2 or CN, wherein R is C1-C6 alkyl, alkenyl or alkynyl and halo is F, Cl, Br or I. Modified nucleotides also include nucleotides conjugated with cholesterol through, e.g., a hydroxyprolinol linkage as described in Krutzfeldt et al. (Nature 2005; 438:685-689), Soutschek et al. (Nature 2004; 432:173-178), and WO 2005/079397, which are incorporated herein by reference. Modifications of the ribose-phosphate backbone may be done for a variety of reasons, e.g., to increase the stability and half-life of such molecules in physiological environments, to enhance diffusion across cell membranes, or as probes on a biochip. The backbone modification may also enhance resistance to degradation, such as in the harsh endocytic environment of cells. The backbone modification may also reduce nucleic acid clearance by hepatocytes, such as in the liver and thyroid. Mixtures of naturally occurring nucleic acids and analogs may be made. Alternatively, mixtures of different nucleic acid analogs, and mixtures of naturally occurring nucleic acids and analogs may be made.
- Thus, novel isolated nucleic acids are provided herein. The nucleic acids provided herein may be non-naturally occurring, synthesized nucleic acids. Thus, the nucleic acid provided herein may be a synthetic nucleic acid. Methods of synthesizing nucleic acids are known to the man skilled in the art, and are described, e.g., in U.S. Pat. No. 7,579,451, the contents of which are incorporated herein by reference. The nucleic acids may comprise at least one of the sequences of SEQ ID NOS: 1-308 or a variant thereof. In one embodiment, the nucleic acids comprise at least one of the sequences of SEQ ID NOS: 1-182. The variant may be a complement of the referenced nucleotide sequence. The variant may be a nucleotide sequence that is 70%, 75%, 80%, 85%, 90% or 95% identical to the referenced nucleotide sequence or the complement thereof. The variant may be a nucleotide sequence which hybridizes under stringent conditions to the referenced nucleotide sequence, complements thereof, or nucleotide sequences substantially identical thereto.
- A nucleic acid as described herein may have a length of from about 10 to about 250 nucleotides. The nucleic acid may have a length of at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200 or 250 nucleotides. The nucleic acid may be synthesized or expressed in a cell (in vitro or in vivo) using a synthetic gene. The nucleic acid may be synthesized as a single strand molecule and hybridized to a substantially complementary nucleic acid to form a duplex. The nucleic acid may be introduced to a cell, tissue or organ in a single- or double-stranded form or capable of being expressed by a synthetic gene using methods well known to those skilled in the art, including as described in U.S. Pat. No. 6,506,559, the contents of which are incorporated by reference herein.
- The nucleic acid may comprise a microRNA sequence shown in Table 1, or a variant thereof. In some instances, variants of the same microRNA are also provided in Table 1. It is to be noted that SEQ ID NOs.1-180 in Table 1 present the cDNA corresponding to the sequence of the naturally occurring microRNA, i.e., the sequences present thymine (T) instead of uracil (U).
- It is to be understood that nucleic acid refers to deoxyribonucleotides, ribonucleotides, or modified nucleotides, and polymers thereof in single- or double-stranded form. The term encompasses nucleic acids containing known nucleotide analogs or modified backbone residues or linkages, which are synthetic, naturally occurring, and non-naturally occurring, which have similar binding properties as the reference nucleic acid, and which are metabolized in a manner similar to the reference nucleotides. Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, peptide-nucleic acids (PNAs) and unlocked nucleic acids (UNAs; see, e.g., Jensen et al. Nucleic Acids Symposium Series 52: 133-4), and derivatives thereof.
- Nucleotide is used as recognized in the art, to include those with natural bases (standard), and modified bases well known in the art. Such bases are generally located at the 1′ position of a nucleotide sugar moiety. Nucleotides generally comprise a base, sugar and a phosphate group. The nucleotides can be unmodified or modified at the sugar, phosphate and/or base moiety, also referred to interchangeably as nucleotide analogs, modified nucleotides, non-natural nucleotides, non-standard nucleotides and other (see, e.g., WO 92/07065; WO 93/15187; the contents of which are incorporated herein by reference). There are several examples of modified nucleic acid bases known in the art as summarized by Limbach, et al, Nucleic Acids Res. 22:2183, 1994. Some of the non-limiting examples of base modifications that can be introduced into nucleic acid molecules include, hypoxanthine, purine, pyridin-4-one, pyridin-2-one, phenyl, pseudouracil, 2,4,6-trimethoxy benzene, 3-methyl uracil, dihydrouridine, naphthyl, aminophenyl, 5-alkylcytidines (e.g., 5-methylcytidine), 5-alkyluridines (e.g., ribothymidine), 5-halouridine (e.g., 5-bromouridine) or 6-azapyrimidines or 6-alkylpyrimidines (e.g. 6-methyluridine), propyne, and others (Burgin, et al., Biochemistry 35:14090, 1996). By “modified bases” in this aspect is meant nucleotide bases other than adenine, guanine, cytosine and uracil at 1′ position or their equivalents.
- Modified nucleotide refers to a nucleotide that has one or more modifications to the nucleoside, the nucleobase, pentose ring, or phosphate group. Modifications include those naturally occurring that result from modification by enzymes that modify nucleotides, such as methyltransferases. Modified nucleotides also include synthetic or non-naturally occurring nucleotides. Synthetic or non-naturally occurring modifications in nucleotides include those with 2′ modifications, e.g., 2′-methoxyethoxy, 2′-fluoro, 2′-allyl, 2′-O-[2-(methylamino)-2-oxoethyl], 4′-thio, 4′-CH2—O-2′-bridge, 4′-(CH2)2—O-2′-bridge, 2′-LNA or other bicyclic or “bridged” nucleoside analog, and 2′-O—(N-methylcarbamate) or those comprising base analogs. In connection with 2′-modified nucleotides as described for the present disclosure, by “amino” is meant 2′-NH2 or 2′-O—NH2, which can be modified or unmodified. Such modified groups are described, e.g., in U.S. Pat. Nos. 5,672,695 and 6,248,878. “Modified nucleotides” of the instant invention can also include nucleotide analogs as described above.
- As used herein, “base analog” refers to a heterocyclic moiety which is located at the 1′ position of a nucleotide sugar moiety in a modified nucleotide that can be incorporated into a nucleic acid duplex (or the equivalent position in a nucleotide sugar moiety substitution that can be incorporated into a nucleic acid duplex). A base analog may be generally a purine or a pyrimidine base, excluding the common bases guanine (G), cytosine (C), adenine (A), thymine (T), and uracil (U). Base analogs can duplex with other bases or base analogs in dsRNAs. Base analogs include those useful in the compounds and methods of the invention, e.g., those disclosed in U.S. Pat. Nos. 5,432,272, 6,001,983 and 7,579,451, which are herein incorporated by reference. Non-limiting examples of bases include hypoxanthine (I), xanthine (X), 313-D-ribofuranosyl-(2,6-diaminopyrimidine) (K), 3-gamma-D-ribofuranosyl-(1-methyl-pyrazolo[4,3-d]pyrimidine-5,7(4H,6H)-dione) (P), iso-cytosine (iso-C), iso-guanine (iso-G), 1-gamma-D-ribofuranosyl-(5-nitroindole), 1-gamma-D-ribofuranosyl-(3-nitropyrrole), 5-bromouracil, 2-aminopurine, 4-thio-dT, 7-(2-thienyl)-imidazo[4,5-b]pyridine (Ds) and pyrrole-2-carbaldehyde (Pa), 2-amino-6-(2-thienyl)purine (S), 2-oxopyridine (Y), difluorotolyl, 4-fluoro-6-methylbenzimidazole, 4-methylbenzimidazole, 3-methyl isocarbostyrilyl, 5-methyl isocarbostyrilyl, and 3-methyl-7-propynyl isocarbostyrilyl, 7-azaindolyl, 6-methyl-7-azaindolyl, imidizopyridinyl, 9-methyl-imidizopyridinyl, pyrrolopyrizinyl, isocarbostyrilyl, 7-propynyl isocarbostyrilyl, propynyl-7-azaindolyl, 2,4,5-trimethylphenyl, 4-methylindolyl, 4,6-dimethylindolyl, phenyl, napthalenyl, anthracenyl, phenanthracenyl, pyrenyl, stilbenzyl, tetracenyl, pentacenyl, and structural derivates thereof (Schweitzer et al., J. Org. Chem., 59:7238-7242 (1994); Berger et al., Nucleic Acids Research, 28(15):2911-2914 (2000); Moran et al., J. Am. Chem. Soc., 119:2056-2057 (1997); Morales et al., J. Am. Chem. Soc., 121:2323-2324 (1999); Guckian et al., J. Am. Chem. Soc., 118:8182-8183 (1996); Morales et al., J. Am. Chem. Soc., 122(6):1001-1007 (2000); McMinn et al., J. Am. Chem. Soc., 121:11585-11586 (1999); Guckian et al., J. Org. Chem., 63:9652-9656 (1998); Moran et al., Proc. Natl. Acad. Sci., 94:10506-10511 (1997); Das et al., J. Chem. Soc., Perkin Trans., 1:197-206 (2002); Shibata et al., J. Chem. Soc., Perkin Trans., 1: 1605-1611 (2001); Wu et al., J. Am. Chem. Soc., 122(32):7621-7632 (2000); O'Neill et al., J. Org. Chem., 67:5869-5875 (2002); Chaudhuri et al., J. Am. Chem. Soc., 117:10434-10442 (1995); and U.S. Pat. No. 6,218,108.). Base analogs may also be a universal base.
- “Universal base” refers to a heterocyclic moiety located at the 1′ position of a nucleotide sugar moiety in a modified nucleotide, or the equivalent position in a nucleotide sugar moiety substitution, that, when present in a nucleic acid duplex, can be positioned opposite more than one type of base without altering the double helical structure (e.g., the structure of the phosphate backbone). Additionally, the universal base does not destroy the ability of the single stranded nucleic acid in which it resides to duplex to a target nucleic acid.
-
TABLE 1 The microRNAs of the invention SEQ miR name ID NO. Sequence hsa-miR-222-3p 1 AGCTACATCTGGCTACTGGGT 2 AGCTACATCTGGCTACTGGGTCTC hsa-miR-551b-3p 3 GACCCATACTTGGTTTCAGAGG 4 GCGACCCATACTTGGTTTCAG hsa-miR-31-5p 5 AGGCAAGATGCTGGCATAGCT 6 AGGCAAGATGCTGGCATAGCTGT 7 GGCAAGATGCTGGCATAGCTG hsa-miR-375 8 TTTGTTCGTTCGGCTCGCGTGA hsa-miR-125b-5p 9 TCCCTGAGACCCTAACTTGTGA hsa-miR-146b-5p 10 TGAGAACTGAATTCCATAGGCT 11 TGAGAACTGAATTCCATAGGCTGT hsa-miR-152-3p 12 TCAGTGCATGACAGAACTTGG 13 TCAGTGCATGACAGAACTTGGG hsa-miR-346 14 TGTCTGCCCGCATGCCTGCCTCT hsa-miR-181c-5p 15 AACATTCAACCTGTCGGTGAGT hsa-miR-424-3p 16 CAAAACGTGAGGCGCTGCTAT hsa-miR-342-3p 17 TCTCACACAGAAATCGCACCCGT 18 TCTCACACAGAAATCGCACCCGTC hsa-miR-138-5p 19 AGCTGGTGTTGTGAATC 20 AGCTGGTGTTGTGAATCAGGCCG 21 AGCTGGTGTTGTGAATCAGGCCGT hsa-miR-486-5p 22 TCCTGTACTGAGCTGCCCCGAG hsa-miR-200c-3p 23 TAATACTGCCGGGTAATGATGG 24 TAATACTGCCGGGTAATGATGGA MID-16582 25 AGTGAAGCATTGGACTGTA hsa-miR-23a-3p 26 ATCACATTGCCAGGGATTTCC MID-20094 27 TAAGCCAGTTTCTGTCTGATA 28 TTTCTAAGCCAGTTTCTGTCTGATA MID-50969 29 ATGACAGATTGACATGGACAATT hsa-miR-345-5p 30 GCTGACTCCTAGTCCAGGGCTC 31 TGCTGACTCCTAGTCCAGGGC hsa-miR-3074-5p 32 GTTCCTGCTGAACTGAGCCAG MID-50976 33 CTGTCTGAGCGCCGCTC MID-50971 34 ATACTCTGGTTTCTTTTC hsa-miR-5701 35 TTATTGTCACGTTCTGATT hsa-miR-574-3p 36 CACGCTCATGCACACACCCAC 37 CACGCTCATGCACACACCCACA hsa-miR-7-5p 38 TGGAAGACTAGTGATTTTGTTGT hsa-miR-10a-5p 39 TACCCTGTAGATCCGAATTTGTG hsa-miR-18a-5p 40 TAAGGTGCATCTAGTGCAGATAG hsa-miR-21-3p 41 CAACACCAGTCGATGGGCTGT hsa-miR-21-5p 42 TAGCTTATCAGACTGATGTTGA hsa-miR-30e-5p 43 TGTAAACATCCTTGACTGGAAG hsa-miR-31-3p 44 TGCTATGCCAACATATTGCCAT hsa-miR-34a-5p 45 TGGCAGTGTCTTAGCTGGTTGTT hsa-miR-92b-5p 46 AGGGACGGGACGCGGTGCAGTG hsa-miR-96-5p 47 TTTGGCACTAGCACATTTTTGCT hsa-miR-100-5p 48 AACCCGTAGATCCGAACTTGTG hsa-miR-126-3p 49 TCGTACCGTGAGTAATAATGCG hsa-miR-138-1-3p 50 GCTACTTCACAACACCAGGGCC hsa-miR-140-3p 51 TACCACAGGGTAGAACCACGG hsa-miR-141-3p 52 TAACACTGTCTGGTAAAGATGG hsa-miR-142-3p 53 TGTAGTGTTTCCTACTTTATGGA hsa-miR-142-5p 54 CATAAAGTAGAAAGCACTACT hsa-miR-146b-3p 55 TGCCCTGTGGACTCAGTTCTGG hsa-miR-146a-5p 56 TGAGAACTGAATTCCATGGGTT hsa-miR-148a-3p 57 TCAGTGCACTACAGAACTTTGT hsa-miR-150-3p 58 CTGGTACAGGCCTGGGGGACAG hsa-miR-150-5p 59 TCTCCCAACCCTTGTACCAGTG hsa-miR-155-5p 60 TTAATGCTAATCGTGATAGGGGT hsa-miR-181a-5p 61 AACATTCAACGCTGTCGGTGAGT hsa-miR-181b-5p 62 AACATTCATTGCTGTCGGTGGGT hsa-miR-182-5p 63 TTTGGCAATGGTAGAACTCACACT hsa-miR-187-3p 64 TCGTGTCTTGTGTTGCAGCCGG hsa-miR-193a-3p 65 AACTGGCCTACAAAGTCCCAGT hsa-miR-195-5p 66 TAGCAGCACAGAAATATTGGC hsa-miR-197-5p 67 CGGGTAGAGAGGGCAGTGGGAGG hsa-miR-199a-3p 68 ACAGTAGTCTGCACATTGGTTA hsa-miR-200a-3p 69 TAACACTGTCTGGTAACGATGTT hsa-miR-200b-3p 70 TAATACTGCCTGGTAATGATGA hsa-miR-199a-5p 71 CCCAGTGTTCAGACTACCTGTTC hsa-miR-199b-5p 72 CCCAGTGTTTAGACTATCTGTTC hsa-miR-205-5p 73 TCCTTCATTCCACCGGAGTCTG hsa-miR-210-3p 74 CTGTGCGTGTGACAGCGGCTGA hsa-miR-214-3p 75 ACAGCAGGCACAGACAGGCAGT hsa-miR-221-3p 76 AGCTACATTGTCTGCTGGGTTTC hsa-miR-221-5p 77 ACCTGGCATACAATGTAGATTT hsa-miR-223-3p 78 TGTCAGTTTGTCAAATACCCCA hsa-miR-222-5p 79 CTCAGTAGCCAGTGTAGATCCT hsa-miR-224-5p 80 CAAGTCACTAGTGGTTCCGTTTAG hsa-miR-342-5p 81 AGGGGTGCTATCTGTGATTGA hsa-miR-429 82 TAATACTGTCTGGTAAAACCGT hsa-miR-455-3p 83 GCAGTCCATGGGCATATACAC hsa-miR-483-5p 84 AAGACGGGAGGAAAGAAGGGAG hsa-miR-487b-3p 85 AATCGTACAGGGTCATCCACTT hsa-miR-497-5p 86 CAGCAGCACACTGTGGTTTGT hsa-miR-513a-5p 87 TTCACAGGGAGGTGTCATTTAT hsa-miR-542-5p 88 TCGGGGATCATCATGTCACGAGA hsa-miR-625-5p 89 AGGGGGAAAGTTCTATAGTCC hsa-miR-650 90 AGGAGGCAGCGCTCTCAGGAC hsa-miR-658 91 GGCGGAGGGAAGTAGGTCCGTTGGT hsa-miR-664b-5p 92 TGGGCTAAGGGAGATGATTGGGTA hsa-miR-708-5p 93 AAGGAGCTTACAATCTAGCTGGG hsa-miR-765 94 TGGAGGAGAAGGAAGGTGATG hsa-miR-1229-5p 95 GTGGGTAGGGTTTGGGGGAGAGCG hsa-miR-2392 96 TAGGATGGGGGTGAGAGGTG hsa-miR-3141 97 GAGGGCGGGTGGAGGAGGA hsa-miR-3162-5p 98 TTAGGGAGTAGAAGGGTGGGGAG hsa-miR-3679-5p 99 TGAGGATATGGCAGGGAAGGGGA hsa-miR-3687 100 CCCGGACAGGCGTTCGTGCGACGT hsa-miR-3940-5p 101 GTGGGTTGGGGCGGGCTCTG hsa-miR-4270 102 TCAGGGAGTCAGGGGAGGGC hsa-miR-4284 103 GGGCTCACATCACCCCAT hsa-miR-4443 104 TTGGAGGCGTGGGTTTT hsa-miR-4447 105 GGTGGGGGCTGTTGTTT hsa-miR-4448 106 GGCTCCTTGGTCTAGGGGTA hsa-miR-4454 107 GGATCCGAGTCACGGCACCA hsa-miR-4534 108 GGATGGAGGAGGGGTCT hsa-miR-4538 109 GAGCTTGGATGAGCTGGGCTGA hsa-miR-4539 110 GCTGAACTGGGCTGAGCTGGGC hsa-miR-4689 111 TTGAGGAGACATGGTGGGGGCC hsa-miR-4690-5p 112 GAGCAGGCGAGGCTGGGCTGAA hsa-miR-4739 113 AAGGGAGGAGGAGCGGAGGGGCCCT hsa-miR-5001-5p 114 AGGGCTGGACTCAGCGGCGGAGCT hsa-miR-5100 115 TTCAGATCCCAGCGGTGCCTCT hsa-miR-5684 116 AACTCTAGCCTGAGCAACAG hsa-miR-5698 117 TGGGGGAGTGCAGTGATTGTGG hsa-miR-5739 118 GCGGAGAGAGAATGGGGAGC hsa-miR-6076 119 AGCATGACAGAGGAGAGGTGG hsa-miR-6086 120 GGAGGTTGGGAAGGGCAGAG hsa-miR-6127 121 TGAGGGAGTGGGTGGGAGG MID-00078 122 AAGTGATTGGAGGTGGGTGGGG MID-00321 123 CCTGTCTGAGCGACGCT MID-00387 124 GAGACTCTCCTGTGCAG MID-00671 125 TGCAGATTGTGGGTGGGAGGAC MID-00672 126 TGCAGCTGGTGGAGTCTGGGGG MID-00690 127 TGGAGAAGACTGGAGAGGGTAT MID-15965 128 ACTACCCCAGGATGCCAGCATAGTT MID-16318 129 AGCTGGTTTGATGGGGAGCCAT MID-17144 130 CACTGATTATCGAGGCGATTCT MID-17866 131 CGCCTGTGAATAGTCACTGCAC MID-18468 132 GACGTGAGGGGGTGCTACATAC MID-19433 133 GGCTGGTCCGAAGGTAGTGAGTT MID-19434 134 GGCTGGTCCGAGTGCAGTGGTGTTT MID-23168 135 TGTCCAAAGTAAACGCCCTGACGCA MID-23794 136 TTCCCGGCCAATGCATTA MID-24496 137 TTTGGAGGGGCCGTGACAGATG MID-24705 138 CTCCCACTGCTTCACTTGACTA MD2-495 139 NGGGCCGAGGGAGCGAGAG1 MD2-437 140 AGUGCUUGGCUGAGGAGCU hsa-let-7a-5p 141 TGAGGTAGTAGGTTGTATAGTT hsa-let-7b-5p 142 TGAGGTAGTAGGTTGTGTGGTT hsa-let-7c-5p 143 TGAGGTAGTAGGTTGTATGGTT hsa-let-7d-5p 144 AGAGGTAGTAGGTTGCATAGTT hsa-let-7f-5p 145 TGAGGTAGTAGATTGTATAGTT hsa-let-7g-5p 146 TGAGGTAGTAGTTTGTACAGTT hsa-let-7i-5p 147 TGAGGTAGTAGTTTGTGCTGTT hsa-miR-103a-2-5p 148 AGCTTCTTTACAGTGCTGCCTTG hsa-miR-103a-3p 149 AGCAGCATTGTACAGGGCTATGA hsa-miR-106a-5p 150 AAAAGTGCTTACAGTGCAGGTAGC hsa-miR-106b-5p 151 TAAAGTGCTGACAGTGCAGAT hsa-miR-107 152 AGCAGCATTGTACAGGGCTATCA hsa-miR-125a-5p 153 TCCCTGAGACCCTTTAACCTGTGA hsa-miR-144-3p 154 TACAGTATAGATGATGTACT hsa-miR-149-5p 155 TCTGGCTCCGTGTCTTCACTCCC hsa-miR-151a-5p 156 TCGAGGAGCTCACAGTCTAGTA hsa-miR-15b-5p 157 TAGCAGCACATCATGGTTTACA hsa-miR-16-1-3p 158 CCAGTATTAACTGTGCTGCTGA hsa-miR-16-5p 159 TAGCAGCACGTAAATATTGGCG hsa-miR-17-3p 160 ACTGCAGTGAAGGCACTTGTAG hsa-miR-17-5p 161 CAAAGTGCTTACAGTGCAGGTAGT hsa-miR-185-5p 162 TGGAGAGAAAGGCAGTTCCTGA hsa-miR-191-5p 163 CAACGGAATCCCAAAAGCAGCTG hsa-miR-192-3p 164 CTGCCAATTCCATAGGTCACAG hsa-miR-19b-3p 165 TGTGCAAATCCATGCAAAACTGA hsa-miR-20a-5p 166 TAAAGTGCTTATAGTGCAGGTAG hsa-miR-25-3p 167 CATTGCACTTGTCTCGGTCTGA hsa-miR-26a-5p 168 TTCAAGTAATCCAGGATAGGCT hsa-miR-26b-5p 169 TTCAAGTAATTCAGGATAGGT hsa-miR-30b-5p 170 TGTAAACATCCTACACTCAGCT hsa-miR-30c-5p 171 TGTAAACATCCTACACTCTCAGC hsa-miR-30d-5p 172 TGTAAACATCCCCGACTGGAAG hsa-miR-320a 173 AAAAGCTGGGTTGAGAGGGCGAA hsa-miR-3648 174 AGCCGCGGGGATCGCCGAGGG hsa-miR-376a-3p 175 ATCATAGAGGAAAATCCACGT hsa-miR-425-5p 176 AATGACACGATCACTCCCGTTGA hsa-miR-4306 177 TGGAGAGAAAGGCAGTA hsa-miR-4324 178 CCCTGAGACCCTAACCTTAA hsa-miR-484 179 TCAGGCTCAGTCCCCTCCCGAT hsa-miR-624-5p 180 TAGTACCAGTACCTTGTGTTCA hsa-miR-92a-3p 181 TATTGCACTTGTCCCGGCCTGT hsa-miR-93-5p 182 CAAAGTGCTGTTCGTGCAGGTAG 1“N” may be any one of G, C, A, T/U.
miR name is the miRBase registry name (release 20), except for the miR names represented by MID-[numeral] or MD2-[numeral].
MID-00078, MID-00321, MID-00387, MID-00671, MID-00672, MID-00690, MID-15965, MID-16318, MID-17144, MID-17866, MID-18468, MID-19433, MID-19434, MID-23168, MID-23794, MID-24496, MID-24705, MD2-495 and MD2-437 are putative microRNAs, which were predicted and/or cloned at Rosetta Genomics. - The nucleic acid may also comprise a miR hairpin sequence shown in Table 2, or a variant thereof.
-
TABLE 2 Hairpins of the microRNAs of the invention Hairpin SEQ ID miR name NO. Hairpin Sequence hsa-mir-7 183 GTGGACCGGCTGGCCCCATCTGGAAGACTAGTGATTTTGTTGTTGTCTTA CTGCGCTCAACAACAAATCCCAGTCTACCTAATGGTGCCAGCCATCGC hsa-mir-10a 184 GTCTTCTGTATATACCCTGTAGATCCGAATTTGTGTAAGGAATTTTGTGG TCACAAATTCGTATCTAGGGGAATATGTAGTTGAC hsa-mir-18a 185 GTTCTAAGGTGCATCTAGTGCAGATAGTGAAGTAGATTAGCATCTACTGC CCTAAGTGCTCCTTCTGGC hsa-mir-21 186 GTACCACCTTGTCGGGTAGCTTATCAGACTGATGTTGACTGTTGAATCTC ATGGCAACACCAGTCGATGGGCTGTCTGACATTTTGGTAT hsa-mir-23a 187 GGCCGGCTGGGGTTCCTGGGGATGGGATTTGCTTCCTGTCACAAATCACA TTGCCAGGGATTTCCAACCGACC hsa-mir-30e 188 GGCAGTCTTTGCTACTGTAAACATCCTTGACTGGAAGCTGTAAGGTGTTC AGAGGAGCTTTCAGTCGGATGTTTACAGCGGCAGGCTGCC hsa-mir-31 189 GGAGAGGAGGCAAGATGCTGGCATAGCTGTTGAACTGGGAACCTGCTATG CCAACATATTGCCATCTTTCC hsa-mir-34a 190 GTGAGTGTTTCTTTGGCAGTGTCTTAGCTGGTTGTTGTGAGCAATAGTAA GGAAGCAATCAGCAAGTATACTGCCCTAGAAGTGCTGCAC hsa-mir-92b 191 GGGGAGCGGGATCCCGGGCCCCGGGCGGGCGGGAGGGACGGGACGCGGTG CAGTGTTGTTTTTTCCCCCGCCAATATTGCACTCGTCCCGGCCTCCGGCC CCCCCGGCCCCCCGGCCTCCCCGCTACCCC hsa-mir-96 192 TCTGCTTGGCCGATTTTGGCACTAGCACATTTTTGCTTGTGTCTCTCCGC TCTGAGCAATCATGTGCAGTGCCAATATGGGAAAAGCAGG hsa-mir-100 193 GCCTGTTGCCACAAACCCGTAGATCCGAACTTGTGGTATTAGTCCGCACA AGCTTGTATCTATAGGTATGTGTCTGTTAGGC hsa-mir-126 194 GCTGGCGACGGGACATTATTACTTTTGGTACGCGCTGTGACACTTCAAAC TCGTACCGTGAGTAATAATGCGCCGTCCACGGC hsa-mir-125b-1 195 TGCGCTCCTCTCAGTCCCTGAGACCCTAACTTGTGATGTTTACCGTTTAA ATCCACGGGTTAGGCTCTTGGGAGCTGCGAGTCGTGCT hsa-mir-125b-2 196 ACCAGACTTTTCCTAGTCCCTGAGACCCTAACTTGTGAGGTATTTTAGTA ACATCACAAGTCAGGCTCTTGGGACCTAGGCGGAGGGGA hsa-mir-138-1 197 TGGTGTGGTGGGGCAGCTGGTGTTGTGAATCAGGCCGTTGCCAATCAGAG AACGGCTACTTCACAACACCAGGGCCACACCACACTA 198 CCCTGGCATGGTGTGGTGGGGCAGCTGGTGTTGTGAATCAGGCCGTTGCC AATCAGAGAACGGCTACTTCACAACACCAGGGCCACACCACACTACAGG hsa-mir-138-2 199 CGTTGCTGCAGCTGGTGTTGTGAATCAGGCCGACGAGCAGCGCATCCTCT TACCCGGCTATTTCACGACACCAGGGTTGCATCA 200 GAGGAAGCCGGCGGAGTTCTGGTATCGTTGCTGCAGCTGGTGTTGTGAAT CAGGCCGACGAGCAGCGCATCCTCTTACCCGGCTATTTCACGACACCAGG GTTGCATCATACCCATCCTCTCCAGGCGAGCCTC hsa-mir-140 201 GCGCCCTGTGTGTGTCTCTCTCTGTGTCCTGCCAGTGGTTTTACCCTATG GTAGGTTACGTCATGCTGTTCTACCACAGGGTAGAACCACGGACAGGATA CCGGGGCACCCTCTGCGT hsa-mir-141 202 GTCGGCCGGCCCTGGGTCCATCTTCCAGTACAGTGTTGGATGGTCTAATT GTGAAGCTCCTAACACTGTCTGGTAAAGATGGCTCCCGGGTGGGTTCTCT CGGC hsa-mir-142 203 ACAGTGCAGTCACCCATAAAGTAGAAAGCACTACTAACAGCACTGGAGGG TGTAGTGTTTCCTACTTTATGGATGAGTGTACTGT hsa-mir-146b 204 CCTGGCACTGAGAACTGAATTCCATAGGCTGTGAGCTCTAGCAATGCCCT GTGGACTCAGTTCTGGTGCCCGG hsa-mir-146a 205 GTATCCTCAGCTTTGAGAACTGAATTCCATGGGTTGTGTCAGTGTCAGAC CTCTGAAATTCAGTTCTTCAGCTGGGATAT hsa-mir-148a 206 GGTCTTTTGAGGCAAAGTTCTGAGACACTCCGACTCTGAGTATGATAGAA GTCAGTGCACTACAGAACTTTGTCTCTAGAGGCT hsa-mir-150 207 TCCCCATGGCCCTGTCTCCCAACCCTTGTACCAGTGCTGGGCTCAGACCC TGGTACAGGCCTGGGGGACAGGGACCTGGGGA hsa-mir-152 208 GTCCCCCCCGGCCCAGGTTCTGTGATACACTCCGACTCGGGCTCTGGAGC AGTCAGTGCATGACAGAACTTGGGCCCGGAAGGAC 209 TGTCCCCCCCGGCCCAGGTTCTGTGATACACTCCGACTCGGGCTCTGGAG CAGTCAGTGCATGACAGAACTTGGGCCCGGAAGGACC hsa-mir-155 210 TAGGCTGTATGCTGTTAATGCTAATCGTGATAGGGGTTTTTGCCTCCAAC TGACTCCTACATATTAGCATTAACAGTGTATGATGCCTG hsa-mir-181a 211 GGTTGCTTCAGTGAACATTCAACGCTGTCGGTGAGTTTGGAATTAAAATC AAAACCATCGACCGTTGATTGTACCCTATGGCTAACC hsa-mir-181b 212 GGTCACAATCAACATTCATTGCTGTCGGTGGGTTGAACTGTGTGGACAAG CTCACTGAACAATGAATGCAACTGTGGCC hsa-mir-181c 213 CGGAAAATTTGCCAAGGGTTTGGGGGAACATTCAACCTGTCGGTGAGTTT GGGCAGCTCAGGCAAACCATCGACCGTTGAGTGGACCCTGAGGCCTGGAA TTGCCATCCT hsa-mir-182 214 CCTCCCCCCGTTTTTGGCAATGGTAGAACTCACACTGGTGAGGTAACAGG ATCCGGTGGTTCTAGACTTGCCAACTATGGGGCGAGG hsa-mir-187 215 CCTCGGGCTACAACACAGGACCCGGGCGCTGCTCTGACCCCTCGTGTCTT GTGTTGCAGCCGGAGG hsa-mir-193a 216 GGGAGCTGAGGGCTGGGTCTTTGCGGGCGAGATGAGGGTGTCGGATCAAC TGGCCTACAAAGTCCCAGTTCTCGGCCCC hsa-mir-195 217 CCTGGCTCTAGCAGCACAGAAATATTGGCACAGGGAAGCGAGTCTGCCAA TATTGGCTGTGCTGCTCCAGG hsa-mir-197 218 TGTGCTCTGGGGGCTGTGCCGGGTAGAGAGGGCAGTGGGAGGTAAGAGCT CTTCACCCTTCACCACCTTCTCCACCCAGCATGGCCGGCACA hsa-mir-199a 219 GGCCCCGCCAACCCAGTGTTCAGACTACCTGTTCAGGAGGCTCTCAATGT GTACAGTAGTCTGCACATTGGTTAGGCTGGGCT hsa-mir-200a 220 GAGCATCTTACCGGACAGTGCTGGATTTCCCAGCTTGACTCTAACACTGT CTGGTAACGATGTTC hsa-mir-200b 221 GCTCGGGCAGCCGTGGCCATCTTACTGGGCAGCATTGGATGGAGTCAGGT CTCTAATACTGCCTGGTAATGATGACGGCGGAGCCCTGC hsa-mir-200c 222 GGGCGGGGGCCCTCGTCTTACCCAGCAGTGTTTGGGTGCGGTTGGGAGTC TCTAATACTGCCGGGTAATGATGGAGGCCCCTGTCC 223 CCCTCGTCTTACCCAGCAGTGTTTGGGTGCGGTTGGGAGTCTCTAATACT GCCGGGTAATGATGGAGG hsa-mir-199a 224 GGCCCCGCCAACCCAGTGTTCAGACTACCTGTTCAGGAGGCTCTCAATGT GTACAGTAGTCTGCACATTGGTTAGGCTGGGCT hsa-mir-199b 225 GTCTACCCAGTGTTTAGACTATCTGTTCAGGACTCCCAAATTGTACAGTA GTCTGCACATTGGTTAGGC hsa-mir-205 226 TCCATGTGCTTCTCTTGTCCTTCATTCCACCGGAGTCTGTCTCATACCCA ACCAGATTTCAGTGGAGTGAAGTTCAGGAGGCATGGA hsa-mir-210 227 CCAGGCGCAGGGCAGCCCCTGCCCACCGCACACTGCGCTGCCCCAGACCC ACTGTGCGTGTGACAGCGGCTGATCTGTGCCTGG hsa-mir-214 228 GGCTGGACAGAGTTGTCATGTGTCTGCCTGTCTACACTTGCTGTGCAGAA CATCCGCTCACCTGTACAGCAGGCACAGACAGGCAGTCACATGACAACCC AGCC hsa-mir-221 229 GAACATCCAGGTCTGGGGCATGAACCTGGCATACAATGTAGATTTCTGTG TTCGTTAGGCAACAGCTACATTGTCTGCTGGGTTTCAGGCTACCTGGAAA CATGTTC hsa-mir-222 230 CAGCTGCTGGAAGGTGTAGGTACCCTCAATGGCTCAGTAGCCAGTGTAGA TCCTGTCTTTCGTAATCAGCAGCTACATCTGGCTACTGGGTCTCTGATGG CATCTTCTAGCTTCTG 231 GCTGCTGGAAGGTGTAGGTACCCTCAATGGCTCAGTAGCCAGTGTAGATC CTGTCTTTCGTAATCAGCAGCTACATCTGGCTACTGGGTCTCTGATGGCA TCTTCTAGCT hsa-mir-223 232 GCTCTTGGCCTGGCCTCCTGCAGTGCCACGCTCCGTGTATTTGACAAGCT GAGTTGGACACTCCATGTGGTAGAGTGTCAGTTTGTCAAATACCCCAAGT GCGGCACATGCTTACCAGCTCTAGGCCAGGGC hsa-mir-224 233 GGGGCTTTCAAGTCACTAGTGGTTCCGTTTAGTAGATGATTGTGCATTGT TTCAAAATGGTGCCCTAGTGACTACAAAGCCCC hsa-mir-342 234 GTGAAACTGGGCTCAAGGTGAGGGGTGCTATCTGTGATTGAGGGACATGG TTAATGGAATTGTCTCACACAGAAATCGCACCCGTCACCTTGGCCTACTT ATCAC 235 GAAACTGGGCTCAAGGTGAGGGGTGCTATCTGTGATTGAGGGACATGGTT AATGGAATTGTCTCACACAGAAATCGCACCCGTCACCTTGGCCTACTTA hsa-mir-345 236 ACCCAAACCCTAGGTCTGCTGACTCCTAGTCCAGGGCTCGTGATGGCTGG TGGGCCCTGAACGAGGGGTCTGGAGGCCTGGGTTTGAATATCGACAGC hsa-mir-346 237 GGTCTCTGTGTTGGGCGTCTGTCTGCCCGCATGCCTGCCTCTCTGTTGCT CTGAAGGAGGCAGGGGCTGGGCCTGCAGCTGCCTGGGCAGAGCGG hsa-mir-375 238 CGCTCCCGCCCCGCGACGAGCCCCTCGCACAAACCGGACCTGAGCGTTTT GTTCGTTCGGCTCGCGTGAGGCAGGGGCG 239 CCCCGCGACGAGCCCCTCGCACAAACCGGACCTGAGCGTTTTGTTCGTTC GGCTCGCGTGAGGC hsa-mir-424 240 CGAGGGGATACAGCAGCAATTCATGTTTTGAAGTGTTCTAAATGGTTCAA AACGTGAGGCGCTGCTATACCCCCTCGTGGGGAAGGTAGAAGGTGGGG hsa-mir-429 241 GATGGGCGTCTTACCAGACATGGTTAGACCTGGCCCTCTGTCTAATACTG TCTGGTAAAACCGTCCATC hsa-mir-455 242 GGCGTGAGGGTATGTGCCTTTGGACTACATCGTGGAAGCCAGCACCATGC AGTCCATGGGCATATACACTTGCCTCAAGGCC hsa-mir-483 243 ACCCCAAGGTGGAGCCCCCAGCGACCTTCCCCTTCCAGCTGAGCATTGCT GTGGGGGAGAGGGGGAAGACGGGAGGAAAGAAGGGAGTGGTTCCATCACG CCTCCTCACTCCTCTCCTCCCGTCTTCTCCTCTCCTGCCCTTGTCTCCCT GTCTCAGCAGCTCCAGGGGTGGTGTGGGCCCCTCCAGCCTCCTAGGTGGT hsa-mir-487b 244 GTGCTAACCTTTGGTACTTGGAGAGTGGTTATCCCTGTCCTGTTCGTTTT GCTCATGTCGAATCGTACAGGGTCATCCACTTTTTCAGTATCAAGAGCGC hsa-mir-486 245 CTGATCTCCATCCTCCCTGGGGCATCCTGTACTGAGCTGCCCCGAGGCCC TTCATGCTGCCCAGCTCGGGGCAGCTCAGTACAGGATACTCGGGGTGGGA GTCAGCAGGAGGTGAG 246 GCATCCTGTACTGAGCTGCCCCGAGGCCCTTCATGCTGCCCAGCTCGGGG CAGCTCAGTACAGGATAC hsa-mir-486-2 247 TCCTGTACTGAGCTGCCCCGAGCTGGGCAGCATGAAGGGCCTCGGGGCAG CTCAGTACAGGATG hsa-mir-497 248 CGGTCCTGCTCCCGCCCCAGCAGCACACTGTGGTTTGTACGGCACTGTGG CCACGTCCAAACCACACTGTGGTGTTAGAGCGAGGGTGGGGGAGGCACCG hsa-mir-513a 249 GGGATGCCACATTCAGCCATTCAGCGTACAGTGCCTTTCACAGGGAGGTG TCATTTATGTGAACTAAAATATAAATTTCACCTTTCTGAGAAGGGTAATG TACAGCATGCACTGCATATGTGGTGTCCC hsa-mir-542 250 GGATGCACAGATCTCAGACATCTCGGGGATCATCATGTCACGAGATACCA GTGTGCACTTGTGACAGATTGATAACTGAAAGGTCTGGGAGCCACTCATCT hsa-mir-551b 251 TGCCAGATGTGCTCTCCTGGCCCATGAAATCAAGCGTGGGTGAGACCTGG TGCAGAACGGGAAGGCGACCCATACTTGGTTTCAGAGGCTGTGAGAATAA CTGCA 252 AGATGTGCTCTCCTGGCCCATGAAATCAAGCGTGGGTGAGACCTGGTGCA GAACGGGAAGGCGACCCATACTTGGTTTCAGAGGCTGTGAGAATAA hsa-mir-574 253 GGGACCTGCGTGGGTGCGGGCGTGTGAGTGTGTGTGTGTGAGTGTGTGTC GCTCCGGGTCCACGCTCATGCACACACCCACACGCCCACACTCAGG hsa-mir-625 254 TGGTAAGGGTAGAGGGATGAGGGGGAAAGTTCTATAGTCCTGTAATTAGA TCTCAGGACTATAGAACTTTCCCCCTCATCCCTCTGCCCTCTACCA hsa-mir-650 255 TCTCAGGAGGCAGCGCTCTCAGGACGTCACCACCATGGCCTGGGCTCTGC TCCTCCTCA hsa-mir-658 256 CTCGGTTGCCGTGGTTGCGGGCCCTGCCCGCCCGCCAGCTCGCTGACAGC ACGACTCAGGGCGGAGGGAAGTAGGTCCGTTGGTCGGTCGGGAACGAG hsa-mir-664b 257 GTTCAGTCCAGGGCAGCTTCCCTGTTCTGTTAATTAAACTTTGGGACATT AAAATGGGCTAAGGGAGATGATTGGGTAGAAAGTATTATTCTATTCATTT GCCTCCCAGCCTACAAAAATGCCTGCTTGGGGTCTAATACTTCAACGGTT AAAGATGCCTGGAAGAGGGC hsa-mir-708 258 GGTAACTGCCCTCAAGGAGCTTACAATCTAGCTGGGGGTAAATGACTTGC ACATGAACACAACTAGACTGTGAGCTTCTAGAGGGCAGGGACC hsa-mir-765 259 TTAGGCGCTGATGAAAGTGGAGTTCAGTAGACAGCCCTTTTCAAGCCCTA CGAGAAACTGGGGTTTCTGGAGGAGAAGGAAGGTGATGAAGGATCTGTTC TCGTGAGCCTGA hsa-mir-1229 260 GTGGGTAGGGTTTGGGGGAGAGCGTGGGCTGGGGTTCAGGGACACCCTCT CACCACTGCCCTCCCACAG hsa-mir-2392 261 TGGTCCCTCCCAATCCAGCCATTCCTCAGACCAGGTGGCTCCCGAGCCAC CCCAGGCTGTAGGATGGGGGTGAGAGGTGCTA hsa-mir-3074 262 GCTCGACTCCTGTTCCTGCTGAACTGAGCCAGTGTGTAAAATGAGAACTG ATATCAGCTCAGTAGGCACCGGAGGGCGGGT hsa-mir-3141 263 CCCGGTGAGGGCGGGTGGAGGAGGAGGGTCCCCACCATCAGCCTTCACTG GGACGGG hsa-mir-3162 264 AAGTTAATTTTGAAGCTGACTTTTTTAGGGAGTAGAAGGGTGGGGAGCAT GAACAATGTTTCTCACTCCCTACCCCTCCACTCCCCAAAAAAGTCAGCTT CTCTTGTTAACTT hsa-mir-3679 265 GGCCCCACGTGGTGAGGATATGGCAGGGAAGGGGAGTTTCCCTCTATTCC CTTCCCCCCAGTAATCTTCATCATGCGGTGTC hsa-mir-3687 266 GCGCGTGCGCCCGAGCGCGGCCCGGTGGTCCCTCCCGGACAGGCGTTCGT GCGACGTGT hsa-mir-3940 267 GAGGAAAAGATCGAGGTGGGTTGGGGCGGGCTCTGGGGATTTGGTCTCAC AGCCCGGATCCCAGCCCACTTACCTTGGTTACTCTCCTT hsa-mir-4270 268 CAAATAGCTTCAGGGAGTCAGGGGAGGGCAGAAATAGATGGCCTTCCCCT GCTGGGAAGAAAGTG hsa-mir-4284 269 TTCTGTGAGGGGCTCACATCACCCCATCAAAGTGGGGACTCATGGGGAGA GGGGGTAGTTAGGAGCTTTGATAGAG hsa-mir-4443 270 GGTGGGGGTTGGAGGCGTGGGTTTTAGAACCTATCCCTTTCTAGCCCTGA GCA hsa-mir-4447 271 GTTCTAGAGCATGGTTTCTCATCATTTGCACTACTGATACTTGGGGTCAG ATAATTGTTTGTGGTGGGGGCTGTTGTTTGCATTGTAGGAT hsa-mir-4448 272 GGAGTGACCAAAAGACAAGAGTGCGAGCCTTCTATTATGCCCAGACAGGG CCACCAGAGGGCTCCTTGGTCTAGGGGTAATGCC hsa-mir-4454 273 CCGGATCCGAGTCACGGCACCAAATTTCATGCGTGTCCGTGTGAAGAGAC CACCA hsa-mir-4534 274 GTGAATGACCCCCTTCCAGAGCCAAAATCACCAGGGATGGAGGAGGGGTC TTGGGTAC hsa-mir-4538 275 AACTGGGCTGGGCTGAACTGGGCTGGGCTGAGCTGAGCTTGGATGAGCTG GGCTGAACTGGGCTGGGTTGAGCTGGGCTGGGCTGAGTTGAGCCAGGCTG ATCTGGGCTGAGCCGAGCTGGGTTAAGCCGAGCTGGGTT hsa-mir-4539 276 GGCTGGGCTGGGCTGGGCTCTGCTGTGCTGTGCTGAACAGGGCTGAGCTG AACTGAGCTGAGCTGGGCTGAGCTGGGCTCTGCTGTGCTGTGCTGAGCAG GGCTGAGCTGAACTGGGCTGAGCTGGGCTGAGCTGGGCTGAGTTGAGCAG AGCTGGGTTGAGCAGAGCTGGGCTGGGCTGGGCTGAGTTGAGCC hsa-mir-4689 277 CGGTTTCTCCTTGAGGAGACATGGTGGGGGCCGGTCAGGCAGCCCATGCC ATGTGTCCTCATGGAGAGGCCG hsa-mir-4690 278 GGCAGGTGAGCAGGCGAGGCTGGGCTGAACCCGTGGGTGAGGAGTGCAGC CCAGCTGAGGCCTCTGCTGTCTTATCTGTC hsa-mir-4739 279 GTGGGCAGGGGAGGAAGAAGGGAGGAGGAGCGGAGGGGCCCTTGTCTTCC CAGAGCCTCTCCCTTCCTCCCCTCCCCCTCCCTCTGCTCAT hsa-mir-5001 280 GGGCGGCTGCGCAGAGGGCTGGACTCAGCGGCGGAGCTGGCTGCTGGCCT CAGTTCTGCCTCTGTCCAGGTCCTTGTGACCCGCCC hsa-mir-5100 281 CTGGGGGTAGGAGCGTGGCTTCTGGAGCTAGACCACATGGGTTCAGATCC CAGCGGTGCCTCTAACTG hsa-mir-5684 282 GAGCTATGATTGTGTAGCTGAACTCTAGCCTGAGCAACAGAGTGAGATGG TCTTGTTTTGTTGCCCAGGCTGGAGTCCAGTGTCAAGATCATGGCTC hsa-mir-5698 283 GAGCTCCAAATCTGTGCACCTGGGGGAGTGCAGTGATTGTGGAATGCAAA GTCCCACAATCACTGTACTCCCCAGGTGCACAGATTCTCTCTC hsa-mir-5701-1 284 GATTGGACTTTATTGTCACGTTCTGATTGGTTAGCCTAAGACTTGTTCTG ATCCAATCAGAACATGAAAATAACGTCCAATC hsa-mir-5701-2 285 GATTGGACTTTATTGTCACGTTCTGATTGGTTAGCCTAAGACTTGTTCTG ATCCAATCAGAACATGAAAATAACGTCCAATC hsa-mir-5739 286 TTGGCTATAACTATCATTTCCAAGGTTGTGCTTTTAGGAAATGTTGGCTG TCCTGCGGAGAGAGAATGGGGAGCCAG hsa-mir-6076 287 AGCATGACAGAGGAGAGGTGGAGGTAGGCGAGAGTAATATAATTTCTCCA GGAGAACATCTGAGAGGGGAAGTTGCTTTCCTGCCCTGGCCCTTTCACCC TCCTGAGTTTGGG hsa-mir-6086 288 AGGAGGTTGGGAAGGGCAGAGATGAGCATAAAGTTTTTGCCTTGTTTTTC TTTTT hsa-mir-6127 289 AAGATGAGGGAGTGGGTGGGAGGTGGGAAGGCTGCCCCAAATGGCCTCTA ACATCCCTTCCAGTCTCCTCCTCCTCCTCCTCCTTCTTCTT MID-00078 290 TATGTACCCGGAGCCAAAAGTGATTGGAGGTGGGTGGGGTTAATGAATAG ACAAGTGTTAAAACTAAAAGTCACGTCTCTCTCTCCTTCCTCCTCAGTTT TGGCTTGATTTTTCATG MID-00321 291 CTTACCTAGAAATTGTTGCCTGTCTGAGCGACGCTTCAAACTCAGCTTCA GCAGGTCTGCAGGGACATCAGGTAGG MID-00387 292 GTGTCTCTGTGTTTGCAGGTGTCCAGTGTGAGGTGCAGCTGGTGGAGTCT GGGGGAGGCTTGGTACAGCCTGGGGGATCCCTGAGACTCTCCTGTGCAGC CTCTGGATTCACCTTCAGTAACAGTGACAT MID-00671 293 GTCAGCCTGCAATTAGTGAAATGGAGGCACACATGCTGGTTTGCAGATTG TGGGTGGGAGGAC MID-00672 294 GTGTCTCTGTGTTTGCAGGTGTCCAGTGTGAGGTGCAGCTGGTGGAGTCT GGGGGAGGCTTGGTACAGCCTGGGGGATCCCTGAGACTCTCCTGTGCAGC CTCTGGATTCACCTTCAGTAACAGTGACAT MID-00690 295 GGCCTTGGATGGAGAAGACTGGAGAGGGTATGGAAGTGCTTGGACGTAGG ACATCTGCCTCTCTGGTCTTTGTCCATCCCACAGGGCC MID-15965 296 AGCTGGTTGGCATTCTGGCCCTGGTTCATGCCAACTCTTGTGTTGACTAC CCCAGGATGCCAGCATAGTTG MID-16318 297 CTGCCAAAGAGCAGCAAGATGAGCTGGTTTGATGGGGAGCCATCCCTTGA TGAGGAGAACCCTTCCCACTCTCACTCAGCCTCACCCAGCTGCCCTGAGG CAG MID-17144 298 GCTCAGAAGTGATGAATTGATCAGATAGACGAGGCCGGGCTTGTCCCCGG CCACTGATTATCGAGGCGATTCTGATCTGGGC MID-17866 299 GCTGGGTGCAGTAGCTTATGTCTGTAGTCCCAGCTACTTGGGAGGCTGAG GTGGGAGGATCACCTGAGGTCAGGAGTTTGGGTCTGCCGTGAGCTGTGAT TGCGCCTGTGAATAGTCACTGCACTCCAGC MID-18468 300 GACGTGAGGGGGTGCTACATACAGCAGCTGTGTGTAGTATGTGCCTTTCT CTGTT MID-19433 301 TAGGAATTCTGGACCAGGCTTAAAAGACTGGGATGAGGCTGGTCCGAAGG TAGTGAGTTATCTCCATTGATAGTTCAGTCTGTAACAGATCAAACTCCTT GTTCTACTCTTTTTTTTTTTTTTAGACAGA MID-19434 302 TGGGCTGGTCCGAGTGCAGTGGTGTTTACAAGTATTTGATTATAACTAGT TACAGATTTCTTTGTTTCCTTCTCCACTCCCACTGCCTCACTTGACTGGC CTA MID-23168 303 GCTCTGTCCAAAGTAAACGCCCTGACGCACTGTGGGAAGGGTGAGATGGG CACCGC MID-23794 304 GTGAGTGGGAGGGGGGCTGCAGCCCAAAGAGGCAACAAAGGCCCTTCCCG GCCAATGCATTAC MID-24496 305 TGTCCTCAGGCCTGCTACTGATCCTGCAGCCAGAAGTTCCAGAAAGTGAA GGGATTTGGAGGGGCCGTGACAGATGCAGGTGCCCTCAACATCCTTGCCC TGTCACCCCCTGCCCAGAATTTGCTACTTAAATGGTACTTCTCTGAAGAA GATGAGGAGGAAGGGGACA MID-24705 306 ACAGAATTCCTCTTCTCCCTTCTCCTATAACCTGTTTTATTTAATTAATT AATTTTTTAGGCTAGTCAAGTGAAGCAGTGGGAGTGGAAGGAACAAAGAA ATCTGT MD2-495 307 UGAGCUCUGCGGCGCCAAGGGACCGAGGGGCCGAGGGAGCGAGAG MD2-437 308 AGUGCUUGGCUGAGGAGCUGGGGCCAAGGGGGAACACAAAUAUGGUCCUG ACCCUACAUUCCCAGCCCUGCCUCU - It is to be noted that SEQ ID NOs.183-306 in Table 2 present the cDNA corresponding to the sequence of the naturally occurring pre-miR, i.e., the sequences present thymine (T) instead of uracil (U).
- The nucleic acid may be in the form of a nucleic acid complex, and may further comprise one or more of the following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an antibody fragment, a Fab fragment, or an aptamer.
- The nucleic acid may also comprise a sequence of a pre-miRNA or a variant thereof. The pre-miRNA sequence may comprise from 45-90, 60-80 or 60-70 nucleotides. The sequence of the pre-miRNA may comprise a miRNA and a miRNA* as set forth herein. The sequence of the pre-miRNA may comprise the sequence of SEQ ID NOS: 183-308 or variants thereof.
- As described herein, the nucleic acid may be at least 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% identical to the nucleic acid sequences in Tables 1 or 2 (with increments of 1% from 80 to 99%), over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35, 40, 45, 50 or more nucleotides.
- The nucleic acid may also comprise a sequence of a microRNA (including a miRNA*) or a variant thereof, including those putative microRNAs represented by MID-[numeral]. As referred to herein, microRNAs include those miRs which have been listed in the miRBase registry name (release 20), as well as putative microRNAs which have been predicted and/or cloned by Rosetta Genomics and which are represented by MID-[numeral]. The microRNA sequence may comprise from 13-33, 18-24 or 21-23 nucleotides. The microRNA may also comprise a total of at least 5, 67, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39 or 40 nucleotides. The sequence of the microRNA may be the first 13-33 nucleotides of the pre-miRNA. The sequence of the microRNA may also be the last 13-33 nucleotides of the pre-miRNA. The sequence of the microRNA may comprise the sequence of any one of SEQ ID NOS: 1-182 or a variant thereof. The present invention employs microRNAs for the identification, classification and diagnosis of thyroid nodules.
- “Variant”, as used herein referring to a nucleic acid, means (i) a portion of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that differs from the referenced nucleotide sequence by a point-mutation or the complement thereof; (iv) a naturally-occurring variant of the referenced nucleotide sequence present in the general population or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, of the complement thereof.
- “Probe”, as used herein, means an oligonucleotide capable of binding to a target nucleic acid of complementary sequence through one or more types of chemical bonds, usually through complementary base pairing, usually through hydrogen bond formation. Probes may bind target sequences lacking complete complementarity with the probe sequence depending upon the stringency of the hybridization conditions. For example, for hybridization assays, the probe may be complementary to at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20 contiguous nucleotides of the sequence of the microRNA being detected. Alternatively, for PCR assays, the probe may be complementary to at least 8, at least 9, at least 10, at least 11, at least 12, at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, at least 20 contiguous nucleotides of the sequence of the PCR product being detected.
- Thus, a probe may be complementary to, or may hybridize to at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98% or 99% of its target nucleic acid.
- A probe may be single-stranded or partially single- and partially double-stranded. The strandedness of the probe is dictated by the structure, composition and properties of the target sequence. Probes may include a label, an attachment, or a nucleotide sequence that does not naturally occur in a nucleic acid described herein. Probes may be directly labeled or indirectly labeled such as with biotin to which a streptavidin complex may bind.
- “Probe” may be an agent for detecting the nucleic acid sequences described herein. Probe may be a labeled nucleic acid probe capable of hybridizing to a portion of the nucleic acid sequence of the invention, or amplification products derived therefrom. In some embodiments, the nucleic acid probe is reverse complementary nucleic acid molecule of the nucleic acid sequence disclosed herein. A probe may be a nucleic acid sequence which sufficiently specifically hybridizes under stringent conditions to the nucleic acid disclosed herein. A probe is optionally labeled with a fluorescent molecule such as a fluorescein, e.g. 6-carboxyfluorescein (FAM), an indocarbocyanine, e.g. QUASAR-670 (QUA), a hexafluorocine, such as 6-carboxyhexafluorescein (HEX), or other fluorophore molecules and optionally a quencher. A quencher is appreciated to be matched to a fluorophore. Illustrative examples of a quencher include the black hole quenchers BHQ1, and BHQ2, or minor groove binders (MGB), e.g. dihydrocyclopyrroloindole tripeptide. Other fluorophores and quenchers are known in the art and are similarly operable herein.
- Thus, the present invention also provides a probe, said probe comprising the novel nucleic acid sequences described herein, defined by any one of SEQ ID NOs. 27-29, 33, 34, 139, 140, 307 and 308, or variants thereof. Probes may be used for screening and diagnostic methods. The probe may be attached or immobilized to a solid substrate, such as a biochip. The probe may have a length of from 8 to 500, 10 to 100 or 20 to 60 nucleotides. The probe may have a length of at least 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120, 140, 160, 180, 200, 220, 240, 260, 280 or 300 nucleotides. The probe may further comprise a linker sequence of from 10-60 nucleotides. The probe may further comprise a linker. The linker may comprise a sequence that does not occur naturally in a nucleic acid described herein. The linker may be 10-60 nucleotides in length. The linker may be 20-27 nucleotides in length. The linker may be of sufficient length to allow the probe to be a total length of 45-60 nucleotides. The linker may not be capable of forming a stable secondary structure, or may not be capable of folding on itself, or may not be capable of folding on a non-linker portion of a nucleic acid contained in the probe. The sequence of the linker is heterogeneous, and it may not appear in the genome of the animal from which the probe non-linker nucleic acid is derived.
- As used herein, the term “reference value” means a value that statistically correlates to a particular outcome when compared to an assay result. In one embodiment, the reference value is determined from statistical analysis of studies that compare microRNA expression with known clinical outcomes. In another embodiment, the reference value may vary according to the classifier (i.e. the algorithm) used. Hence, the reference value may be the expression levels (or values) of all the microRNAs in the training data. The reference value may be one or more thresholds established by the classifier. The reference value may further be a coefficient or set of coefficients. Essentially the reference value refers to any parameter needed or used by the algorithm.
- “Sensitivity”, as used herein, may mean a statistical measure of how well a classification test correctly identifies a condition or conditions, for example, how frequently it correctly classifies a cancer into the correct type out of two or more possible types. In a binary classification test, sensitivity means how frequently the test correctly classifies a cancer into the correct type out of two possible types. The sensitivity for class A is the proportion of cases that are determined to belong to class “A” by the test out of the cases that are in class “A”, as determined by some absolute or gold standard.
- “Smear”, as used herein, refers to a sample of thyroid tissue spread thinly on a microscope slide for examination, typically for medical diagnosis. Smears from FNAs usually have very small amounts of cells, which results in small amounts of RNA, which may range from 1-1000 ng, 1-100 ng, 1-50 ng, 1-40 ng, 5-100 ng, 5-50 ng, 5-40 ng, accordingly. Smears may be stained with any stain known to the man skilled in the art of cytology, histology or pathology, such as any stain used to differentiate cells in pathologic specimens. Examples of stains are multichromatic stains, like Papanicolaou, which are a combination of nuclear stain and cytoplasm stain; cellular structure stains such as Wright, Giemsa, Romanowsky and the like; nuclear stains, such as Hoescht stains and the like; cell viability stains, such as Trypan blue, and the like, enzyme activity, such as benzidine for HRP to form visible precipitate and the like.
- “Specificity”, as used herein, may mean a statistical measure of how well a binary classification test correctly identifies cases that do not have a specific condition, for example, how frequently it correctly classifies a sample as non-cancer when indeed it is a non-cancerous sample. The specificity for class A is the proportion of cases that are determined to belong to class “not A” by the test out of the cases that are in class “not A”, as determined by some absolute or gold standard.
- “Specificity”, as used herein, may mean a statistical measure of how well a classification test correctly identifies cases that do not have a specific condition. The specificity for class A is the proportion of cases that are determined by the test not to belong to class A out of the cases that are not in class A, as determined by some absolute or gold standard.
- As used herein, the term “stage of cancer” refers to a numerical measurement of the level of advancement of a cancer. Criteria used to determine the stage of a cancer include, but are not limited to, the size of the tumor, whether the tumor has spread to other parts of the body and where the cancer has spread (e.g., within the same organ or region of the body or to another organ).
- “Stringent hybridization conditions”, as used herein, mean conditions under which a first nucleic acid sequence (e.g., probe) will hybridize to a second nucleic acid sequence (e.g., target), such as in a complex mixture of nucleic acids. Stringent conditions are sequence-dependent and will be different in different circumstances. Stringent conditions may be selected to be about 5-10° C. lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength pH. The Tm may be the temperature (under defined ionic strength, pH and nucleic concentration) at which 50% of the probes complementary to the target hybridize to the target sequence at equilibrium (as the target sequences are present in excess, at Tm, 50% of the probes are occupied at equilibrium). Stringent conditions may be those in which the salt concentration is less than about 1.0 M sodium ion, such as about 0.01-1.0 M sodium ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30° C. for short probes (e.g., about 10-50 nucleotides) and at least about 60° C. for long probes (e.g., greater than about 50 nucleotides). Stringent conditions may also be achieved with the addition of destabilizing agents such as formamide. For selective or specific hybridization, a positive signal may be at least 2 to 10 times background hybridization. Exemplary stringent hybridization conditions include the following: 50% formamide, 5×SSC, and 1% SDS, incubating at 42° C., or, 5×SSC, 1% SDS, incubating at 65° C., with wash in 0.2×SSC, and 0.1% SDS at 65° C., DMSO, 6×SSPE+0.005% N-Lauroylsarcosine+0.005% Triton X-102, 0.06×SSPE+0.005% N-Lauroylsarcosine+0.005% Triton X-102.
- As used herein, the term “subject” refers to a mammal, including both human and other mammals. The methods of the present invention are preferably applied to human subjects.
- As used herein, the term “subtype of cancer” refers to different types of cancer that affect the same organ (e.g., papillary, follicular carcinoma and follicular variant papillary carcinoma of the thyroid).
- “Thyroid lesion” as used herein, may mean a thyroid tumor, including sub-types of thyroid tumors, such as Hashimoto disease, follicular carcinoma, papillary carcinoma, follicular variant of papillary carcinoma (FVPC or FVPTC), encapsulated FVPC (or encapsulated FVPTC), non-encapsulated (infiltrative/diffuse) FVPC or FVPTC, medullary carcinoma, anaplastic thyroid cancer, or poorly differentiated thyroid cancer.
- As used herein, the phrase “threshold expression profile” refers to a criterion expression profile to which measured values are compared in order to classify a tumor.
- As used herein, a tissue sample is tissue obtained from a tissue biopsy using methods well known to those of ordinary skill in the related medical arts. The phrase “suspected of being cancerous”, as used herein, means a cancer tissue sample believed by one of ordinary skill in the medical arts to contain cancerous cells. Methods for obtaining the sample from the biopsy include gross apportioning of a mass, microdissection, laser-based microdissection, or other art-known cell-separation methods.
- “Tumor”, as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues. The cytological classification of the thyroid lesions or tumor samples used herein is based on “The Bethesda System for Reporting Thyroid Cytopathology”, the “BSRTC” (Syed, Z. Ali and Edmund S. Cibas, eds.; DOI 10.1007/978-0-387-87666-5_1; Springer Science+Business Media, LLC 2010). The BSRTC recommends that each thyroid FNA report be accompanied by a general diagnostic category, in which each category has an implied cancer risk.
- Recommended nomenclature for the Bethesda categories are as follows:
-
-
- Cyst fluid only
- Virtually acellular specimen
- Other (obscuring blood, clotting artifact, etc.)
-
-
- Consistent with a benign follicular nodule (includes adenomatoid nodule, colloid nodule, etc.)
- Consistent with lymphocytic (Hashimoto) thyroiditis in the proper clinical context
- Consistent with granulomatous (subacute) thyroiditis
- Other
- IV. Follicular Neoplasm or suspicious of a Follicular Neoplasm
-
- Specific in Hurthle cell (oncocytic) type
-
-
- Suspicious for papillary carcinoma
- Suspicious for medullary carcinoma
- Suspicious for metastatic carcinoma
- Suspicious for lymphoma
- Other
-
-
- Papillary thyroid carcinoma
- Poorly differentiated carcinoma
- Medullary thyroid carcinoma
- Undifferentiated (anaplastic) carcinoma
- Squamous cell carcinoma
- Carcinoma with mixed features
- Metastatic carcinoma
- Non-Hodgkin lymphoma
- Other
- As used herein, “Indeterminate” refers to thyroid lesions or tumor samples examined for cytology and classified according to the Bethesda classification in categories III, IV and V.
- The present invention further provides a method for identifying subtypes of thyroid lesions in a subject, said subtypes of thyroid lesions being said subtypes of malignant or benign thyroid tumor. Subtype is any one of follicular carcinoma, papillary carcinoma, follicular variant of papillary carcinoma (FVPC or FVPTC), encapsulated FVPC (or encapsulated FVPTC), non-encapsulated FVPC (or non-encapsulated FVPTC), medullary carcinoma, anaplastic thyroid cancer or poorly differentiated thyroid cancer.
- In another further embodiment, said subtype is any one of Hashimoto thyroiditis, follicular adenoma or hyperplasia.
- In another further embodiment, said subtype is Hurthle cell carcinoma.
- In another aspect, the present invention provides a method for distinguishing between follicular adenoma and follicular carcinoma.
- In another further aspect, the present invention provides a method for distinguishing follicular adenoma from papillary carcinoma.
- In another further aspect, the present invention provides a method for distinguishing follicular adenoma from follicular variant of papillary carcinoma.
- In another further aspect, the present invention provides a method for distinguishing non-encapsulated follicular variant of papillary carcinoma from benign lesions.
- In another further aspect the present invention provides a method for distinguishing papillary carcinoma and Hashimoto thyroiditis.
- “Vector” refers to any known vector such as a plasmid vector, a phage vector, a phagemid vector, a cosmid vector, or a virus vector. The nucleic acid described herein may be comprised in a vector. The vector may be used for delivery of the nucleic acid. The vector preferably contains at least a promoter that enhances expression of the nucleic acid carried, and in this case the nucleic acid is preferably operably linked to such a promoter. The vector may or may not be replicable in a host cell, and the transcription of a gene may be carried out either outside the nucleus or within the nucleus of a host cell. In the latter case, the nucleic acid may be incorporated into the genome of a host cell. A vector may be a DNA or RNA vector. A vector may be either a self-replicating extrachromosomal vector or a vector that integrates into a host genome.
- In one embodiment of the method or protocol of the invention, the levels of microRNAs are measured by reverse transcription polymerase chain reaction (RT-PCR). Target sequences of a cDNA are generated by reverse transcription of a target RNA, which may be a nucleic acid described herein (comprising a sequence provided in Tables 1 and 2). Known methods for generating cDNA involve reverse transcribing either polyadenylated RNA or alternatively, RNA with a ligated adaptor sequence.
- RNA may be ligated to an adaptor sequence prior to reverse transcription. A ligation reaction may be performed by T4 RNA ligase to ligate an adaptor sequence at the 3′ end of the RNA. Reverse transcription (RT) reaction may then be performed using a primer comprising a sequence that is complementary to the 3′ end of the adaptor sequence.
- Alternatively, polyadenylated RNA may be used in a reverse transcription (RT) reaction using a poly(T) primer comprising a 5′ adaptor sequence. The poly(T) sequence may comprise 8, 9, 10, 11, 12, 13, or 14 consecutive thymines.
- The reverse transcript of the RNA may then be amplified by real-time PCR, using a specific forward primer comprising at least 15 nucleic acids complementary to the target nucleic acid and a 5′ tail sequence; a reverse primer that is complementary to the 3′ end of the adaptor sequence; and a probe comprising at least 8 nucleic acids complementary to the target nucleic acid. The probe may be partially complementary to the 5′ end of the adaptor sequence.
- The amplification of the reverse transcripts of the target nucleic acids (microRNAs, including herein described putative microRNAs) may be by PCR or the like. The first cycles of the PCR reaction may have an annealing temperature of 56° C., 57° C., 58° C., 59° C., or 60° C. The first cycles may comprise 1-10 cycles. The remaining cycles of the PCR reaction may be 60° C. The remaining cycles may comprise 2-40 cycles.
- The PCR reaction comprises a forward primer. In one embodiment, the forward primer may comprise 15, 16, 17, 18, 19, 20, or 21 nucleotides identical to the target nucleic acid. The 3′ end of the forward primer may be sensitive to differences in sequence between a target nucleic acid and highly similar sequences.
- The forward primer may also comprise a 5′ overhanging tail. The 5′ tail may increase the melting temperature of the forward primer. The sequence of the 5′ tail may comprise a sequence that is non-identical to the target nucleic acid. The sequence of the 5′ tail may also be synthetic. The 5′ tail may comprise 8, 9, 10, 11, 12, 13, 14, 15, or 16 nucleotides. Examples of forward primers used in the invention are provided in Table 8.
- The PCR reaction comprises a reverse primer. The reverse primer may be complementary to a target nucleic acid. The reverse primer may also comprise a sequence complementary to an adaptor sequence. Examples of reverse primers used in the invention are provided in Example 8.
- The probes used to detect products of RT-PCR amplification may be general probes or sequence-specific probes. General probes are designed to detect (or hybridize with) RT-PCR amplification products in a non-sequence specific manner. Said probes are between 16 and 20 nucleotides long, preferably 18 nucleotides long, and comprise a sequence which is the reverse complement of the RT primer, including 4 adenines (As) at the 5′ end. Sequence-specific probes are designed to detect (or hybridize with) RT-PCR amplification products based on total or partial complementarity between the sequence of the probe and the sequence of the RT-PCR product. Said probes are between 20 and 28 nucleotides longs, preferably 24 nucleotides long, and comprising at the 5′ end three nucleotides from each at least two are complementary to the RT primer, followed by between 10 to 14, preferably 12 thymines (Ts), followed by between 6 to 10, preferably 8 contiguous nucleotides which correspond to the reverse complementary sequence of the specific corresponding microRNA.
- A biochip comprising novel nucleic acids described herein is provided. In one embodiment, the biochip may comprise probes that recognize the novel nucleic acids described herein. Said nucleic acids are isolated nucleic acids comprising at least 12 contiguous nucleotides at least 80% identical to the sequence of any one of SEQ ID NOs. 27-29, 33, 34, 139, 140, 307 and 308. In one embodiment, said isolated nucleic acid comprises at least 13, at least 14, at least 15, at least 16, at least 17, at least 18, at least 19, or at least 20 contiguous nucleotides identical to the sequence of any one of SEQ ID NOs. 27-29, 33, 34, 139, 140, 307 and 308. The biochip may comprise a solid substrate comprising an attached nucleic acid, probe or plurality of probes described herein. The probes may be capable of hybridizing to a target sequence under stringent hybridization conditions. The probes may be attached at spatially defined addresses on the substrate. More than one probe per target sequence may be used, with either overlapping probes or probes to different sections of a particular target sequence. The probes may be capable of hybridizing to target sequences associated with a single disorder appreciated by those in the art. The probes may either be synthesized first, with subsequent attachment to the biochip, or may be directly synthesized on the biochip.
- The solid substrate may be a material that may be modified to contain discrete individual sites appropriate for the attachment or association of the probes and is amenable to at least one detection method. Representative examples of substrates include glass and modified or functionalized glass, plastics (including acrylics, polystyrene and copolymers of styrene and other materials, polypropylene, polyethylene, polybutylene, polyurethanes, TeflonJ, etc.), polysaccharides, nylon or nitrocellulose, resins, silica or silica-based materials including silicon and modified silicon, carbon, metals, inorganic glasses and plastics. The substrates may allow optical detection without appreciably fluorescing.
- The substrate may be planar, although other configurations of substrates may be used as well. For example, probes may be placed on the inside surface of a tube, for flow-through sample analysis to minimize sample volume. Similarly, the substrate may be flexible, such as flexible foam, including closed cell foams made of particular plastics.
- The biochip and the probe may be derivatized with chemical functional groups for subsequent attachment of the two. For example, the biochip may be derivatized with a chemical functional group including, but not limited to, amino groups, carboxyl groups, oxo groups or thiol groups. Using these functional groups, the probes may be attached using functional groups on the probes either directly or indirectly using a linker. The probes may be attached to the solid support by either the 5′ terminus, 3′ terminus, or via an internal nucleotide.
- The probe may also be attached to the solid support non-covalently. For example, biotinylated oligonucleotides can be made, which may bind to surfaces covalently coated with streptavidin, resulting in attachment. Alternatively, probes may be synthesized on the surface using techniques such as photopolymerization and photolithography.
- In a further embodiment of the invention, measuring the microRNAs for classification of thyroid lesions may be effected by high throughput sequencing. High throughput sequencing can involve sequencing-by-synthesis, sequencing-by-ligation, and ultra-deep sequencing. Sequence-by-synthesis can be initiated using sequencing primers complementary to the sequencing element on the nucleic acid tags. The method involves detecting the identity of each nucleotide immediately after (substantially real-time) or upon (real-time) the incorporation of a labeled nucleotide or nucleotide analog into a growing strand of a complementary nucleic acid sequence in a polymerase reaction. After the successful incorporation of a label nucleotide, a signal is measured and then nulled by methods known in the art. Examples of sequence-by-synthesis methods are known in the art, and are described for example in U.S. Pat. Nos. 7,056,676, 8,802,368 and 7,169,560, the contents of which are incorporated herein by reference. Examples of labels that can be used to label nucleotide or nucleotide analogs for sequencing-by-synthesis include, but are not limited to, chromophores, fluorescent moieties, enzymes, antigens, heavy metal, magnetic probes, dyes, phosphorescent groups, radioactive materials, chemiluminescent moieties, scattering or fluorescent nanoparticles, Raman signal generating moieties, and electrochemical detection moieties. Sequencing-by-synthesis can generate at least 1,000, at least 5,000, at least 10,000, at least 20,000, 30,000, at least 40,000, at least 50,000, at least 100,000 or at least 500,000 reads per hour. Such reads can have at least 40, at least 45, at least 50, at least 60, at least 70, at least 80, at least 90, at least 100, at least 120 or at least 150 bases per read.
- Sequencing-by-synthesis may be performed on a solid surface (or a chip) using fold-back PCR and anchored primers. Since microRNAs occur as small nucleic acid fragments—adaptors are added to the 5′ and 3′ ends of the fragments. Nucleic acid fragments that are attached to the surface of flow cell channels are extended and bridge amplified. The fragments become double stranded, and the double stranded molecules are denatured. Multiple cycles of the solid-phase amplification followed by denaturation can create several million clusters of approximately 1,000 copies of single-stranded nucleic acid molecules of the same template in each channel of the flow cell. Primers, polymerase and four fluorophore-labeled, reversibly terminating nucleotides are used to perform sequential sequencing. After nucleotide incorporation, a laser is used to excite the fluorophores, an image is captured and the identity of the first base is recorded. The 3′ terminators and fluorophores from each incorporated base are removed and the incorporation, detection and identification steps are repeated. This technology is used, for example, in the Illumina® sequencing platform.
- Another sequencing method involves hybridizing the amplified regions to a primer complementary to the sequence element in an LST (a file listing the names of fasta files). This hybridization complex is incubated with a polymerase, ATP sulfurylase, luciferase, apyrase, and the substrates luciferin and adenosine 5′ phosphosulfate. Next, deoxynucleotide triphosphates corresponding to the bases A, C, G, and T (U) are added sequentially. Each base incorporation is accompanied by release of pyrophosphate, converted to ATP by sulfurylase, which drives synthesis of oxyluciferin and the release of visible light. Since pyrophosphate release is equimolar with the number of incorporated bases, the light given off is proportional to the number of nucleotides adding in any one step. The process is repeated until the entire sequence is determined. Yet another sequencing method involves a four-color sequencing by ligation scheme (degenerate ligation), which involves hybridizing an anchor primer to one of four positions. Then an enzymatic ligation reaction of the anchor primer to a population of degenerate nonamers that are labeled with fluorescent dyes is performed. At any given cycle, the population of nonamers that is used is structure such that the identity of one of its positions is correlated with the identity of the fluorophore attached to that nonamer. To the extent that the ligase discriminates for complementarily at that queried position, the fluorescent signal allows the inference of the identity of the base. After performing the ligation and four-color imaging, the anchor primer:nonamer complexes are stripped and a new cycle begins. Methods to image sequence information after performing ligation are known in the art. In some cases, high throughput sequencing involves the use of ultra-deep sequencing, such as described in Marguiles et al., Nature 437 (7057): 376-80 (2005).
- MicroRNA sequencing (miRNA-seq) is a type of RNA Sequencing (RNA-Seq) which uses next-generation sequencing or massively parallel high-throughput DNA sequencing to sequence microRNAs. miRNA-seq differs from other forms of RNA-Seq in that input material is often enriched for small RNAs. miRNA-seq provides tissue specific expression patterns, which may lead to disease associations and microRNAs isoforms. miRNA-seq is also used for the discovery of previously uncharacterized microRNAs, such as the nucleic acid sequences denoted by SEQ ID NOs 139-140 and 307-308.
- As used herein, the term “diagnosing” refers to classifying pathology, or a symptom, determining a severity of the pathology (grade or stage), monitoring pathology progression, forecasting an outcome of pathology and/or prospects of recovery.
- As used herein, the phrase “subject in need thereof” refers to an human subject who is known to have cancer, at risk of having cancer (e.g., a genetically predisposed subject, a subject with medical and/or family history of cancer, a subject who has been exposed to carcinogens, occupational hazard, environmental hazard) and/or a subject who exhibits suspicious clinical signs of cancer (e.g., nodules in the thyroid). Additionally or alternatively, the subject in need thereof can be a healthy human subject undergoing a routine well-being check-up.
- Analyzing presence of malignant or pre-malignant cells can be effected in vivo or ex vivo, whereby a biological sample (e.g., biopsy) is retrieved. Such biopsy samples comprise cells and may be an incisional or excisional biopsy. The sample may be retrieved from the thyroid of the subject, and may be retrieved using FNA. Alternatively the cells may be retrieved from a complete resection.
- While employing the present teachings, additional information may be gleaned pertaining to the determination of treatment regimen, treatment course and/or to the measurement of the severity of the disease.
- As used herein, the phrase “treatment regimen” refers to a treatment plan that specifies the type of treatment, dosage, schedule and/or duration of a treatment provided to a subject in need thereof (e.g., a subject diagnosed with a pathology). The selected treatment regimen can be an aggressive one which is expected to result in the best clinical outcome (e.g., complete cure of the pathology) or a more moderate one which may relieve symptoms of the pathology yet results in incomplete cure of the pathology. It will be appreciated that in certain cases the treatment regimen may be associated with some discomfort to the subject or adverse side effects (e.g., damage to healthy cells or tissue). The type of treatment can include a surgical intervention (e.g., removal of lesion, diseased cells, tissue, or organ), a cell replacement therapy, an administration of a therapeutic drug (e.g., receptor agonists, antagonists, hormones, chemotherapy agents) in a local or a systemic mode, an exposure to radiation therapy using an external source (e.g., external beam) and/or an internal source (e.g., brachytherapy) and/or any combination thereof. The dosage, schedule and duration of treatment can vary, depending on the severity of pathology and the selected type of treatment, and those of skills in the art are capable of adjusting the type of treatment with the dosage, schedule and duration of treatment.
- A method of diagnosis is also provided. The method comprises detecting an expression level of a specific cancer-associated nucleic acid in a biological sample. Diagnosis of a specific cancer state in a patient may allow for prognosis and selection of therapeutic strategy. Further, the developmental stage of cells may be classified by determining temporarily expressed specific cancer-associated nucleic acids.
- In situ hybridization of labeled probes to tissue sections or FNA smears may be performed. When comparing the fingerprints between individual samples the skilled artisan can make a diagnosis, a prognosis, or a prediction based on the findings. It is further understood that the nucleic acid sequence which indicate the diagnosis may differ from those which indicate the prognosis and molecular profiling of the condition of the cells may lead to distinctions between responsive or refractory conditions or may be predictive of outcomes.
- A kit is also provided and may comprise a nucleic acid described herein together with any or all of the following: assay reagents, buffers, probes and/or primers, and sterile saline or another pharmaceutically acceptable emulsion and suspension base. In addition, the kits may include instructional materials containing directions (e.g., protocols) for the practice of the methods described herein. The kit may further comprise a software package for data analysis of expression profiles.
- For example, the kit may be a kit for the amplification, detection, identification or quantification of a target nucleic acid sequence. The kit may comprise a poly (T) primer, a forward primer, a reverse primer, and a probe. Specifically, it is provided a kit for thyroid tumor classification, said kit comprising: (a) probes for performing thyroid tumor classification, wherein said probes comprise any one of (i) DNA equivalents of microRNAs comprising at least one of SEQ ID NOs 1-37, (ii) the complements thereof, (iii) sequences at least 80% identical to (i) or (ii), (iv) a nucleic acid sequence that hybridizes with at least eight contiguous nucleotides of any one of SEQ ID NOs 1-37, or (v) a nucleic acid sequence that hybridizes with RT-PCR products; and optionally (b) an instruction manual for using said probes.
- Any of the compositions described herein may be comprised in a kit. In a non-limiting example, reagents for isolating microRNA, labeling microRNA, and/or evaluating a microRNA population using an array are included in a kit. The kit may further include reagents for creating or synthesizing microRNA probes. The kits will thus comprise, in suitable container means, an enzyme for labeling the microRNA by incorporating labeled nucleotide or unlabeled nucleotides that are subsequently labeled. It may also include one or more buffers, such as reaction buffer, labeling buffer, washing buffer, or a hybridization buffer, compounds for preparing the microRNA probes, components for in situ hybridization and components for isolating microRNA. Other kits of the invention may include components for making a nucleic acid array comprising microRNA, and thus, may include, for example, a solid support.
- The following examples are presented in order to more fully illustrate some embodiments of the invention. They should in no way be construed, however, as limiting the broad scope of the invention.
- 1. microRNA Analysis
- The presence and/or level of microRNAs in thyroid tumor samples may be evaluated using methods known in the art, e.g., Northern blot, RNA expression assays, e.g., microarray analysis, RT-PCR, high throughput sequencing (next generation sequencing), cloning, and quantitative real time polymerase chain reaction (qRT-PCR). Analytical techniques to determine RNA expression are known in the art, see e.g. Sambrook et al., Molecular Cloning: A Laboratory Manual, 3rd Ed., Cold Spring Harbor Press, Cold Spring Harbor, N.Y. (2001). Examples of specific methods used herein are described in more detail below.
- Total RNA was isolated from seven to ten 10 μm-thick tissue sections. Sections were incubated a few times (1-3 times) in xylene at 57° C. for 5 minutes in order to remove excess paraffin, followed by centrifugation at ambient temperature for 2 minutes at 10,000 g. The specimen was then washed several times (about 3 times) with 1 ml 100% ethanol in order to wash the xylene out of the tissue, followed by centrifugation at ambient temperature for 10 minutes at 10,000 g. The supernatant was discarded and the tissue dried at 65° C. for 5 minutes. Proteins were degraded by proteinase K solution (5-12 μl Proteinase K (e.g., Sigma or ABI) in 500 μl of Buffer B (10 mM NaCl, 500 mM Tris pH 7.5, 20 mM EDTA pH 8, 1% SDS), at 45° C. for a few hours (about 16 hours). Proteinase K was inactivated by incubation at 95° C. for 7 minutes. After the tubes were chilled 10 μl of RNA synthetic spikes was added (e.g., 2 spikes of 0.15 fmol/μl). RNA was extracted using acid phenol/chloroform equal volume, vortexing, followed by centrifugation at 4° C. for 15 minutes at 12000 g. RNA was then precipitated using 8 μl linear acrylamide, 0.1 volumes of 3M NaOAc pH 5.2, and 3 volumes of absolute 100% ethanol, for 30 minutes to 16 hours followed by centrifugation at 4° C. for at least 40 minutes at 20000 g (14,000 rpm). The pellet was washed by adding 1 ml 85% cold Ethanol. DNAses were introduced at 37° C. for 60 minutes to digest DNA (e.g. 10 μl Turbo™ DNase), followed by extraction using acid phenol/chloroform and ethanol precipitated as described above.
- Total RNA was isolated from FNA smear samples in slides, either non-stained or stained (e.g. by Papanicolaou, Giemsa or Diff-Quick) after removal of the coverslip (when present) by dipping the slides for several hours (about 2-20 hours, usually about 16 hours) in xylene at ambient temperature, in order to remove excess paraffin or glue. Further the slides were washed several times (about 3 times) with 100% ethanol in order to wash the xylene out. Slides were dipped for 1 minute in double-distilled water (DDW). The cells were scraped from the slide using a scalpel. The slide was then washed with 500 μl buffer B (10 mM NaCl, 500 mM Tris pH 7.5, 20 mM EDTA pH 8, 1% SDS), and transferred to a 1.7 ml tube. Proteins were degraded by proteinase K (e.g., 5-12 μl Sigma or ABI) at 45° C. for a few hours (about 16 hours). Proteinase K was inactivated by incubating the tubes at 95° C. for 7 minutes. After chilling the tubes, 10 μl of RNA synthetic spikes (e.g., 2 spikes of 0.15 fmol/μl) was added. RNA was extracted using acid phenol/chloroform equal volume, vortexing, spinning down at 4° C. for 15 minutes at 12000 g. RNA was then precipitated using 8 μl linear acrylamide, 0.1 volumes of 3M NaOAc pH 5.2, and 3 volumes of absolute ethanol from 30 minutes to 16 hours. The tubes were then spun down at 4° C. for at least 40 minutes at 20000 g (14,000 rpm). The pellet was washed with about 1 ml 85% cold ethanol. DNAses were introduced at 37° C. for 60 minutes to digest DNA (e.g. 10 μl Turbo™ DNase, Ambion, Life Technologies), followed by extraction using acid phenol/chloroform and ethanol precipitation as described above.
- Total RNA quantification was performed by fluorospectrometry in a NanoDrop 3300 (ND3300) fluorospectrometer using the RiboGreen® dye (Thermo Fisher Scientific®, Wilmington, Del.). The ND3300 RNA detection range is of 25 ng/ml-1000 ng/ml when using a high concentration of RiboGreen® dye (1:200 dilution), and 5 ng/ml-50 ng/ml when using a 1:2000 dilution of RiboGreen® dye. The RNA amounts which were determined by ND3300 were highly correlated to the detected expressed microRNA.
- Custom microarrays (Agilent Technologies, Santa Clara, Calif.) were generated by printing DNA oligonucleotide probes to: 2172 miRs sequences, 17 negative controls, 23 spikes, and 10 positive controls (total of 2222 probes). Each microRNA probe, printed in triplicate, carried up to 28-nucleotide (nt) linker at the 3′ end of the microRNAs' complement sequence. Negative spikes and positive probes were printed from 3 to 200 times. Seventeen (17) negative control probes were designed using sequences that do not match the genome. Two groups of positive control probes were designed to hybridize to the microRNA array: (i) synthetic small RNAs were spiked to the RNA before labeling to verify the labeling efficiency; and (ii) probes for abundant small RNA, e.g., small nuclear RNAs (U43, U24, Z30, U6, U48, U44), 5.8s and 5s ribosomal RNA were spotted on the array to verify RNA quality.
- 5. Cy-Dye Labeling of microRNA for Microarray
- Total RNA (20-1000 ng) was labeled by ligation (Thomson et al. Nature Methods 2004; 1:47-53) with an RNA linker, p-rCrU-Cy/dye or several sequential Cys (BioSpring GmbH, IBA GmbH or equivalent), to the 3′ end with Cy3 or Cy5. The labeling reaction contained total RNA, spikes (0.1-100 fmoles), 250-400 ng RNA-linker-dye, 15% DMSO, 1× ligase buffer and 20 units of T4 RNA ligase (NEB or equivalent), and proceeded at 4° C. for 1 hour, followed by 1 hour at 37° C., followed by 4° C. up to 40 minutes.
- The labeled RNA was mixed with 30 μl hybridization mixture (mixture of 45 μL of the 10× GE Agilent Blocking Agent and 246 μL of 2× Hi-RPM Hybridization). The labeling mixture was incubated at 100° C. for 5 minutes followed by ice incubation in water bath for 5 minutes. Slides were hybridized at 54-55° C. for 16-20 hours, followed by two washes. The first wash was conducted at room temperature with Agilent GE Wash Buffer 1 (e.g. 6×SSPE+0.005% N-Lauroylsarcosine+0.005% Triton X-102,) for 5 minutes followed by a second wash with Agilent GE Wash Buffer 2 at 37° C. for 5 minutes (e.g. 0.06×SSPE+0.005% N-Lauroylsarcosine+0.005% Triton X-102).
- Arrays were scanned using a microarray scanner (Agilent Microarray Scanner Bundle G2565BA, resolution of 5 μm at XDR Hi 100%, XDR Lo 10%). Array images were analyzed using appropriate software (Feature Extraction 10.7 software, Agilent).
- Poly-adenylation and reverse transcription was performed on 1-500 ng of total RNA. RNA was incubated in the presence of poly (A) polymerase (Poly (A) Polymerase NEB-M0276L), ATP, an oligodT primer harboring a consensus sequence and reverse transcriptase (SuperScript® II RT, Invitrogen, Carlsbad, Calif.) for 1 hour at 37° C. Next, the cDNA was amplified by RT-PCR. The amplification reaction included a microRNA-specific forward primer, being a TaqMan® (MGB) probe complementary to the 3′ of the specific microRNA sequence and or to part of the polyA adaptor sequence, and a universal reverse primer complementary to the consensus 3′ sequence of the oligodT tail. Detailed description of the RT-PCR methodology may be found in publication WO 2008/029295, the contents of which are incorporated herein by reference.
- The cycle threshold (CT, the PCR cycle at which probe signal reaches the threshold) was determined for each microRNA.
- In order to allow comparison between microRNA expression results from RT-PCR with microRNA expression results from microarray, each value obtained by RT-PCR was subtracted from 50 (50-CT). The 50-CT expression for each microRNA for each patient was compared with the signal obtained by the microarray method.
- The initial data set consisted of signals measured for multiple probes for every sample. For the analysis, signals were used only for probes that were designed to measure the expression levels of known or validated human microRNAs.
- Triplicate spots were combined into one signal by taking the logarithmic mean of the reliable spots. All data was log-transformed and the analysis was performed in log-space. A reference data vector for normalization, R, was calculated by taking the mean expression level for each probe in two representative samples, one from each tumor type.
- For each sample k with data vector Sk, a 2nd degree polynomial Fk was found so as to provide the best fit between the sample data and the reference data, such that R≈Fk(Sk). Remote data points (“outliers”) were not used for fitting the polynomials F. For each probe in the sample (element Si k in the vector Sk), the normalized value (in log-space) Mi k is calculated from the initial value Si k by transforming it with the polynomial function Fk, so that Mi k=Fk(Si k). Statistical analysis is performed in log-space. For presentation and calculation of fold-change, data is translated back to linear-space by taking the exponent.
- 8. miRNA-Seq Sequence Library Construction
- Sequence library construction may be performed using a variety of different kits depending on the high-throughput sequencing platform being employed. However, there are several common steps for small RNA sequencing preparation. The ligation step adds DNA adaptors to both ends of the small RNAs, which act as primer binding sites during reverse transcription and PCR amplification. An adenylated single strand DNA 3′ adaptor followed by a 5′ adaptor is ligated to the small RNAs using a ligating enzyme such as T4 RNA ligase or adding 5′ adaptor using 5′ RACE reaction 2. The adaptors are also designed to capture small RNAs with a 5′ phosphate group, characteristic microRNAs, rather than RNA degradation products with a 5′ hydroxyl group. Reverse transcription and PCR amplification steps convert the small adaptor ligated RNAs into cDNA clones used in the sequencing reaction. PCR is then carried out to amplify the pool of cDNA sequences. Primers designed with unique nucleotide tags may also be used in this step to create ID tags in pooled library multiplex sequencing.
- 500 ng of RNA from each FFPE sample were used for small RNA deep sequencing (miRSeq). Libraries were loaded on two lanes of the sequence analyzer (Illumina® HiSeq™ 2000 DNA). An average of about 6.3 million reads per library were obtained. To find novel microRNAs, sequence analysis software (miRDeep2, Friedlander M R et al. Nucleic Acids Res. 2012 January; 40(1):37-52) was applied on the raw sequencing data (primer-adapter sequences were trimmed).
- P-values were calculated using a two-sided (unpaired) Student's t-test on the log-transformed normalized fluorescence signal. The threshold for significant differences was determined by setting a false discovery rate (FDR) of 0.05 to 0.1, to correct for effects of multiple hypothesis testing, resulting in p-value cutoffs in the range of 0.01-0.06. For each differentially expressed microRNA, the fold-difference (ratio of the median normalized fluorescence) and the area under curve (AUC) of the response operating characteristic (ROC) curve were calculated. Three sets of miRs were excluded from the statistical analysis: (a) miRs that were previously found as highly expressed in blood samples (due to high percentages of blood in FNA samples), (b) miRs whose level of expression did not correlate with decreasing amounts of RNA, i.e: these miRs did not show linear decrease in signal in association with decreasing measured RNA amounts, and (c) miRs whose level of expression correlated with miRs in set (b).
- For Examples 14 and 15 (final classifier and the validation set), 95% confidence intervals were calculated using the Clopper-Pearson method. Sensitivity and specificity estimates for the training cohort were calculated as the mean of ten (10) 10-fold cross-validation runs. Only non-medullary samples which had expression levels of assay miRNA above defined thresholds were included in the training performance estimates. For 2×2 contingency tables, either the χ2 test or Fisher's exact test (in case that the conditions for performing a Φ2 test were not met) was used.
- A pilot study of microRNA profiling was conducted in a few Papanicolaou, Giemsa and Diff-Quick stained smears from ex-vivo FNA biopsy samples in order to ensure feasibility of the methodology. Since FNA smears often have very few cells, providing a minuscule amount of RNA for analysis, e.g. 1-1000 ng, it was first necessary to evaluate whether microRNA would be detectable under such low RNA amounts. Thus, microRNA expression levels of about 2200 individual microRNAs was measured in Giemsa-stained papillary carcinoma and non-papillary carcinoma smears. Five microRNAs (hsa-miR-146b-5p, hsa-miR-31-5p, hsa-miR-222-3p, hsa-miR-221-3p, and hsa-miR-21-5p), previously shown to correlate with papillary carcinoma were found over-represented (upregulated) in the papillary-carcinoma smears. A comparison of microRNA expression between Giemsa-stained papillary carcinoma and non-papillary carcinoma samples, revealed the highly up-regulated microRNA markers in the papillary carcinoma were detected. These results strongly suggested that microRNA profiles can be successfully determined in FNA smears stained with any dye.
- The cohort of samples used in the experimental analysis is shown in Table 7 as cohort II. The histological diagnosis assessed ultimately the malignancy or benignity of the thyroid lesions. The cytological classification was based on “The Bethesda System for Reporting Thyroid Cytopathology” (Syed, Z. Ali and Edmund S. Cibas, eds.; DOI 10.1007/978-0-387-87666-5_1; Springer Science+Business Media, LLC 2010). The study protocol was approved by the Institutional Review Board (IRB, equivalent to Ethical Review Board) of the contributing institution. Tumor classification was based on the World Health Organization (WHO) guidelines. An additional cohort consisted of 13 thyroid ex-vivo FNA smears, prepared after thyroidectomy, and obtained from the University Milano-Bicocca (Milan, Italy).
- Total RNA (at least 10 ng) was extracted from these samples, and microRNA expression was profiled using custom microarrays containing about 2200 miRs. The results exhibited a significant difference in the expression pattern between benign and malignant lesions of several miRs listed in Table 3 (upregulated or downregulated in malignant versus benign).
-
TABLE 3 miRNAs up- or downregulated in malignant versus benign thyroid tumor fold- median miR name p-value change AUC malignant benign hsa-miR-146b-5p 3.80E−05 2.57 (+) 0.77 5.70E+02 2.20E+02 hsa-miR-222-3p 1.80E−03 2.20 (+) 0.71 4.70E+03 2.10E+03 hsa-miR-221-3p 1.80E−03 2.09 (+) 0.71 4.10E+03 2.00E+03 hsa-miR-181b-5p 2.50E−02 1.38 (+) 0.65 5.00E+02 3.60E+02 hsa-miR-29b-3p 9.50E−03 1.32 (+) 0.64 2.10E+03 1.60E+03 hsa-miR-200b-3p 2.60E−02 1.27 (+) 0.65 3.10E+02 2.40E+02 hsa-miR-200a-3p 3.90E−02 1.27 (+) 0.64 3.00E+02 2.40E+02 hsa-miR-29c-3p 8.80E−03 1.22 (+) 0.64 1.40E+03 1.10E+03 hsa-miR-130a-3p 3.30E−02 1.20 (+) 0.64 1.00E+03 8.70E+02 hsa-miR-148b-3p 3.60E−02 1.13 (+) 0.64 5.00E+02 4.50E+02 MID-23794 2.60E−05 2.34 (−) 0.78 6.00E+02 1.40E+03 hsa-miR-197-5p 2.20E−03 1.90 (−) 0.74 3.40E+02 6.60E+02 hsa-miR-486-3p 3.60E−05 1.73 (−) 0.79 2.00E+02 3.50E+02 hsa-miR-574-3p 1.40E−02 1.44 (−) 0.68 2.30E+02 3.30E+02 hsa-miR-532-3p 4.80E−03 1.30 (−) 0.71 4.50E+02 5.80E+02 hsa-miR-199a-5p 2.50E−03 1.25 (−) 0.73 3.90E+02 4.80E+02 hsa-miR-22-3p 3.90E−02 1.11 (−) 0.62 3.40E+03 3.70E+03 p-values were calculated using a two-sided (unpaired) Student's t-test. The fold-change represents the ratio between the median values of each group. AUC: Area under the curve when using the miRNAs to classify the two groups. Median: median of expression values (rounded). - A classification algorithm for differentiating between malignant and benign thyroid tumor was developed based on miRNA expression in 35 benign and 38 malignant FNA samples. A logistic regression classifier was trained to distinguish between malignant and benign thyroid lesions, based on eight miRs (hsa-miR-125b-5p, hsa-miR-21-5p, hsa-miR-222-3p, hsa-miR-221-3p, hsa-miR-146b-5p, hsa-miR-181a-5p, hsa-miR-138-5p, and MID-23794) that were found to be differentially expressed in these conditions, either between benign or malignant or between specific thyroid tumor subtypes (data not shown). The classifier reached 89% accuracy with sensitivity of 87% and specificity of 91% for identifying malignant samples. hsa-miR-125b-5p, hsa-miR-21-5p, hsa-miR-222-3p, hsa-miR-221-3p, hsa-miR-146b-5p and hsa-miR-181a-5p exhibited higher expression in malignant lesions, while hsa-miR-138-5p and MID-23794 exhibited higher expression in benign lesions (data not shown).
- Expression levels of miRs were compared in 18 follicular adenoma samples and 10 follicular carcinoma samples. microRNAs that were upregulated or downregulated in follicular adenoma relative to follicular carcinoma are presented in Table 4.
-
TABLE 4 miRNAs up- or downregulated in follicular adenoma versus follicular carcinoma Median fold- Follicular Follicular miR name p-value change AUC adenoma carcinoma hsa-miR-486-3p 2.80E−02 2.04 (+) 0.77 4.80E+02 2.40E+02 MID-01141 5.50E−02 1.91 (+) 0.73 3.50E+02 1.80E+02 hsa-miR-193a-3p 2.70E−02 1.45 (+) 0.76 3.10E+02 2.20E+02 hsa-miR-148b-3p 3.90E−02 1.25 (−) 0.71 4.50E+02 5.60E+02 p-values were calculated using a two-sided (unpaired) Student's t-test. The fold-change represents the ratio between the median values of each group. AUC: Area under the curve when using the miRNAs to classify the two groups. Median: median of expression values (rounded). - Expression levels of miRs were compared in 18 follicular adenoma samples versus 9 papillary carcinoma (non-follicular variant) samples, and a classifier was generated for distinguishing between follicular adenoma and papillary carcinoma samples using the expression levels of hsa-miR-146b-5p and hsa-miR-21-5p, with 100% accuracy (data not shown).
- Expression levels of miRs were compared in 18 follicular adenoma samples versus 19 follicular variant of papillary carcinoma samples. microRNAs that were upregulated or downregulated in follicular variant of papillary carcinoma relative to follicular adenoma are presented in Table 5.
-
TABLE 5 miRNAs up- or downregulated in follicular variant papillary carcinoma (FVPC) versus follicular adenoma (FA) fold- median miR name p-value change AUC FVPC FA hsa-miR-146b-5p 4.00E−02 2.36 (+) 0.71 5.40E+02 2.30E+02 hsa-miR-29c-3p 2.00E−03 1.66 (+) 0.76 1.40E+03 8.30E+02 hsa-miR-200a-3p 2.50E−02 1.65 (+) 0.73 3.00E+02 1.80E+02 hsa-miR-200b-3p 1.70E−02 1.56 (+) 0.73 3.10E+02 2.00E+02 hsa-miR-125a-5p 3.30E−02 1.42 (+) 0.69 1.70E+03 1.20E+03 hsa-miR-148b-3p 2.10E−02 1.20 (+) 0.70 5.40E+02 4.50E+02 hsa-miR-199a-3p 4.10E−02 1.09 (+) 0.70 3.30E+02 3.10E+02 hsa-miR-197-5p 5.60E−05 3.73 (−) 0.89 2.70E+02 1.00E+03 MID-23794 6.50E−05 2.39 (−) 0.84 7.70E+02 1.80E+03 hsa-miR-486-3p 2.00E−05 2.34 (−) 0.89 2.10E+02 4.80E+02 hsa-miR-532-3p 8.50E−04 1.70 (−) 0.82 4.40E+02 7.60E+02 hsa-miR-22-3p 8.10E−03 1.33 (−) 0.75 3.40E+03 4.50E+03 hsa-miR-199a-5p 5.80E−03 1.30 (−) 0.76 3.70E+02 4.80E+02 hsa-miR-23a-3p 4.50E−02 1.26 (−) 0.68 2.60E+03 3.30E+03 hsa-miR-34a-5p 4.10E−02 1.09 (−) 0.63 6.00E+02 6.60E+02 p-values were calculated using a two-sided (unpaired) Student's t-test. The fold-change represents the ratio between the median values of each group. AUC: Area under the curve when using the miRNAs to classify the two groups. Median: median of expression values (rounded). - Expression levels of miRs were compared in 6 non-encapsulated follicular variant of papillary carcinoma samples versus 35 benign samples, and a classifier was generated using the expression levels of hsa-miR-221-3p and hsa-miR-200b-3p, with 98% accuracy, 83% sensitivity and 100% specificity (data not shown).
- Expression levels of miRs were compared in 8 Hashimoto thyroiditis samples and 9 (non-follicular) papillary carcinoma samples. microRNAs that were upregulated or downregulated in papillary carcinoma relative to Hashimoto thyroiditis are presented in Table 6. The miRs that are the best candidates for the profile signature for comparing these two thyroid lesions are hsa-miR-146b-5p, hsa-miR-200a-3p and MID-23794.
-
TABLE 6 miRNAs upregulated or downregulated in papillary carcinoma (PC) versus Hashimoto thyroiditis (Ht) fold- median miR name p-value change AUC PC Ht hsa-miR-146b-5p 2.20E−02 2.46 (+) 0.75 7.90E+02 3.20E+02 hsa-miR-200a-3p 2.30E−02 2.46 (+) 0.75 4.50E+02 1.80E+02 hsa-miR-200b-3p 3.40E−02 2.13 (+) 0.76 4.30E+02 2.00E+02 MID-23794 4.10E−05 4.85 (−) 0.88 4.80E+02 2.30E+03 MID-00387 8.70E−07 4.18 (−) 0.92 7.70E+01 3.20E+02 hsa-miR-486-3p 5.30E−04 2.03 (−) 0.80 1.80E+02 3.70E+02 p-values were calculated using a two-sided (unpaired) Student's t-test. The fold-change represents the ratio between the median values of each group. AUC: Area under the curve when using the miRNAs to classify the two groups. Median: median of expression values (rounded). - Eleven (11) FFPE (Formalin Fixed Paraffin Embedded) thyroid resection samples (obtained from surgical biopsies and fixed in formalin and preserved in paraffin) from follicular lesions were obtained from the Department of Pathology at Rabin Medical Center. The specimens included 6 follicular adenomas and 5 follicular carcinomas. Tumor cellular content was higher than 50% in all the samples.
- A total of 386 novel candidate microRNAs were found with sequence analysis software, and 27 of those were selected for validation, performed by qPCR. Two novel microRNAs are disclosed herein, MD2-495 and MD2-437, and their sequences are presented in Table 1, and their respective hairpins are shown in Table 2.
FIG. 1A shows the secondary structures of the two novel microRNAs, predicted by sequence analysis software.FIG. 1B shows the expression of the two novel microRNAs (normalized number of reads) in each of the 11 samples. The color-coded bar on the right represents a scale for expression. - Stained thyroid FNA smears were obtained from a medical center in Israel (Cohort I); and thyroid FNA cell blocks were obtained from a medical center in the USA (Cohort II). For both cohorts, thyroid lesions were ultimately classified as malignant or benign based on histological diagnosis of the resected tumor. A summary of the breakdown of the samples from the two cohorts is shown in Table 7.
-
TABLE 7 FNA Samples - Cohorts I and II FNA Sample Description Cohort I Cohort II Number of lesions (#patients)1 81 (65) 73 (73) Nodular hyperplasia (nodular Goiter) 13 9 Follicular adenoma 27 18 Graves' disease 3 0 Hashimoto thyroiditis 3 8 Total Benign Nodules 46 35 Papillary carcinoma 10 9 Follicular variant of papillary carcinoma 13 19 Follicular carcinoma 4 10 Medullary carcinoma 6 0 Thyroid carcinoma (Mix histology) 2 0 Total Malignant Nodules 35 38 Bethesda2 class II, VI 33 0 Bethesda2 class III, IV, V 48 73 1Some patients had more than one lesion. 2The Bethesda System for Reporting Thyroid Cytopathology (BSRTC) resulted from a conference held at the National Institutes of Health in 2007 (Cibas ES, Ali SZ. The Bethesda System for Reporting Thyroid Cytopathology. Am J Clin Pathol 2009; 132: 658-65). The system led to standardization of FNA reports based on six diagnostic categories: DC I = non-diagnostic, DC II = benign, DC III = atypia/follicular lesion of undetermined significance (AUS/FLUS), DC IV = follicular neoplasm/suspicion for a follicular neoplasm (FN/SFN), DC V = suspicious for malignancy, and DC VI = malignant. - Highly purified RNA, including the microRNA fraction, was extracted from samples using in-house developed protocols as described above. FFPE and cytological (FNA) samples were profiled by custom printed microarrays measuring over 2000 microRNAs to identify differentially expressed microRNAs and to develop a classifier.
- Over 150 thyroid FNA samples (Table 7) were profiled by custom-printed microarrays measuring over 2000 microRNAs and on 96 microRNAs by qPCR.
FIGS. 3A (cohort I) and 3B (cohort II) show the median microRNA expression levels on microarrays in patients with malignant nodules (y-axis) and in patients with benign nodules (x-axis). For each microRNA, the values in the two groups were compared by Mann-Whitney test with FDR=0.1. - Differential expression of microRNAs was found between benign and malignant neoplasms. Classification of malignant vs. benign smears based on two microRNAs: hsa-miR-146b-5p and hsa-miR-375 results in over 85% accuracy (based on the median of ten 10-fold cross-validation runs, data not shown).
- Expression level of hsa-miR-375 (SEQ ID NO: 8) in FNA cohort I was compared between medullary thyroid cancer samples (n=6) and samples from other thyroid nodules (n=75), and it was significantly higher in medullary lesions compared to malignant non-medullary and benign samples combined, with a fold-change of 201.4 (data not shown). Thus, hsa-miR-375 is a significant marker for medullary thyroid carcinoma.
- MicroRNA expression level in samples stained with different dyes was compared in order to evaluate microRNA stability and reproducibility of the microRNA level detection upon staining (data not shown). A total of 143 smears from FNA cohort I were stained as follows: 60 with May-Griinwald Giemsa (MGG), 64 with DiffQuik and 19 with Papanicolaou. MicroRNA expression levels in duplicates of the same sample stained with different dyes showed significant correlation (more than expected). The normalized expression level of hsa-miR-146b-5p (SEQ ID NO: 10 or 11) is similar when the same sample is stained with different dyes. Therefore, different cytological dyes used in the clinical setting (Papanicolaou; May-Griinwald Giemsa; and DiffQuik) do not affect the detection and quantification of microRNA expression.
- A total of twenty-four (24) microRNAs overall were chosen for establishing the status of thyroid samples as malignant versus benign. MicroRNA expression was measured by RT-PCR as described above. The list of miRs and their respective forward primers are provided in Table 8. First-strand generation was done using polyT adaptor presented below. Forward primers were sequence-specific while the reverse primer was universal. Detection of the RT-PCR products was done with the universal MGB probe for miRs hsa-miR-31-5p, hsa-miR-5701, hsa-miR-424-3p (SEQ ID NO: 16), MID-50971, MID-20094, MID-50976, hsa-miR-3074-5p, hsa-miR-222-3p, MID-50969, hsa-miR-146b-5p, hsa-miR-346, MID-16582, or with probes specific for the miRs as provided in Table 9.
- The sequences of the reverse primer, the polyT adaptor and the MGB probe are provided below:
-
Reverse primer (SEQ ID NO: 309) GCGAGCACAGAATTAATACGAC; PolyT adaptor (SEQ ID NO: 310) GCGAGCACAGAATTAATACGACTCACTATCGGTTTTTTTTTTTTVN, where “V” may be any one of A, G or C; and “N” may be any one of G, C, A or U/T; Universal MGB probe (SEQ ID NO: 311) AAAACCGATAGTGAGTCG. -
TABLE 8 Assay Development - MicroRNAs and forward primers SEQ ID microRNA NO. Forward primer SEQ ID NO. hsa-miR-222-3p 1, 2 GCAGCTACATCTGGCTACTGGGT 312 hsa-miR-551b-3p 3, 4 CAGTCATTTGGCGCGACCCATACTTGGT 313 hsa-miR-31-5p 5, 6, 7 AGGCAAGATGCTGGCATAGCT 314 hsa-miR-375 8 CAGTCATTTGGGTTTGTTCGTTCGGCTC 315 hsa-miR-125b-5p 9 CAGTCATTTGGGTCCCTGAGACCCTAAC 316 hsa-miR-146b-5p 10, 11 TGGCTGAGAACTGAATTCCATAGGCT 317 hsa-miR-152-3p 12, 13 CAGTCATTTGGCTCAGTGCATGACAGAA 318 hsa-miR-346 14 TGTCTGCCCGCATGCCTGCCTCT 319 hsa-miR-181c-5p 15 CAGTCATTTGGCAACATTCAACCTGTCG 320 hsa-miR-424-3p 16 CAAAACGTGAGGCGCTGCTAT 321 hsa-miR-342-3p 17, 18 CAGTCATTTGGGTCTCACACAGAAATCG 322 hsa-miR-138-5p 19, 20, 21 CAGTCATTTGGCAGCTGGTGTTGTGAAT 323 hsa-miR-486-5p 22 CAGTCATTTGGCTCCTGTACTGAGCTGC 324 hsa-miR-200c-3p 23, 24 CAGTCATTTGGGTAATACTGCCGGGTAA 325 MID-16582 25 TTGGCAGTGAAGCATTGGACTGTA 326 hsa-miR-23a-3p 26 CAGTCATTTGGCATCACATTGCCAGGGA 327 MID-20094 27, 28 CATTTGGCTAAGCCAGTTTCTGTCTGATA 328 MID-50969 29 TGGCATGACAGATTGACATGGACAATT 329 hsa-miR-345-5p 30, 31 CAGTCATTTGGCGCTGACTCCTAGTCCA 330 hsa-miR-3074-5p 32 CGTTCCTGCTGAACTGAGCCAG 331 MID-50976 33 CCTGTCTGAGCGCCGCTC 332 MID-50971 34 CAGTCATTTGGCATACTCTGGTTTCTTTTC 333 hsa-miR-5701 35 AGTCATTTGGCTTATTGTCACGTTCTGATT 334 hsa-miR-574-3p 36, 37 CAGTCATTTGGCCACGCTCATGCACACA 335 -
TABLE 9 Assay Development - MicroRNA Specific probes SEQ microRNA Specific probe sequence ID NO. hsa-miR-342-3p CCGTTTTTTTTTTTTACGGGTGC 336 hsa-miR-181c-5p CCGTTTTTTTTTTTTACTCACCG 337 hsa-miR-125b-5p CCGTTTTTTTTTTTTCACAAGTT 338 hsa-miR-375 CCGTTTTTTTTTTTTCACGCGAG 339 hsa-miR-486-5p CCGTTTTTTTTTTTTCTCGGGGC 340 hsa-miR-551b-3p CCGTTTTTTTTTTTTCTGAAACC 341 hsa-miR-23a-3p CCGTTTTTTTTTTTTGGAAATCC 342 hsa-miR-574-3p CCGTTTTTTTTTTTTGTGGGTGT 343 hsa-miR-152-3p CGTTTTTTTTTTTTCCAAGTTC 344 hsa-miR-200c-3p CGTTTTTTTTTTTTCCATCATT 345 hsa-miR-138-5p CGTTTTTTTTTTTTCGGCCTGA 346 hsa-miR-345-5p CGTTTTTTTTTTTTGAGCCCTG 347 - Marker microRNAs were selected based on their patterns of expression in several preliminary studies performed by the inventors (data not shown), and provided the reasoning for classifying the same as “malignant”, “cell type” or alternatively, to be used as normalizers.
- “Malignant markers” hsa-miR-222-3p, hsa-miR-551b-3p, hsa-miR-31-5p, hsa-miR-375, hsa-miR-125b-5p, hsa-miR-152-3p, hsa-miR-346, hsa-miR-181c-5p, hsa-miR-424-3p and hsa-miR-146b-5p were established according to the level of expression of these microRNAs in malignant samples when compared with their expression in benign samples.
- “Cell type” markers hsa-miR-486-5p, hsa-miR-342-3p, hsa-miR-138-5p, hsa-miR-200c-3p, and MID-16582 were chosen by the inventors according to their pattern or expression as exemplified below.
- hsa-miR-486-5p (SEQ ID NO: 22) was found enriched in whole blood relative to thyroid epithelial cells. Along with other microRNAs (data not shown), it was found to be associated with the amount of blood in thyroid FNA samples. Thus, hsa-miR-486-5p is one example of whole blood marker. Several microRNAs were detected in high correlation (>0.85) with miR-486-5p, and may also be considered blood markers, including hsa-miR-320a, hsa-miR-106a-5p, hsa-miR-93-5p, hsa-miR-17-3p, hsa-let-7d-5p, hsa-miR-107, hsa-miR-103a-3p, hsa-miR-17-5p, hsa-miR-191-5p, hsa-miR-25-3p, hsa-miR-106b-5p, hsa-miR-20a-5p, hsa-miR-18a-5p, hsa-miR-144-3p, hsa-miR-140-3p, hsa-miR-15b-5p, hsa-miR-16-5p, hsa-miR-92a-3p, hsa-miR-484, hsa-miR-151a-5p, hsa-let-7f-5p, hsa-let-7a-5p, hsa-let-7c-5p, hsa-let-7b-5p, hsa-let-7g-5p, hsa-let-7i-5p, hsa-miR-185-5p, hsa-miR-30d-5p, hsa-miR-30b-5p, hsa-miR-30c-5p, hsa-miR-19b-3p, hsa-miR-26a-5p, hsa-miR-26b-5p, hsa-miR-425-5p, MID-19433, and hsa-miR-4306.
- The inventors observed, upon measuring the microRNA profile of the blood compartments, that a number of microRNAs were found elevated in different blood cell types (data not shown). Thus, hsa-miR-342-3p (SEQ ID NO: 17 or 18) was one of the microRNAs, amongst others, which was enriched in white blood cells, and may therefore be considered an example of white blood cell marker. Interestingly, hsa-miR-342-3p showed to be expressed in correlation with hsa-miR-150-5p, suggesting that also hsa-miR-150-5p is a white blood cell marker. In addition, hsa-miR-146a-5p was also shown to be expressed in white blood cells (data not shown).
- hsa-miR-200c-3p (SEQ ID NO: 23 or 24) and hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21) were found enriched in epithelial cells. In a preliminary experiment, smears were generated with blood in the absence of thyroid tissue material, and compared with smears from thyroid tissue. Both hsa-miR-200c-3p and hsa-miR-138-5p were found to be expressed at much higher levels in the thyroid smears (both benign and malignant) compared to blood smears (data not shown). Other microRNAs were also found enriched in epithelial cells (data not shown). Thus, hsa-miR-200c-3p and hsa-miR-138-5p are examples of epithelial cell markers. Interestingly, the inventors found that the expression of hsa-miR-138-5p correlated with the presence of epithelial cells, and in certain subsets of the data hsa-miR-138-5p was found to be upregulated in benign samples (data not shown).
- MID-16582 (SEQ ID NO: 25) was found at higher expression levels in Hurthle cells. In preliminary studies, the inventors have surprisingly found that this microRNA is upregulated in follicular adenoma presenting Hurthle cells versus follicular adenomas not indicated to have Hurthle cells (
FIGS. 2A-2B ). This result may be attributed to the mitochondrial enrichment found in Hurthle cells. The present inventors have found that the sequence of MID-16582, as well as other nucleic acid sequences found in Hurthle cells, can be mapped to mitochondrial DNA (data not shown). Thus, MID-16582 is an example of Hurthle cell marker. - The assay development set included about 360 distinct samples. Most of the samples were stained FNA smears (Papanicolaou, May-Griinwald Giemsa or Diff-Quik). Forty-five (45) FNA samples were in cell blocks. The samples were collected from medical centers in Israel, Europe and USA. Thirty-three of the samples came from thyroid nodules that were less than 1 cm in size. The smallest nodule size was 0.1 cm. Samples of medullary carcinoma were excluded from most of the analyses, unless where indicated. Table 10 provides the distribution of the samples per category.
-
TABLE 10 Assay Development Cohort Composition and Bethesda distribution No. Histological type Papillary carcinoma 84 Papillary carcinoma, follicular variant 77 Follicular carcinoma 16 Unspecified carcinoma 6 Medullary 14 Nodular hyperplasia 65 Follicular adenoma 81 Hashimoto 6 Graves 3 Total Malignant 197 Total Benign 155 inconclusive 4 Bethesda I 0 II (“determinate”) 38 III (“indeterminate”) 71 IV (“indeterminate”) 113 V (“indeterminate”) 74 VI (“determinate”) 60 unknown 98 Determinate total 258 Indeterminate total 84 - Samples from FNA smears routinely prepared as well as cell blocks were used for total RNA extraction and RT-PCR amplification. All the samples were tested with a panel of 24 microRNAs.
- Results of the assay development in a sub-set of samples (n=353) are shown in
FIG. 3 . Expression of microRNAs hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), hsa-miR-346 (SEQ ID NO: 14), hsa-miR-181c-5p (SEQ ID NO: 15), and hsa-miR-375 (SEQ ID NO: 8) above the threshold are found in correlation with malignant samples. The expression levels shown inFIG. 3 were obtained by the following formula: [50—normalized Ct of each marker]. The normalization was done by subtracting the mean signal of the normalizers. The value of the mean signal of the normalizers over all the samples used, was added to all the expression values detected, in order to bring the values to a range more manageable for calculation. Interestingly, expression levels of hsa-miR-125a-5p correlate with that of hsa-miR-125b-5p. - Four algorithms were used in order to establish the best classifier to be implemented in the thyroid assay, Discriminant Analysis, K-nearest neighbor (KNN), support vector machine (SVM) and Ensemble of discriminant analysis classifiers (Discriminant Analysis Ensemble).
- The following parameters were established a priori:
- Priors: For all the algorithms used, priors were set to 70% for the malignant and 30% for the benign samples.
- Sample Set: In this example, three sample sets were analyzed. One sample set included malignant (n=183) plus benign (n=155) samples, which excludes the malignant medullary samples; referred to below and in the Figures as “malignant+benign”. Another sample set included all “indeterminate” samples, which includes all samples classified as Bethesda III, IV and V, referred to below and in the Figures as “indeterminate”. A third sample set included samples classified as Bethesda IV only, referred to below and in the Figures as “Bethesda”. Samples from thyroid lesions classified as Bethesda IV are usually difficult to classify by cytological parameters. Therefore, it is important to establish a classifier that is based on this sub-group of samples. In addition, specific samples that presented technical problems due to a variety of reasons (e.g. malignant samples with Bethesda II; sample taken from lymph nodes) were excluded.
- Medullary samples were excluded from the classification. Therefore, in this Example, when referring to malignant samples it means non-medullary malignant.
- MicroRNA Ratios: Ratios were obtained from pairs of microRNAs in an attempt to subtract certain factors from the classifier. Thus e.g. a ratio of hsa-miR-31-5p:hsa-miR-342-3p enables to reduce the contribution of white blood cells (through the expression of hsa-miR-342-3p, the denominator) in the expression of hsa-miR-31-5p (the numerator). Since CTS are in log-scale, ratios were created by subtracting one miR expression from the other. Each ratio was further normalized by adding a constant, in order for the ratios to be within the same range as the microRNA normalized values.
- In this Example, microRNA normalized values were obtained by dividing microRNA expression levels by the expression levels of hsa-miR-23a-3p, MID-20094, MID-50969, hsa-miR-345-5p, hsa-miR-3074-5p, MID-50976, MID-50971, hsa-miR-5701 or hsa-miR-574-3p; and were further subtracted from 50, in order for lower CTS to be associated with higher expression values.
- When discriminant analysis was used as the algorithm, a linear discriminant type of discriminant analysis (LDA) was applied, in three sets of samples as mentioned above, using as features either different combinations of microRNA expression levels (
FIG. 4A-4C ,FIG. 10A-10C andFIG. 22A-22C ), microRNA ratios (FIG. 5A-5C ,FIG. 11A-11C andFIG. 23A-23C ), or a combination of microRNA expression levels and microRNA ratios (FIG. 6A-6C ,FIG. 12A-12C andFIG. 24A-24C ). - Analysis of the malignant+benign samples using (i) different combinations of microRNA expression levels, e.g. two (e.g. hsa-miR-551b-3p and hsa-miR-146b-5p), three (e.g. hsa-miR-551b-3p, hsa-miR-146b-5p, and hsa-miR-31-5p), or eight microRNAs (e.g. hsa-miR-551b-3p; hsa-miR-146b-5p; hsa-miR-31-5p; hsa-miR-222-3p; hsa-miR-375; hsa-miR-125b-5p; hsa-miR-152-3p; hsa-miR-181c-5p), resulted in a sensitivity of between 82-85%, and a specificity of between 68-81.5%; or (ii) microRNA ratios [e.g. hsa-miR-146b-5p:hsa-miR-342-3p and hsa-miR-31-5p:hsa-miR-342-3p], [e.g. hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p], or [e.g. hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-138-5p; hsa-miR-125b-5p:hsa-miR-200c-3p; hsa-miR-222-3p:hsa-miR-486-5p; hsa-miR-200c-3p:hsa-miR-486-5p; MID-16582:hsa-miR-200c-3p; MID-16582:hsa-miR-138-5p] resulted in a sensitivity of between 74-81.1% and a specificity of between 79.5-84.1%. In an alternative type of analysis a combination of microRNAs and microRNA ratios was used, resulting in a sensitivity of between 83-93.3% and a specificity of between 42.4-82.2% (
FIG. 4A-4C ,FIG. 5A-5C ,FIG. 6A-6C ). - Analysis of the indeterminate samples using (i) different combinations of microRNA expression levels, e.g. two, three (hsa-miR-146b-5; hsa-miR-551b-3p; hsa-miR-222-3p), or eight microRNAs resulted in a sensitivity of between 80-82.3%, and a specificity of between 56-71.4%; or (ii) microRNA ratios [e.g. hsa-miR-146b-5p:hsa-miR-342-3p; hsa-miR-31-5p:hsa-miR-342-3p; hsa-miR-125b-5p:hsa-miR-200c-3p] resulted in a sensitivity of around 80% and a specificity of between 66-72.2%. In an alternative type of analysis a combination of microRNAs and microRNA ratios was used, resulting in a sensitivity of between 79-88% and a specificity of between 67.5-73.8% (
FIG. 10A-10C ,FIG. 11A-11C ,FIG. 12A-12C ). - Analysis of the Bethesda IV samples using (i) different combinations of microRNA expression levels, e.g. two (hsa-miR-125b-5p; hsa-miR-551b-3p), three (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p), or eight (hsa-miR-125b-5p; hsa-miR-551b-3p; hsa-miR-222-3p; hsa-miR-146b-5p; hsa-miR-375; hsa-miR-181c-5p; hsa-miR-31-5p; hsa-miR-138-5p), resulted in a sensitivity of between 89-91.5%, and a specificity of between 39-47%; or (ii) microRNA ratios [e.g.] resulted in a sensitivity of between 80-91.5% and a specificity of up to 57%. In an alternative type of analysis a combination of microRNAs and microRNA ratios was used, resulting in a sensitivity of between 87-93.6% and a specificity of up to 46% (
FIG. 22A-22C ,FIG. 23A-23C ,FIG. 24A-24C ). - One analysis was performed using KNN (k-nearest neighbors) as the algorithm, in which k=5 was used with a distance metric of Pearson correlation. The analysis with the KNN algorithm was applied to three sets of samples as mentioned above (malignant+benign, indeterminate and Bethesda IV) using as features either different combinations of microRNA expression levels (
FIG. 7A-7C ,FIG. 13A-13C andFIG. 25A-25C ), microRNA ratios (FIG. 8A-8B ,FIG. 14A-14B andFIG. 26A-26B ), or a combination of microRNA expression levels and microRNA ratios (FIG. 9A-9C ,FIG. 15A-15C andFIG. 27A-27C ). - For malignant+benign samples, analysis of six, eight or twelve microRNA expression levels achieved sensitivity of between 81.1-82.9% and specificity of up to 74.2%. Analysis of six or eight microRNA ratios, achieved sensitivity of between 78-80.5% and specificity of up to 65.6%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 83.5-85.4% and specificity of up to 70.9% (
FIG. 7A-7C ,FIG. 8A-8B ,FIG. 9A-9C ). - For indeterminate samples, analysis of six, eight or twelve microRNA expression levels achieved sensitivity of between 73.9-82.6% and specificity of up to 73%. Analysis of six or eight microRNA ratios, achieved sensitivity of between 76.5-80.9% and specificity of up to 65.9%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 76.5-80.9% and specificity of up to 67.5% (
FIG. 13A-13C ,FIG. 14A-14B ,FIG. 15A-15C ). - For Bethesda IV samples, analysis of six, eight or twelve microRNA expression levels achieved sensitivity of between 66-72.3% and specificity of up to 61.9%. Analysis of six or eight microRNA ratios, achieved sensitivity of between 78.7-80.9% and specificity of up to 61.9%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 63.8-74.5% and specificity of up to 58.7% (
FIG. 25A-25C ,FIG. 26A-26B ,FIG. 27A-27C ). - A third analysis was performed applying SVM (Support vector machine) as the algorithm, in which linear kernel was used. The analysis with the SVM algorithm was applied to the three sets of samples as mentioned above (malignant+benign, indeterminate and Bethesda IV), using as features either different combinations of microRNA expression levels, microRNA ratios, or a combination of microRNA expression levels and microRNA ratios, respectively. The results are described below.
- For malignant+benign samples, analysis of three, six or eight microRNA expression levels achieved sensitivity of between 82.3-86% and specificity of up to 75.5%. Analysis of three, six or eight microRNA ratios, achieved sensitivity of about 83% and specificity of up to 80.8%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 82.9-86.6% and specificity of up to 83.4% (data not shown).
- For indeterminate samples, analysis of six, eight or twelve microRNA expression levels achieved sensitivity of between 82.6-90.4% and specificity of up to 60.3%. Analysis of three, six or eight microRNA ratios, achieved sensitivity of between 81.7-88.7% and specificity of up to 67.5%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 80-89.9% and specificity of up to 71.4% (
FIG. 16A-16C ,FIG. 17A-17C ,FIG. 18A-18C ). - For Bethesda IV samples, analysis of three, six or eight microRNA expression levels achieved sensitivity of between 89.4-97.9% and specificity of up to 55.6%. Analysis of three, six or eight microRNA ratios, achieved sensitivity of between 93.6-100%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 91.5-95.7% (data not shown).
- A fourth analysis was performed applying Ensemble methods as the algorithm. An ensemble of up to 100 discriminant analysis classifiers was created using AdaBoost and applied to the data. The analysis with the Ensemble algorithm was applied to three sets of samples as mentioned above (malignant+benign, indeterminate and Bethesda IV), using as features either different combinations of microRNA expression levels, microRNA ratios, or a combination of microRNA expression levels and microRNA ratios. The results are described below.
- For malignant+benign samples, analysis of two, three or eight microRNA expression levels achieved sensitivity of about 85% and specificity of up to 74.8%. Analysis of three, six or eight microRNA ratios, achieved sensitivity of about 83.5-86% and specificity of up to 79.5%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of about 86% and specificity of up to 82.8% (data not shown).
- For indeterminate samples, analysis of two, three or eight microRNA expression levels achieved sensitivity of between 84.3-88.7% and specificity of up to 64.3%. Analysis of two, three or eight microRNA ratios, achieved sensitivity of between 86.1-89.7% and specificity of up to 65.1%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of between 83.5-87.8% and specificity of up to 65.9% (
FIG. 19A-19C ,FIG. 20A-20C ,FIG. 21A-21C ). - For Bethesda IV samples, analysis of two, three or eight microRNA expression levels achieved sensitivity of between 89.4-93.6% and specificity of up to 46%. Analysis of two, three or eight microRNA ratios, achieved sensitivity of between 89.4-93.6% and specificity of up to 44.4%. Analysis of a combination of microRNA expression levels and microRNA ratios achieved sensitivity of around 91.5% (data not shown).
- The same sample set used in Example 9, but including medullary malignant samples was used for establishing a classifier. All classifiers (LDA, KNN, SVM and Discriminant Analysis Ensemble) were applied, and a representative set of results from the discriminant analysis algorithm showed that when normalized values of two microRNA ratios (e.g. hsa-miR-125b-5p:hsa-miR-138-5p; and hsa-miR-146b-5p:hsa-miR-342-3p) were used as the features for the classification, the sensitivity of the classifier was 84.7% and the specificity, 80.8% (
FIG. 30-31 ). When the normalized values of two microRNAs (e.g. hsa-miR-222-3p and hsa-miR-551b-3p) were used as the features for the classification, the sensitivity was 85.2% and the specificity, 53.6% (FIG. 30-31 ). - One important consideration throughout this study was the accuracy of the result that is to be provided to a patient who has had an FNA sample collected. Laboratories tend to err in order not to provide false-negative results. On the other hand, in the analysis of FNA specimens, a suspicious diagnostic will send the patient to surgery, which in more than 25% of the cases turns out to be unnecessary. For example, at least one report in the literature described that thyroid tumor samples with large amounts of blood, or even pure blood, are misdiagnosed as suspicious in 7 out of 9 cases (Walsh et al. (2012) J Clin Endocrin Metab. doi:10.1210/jc.2012-1923).
- With this goal in mind, the present inventors searched for microRNAs that could be used as cell type markers and aid in the screening of the quality of the specimen examined. Thus, the expression of hsa-miR-486-5p (SEQ ID NO: 22) and hsa-miR-200c-3p (SEQ ID NO: 23 or 24) was evaluated in the development assay cohort, including cell blocks, having samples from benign and malignant (non-medullary) thyroid lesions, as well as four samples of blood only (slides of blood smears were generated for this purpose, and RNA extracted as described herein). The results showed that the blood microRNA marker, hsa-miR-486-5p is very high and the epithelial marker, hsa-miR-200c-3p, is very low, compared to the threshold established in the development assay set (
FIG. 32 ). The blood smear samples were therefore filtered out using these markers. This expression pattern indicates that these samples do not have enough epithelial cells (for lack of the epithelial cell marker) to continue the test. In a test situation, these four samples of blood smears would be disqualified and discarded. Expression of hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21) has also been shown to be low, compared to the threshold, in blood smears (data not shown). Samples with this profile are eligible to be disqualified and/or discarded from the protocol for classification of thyroid lesion samples. - The inventors had previously established that expression of hsa-miR-342-3p (SEQ ID NO: 17 or 18) correlates with white blood cells (data not shown). Hence, high expression of hsa-miR-342-3p compared to the threshold indicated lack of sufficient thyroid cells, and samples with this profile are eligible to be disqualified and/or discarded from the protocol for classification of thyroid lesion samples.
- In parallel, high expression of hsa-miR-200c-3p is an indicator of the presence of epithelial cells in general, and specifically thyroid cells (data not shown and
FIG. 32 ). Hence, the expression of hsa-miR-200c-3p above a threshold may be used as an indicator of sufficiency of thyroid cells in the sample. - Classification of benign thyroid tumor sub-types was done using samples from Hashimoto (n=6) and follicular adenoma (FA; n=81), from the development assay cohort. Expression of hsa-miR-342-3p and hsa-miR-31-5p in Hashimoto samples was high compared to the threshold established in the assay set (
FIG. 33 ). Thus, high expression of hsa-miR-342-3p alone or in combination with hsa-miR-31-5p may be used for the classification of samples as benign, and further sub-typing as Hashimoto. - Further, the inventors also tested microRNA ratios for sub-typing benign thyroid tumors. In this context, the miR ratio of hsa-miR-125b-5p:hsa-miR-200c-3p was significant for classifying follicular adenoma (FA) versus Hashimoto samples (data not shown).
- Classification of malignant thyroid tumor sub-types was done using a subset of samples (n=177) of the assay cohort. In one example of an analysis, 146b-5p, 222-3p, 31-5p, 125b-5p, 551-3p and 375 were found to be highly expressed in papillary carcinoma, while MID-16582 was found to be highly expressed in follicular carcinoma (
FIG. 34 ). - The ratios of the following miR pairs were significant for classifying Papillary Carcinoma (PC) versus Follicular Carcinoma samples: hsa-miR-146b-5p:hsa-miR-342-3p, hsa-miR-125b-5p:hsa-miR-200c-3p, hsa-miR-222-3p:hsa-miR-486-5p, hsa-miR-31-5p:hsa-miR-342-3p, MID-16582:hsa-miR-200c-3p, MID-16582:hsa-miR-138-5p (data not shown).
- Therefore, the inventors have demonstrated that malignant thyroid tumor sub-typing may be performed using miR ratios, particularly miR ratios where the denominator is a cell marker microRNA, such as hsa-miR-486-5p, hsa-miR-200c-3p, hsa-miR-138-5p, and hsa-miR-342-3p.
- A flowchart with a protocol for thyroid nodule sample analysis, from collection of FNA samples to laboratory analysis and diagnostic (
FIG. 35 ). FNA samples are collected from patients having thyroid nodules, and are routinely processed. Smears are prepared from the FNA samples. As a first step, a specialist in cytopathology examines the FNA sample and provides an analysis. In cases where the analysis is inconclusive, particularly in samples classified as Bethesda III, IV, or V, i.e. so-called “indeterminate”, the sample is sent to Rosetta Genomics' laboratories to undergo microRNA profiling and conclusive diagnostic. Total RNA is extracted from the sample, which undergoes microRNA profiling. MicroRNA profiling may be performed by amplification (RT-PCR or NGS) or hybridization (microarray), as shown in the Examples above. - The protocol may include any one of the following:
- One or more algorithms may be used during classification, and will be applied on data comprising single microRNAs expression, microRNA ratios, or a combination thereof.
- Samples wherein the hsa-miR-375 expression level is above a specific threshold may be determined as malignant, e.g. a threshold of at least 10, or a threshold of at least 18 (data not shown), when the expression was analyzed by array or PCR, respectively. The threshold is dependent on the normalization of the samples, as well as on the methodology used for measuring the microRNAs. The threshold may also be a function of the target sensitivity and specificity.
- Samples wherein the hsa-miR-146b-5p expression level is above a specific threshold will be determined as malignant (e.g. a threshold of at least 16;
FIG. 28 and data not shown). The threshold is dependent on the normalization of the samples, as well as on the methodology used for measuring the microRNAs. The threshold may also be a function of the target sensitivity and specificity. - Samples wherein the ratio hsa-miR-146b-5p:hsa-miR-342-3p, further to normalization, is above a specific threshold will be determined as malignant, e.g. a threshold of at least 16 (
FIG. 29 and data not shown). The threshold is dependent on the normalization of the samples, as well as on the methodology used for measuring the microRNAs. - The level of expression of the normalizers may be used as an indicator for discarding samples, due to insufficient tumor-derived material. Thus, samples presenting low levels of any of the normalizers, or the minimal, median or maximal value of expression for the normalizers may be discarded. For example, low levels of hsa-miR-23a-3p (compared to the overall levels of hsa-miR-23a-3p expression in the cohort) are likely to be misclassified. In counterpart, high levels of hsa-miR-23a-3p improve the classification by improving sensitivity and specificity (data not shown).
- Analysis of the microRNA profiling data leads to diagnostic of the thyroid nodule as benign or malignant. Results permitting, which include the expression of microRNAs that may be associated with thyroid tumor sub-types, as shown in
FIGS. 33 and 34 , for example, the sample is further classified according to its thyroid tumor subtype. - A training set of 375 FNA smears (Table 11) was used to develop a classifier for diagnosing indeterminate thyroid nodules as benign or malignant according to microRNA profiling, using the set of 24 miRNAs established and described in Example 8 above. From the total 375 FNA smear samples, 252 samples were profiled in the RG-IL (Rosetta Genomics, Rehovot, Israel) laboratory and 123 samples were profiled in the RGL-US (Rosetta Genomics, Philadelphia, US) laboratory.
- The classifier developed combines several linear discriminant analysis (LDA) steps and a KNN-based classifier. The LDA step which classifies medullary samples is based on the expression of hsa-miR-375 (an indicator of medullary carcinoma) (
FIG. 36 ). Samples classified in this step receive a final classification of malignant, and are marked as being positive for expression of the medullary marker. The KNN classifier (K=9) uses a Pearson correlation distance metric over the CT values of six of the measured miRNAs, and uses a training set of 314 samples. Samples with at least four benign neighbors are classified as benign by the KNN classifier. Only a subset of the training samples were used to determine the classifier. For example, samples with low microRNA expression in the KNN step were not included in the KNN classifier. - The performance of the training set is summarized in Table 12 and Table 13.
- Table 12 shows the result of the classification according to the classifier (malignant or benign), as well as sensitivity and specificity in each sub-group of samples, Indeterminate (all sub-types), Indeterminate (subtypes III and IV) and Determinates. Sensitivity of the classifier on indeterminate samples (Bethesda III-V) was estimated to be 86%, based on the training set, and the specificity was estimated to be 75%. The performance of the classifier on determinate samples (Bethesda II and VI) was estimated to be higher, with a sensitivity of 96% and a specificity of 82% for these samples.
- Table 13 shows the performance of the assay per histological type, and indicates the number of samples and the percentage of agreement between the classification obtained with the classifier and the cytopathologist classification.
-
TABLE 11 Tumor samples used for the final classifier and for the validation Traininga Validation Cohort #Samples 375 201 #Patients 357 201 % Malignant 49 30 Age (median) 54 53 % Females 73 80 Cytology #Giemsa 212 90 #Diff-Quik 95 21 #Papanicolaou 62 90 #BethesdaII 27 0 #BethesdaIII 80 29 #BethesdaIV 142 131 #BethesdaV 77 41 #BethesdaVI 49 0 aFor 64 training samples, the age information was missing and for 10 training samples, the gender was unknown. Three training samples were created by mixing more than one slide (with different stains), two were unstained, and for one the stain was unknown. -
TABLE 12 Assay performance for the final classifier training set and validation Indeterminate Indeterminate Bethesda (all sub-types)c (III, IV)c II and VIc Traininga Malignant 115 59 40 Benign 147 137 26 Sensitivity 86 [78-92] 78 [65-88] 96 [85-100] Specificity 75 [67-81] 76 [68-83] 82 [62-94] Validationb, Malignant 61 31 0 entire set Benign 128 119 0 Sensitivity 85 [74-93] 74 [55-88] NA Specificity 72 [63-79] 74 [65-82] NA NPV 91 [84-96] 92 [84-96] NA PPV 59 [48-69] 43 [29-57] NA Validationb, Malignant 40 14 0 Agreement Benign 110 102 0 set Sensitivity 98 [87-100] 100 [77-100] NA Specificity 78 [69-85] 80 [71-88] NA NPV 99 [94-100] 100 [96-100] NA PPV 62 [49-74] 41 [25-59] NA aIn the training set, estimates are based on the mean of ten 10-fold cross-validation runs. Samples with very low expression in any of the classification steps, as well as medullary samples are not included. bSamples that failed QC are not included. c95% Confidence Intervals are in square brackets [ ]. NA = Non-Applicable. -
TABLE 13 Performance of assay per histological type Validation Histological Traininga Validation Full-agreement Set type Samplesb % Correctc Samplesb % Correctc Samplesbd % Correctc Medullary 5 100 [48-100] 3 100 [29-100] 1 (33.3%) 100 [3-100] PTC Classic 48 94 [83-100] 17 88 [64-99] 15 (88.2%) 100 [78-100] FVPTC 40 81 [65-92] 37 84 [68-94] 23 (62.2%) 96 [78-100] FC 16 56 [30-80] 3 67 [9-99] 1 (33.3%) 100 [3-100] FA 90 76 [66-84] 95 76 [66-84] 82 (86.3%) 82 [72-89] Nodular 48 75 [60-86] 28 64 [44-81] 23 (82.1%) 74 [52-90] Hyperplasia Hashimoto 9 82 [44-99] 5 40 [5-85] 5 (100.0%) 40 [5-85] PDCe 5 100 [48-100] 1 100 [3-100] 0 (0%) NA Papillary, 6 88 [54-100] 0 NA 0 NA Other Total 267 — 189 — 150 — aOnly indeterminate training samples are listed in the Table. Estimates are based on the mean of ten 10-fold cross-validation runs. Samples with very low expression in any of the classification steps, as well as medullary samples, are not included. bNumber of samples includes only those that passed QC. c95% Confidence Intervals are in square brackets dNumbers in parentheses signify the percentage of validation samples in the agreement set ePDC = Poorly Differentiated Carcinoma fOther Malignant = Non-classic Papillary and non-FVPTC samples. NA = Not Applicable. - As an additional proof-of-concept of its performance, the classifier was also assessed on a set of 48 FNA cell blocks (used also in Examples 2 and 5), which were run in the final assay format. The performance of the classifier on the indeterminate samples (72% sensitivity and 79% specificity) was similar to the performance of the FNA smears, as can be seen in Table 14, with a slightly lower sensitivity and a slightly higher specificity. The sensitivity of the malignant Bethesda VI smears was 89% and the specificity of the benign Bethesda II samples was 63%.
-
TABLE 14 Performance of the assay on cell blocks Malignant Benign Sensitivity (%) Specificity (%) All 23 25 78 [56-93] 80 [59-93] Indeterminate 18 24 72 [47-90] 79 [58-93] (Bethesda III, IV and V) Determinate 5 1 100 [48-100] 100 [3-100] (Bethesda II and VI) 95% Confidence Intervals are in square brackets - Validation of the classifier described in Example 14 was done on a blinded retrospective cohort of 199 FNA smears.
- The retrospective validation cohort is an independent set of 201 consecutive, indeterminate FNA samples (Table 11) from five centers, which was tested blindly in the clinical validation assay. This set of 201 samples included only samples for which at least two of the three pathologists (the original pathologist and at least one of the additional pathologists) agreed on the final diagnosis of the excised H&E stained nodule. In addition, samples whose diagnosis was determined to be medullary carcinoma, according to the original pathologist, were included (regardless of the diagnosis of the additional two pathologists). Only twelve of the 201 samples (6%) failed during processing or QC steps, with the most common reason being low miRNA expression. The performance of the validation cohort was found to be very similar to the performance estimates of the training cohort, as can be seen in Tables 12 and 13.
- The NPV of the entire set was 91%, the sensitivity was 85%, the specificity was 72% and the PPV was 59% (
FIG. 37 ). When excluding 23 samples with tumor size<1 cm, i.e, when the samples are with tumor size equal or larger than 1 cm, the sensitivity was 84% and the specificity was 72%. Seventy of the validation samples from the US passed QC steps, which demonstrated a sensitivity of 82%, and a specificity of 71%. The nine misclassified malignant samples (Table 15) included samples from all three indeterminate Bethesda classes, both Giemsa and Papanicolaou stains, and were from three medical centers. The misclassified FC sample was described as having minimal capsular invasion, according to the original pathologist, as were the other two, correctly classified FC samples. The Hashimoto samples showed a lower correct classification rate, relative to both the training performance and to the other benign samples. However, this difference may be due to the small number of Hashimoto samples in the validation set. -
TABLE 15 The misclassified malignant validation samples Extracted Amount Histological Histological In Agreement Bethesda Stain (ng) Gender Type Subtype Set? V MGG 294 Female Papillary FVNE Yes carcinoma IV MGG 4716 Female Papillary Classic variant No carcinoma IV PP 138 Male Papillary FVE No carcinoma III PP 115 Female Papillary FVE No carcinoma IV PP 103 Female Papillary FVE No carcinoma IV MGG 51 Female Papillary FVE No carcinoma IV PP 1242 Female Papillary FVE No carcinoma IV MGG 249 Female Follicular MCI No carcinoma IV MGG 451 Male Papillary Classic variant No carcinoma MGG = May-Grünwald Giemsa; PP = Papanicolaou; FVNE = Follicular Variant, Non-encapsulated; CV = Classic variant; MCI = Minimal capsular invasion; FVE = Follicular Variant Encapsulated - In order to test the assay on a set of samples for which the final diagnosis has a high degree of certainty, a subset of the validation set (“Agreement set”) was compiled. This set comprised 160 samples (80% of the validation set) for which all three pathologists agreed on the final diagnosis. This set demonstrated very high performance (Table 16). The NPV of the Agreement set was 99% (only one malignant sample was misclassified as benign), with a sensitivity of 98%, a specificity of 78%, and a PPV of 62%. If the NPV and PPV are calculated based on the sensitivity, specificity and prevalence, the NPV would remain over 95%, even if the rate of malignancy was above 60%. The samples in the Agreement set (Table 16) had a much higher correct classification rate when compared with samples not in the Agreement set: of the samples that passed QC steps, 150 were in the Agreement set and 25 (17%) of these were misclassified, whereas 39 samples were not in the Agreement set and 20 (51%) of these were misclassified (p<6.14e-06, χ2 test). The same conclusion is reached when focusing on the malignant samples, with malignant samples in the Agreement set having a higher correct classification rate: 40 malignant samples were in the Agreement set and one (2.5%) was misclassified, whereas 21 malignant samples were not in the Agreement set and eight (38%) were misclassified (p<5e-4, Fisher's exact test).
- The performance of the assay is influenced by the accuracy of the diagnosis. Therefore, the level of agreement between the pathologists for the different histological types (Table 16) was examined.
-
TABLE 16 The malignant histological types in the Agreement set Agreement set Not in Agreement set Total #Misclassifiedb Total #Misclassifiedb Medullary 1 0 2 0 Papillary classic 15 0 2 2 FVPTC, encapsulated 12 0 14 5 FVPTC, 10 1 0 0 non-encapsulated FC 1 0 2 1 PDC 0 0 1 0 Totala 39 1 21 8 FVPTC = Follicular Variant of Papillary Thyroid Carcinoma aOne FVPTC sample (in the full-agreement set and correctly classified) is not included in the Table, since there was no information available regarding the encapsulation status. bMisclassified as benign. - Of the 17 classic/conventional papillary thyroid carcinoma (PTC) samples in the entire validation set, only two were not included in the agreement set; these two samples were the only PTC samples misclassified by the assay as benign. There was a large number of encapsulated Follicular Variant of Papillary Carcinoma (FVPTC) in the entire validation set that were not included in the agreement set. This higher proportion of encapsulated FVPTC in the subset of samples for which there was no agreement, was statistically significant when compared with the proportion of non-encapsulated FVPTC (p<0.0029, Fisher's exact test). Furthermore, of the 26 encapsulated FVPTC samples in the entire validation set, five were misclassified as benign; none of these five cases were in the agreement set. It can also be noted that of the three FC samples, only one was included in the agreement set and this one was correctly classified as malignant. These details highlight the importance of working with a high-quality reference diagnosis set and demonstrate the high performance of the assay on samples for which the reference diagnosis is more reliable.
- The foregoing description of the specific embodiments so fully reveals the general nature of the invention that others can, by applying current knowledge, readily modify and/or adapt for various applications such specific embodiments without undue experimentation and without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. Although the invention has been described in conjunction with specific embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, it is intended to embrace all such alternatives, modifications and variations that fall within the broad scope of the appended claims.
- It should be understood that the detailed description and specific examples, while indicating preferred embodiments of the invention, are given by way of illustration only, since various changes and modifications within the scope of the invention will become apparent to those skilled in the art from this detailed description.
Claims (12)
1. A method of classifying a thyroid lesion sample as malignant or benign, comprising:
a. providing RNA extracted from a thyroid lesion sample obtained from a human subject using fine need aspiration (FNA);
b. obtaining by real time polymerase chain reaction (PCR) performed on the RNA an expression profile comprising expression levels of miRNAs comprising hsa-miR-31-5p (SEQ ID NO: 5, 6, or 7), hsa-miR-222-3p (SEQ ID NO: 1 or 2), hsa-miR-146b-5p (SEQ ID NO: 10 or 11), MID-16582 (SEQ ID NO: 25), hsa-miR-342-3p (SEQ ID NO: 17 or 18), hsa-miR-125b-5p (SEQ ID NO: 9), hsa-miR-375 (SEQ ID NO: 8), hsa-miR-486-5p (SEQ ID NO: 22), hsa-miR-551b-3p (SEQ ID NO: 3 or 4), hsa-miR-152-3p (SEQ ID NO: 12 or 13), hsa-miR-138-5p (SEQ ID NO: 19, 20, or 21), hsa-miR-23a-3p (SEQ ID NO: 26), and hsa-miR-574-3p (SEQ ID NO: 36 or 37); wherein the PCR comprises contacting the RNA with forward and reverse primers for each of the miRNAs, wherein each forward primer comprises 15-21 nucleotides identical to one of the miRNAs; and wherein the forward primers comprise SEQ ID NO: 317;
c. applying a classifier algorithm to the expression profile; wherein the classifier algorithm compares the expression profile to a reference value; and
d. classifying the thyroid lesion as benign or malignant based on the result from the classifier algorithm.
2. The method of claim 1 , wherein the thyroid lesion has been classified as Bethesda III, IV or V according to the Bethesda system.
3. The method of claim 1 , wherein said classifier algorithm is a machine-learning algorithm.
4. The method of claim 1 , wherein said classifier algorithm is a multi-step classifier.
5. The method of claim 4 , wherein the classifier algorithm comprises at least one linear discriminant analysis (LDA) classifier.
6. The method of claim 5 , wherein the classifier algorithm comprises at least one LDA classifier combined with a KNN classifier.
7. The method of claim 1 , wherein following step (b), the method further comprises a step of obtaining a ratio between the expression levels of at least one pair of microRNAs; and wherein in step (c) said classifier algorithm is applied to any one of the microRNA expression profile, said ratio of at least one pair of microRNAs, or to a combination thereof.
8. The method of claim 1 , wherein said algorithm further combines at least one of clinical or genetic data from said sample.
9. The method of claim 1 , further comprising the step of administering a differential treatment to said subject if said thyroid lesion is classified as benign or malignant.
10. The method of claim 9 , wherein said lesion is classified as malignant and said treatment is any one of surgery, chemotherapy, radiotherapy, hormone therapy, or any other recommended treatment.
11. The method of claim 1 , wherein said classifying further includes a step of eliminating a sample classified as medullary malignant carcinoma.
12. The method of claim 1 , wherein said classification has a negative predictive value of between 84 and 96%.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US16/192,221 US20190300963A1 (en) | 2014-05-13 | 2018-11-15 | Mirna expression signature in the classification of thyroid tumors |
Applications Claiming Priority (9)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201461992531P | 2014-05-13 | 2014-05-13 | |
| US201461992756P | 2014-05-13 | 2014-05-13 | |
| US201462069353P | 2014-10-28 | 2014-10-28 | |
| US201562139066P | 2015-03-27 | 2015-03-27 | |
| PCT/US2015/030564 WO2015175660A1 (en) | 2014-05-13 | 2015-05-13 | Mirna expression signature in the classification of thyroid tumors |
| US201662321498P | 2016-04-12 | 2016-04-12 | |
| US15/237,364 US9708667B2 (en) | 2014-05-13 | 2016-08-15 | MiRNA expression signature in the classification of thyroid tumors |
| US15/625,645 US20170356055A1 (en) | 2014-05-13 | 2017-06-16 | Mirna expression signature in the classification of thyroid tumors |
| US16/192,221 US20190300963A1 (en) | 2014-05-13 | 2018-11-15 | Mirna expression signature in the classification of thyroid tumors |
Related Parent Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/625,645 Continuation US20170356055A1 (en) | 2014-05-13 | 2017-06-16 | Mirna expression signature in the classification of thyroid tumors |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| US20190300963A1 true US20190300963A1 (en) | 2019-10-03 |
Family
ID=57775629
Family Applications (3)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/237,364 Expired - Fee Related US9708667B2 (en) | 2014-05-13 | 2016-08-15 | MiRNA expression signature in the classification of thyroid tumors |
| US15/625,645 Abandoned US20170356055A1 (en) | 2014-05-13 | 2017-06-16 | Mirna expression signature in the classification of thyroid tumors |
| US16/192,221 Abandoned US20190300963A1 (en) | 2014-05-13 | 2018-11-15 | Mirna expression signature in the classification of thyroid tumors |
Family Applications Before (2)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US15/237,364 Expired - Fee Related US9708667B2 (en) | 2014-05-13 | 2016-08-15 | MiRNA expression signature in the classification of thyroid tumors |
| US15/625,645 Abandoned US20170356055A1 (en) | 2014-05-13 | 2017-06-16 | Mirna expression signature in the classification of thyroid tumors |
Country Status (1)
| Country | Link |
|---|---|
| US (3) | US9708667B2 (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN110890132A (en) * | 2019-11-19 | 2020-03-17 | 湖南大学 | Cancer mutation cluster identification method based on adaptive Gaussian mixture model |
| US20200320701A1 (en) * | 2018-03-27 | 2020-10-08 | Tencent Technology (Shenzhen) Company Limited | Image processing method and apparatus and neural network model training method |
Families Citing this family (38)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8338109B2 (en) | 2006-11-02 | 2012-12-25 | Mayo Foundation For Medical Education And Research | Predicting cancer outcome |
| EP2806054A1 (en) | 2008-05-28 | 2014-11-26 | Genomedx Biosciences Inc. | Systems and methods for expression-based discrimination of distinct clinical disease states in prostate cancer |
| US10407731B2 (en) | 2008-05-30 | 2019-09-10 | Mayo Foundation For Medical Education And Research | Biomarker panels for predicting prostate cancer outcomes |
| US9495515B1 (en) | 2009-12-09 | 2016-11-15 | Veracyte, Inc. | Algorithms for disease diagnostics |
| US10236078B2 (en) | 2008-11-17 | 2019-03-19 | Veracyte, Inc. | Methods for processing or analyzing a sample of thyroid tissue |
| US9074258B2 (en) | 2009-03-04 | 2015-07-07 | Genomedx Biosciences Inc. | Compositions and methods for classifying thyroid nodule disease |
| US8669057B2 (en) | 2009-05-07 | 2014-03-11 | Veracyte, Inc. | Methods and compositions for diagnosis of thyroid conditions |
| US10446272B2 (en) | 2009-12-09 | 2019-10-15 | Veracyte, Inc. | Methods and compositions for classification of samples |
| US20130267443A1 (en) | 2010-11-19 | 2013-10-10 | The Regents Of The University Of Michigan | ncRNA AND USES THEREOF |
| CN105861712B (en) | 2011-04-18 | 2021-05-14 | 迪阿米尔有限责任公司 | Methods for early detection and monitoring of Mild Cognitive Impairment (MCI) and Alzheimer's Disease (AD) using miRNA from bodily fluids |
| EP2791359B1 (en) | 2011-12-13 | 2020-01-15 | Decipher Biosciences, Inc. | Cancer diagnostics using non-coding transcripts |
| CA2881627A1 (en) | 2012-08-16 | 2014-02-20 | Genomedx Biosciences Inc. | Cancer diagnostics using biomarkers |
| US10526655B2 (en) | 2013-03-14 | 2020-01-07 | Veracyte, Inc. | Methods for evaluating COPD status |
| US11976329B2 (en) | 2013-03-15 | 2024-05-07 | Veracyte, Inc. | Methods and systems for detecting usual interstitial pneumonia |
| CA2931082C (en) | 2013-11-18 | 2024-01-23 | Diamir, Llc | Methods of using mirnas from bodily fluids for detection and monitoring of parkinson's disease (pd) |
| US12297505B2 (en) | 2014-07-14 | 2025-05-13 | Veracyte, Inc. | Algorithms for disease diagnostics |
| US20170335396A1 (en) | 2014-11-05 | 2017-11-23 | Veracyte, Inc. | Systems and methods of diagnosing idiopathic pulmonary fibrosis on transbronchial biopsies using machine learning and high dimensional transcriptional data |
| US10975436B2 (en) | 2016-01-05 | 2021-04-13 | Diamir, Llc | Methods of using miRNA from bodily fluids for diagnosis and monitoring of neurodevelopmental disorders |
| ES2993025T3 (en) | 2016-03-21 | 2024-12-20 | Diamir Llc | Methods of using mirnas from bodily fluids for detection and differentiation of neurodegenerative diseases |
| AU2017315425B2 (en) | 2016-08-24 | 2023-11-09 | The Regents Of The University Of Michigan | Use of genomic signatures to predict responsiveness of patients with prostate cancer to post-operative radiation therapy |
| EP3571322B9 (en) | 2017-01-20 | 2023-10-04 | VERACYTE SD, Inc. | Molecular subtyping, prognosis, and treatment of bladder cancer |
| EP3593140A4 (en) | 2017-03-09 | 2021-01-06 | Decipher Biosciences, Inc. | SUBTYPING PROSTATE CANCER TO PREDICT RESPONSE TO HORMONE THERAPY |
| US11078542B2 (en) | 2017-05-12 | 2021-08-03 | Decipher Biosciences, Inc. | Genetic signatures to predict prostate cancer metastasis and identify tumor aggressiveness |
| CN107025311A (en) * | 2017-05-18 | 2017-08-08 | 北京大学 | A kind of Bayes's personalized recommendation method and device based on k nearest neighbor |
| US11217329B1 (en) | 2017-06-23 | 2022-01-04 | Veracyte, Inc. | Methods and systems for determining biological sample integrity |
| US10781487B2 (en) | 2017-07-24 | 2020-09-22 | Diamir, Llc | miRNA-based methods for detecting and monitoring aging |
| GB2581584A (en) * | 2017-07-27 | 2020-08-26 | Veracyte Inc | Genomic sequencing classifier |
| EP3701050B1 (en) * | 2017-10-24 | 2024-03-06 | Université Paris Cité | Diagnosis and/or prognosis of her2-dependent cancer using one or more mirna as a biomarker |
| KR20210052389A (en) | 2018-08-27 | 2021-05-10 | 리제너론 파마슈티칼스 인코포레이티드 | Use of Raman spectroscopy in downstream purification |
| TW202018727A (en) | 2018-11-09 | 2020-05-16 | 財團法人工業技術研究院 | Ensemble learning predicting method and system |
| CN109872773A (en) * | 2019-02-26 | 2019-06-11 | 哈尔滨工业大学 | Mirco-RNA precursor identification method based on fusion of Adaboost, BP neural network and random forest |
| CN110184338B (en) * | 2019-05-31 | 2023-04-07 | 南方医科大学第三附属医院(广东省骨科研究院) | Application of cerebrospinal fluid exosome miRNA in MMD diagnosis and treatment |
| CN110484620B (en) * | 2019-08-09 | 2023-09-19 | 深圳市第二人民医院 | Biomarker and application thereof in preparation of product for diagnosing PTMC (ptm-associated tumor cell) |
| CN110499367B (en) * | 2019-08-09 | 2022-11-22 | 深圳市第二人民医院 | Biomarkers and their applications |
| IT201900024009A1 (en) * | 2019-12-13 | 2021-06-13 | Univ Degli Studi Roma La Sapienza | NEW PROCEDURE FOR THE DIAGNOSIS OF A THYROID CANCER AND RELATED KIT |
| CA3221494A1 (en) * | 2021-06-09 | 2022-12-15 | Andrew Zhang | Cancer detection method, kit, and system |
| EP4303324A1 (en) | 2022-07-05 | 2024-01-10 | Narodowy Instytut Onkologii im. Marii Sklodowskiej-Curie Panstwowy Instytut Oddzial w Gliwicach | A method of distinguishing between benign and malignant thyroid nodules |
| CN116987791B (en) * | 2023-09-22 | 2023-12-22 | 润安医学科技(苏州)有限公司 | Application of plasma markers in identification of benign and malignant thyroid nodule |
Family Cites Families (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8945829B2 (en) * | 2011-03-22 | 2015-02-03 | Cornell University | Distinguishing benign and malignant indeterminate thyroid lesions |
-
2016
- 2016-08-15 US US15/237,364 patent/US9708667B2/en not_active Expired - Fee Related
-
2017
- 2017-06-16 US US15/625,645 patent/US20170356055A1/en not_active Abandoned
-
2018
- 2018-11-15 US US16/192,221 patent/US20190300963A1/en not_active Abandoned
Non-Patent Citations (1)
| Title |
|---|
| BUSK. (Jan.2014) "A tool for design of primers for microRNA-specific quantitative RT-qPCR". BMC BIOINFORMATICS, V.15:29 (9 pages). (Year: 2014) * |
Cited By (3)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20200320701A1 (en) * | 2018-03-27 | 2020-10-08 | Tencent Technology (Shenzhen) Company Limited | Image processing method and apparatus and neural network model training method |
| US11501431B2 (en) * | 2018-03-27 | 2022-11-15 | Tencent Technology (Shenzhen) Company Ltd | Image processing method and apparatus and neural network model training method |
| CN110890132A (en) * | 2019-11-19 | 2020-03-17 | 湖南大学 | Cancer mutation cluster identification method based on adaptive Gaussian mixture model |
Also Published As
| Publication number | Publication date |
|---|---|
| US20170016076A1 (en) | 2017-01-19 |
| US20170356055A1 (en) | 2017-12-14 |
| US9708667B2 (en) | 2017-07-18 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9708667B2 (en) | MiRNA expression signature in the classification of thyroid tumors | |
| CA2945531C (en) | Mirna expression signature in the classification of thyroid tumors | |
| KR101900872B1 (en) | Plasma Micorornas for The Detection of Early Colorectal Cancer | |
| CN101921760B (en) | A serum/plasma miRNA marker associated with breast cancer and its application | |
| US20180105888A1 (en) | Methods and Kits for Detecting Subjects at Risk of Having Cancer | |
| MX2013013746A (en) | Biomarkers for lung cancer. | |
| CN102876676A (en) | Blood serum/blood plasma micro ribonucleic acid (miRNA) marker relevant with pancreatic cancer and application thereof | |
| CN103642914B (en) | Plasma/serum circulation microRNA marker related to mlignnt melnom and application of marker | |
| WO2011154008A1 (en) | Microrna classification of thyroid follicular neoplasia | |
| EP4294938A1 (en) | Cell-free dna methylation test | |
| WO2017079571A1 (en) | Process for the indentication of patients at risk for oscc | |
| US20210214799A1 (en) | Method and kit for the classification of thyroid nodules | |
| US20190345489A1 (en) | Reagent for use in assessment of remaining very small lesion of neuroblastoma; and method for analyzing biological sample using same | |
| JP2022536502A (en) | Compositions and methods for treating cancer | |
| CN120584199A (en) | Composition for predicting the risk of developing liver cancer | |
| US20140243230A1 (en) | Gene expression signature for classification of kidney tumors | |
| KR20130098669A (en) | Serum mirna as a marker for the diagnosis of lymph node metastasis of gastric cancer | |
| KR102602134B1 (en) | Method of providing information for diagnosing metastasis of cervical cancer | |
| WO2022188776A1 (en) | Gene methylation marker or combination thereof that can be used for gastric carcinoma her2 companion diagnostics, and use thereof | |
| HK40080902A (en) | Gene methylation markers or combinations thereof for concomitant diagnosis of her2 in gastric carcinoma, and use thereof | |
| HK1204016B (en) | Methods and kits for detecting subjects having pancreatic cancer |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
| STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |