CN112608925B - Pathogenic gene COL2A1 mutation of bone dysplasia disease and detection reagent thereof - Google Patents
Pathogenic gene COL2A1 mutation of bone dysplasia disease and detection reagent thereof Download PDFInfo
- Publication number
- CN112608925B CN112608925B CN202011551982.5A CN202011551982A CN112608925B CN 112608925 B CN112608925 B CN 112608925B CN 202011551982 A CN202011551982 A CN 202011551982A CN 112608925 B CN112608925 B CN 112608925B
- Authority
- CN
- China
- Prior art keywords
- gly
- pro
- mutation
- ala
- gene
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000035772 mutation Effects 0.000 title claims abstract description 109
- 238000001514 detection method Methods 0.000 title claims abstract description 43
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 title claims abstract description 30
- 239000003153 chemical reaction reagent Substances 0.000 title claims abstract description 22
- 230000001717 pathogenic effect Effects 0.000 title abstract description 27
- 101000771163 Homo sapiens Collagen alpha-1(II) chain Proteins 0.000 title abstract description 21
- 206010072610 Skeletal dysplasia Diseases 0.000 title description 11
- 208000013558 Developmental Bone disease Diseases 0.000 title description 8
- 101150082216 COL2A1 gene Proteins 0.000 claims abstract description 29
- 206010031243 Osteogenesis imperfecta Diseases 0.000 claims abstract description 11
- 108090000623 proteins and genes Proteins 0.000 claims description 58
- 239000000523 sample Substances 0.000 claims description 12
- 239000002773 nucleotide Substances 0.000 claims description 10
- 125000003729 nucleotide group Chemical group 0.000 claims description 10
- 201000010099 disease Diseases 0.000 abstract description 21
- 102100029136 Collagen alpha-1(II) chain Human genes 0.000 abstract description 15
- 238000003745 diagnosis Methods 0.000 abstract description 14
- 150000001413 amino acids Chemical class 0.000 abstract description 10
- 108010041390 Collagen Type II Proteins 0.000 abstract description 4
- 102000000503 Collagen Type II Human genes 0.000 abstract description 4
- 210000001188 articular cartilage Anatomy 0.000 abstract 1
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 22
- 108010014614 prolyl-glycyl-proline Proteins 0.000 description 22
- 241000252212 Danio rerio Species 0.000 description 21
- 108020004414 DNA Proteins 0.000 description 16
- 108010047495 alanylglycine Proteins 0.000 description 16
- BRPMXFSTKXXNHF-IUCAKERBSA-N (2s)-1-[2-[[(2s)-pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound OC(=O)[C@@H]1CCCN1C(=O)CNC(=O)[C@H]1NCCC1 BRPMXFSTKXXNHF-IUCAKERBSA-N 0.000 description 15
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 15
- 238000000034 method Methods 0.000 description 13
- 108010029020 prolylglycine Proteins 0.000 description 13
- 238000012163 sequencing technique Methods 0.000 description 13
- 108010087846 prolyl-prolyl-glycine Proteins 0.000 description 12
- LEIKGVHQTKHOLM-IUCAKERBSA-N Pro-Pro-Gly Chemical compound OC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H]1NCCC1 LEIKGVHQTKHOLM-IUCAKERBSA-N 0.000 description 11
- 210000003754 fetus Anatomy 0.000 description 11
- 239000013612 plasmid Substances 0.000 description 11
- 108010069020 alanyl-prolyl-glycine Proteins 0.000 description 10
- 230000001605 fetal effect Effects 0.000 description 10
- 241000282414 Homo sapiens Species 0.000 description 9
- 108010078144 glutaminyl-glycine Proteins 0.000 description 9
- JYPCXBJRLBHWME-UHFFFAOYSA-N glycyl-L-prolyl-L-arginine Natural products NCC(=O)N1CCCC1C(=O)NC(CCCN=C(N)N)C(O)=O JYPCXBJRLBHWME-UHFFFAOYSA-N 0.000 description 9
- 108010064235 lysylglycine Proteins 0.000 description 9
- 238000007480 sanger sequencing Methods 0.000 description 9
- 208000026350 Inborn Genetic disease Diseases 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 8
- 238000007481 next generation sequencing Methods 0.000 description 8
- OOCFXNOVSLSHAB-IUCAKERBSA-N Gly-Pro-Pro Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N1[C@H](C(O)=O)CCC1 OOCFXNOVSLSHAB-IUCAKERBSA-N 0.000 description 7
- UGTHTQWIQKEDEH-BQBZGAKWSA-N L-alanyl-L-prolylglycine zwitterion Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UGTHTQWIQKEDEH-BQBZGAKWSA-N 0.000 description 7
- 108010079364 N-glycylalanine Proteins 0.000 description 7
- 235000001014 amino acid Nutrition 0.000 description 7
- 210000000988 bone and bone Anatomy 0.000 description 7
- 238000011161 development Methods 0.000 description 7
- 230000018109 developmental process Effects 0.000 description 7
- 208000016361 genetic disease Diseases 0.000 description 7
- 238000012165 high-throughput sequencing Methods 0.000 description 7
- 235000018102 proteins Nutrition 0.000 description 7
- 102000004169 proteins and genes Human genes 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- 108010002311 N-glycylglutamic acid Proteins 0.000 description 6
- ZPPVJIJMIKTERM-YUMQZZPRSA-N Pro-Gln-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)N)NC(=O)[C@@H]1CCCN1 ZPPVJIJMIKTERM-YUMQZZPRSA-N 0.000 description 6
- CLNJSLSHKJECME-BQBZGAKWSA-N Pro-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H]1CCCN1 CLNJSLSHKJECME-BQBZGAKWSA-N 0.000 description 6
- BGWKULMLUIUPKY-BQBZGAKWSA-N Pro-Ser-Gly Chemical compound OC(=O)CNC(=O)[C@H](CO)NC(=O)[C@@H]1CCCN1 BGWKULMLUIUPKY-BQBZGAKWSA-N 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 108010077515 glycylproline Proteins 0.000 description 6
- SCAKQYSGEIHPLV-IUCAKERBSA-N (4S)-4-[(2-aminoacetyl)amino]-5-[(2S)-2-(carboxymethylcarbamoyl)pyrrolidin-1-yl]-5-oxopentanoic acid Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SCAKQYSGEIHPLV-IUCAKERBSA-N 0.000 description 5
- QXPRJQPCFXMCIY-NKWVEPMBSA-N Gly-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN QXPRJQPCFXMCIY-NKWVEPMBSA-N 0.000 description 5
- UUYBFNKHOCJCHT-VHSXEESVSA-N Gly-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN UUYBFNKHOCJCHT-VHSXEESVSA-N 0.000 description 5
- IEGFSKKANYKBDU-QWHCGFSZSA-N Gly-Phe-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CC=CC=C2)NC(=O)CN)C(=O)O IEGFSKKANYKBDU-QWHCGFSZSA-N 0.000 description 5
- CAVKXZMMDNOZJU-UHFFFAOYSA-N Gly-Pro-Ala-Gly-Pro Natural products C1CCC(C(O)=O)N1C(=O)CNC(=O)C(C)NC(=O)C1CCCN1C(=O)CN CAVKXZMMDNOZJU-UHFFFAOYSA-N 0.000 description 5
- FKLSMYYLJHYPHH-UWVGGRQHSA-N Pro-Gly-Leu Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O FKLSMYYLJHYPHH-UWVGGRQHSA-N 0.000 description 5
- 108010009111 arginyl-glycyl-glutamic acid Proteins 0.000 description 5
- 108010047857 aspartylglycine Proteins 0.000 description 5
- 210000000349 chromosome Anatomy 0.000 description 5
- 230000037430 deletion Effects 0.000 description 5
- 238000012217 deletion Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 230000002068 genetic effect Effects 0.000 description 5
- 210000005259 peripheral blood Anatomy 0.000 description 5
- 239000011886 peripheral blood Substances 0.000 description 5
- 210000001519 tissue Anatomy 0.000 description 5
- 238000002604 ultrasonography Methods 0.000 description 5
- ZVFVBBGVOILKPO-WHFBIAKZSA-N Ala-Gly-Ala Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@@H](C)C(O)=O ZVFVBBGVOILKPO-WHFBIAKZSA-N 0.000 description 4
- BEMGNWZECGIJOI-WDSKDSINSA-N Ala-Gly-Glu Chemical compound [H]N[C@@H](C)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O BEMGNWZECGIJOI-WDSKDSINSA-N 0.000 description 4
- MQIGTEQXYCRLGK-BQBZGAKWSA-N Ala-Gly-Pro Chemical compound C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O MQIGTEQXYCRLGK-BQBZGAKWSA-N 0.000 description 4
- 206010058314 Dysplasia Diseases 0.000 description 4
- 108700024394 Exon Proteins 0.000 description 4
- 206010064571 Gene mutation Diseases 0.000 description 4
- JYPCXBJRLBHWME-IUCAKERBSA-N Gly-Pro-Arg Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JYPCXBJRLBHWME-IUCAKERBSA-N 0.000 description 4
- HAOUOFNNJJLVNS-BQBZGAKWSA-N Gly-Pro-Ser Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O HAOUOFNNJJLVNS-BQBZGAKWSA-N 0.000 description 4
- BNBBNGZZKQUWCD-IUCAKERBSA-N Pro-Arg-Gly Chemical compound NC(N)=NCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 BNBBNGZZKQUWCD-IUCAKERBSA-N 0.000 description 4
- VYWNORHENYEQDW-YUMQZZPRSA-N Pro-Gly-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 VYWNORHENYEQDW-YUMQZZPRSA-N 0.000 description 4
- YTPLVNUZZOBFFC-SCZZXKLOSA-N Val-Gly-Pro Chemical compound CC(C)[C@H](N)C(=O)NCC(=O)N1CCC[C@@H]1C(O)=O YTPLVNUZZOBFFC-SCZZXKLOSA-N 0.000 description 4
- 230000002159 abnormal effect Effects 0.000 description 4
- 108010086434 alanyl-seryl-glycine Proteins 0.000 description 4
- 108010091092 arginyl-glycyl-proline Proteins 0.000 description 4
- 210000002257 embryonic structure Anatomy 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 108010050848 glycylleucine Proteins 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 238000002513 implantation Methods 0.000 description 4
- 210000001161 mammalian embryo Anatomy 0.000 description 4
- 238000000520 microinjection Methods 0.000 description 4
- 238000003793 prenatal diagnosis Methods 0.000 description 4
- 108010026333 seryl-proline Proteins 0.000 description 4
- 230000012488 skeletal system development Effects 0.000 description 4
- 108010061238 threonyl-glycine Proteins 0.000 description 4
- CUVSTAMIHSSVKL-UWVGGRQHSA-N (4s)-4-[(2-aminoacetyl)amino]-5-[[(2s)-6-amino-1-(carboxymethylamino)-1-oxohexan-2-yl]amino]-5-oxopentanoic acid Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CN CUVSTAMIHSSVKL-UWVGGRQHSA-N 0.000 description 3
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 3
- RTZCUEHYUQZIDE-WHFBIAKZSA-N Ala-Ser-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O RTZCUEHYUQZIDE-WHFBIAKZSA-N 0.000 description 3
- AUFHLLPVPSMEOG-YUMQZZPRSA-N Arg-Gly-Glu Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AUFHLLPVPSMEOG-YUMQZZPRSA-N 0.000 description 3
- 206010010356 Congenital anomaly Diseases 0.000 description 3
- NSORZJXKUQFEKL-JGVFFNPUSA-N Gln-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCC(=O)N)N)C(=O)O NSORZJXKUQFEKL-JGVFFNPUSA-N 0.000 description 3
- GPSHCSTUYOQPAI-JHEQGTHGSA-N Glu-Thr-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O GPSHCSTUYOQPAI-JHEQGTHGSA-N 0.000 description 3
- PYTZFYUXZZHOAD-WHFBIAKZSA-N Gly-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)CN PYTZFYUXZZHOAD-WHFBIAKZSA-N 0.000 description 3
- GZUKEVBTYNNUQF-WDSKDSINSA-N Gly-Ala-Gln Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GZUKEVBTYNNUQF-WDSKDSINSA-N 0.000 description 3
- KKBWDNZXYLGJEY-UHFFFAOYSA-N Gly-Arg-Pro Natural products NCC(=O)NC(CCNC(=N)N)C(=O)N1CCCC1C(=O)O KKBWDNZXYLGJEY-UHFFFAOYSA-N 0.000 description 3
- MOJKRXIRAZPZLW-WDSKDSINSA-N Gly-Glu-Ala Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O MOJKRXIRAZPZLW-WDSKDSINSA-N 0.000 description 3
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 3
- HFXJIZNEXNIZIJ-BQBZGAKWSA-N Gly-Glu-Gln Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O HFXJIZNEXNIZIJ-BQBZGAKWSA-N 0.000 description 3
- SWQALSGKVLYKDT-UHFFFAOYSA-N Gly-Ile-Ala Natural products NCC(=O)NC(C(C)CC)C(=O)NC(C)C(O)=O SWQALSGKVLYKDT-UHFFFAOYSA-N 0.000 description 3
- TWTPDFFBLQEBOE-IUCAKERBSA-N Gly-Leu-Gln Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O TWTPDFFBLQEBOE-IUCAKERBSA-N 0.000 description 3
- GGLIDLCEPDHEJO-BQBZGAKWSA-N Gly-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)CN GGLIDLCEPDHEJO-BQBZGAKWSA-N 0.000 description 3
- GAAHQHNCMIAYEX-UWVGGRQHSA-N Gly-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GAAHQHNCMIAYEX-UWVGGRQHSA-N 0.000 description 3
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 3
- 101100221182 Homo sapiens COL2A1 gene Proteins 0.000 description 3
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 3
- HAEGAELAYWSUNC-WPRPVWTQSA-N Pro-Gly-Val Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O HAEGAELAYWSUNC-WPRPVWTQSA-N 0.000 description 3
- MSIYNSBKKVMGFO-BHNWBGBOSA-N Thr-Gly-Pro Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N)O MSIYNSBKKVMGFO-BHNWBGBOSA-N 0.000 description 3
- 230000005856 abnormality Effects 0.000 description 3
- 108010076324 alanyl-glycyl-glycine Proteins 0.000 description 3
- 210000004369 blood Anatomy 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 210000000845 cartilage Anatomy 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 108010025801 glycyl-prolyl-arginine Proteins 0.000 description 3
- 108010015792 glycyllysine Proteins 0.000 description 3
- 231100000518 lethal Toxicity 0.000 description 3
- 230000001665 lethal effect Effects 0.000 description 3
- 108010005942 methionylglycine Proteins 0.000 description 3
- 150000007523 nucleic acids Chemical class 0.000 description 3
- 230000035935 pregnancy Effects 0.000 description 3
- 230000004853 protein function Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 238000002054 transplantation Methods 0.000 description 3
- 238000012795 verification Methods 0.000 description 3
- AQPVUEJJARLJHB-BQBZGAKWSA-N Arg-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N AQPVUEJJARLJHB-BQBZGAKWSA-N 0.000 description 2
- OQCWXQJLCDPRHV-UWVGGRQHSA-N Arg-Gly-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC(C)C)C(O)=O OQCWXQJLCDPRHV-UWVGGRQHSA-N 0.000 description 2
- HAVKMRGWNXMCDR-STQMWFEESA-N Arg-Gly-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O HAVKMRGWNXMCDR-STQMWFEESA-N 0.000 description 2
- ZATRYQNPUHGXCU-DTWKUNHWSA-N Arg-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCN=C(N)N)N)C(=O)O ZATRYQNPUHGXCU-DTWKUNHWSA-N 0.000 description 2
- GKKUBLFXKRDMFC-BQBZGAKWSA-N Asn-Pro-Gly Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O GKKUBLFXKRDMFC-BQBZGAKWSA-N 0.000 description 2
- AXXCUABIFZPKPM-BQBZGAKWSA-N Asp-Arg-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O AXXCUABIFZPKPM-BQBZGAKWSA-N 0.000 description 2
- FANQWNCPNFEPGZ-WHFBIAKZSA-N Asp-Asp-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O FANQWNCPNFEPGZ-WHFBIAKZSA-N 0.000 description 2
- HAFCJCDJGIOYPW-WDSKDSINSA-N Asp-Gly-Gln Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O HAFCJCDJGIOYPW-WDSKDSINSA-N 0.000 description 2
- 238000007400 DNA extraction Methods 0.000 description 2
- 108090000790 Enzymes Proteins 0.000 description 2
- 102000004190 Enzymes Human genes 0.000 description 2
- 108010072062 GEKG peptide Proteins 0.000 description 2
- NLKVNZUFDPWPNL-YUMQZZPRSA-N Glu-Arg-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O NLKVNZUFDPWPNL-YUMQZZPRSA-N 0.000 description 2
- PUUYVMYCMIWHFE-BQBZGAKWSA-N Gly-Ala-Arg Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N PUUYVMYCMIWHFE-BQBZGAKWSA-N 0.000 description 2
- JBRBACJPBZNFMF-YUMQZZPRSA-N Gly-Ala-Lys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CCCCN JBRBACJPBZNFMF-YUMQZZPRSA-N 0.000 description 2
- CLODWIOAKCSBAN-BQBZGAKWSA-N Gly-Arg-Asp Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N[C@@H](CC(O)=O)C(O)=O CLODWIOAKCSBAN-BQBZGAKWSA-N 0.000 description 2
- KQDMENMTYNBWMR-WHFBIAKZSA-N Gly-Asp-Ala Chemical compound [H]NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(O)=O KQDMENMTYNBWMR-WHFBIAKZSA-N 0.000 description 2
- SWQALSGKVLYKDT-ZKWXMUAHSA-N Gly-Ile-Ala Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SWQALSGKVLYKDT-ZKWXMUAHSA-N 0.000 description 2
- OCPPBNKYGYSLOE-IUCAKERBSA-N Gly-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN OCPPBNKYGYSLOE-IUCAKERBSA-N 0.000 description 2
- FNXSYBOHALPRHV-ONGXEEELSA-N Gly-Val-Lys Chemical compound NCC(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCCN FNXSYBOHALPRHV-ONGXEEELSA-N 0.000 description 2
- 101100175482 Glycine max CG-3 gene Proteins 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- GQKSJYINYYWPMR-NGZCFLSTSA-N Ile-Gly-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N GQKSJYINYYWPMR-NGZCFLSTSA-N 0.000 description 2
- 208000001182 Kniest dysplasia Diseases 0.000 description 2
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 2
- YFBBUHJJUXXZOF-UWVGGRQHSA-N Leu-Gly-Pro Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N1CCC[C@H]1C(O)=O YFBBUHJJUXXZOF-UWVGGRQHSA-N 0.000 description 2
- UCBPDSYUVAAHCD-UWVGGRQHSA-N Leu-Pro-Gly Chemical compound CC(C)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O UCBPDSYUVAAHCD-UWVGGRQHSA-N 0.000 description 2
- AAORVPFVUIHEAB-YUMQZZPRSA-N Lys-Asp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O AAORVPFVUIHEAB-YUMQZZPRSA-N 0.000 description 2
- ITWQLSZTLBKWJM-YUMQZZPRSA-N Lys-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCCN ITWQLSZTLBKWJM-YUMQZZPRSA-N 0.000 description 2
- XNKDCYABMBBEKN-IUCAKERBSA-N Lys-Gly-Gln Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O XNKDCYABMBBEKN-IUCAKERBSA-N 0.000 description 2
- LCMWVZLBCUVDAZ-IUCAKERBSA-N Lys-Gly-Glu Chemical compound [NH3+]CCCC[C@H]([NH3+])C(=O)NCC(=O)N[C@H](C([O-])=O)CCC([O-])=O LCMWVZLBCUVDAZ-IUCAKERBSA-N 0.000 description 2
- RFQATBGBLDAKGI-VHSXEESVSA-N Lys-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CCCCN)N)C(=O)O RFQATBGBLDAKGI-VHSXEESVSA-N 0.000 description 2
- PDIDTSZKKFEDMB-UWVGGRQHSA-N Lys-Pro-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O PDIDTSZKKFEDMB-UWVGGRQHSA-N 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- DMKWYMWNEKIPFC-IUCAKERBSA-N Pro-Gly-Arg Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCCNC(N)=N)C(O)=O DMKWYMWNEKIPFC-IUCAKERBSA-N 0.000 description 2
- UIMCLYYSUCIUJM-UWVGGRQHSA-N Pro-Gly-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 UIMCLYYSUCIUJM-UWVGGRQHSA-N 0.000 description 2
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 2
- AFXCXDQNRXTSBD-FJXKBIBVSA-N Pro-Gly-Thr Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H]([C@@H](C)O)C(O)=O AFXCXDQNRXTSBD-FJXKBIBVSA-N 0.000 description 2
- ABSSTGUCBCDKMU-UWVGGRQHSA-N Pro-Lys-Gly Chemical compound NCCCC[C@@H](C(=O)NCC(O)=O)NC(=O)[C@@H]1CCCN1 ABSSTGUCBCDKMU-UWVGGRQHSA-N 0.000 description 2
- WIPAMEKBSHNFQE-IUCAKERBSA-N Pro-Met-Gly Chemical compound CSCC[C@@H](C(=O)NCC(=O)O)NC(=O)[C@@H]1CCCN1 WIPAMEKBSHNFQE-IUCAKERBSA-N 0.000 description 2
- BSXKBOUZDAZXHE-CIUDSAMLSA-N Ser-Pro-Glu Chemical compound [H]N[C@@H](CO)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O BSXKBOUZDAZXHE-CIUDSAMLSA-N 0.000 description 2
- AQAMPXBRJJWPNI-JHEQGTHGSA-N Thr-Gly-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)NCC(=O)N[C@@H](CCC(O)=O)C(O)=O AQAMPXBRJJWPNI-JHEQGTHGSA-N 0.000 description 2
- MXDOAJQRJBMGMO-FJXKBIBVSA-N Thr-Pro-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O MXDOAJQRJBMGMO-FJXKBIBVSA-N 0.000 description 2
- 102000008579 Transposases Human genes 0.000 description 2
- 108010020764 Transposases Proteins 0.000 description 2
- 210000001015 abdomen Anatomy 0.000 description 2
- 210000004381 amniotic fluid Anatomy 0.000 description 2
- 238000010171 animal model Methods 0.000 description 2
- 108010077245 asparaginyl-proline Proteins 0.000 description 2
- 108010038633 aspartylglutamate Proteins 0.000 description 2
- 230000014461 bone development Effects 0.000 description 2
- 230000022159 cartilage development Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 235000018417 cysteine Nutrition 0.000 description 2
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 2
- 238000012350 deep sequencing Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 210000001503 joint Anatomy 0.000 description 2
- 108010034529 leucyl-lysine Proteins 0.000 description 2
- 108010057821 leucylproline Proteins 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 108020004707 nucleic acids Proteins 0.000 description 2
- 102000039446 nucleic acids Human genes 0.000 description 2
- 230000002018 overexpression Effects 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 238000012175 pyrosequencing Methods 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- -1 smarccal 1 Proteins 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 208000011580 syndromic disease Diseases 0.000 description 2
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 2
- CEHZCZCQHUNAJF-AVGNSLFASA-N (2s)-1-[2-[[(2s)-1-[(2s)-2-amino-5-(diaminomethylideneamino)pentanoyl]pyrrolidine-2-carbonyl]amino]acetyl]pyrrolidine-2-carboxylic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N1[C@H](C(O)=O)CCC1 CEHZCZCQHUNAJF-AVGNSLFASA-N 0.000 description 1
- NNRFRJQMBSBXGO-CIUDSAMLSA-N (3s)-3-[[2-[[(2s)-2-amino-5-(diaminomethylideneamino)pentanoyl]amino]acetyl]amino]-4-[[(1s)-1-carboxy-2-hydroxyethyl]amino]-4-oxobutanoic acid Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O NNRFRJQMBSBXGO-CIUDSAMLSA-N 0.000 description 1
- ITZMJCSORYKOSI-AJNGGQMLSA-N APGPR Enterostatin Chemical compound C[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N1[C@H](C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)CCC1 ITZMJCSORYKOSI-AJNGGQMLSA-N 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- SBGXWWCLHIOABR-UHFFFAOYSA-N Ala Ala Gly Ala Chemical compound CC(N)C(=O)NC(C)C(=O)NCC(=O)NC(C)C(O)=O SBGXWWCLHIOABR-UHFFFAOYSA-N 0.000 description 1
- BUANFPRKJKJSRR-ACZMJKKPSA-N Ala-Ala-Gln Chemical compound C[C@H]([NH3+])C(=O)N[C@@H](C)C(=O)N[C@H](C([O-])=O)CCC(N)=O BUANFPRKJKJSRR-ACZMJKKPSA-N 0.000 description 1
- ZIBWKCRKNFYTPT-ZKWXMUAHSA-N Ala-Asn-Val Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C(C)C)C(O)=O ZIBWKCRKNFYTPT-ZKWXMUAHSA-N 0.000 description 1
- IFTVANMRTIHKML-WDSKDSINSA-N Ala-Gln-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O IFTVANMRTIHKML-WDSKDSINSA-N 0.000 description 1
- PAIHPOGPJVUFJY-WDSKDSINSA-N Ala-Glu-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O PAIHPOGPJVUFJY-WDSKDSINSA-N 0.000 description 1
- WMYJZJRILUVVRG-WDSKDSINSA-N Ala-Gly-Gln Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCC(N)=O WMYJZJRILUVVRG-WDSKDSINSA-N 0.000 description 1
- VGPWRRFOPXVGOH-BYPYZUCNSA-N Ala-Gly-Gly Chemical compound C[C@H](N)C(=O)NCC(=O)NCC(O)=O VGPWRRFOPXVGOH-BYPYZUCNSA-N 0.000 description 1
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 1
- QHASENCZLDHBGX-ONGXEEELSA-N Ala-Gly-Phe Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 QHASENCZLDHBGX-ONGXEEELSA-N 0.000 description 1
- SUHLZMHFRALVSY-YUMQZZPRSA-N Ala-Lys-Gly Chemical compound NCCCC[C@H](NC(=O)[C@@H](N)C)C(=O)NCC(O)=O SUHLZMHFRALVSY-YUMQZZPRSA-N 0.000 description 1
- 108010011667 Ala-Phe-Ala Proteins 0.000 description 1
- VQAVBBCZFQAAED-FXQIFTODSA-N Ala-Pro-Asn Chemical compound C[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(=O)N)C(=O)O)N VQAVBBCZFQAAED-FXQIFTODSA-N 0.000 description 1
- BHTBAVZSZCQZPT-GUBZILKMSA-N Ala-Pro-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@H](C)N BHTBAVZSZCQZPT-GUBZILKMSA-N 0.000 description 1
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 1
- VHAQSYHSDKERBS-XPUUQOCRSA-N Ala-Val-Gly Chemical compound C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O VHAQSYHSDKERBS-XPUUQOCRSA-N 0.000 description 1
- 102100034112 Alkyldihydroxyacetonephosphate synthase, peroxisomal Human genes 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- OTCJMMRQBVDQRK-DCAQKATOSA-N Arg-Asp-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O OTCJMMRQBVDQRK-DCAQKATOSA-N 0.000 description 1
- RYRQZJVFDVWURI-SRVKXCTJSA-N Arg-Gln-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N RYRQZJVFDVWURI-SRVKXCTJSA-N 0.000 description 1
- OGUPCHKBOKJFMA-SRVKXCTJSA-N Arg-Glu-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H](N)CCCN=C(N)N OGUPCHKBOKJFMA-SRVKXCTJSA-N 0.000 description 1
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 1
- YNSGXDWWPCGGQS-YUMQZZPRSA-N Arg-Gly-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O YNSGXDWWPCGGQS-YUMQZZPRSA-N 0.000 description 1
- RFXXUWGNVRJTNQ-QXEWZRGKSA-N Arg-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CCCN=C(N)N)N RFXXUWGNVRJTNQ-QXEWZRGKSA-N 0.000 description 1
- PRLPSDIHSRITSF-UNQGMJICSA-N Arg-Phe-Thr Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O PRLPSDIHSRITSF-UNQGMJICSA-N 0.000 description 1
- 108010051330 Arg-Pro-Gly-Pro Proteins 0.000 description 1
- VRTWYUYCJGNFES-CIUDSAMLSA-N Arg-Ser-Gln Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O VRTWYUYCJGNFES-CIUDSAMLSA-N 0.000 description 1
- HRCIIMCTUIAKQB-XGEHTFHBSA-N Arg-Thr-Cys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N)O HRCIIMCTUIAKQB-XGEHTFHBSA-N 0.000 description 1
- 102100023943 Arylsulfatase L Human genes 0.000 description 1
- QISZHYWZHJRDAO-CIUDSAMLSA-N Asn-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC(=O)N)N QISZHYWZHJRDAO-CIUDSAMLSA-N 0.000 description 1
- VJTWLBMESLDOMK-WDSKDSINSA-N Asn-Gln-Gly Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VJTWLBMESLDOMK-WDSKDSINSA-N 0.000 description 1
- IICZCLFBILYRCU-WHFBIAKZSA-N Asn-Gly-Asp Chemical compound [H]N[C@@H](CC(N)=O)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O IICZCLFBILYRCU-WHFBIAKZSA-N 0.000 description 1
- PBSQFBAJKPLRJY-BYULHYEWSA-N Asn-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N PBSQFBAJKPLRJY-BYULHYEWSA-N 0.000 description 1
- FTCGGKNCJZOPNB-WHFBIAKZSA-N Asn-Gly-Ser Chemical compound NC(=O)C[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O FTCGGKNCJZOPNB-WHFBIAKZSA-N 0.000 description 1
- HMUKKNAMNSXDBB-CIUDSAMLSA-N Asn-Met-Glu Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CCC(O)=O)C(O)=O HMUKKNAMNSXDBB-CIUDSAMLSA-N 0.000 description 1
- IPAQILGYEQFCFO-NYVOZVTQSA-N Asn-Trp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)N[C@@H](CC3=CNC4=CC=CC=C43)C(=O)O)NC(=O)[C@H](CC(=O)N)N IPAQILGYEQFCFO-NYVOZVTQSA-N 0.000 description 1
- XEDQMTWEYFBOIK-ACZMJKKPSA-N Asp-Ala-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(O)=O)C(O)=O XEDQMTWEYFBOIK-ACZMJKKPSA-N 0.000 description 1
- CXBOKJPLEYUPGB-FXQIFTODSA-N Asp-Ala-Met Chemical compound C[C@@H](C(=O)N[C@@H](CCSC)C(=O)O)NC(=O)[C@H](CC(=O)O)N CXBOKJPLEYUPGB-FXQIFTODSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- TVVYVAUGRHNTGT-UGYAYLCHSA-N Asp-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CC(O)=O TVVYVAUGRHNTGT-UGYAYLCHSA-N 0.000 description 1
- FTNVLGCFIJEMQT-CIUDSAMLSA-N Asp-Cys-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N FTNVLGCFIJEMQT-CIUDSAMLSA-N 0.000 description 1
- NYQHSUGFEWDWPD-ACZMJKKPSA-N Asp-Gln-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)O)N NYQHSUGFEWDWPD-ACZMJKKPSA-N 0.000 description 1
- VHQOCWWKXIOAQI-WDSKDSINSA-N Asp-Gln-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VHQOCWWKXIOAQI-WDSKDSINSA-N 0.000 description 1
- DTNUIAJCPRMNBT-WHFBIAKZSA-N Asp-Gly-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O DTNUIAJCPRMNBT-WHFBIAKZSA-N 0.000 description 1
- OMMIEVATLAGRCK-BYPYZUCNSA-N Asp-Gly-Gly Chemical compound OC(=O)C[C@H](N)C(=O)NCC(=O)NCC(O)=O OMMIEVATLAGRCK-BYPYZUCNSA-N 0.000 description 1
- PSLSTUMPZILTAH-BYULHYEWSA-N Asp-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O PSLSTUMPZILTAH-BYULHYEWSA-N 0.000 description 1
- SEMWSADZTMJELF-BYULHYEWSA-N Asp-Ile-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)NCC(O)=O SEMWSADZTMJELF-BYULHYEWSA-N 0.000 description 1
- LBOVBQONZJRWPV-YUMQZZPRSA-N Asp-Lys-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)NCC(O)=O LBOVBQONZJRWPV-YUMQZZPRSA-N 0.000 description 1
- JJQGZGOEDSSHTE-FOHZUACHSA-N Asp-Thr-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O JJQGZGOEDSSHTE-FOHZUACHSA-N 0.000 description 1
- HTSSXFASOUSJQG-IHPCNDPISA-N Asp-Tyr-Trp Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O HTSSXFASOUSJQG-IHPCNDPISA-N 0.000 description 1
- XWKPSMRPIKKDDU-RCOVLWMOSA-N Asp-Val-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)NCC(O)=O XWKPSMRPIKKDDU-RCOVLWMOSA-N 0.000 description 1
- GGBQDSHTXKQSLP-NHCYSSNCSA-N Asp-Val-Lys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCCCN)C(=O)O)NC(=O)[C@H](CC(=O)O)N GGBQDSHTXKQSLP-NHCYSSNCSA-N 0.000 description 1
- GYNUXDMCDILYIQ-QRTARXTBSA-N Asp-Val-Trp Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CNC2=CC=CC=C21)C(=O)O)NC(=O)[C@H](CC(=O)O)N GYNUXDMCDILYIQ-QRTARXTBSA-N 0.000 description 1
- 102100031403 Beta-1,3-N-acetylglucosaminyltransferase lunatic fringe Human genes 0.000 description 1
- 101150008656 COL1A1 gene Proteins 0.000 description 1
- 101150032176 COL21A1 gene Proteins 0.000 description 1
- 102100038768 Carbohydrate sulfotransferase 3 Human genes 0.000 description 1
- 101710176668 Cartilage oligomeric matrix protein Proteins 0.000 description 1
- 102100024940 Cathepsin K Human genes 0.000 description 1
- ZEOWTGPWHLSLOG-UHFFFAOYSA-N Cc1ccc(cc1-c1ccc2c(n[nH]c2c1)-c1cnn(c1)C1CC1)C(=O)Nc1cccc(c1)C(F)(F)F Chemical compound Cc1ccc(cc1-c1ccc2c(n[nH]c2c1)-c1cnn(c1)C1CC1)C(=O)Nc1cccc(c1)C(F)(F)F ZEOWTGPWHLSLOG-UHFFFAOYSA-N 0.000 description 1
- 206010068051 Chimerism Diseases 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 102100031043 Coiled-coil domain-containing protein 8 Human genes 0.000 description 1
- 102000008186 Collagen Human genes 0.000 description 1
- 108010035532 Collagen Proteins 0.000 description 1
- 108010088874 Cullin 1 Proteins 0.000 description 1
- LHLSSZYQFUNWRZ-NAKRPEOUSA-N Cys-Arg-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O LHLSSZYQFUNWRZ-NAKRPEOUSA-N 0.000 description 1
- LHMSYHSAAJOEBL-CIUDSAMLSA-N Cys-Lys-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(O)=O LHMSYHSAAJOEBL-CIUDSAMLSA-N 0.000 description 1
- RESAHOSBQHMOKH-KKUMJFAQSA-N Cys-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](CS)N RESAHOSBQHMOKH-KKUMJFAQSA-N 0.000 description 1
- SAEVTQWAYDPXMU-KATARQTJSA-N Cys-Thr-Leu Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O SAEVTQWAYDPXMU-KATARQTJSA-N 0.000 description 1
- MQQLYEHXSBJTRK-FXQIFTODSA-N Cys-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CS)N MQQLYEHXSBJTRK-FXQIFTODSA-N 0.000 description 1
- 201000005171 Cystadenoma Diseases 0.000 description 1
- 238000001712 DNA sequencing Methods 0.000 description 1
- 102100040679 Dihydroxyacetone phosphate acyltransferase Human genes 0.000 description 1
- 206010013883 Dwarfism Diseases 0.000 description 1
- 102100031509 Fibrillin-1 Human genes 0.000 description 1
- 102100023593 Fibroblast growth factor receptor 1 Human genes 0.000 description 1
- 101710182386 Fibroblast growth factor receptor 1 Proteins 0.000 description 1
- 102100026559 Filamin-B Human genes 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- MQANCSUBSBJNLU-KKUMJFAQSA-N Gln-Arg-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQANCSUBSBJNLU-KKUMJFAQSA-N 0.000 description 1
- WQWMZOIPXWSZNE-WDSKDSINSA-N Gln-Asp-Gly Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O WQWMZOIPXWSZNE-WDSKDSINSA-N 0.000 description 1
- VSXBYIJUAXPAAL-WDSKDSINSA-N Gln-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O VSXBYIJUAXPAAL-WDSKDSINSA-N 0.000 description 1
- NSNUZSPSADIMJQ-WDSKDSINSA-N Gln-Gly-Asp Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(O)=O)C(O)=O NSNUZSPSADIMJQ-WDSKDSINSA-N 0.000 description 1
- QQAPDATZKKTBIY-YUMQZZPRSA-N Gln-Gly-Met Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)NCC(=O)N[C@@H](CCSC)C(O)=O QQAPDATZKKTBIY-YUMQZZPRSA-N 0.000 description 1
- VGTDBGYFVWOQTI-RYUDHWBXSA-N Gln-Gly-Phe Chemical compound NC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 VGTDBGYFVWOQTI-RYUDHWBXSA-N 0.000 description 1
- HXOLDXKNWKLDMM-YVNDNENWSA-N Gln-Ile-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)O)NC(=O)[C@H](CCC(=O)N)N HXOLDXKNWKLDMM-YVNDNENWSA-N 0.000 description 1
- RUFHOVYUYSNDNY-ACZMJKKPSA-N Glu-Ala-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O RUFHOVYUYSNDNY-ACZMJKKPSA-N 0.000 description 1
- UTKUTMJSWKKHEM-WDSKDSINSA-N Glu-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(O)=O UTKUTMJSWKKHEM-WDSKDSINSA-N 0.000 description 1
- DSPQRJXOIXHOHK-WDSKDSINSA-N Glu-Asp-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O DSPQRJXOIXHOHK-WDSKDSINSA-N 0.000 description 1
- MXPBQDFWIMBACQ-ACZMJKKPSA-N Glu-Cys-Cys Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)N[C@@H](CS)C(O)=O MXPBQDFWIMBACQ-ACZMJKKPSA-N 0.000 description 1
- SJPMNHCEWPTRBR-BQBZGAKWSA-N Glu-Glu-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SJPMNHCEWPTRBR-BQBZGAKWSA-N 0.000 description 1
- AIGROOHQXCACHL-WDSKDSINSA-N Glu-Gly-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)NCC(=O)N[C@@H](C)C(O)=O AIGROOHQXCACHL-WDSKDSINSA-N 0.000 description 1
- OAGVHWYIBZMWLA-YFKPBYRVSA-N Glu-Gly-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)NCC(O)=O OAGVHWYIBZMWLA-YFKPBYRVSA-N 0.000 description 1
- RAUDKMVXNOWDLS-WDSKDSINSA-N Glu-Gly-Ser Chemical compound OC(=O)CC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O RAUDKMVXNOWDLS-WDSKDSINSA-N 0.000 description 1
- OQXDUSZKISQQSS-GUBZILKMSA-N Glu-Lys-Ala Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OQXDUSZKISQQSS-GUBZILKMSA-N 0.000 description 1
- YRMZCZIRHYCNHX-RYUDHWBXSA-N Glu-Phe-Gly Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)NCC(O)=O YRMZCZIRHYCNHX-RYUDHWBXSA-N 0.000 description 1
- RGJKYNUINKGPJN-RWRJDSDZSA-N Glu-Thr-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](CCC(=O)O)N RGJKYNUINKGPJN-RWRJDSDZSA-N 0.000 description 1
- HQTDNEZTGZUWSY-XVKPBYJWSA-N Glu-Val-Gly Chemical compound CC(C)[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)NCC(O)=O HQTDNEZTGZUWSY-XVKPBYJWSA-N 0.000 description 1
- BRFJMRSRMOMIMU-WHFBIAKZSA-N Gly-Ala-Asn Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O BRFJMRSRMOMIMU-WHFBIAKZSA-N 0.000 description 1
- RLFSBAPJTYKSLG-WHFBIAKZSA-N Gly-Ala-Asp Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CC(O)=O)C(O)=O RLFSBAPJTYKSLG-WHFBIAKZSA-N 0.000 description 1
- JXYMPBCYRKWJEE-BQBZGAKWSA-N Gly-Arg-Ala Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](C)C(O)=O JXYMPBCYRKWJEE-BQBZGAKWSA-N 0.000 description 1
- VXKCPBPQEKKERH-IUCAKERBSA-N Gly-Arg-Pro Chemical compound NC(N)=NCCC[C@H](NC(=O)CN)C(=O)N1CCC[C@H]1C(O)=O VXKCPBPQEKKERH-IUCAKERBSA-N 0.000 description 1
- GWCRIHNSVMOBEQ-BQBZGAKWSA-N Gly-Arg-Ser Chemical compound [H]NCC(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O GWCRIHNSVMOBEQ-BQBZGAKWSA-N 0.000 description 1
- DTPOVRRYXPJJAZ-FJXKBIBVSA-N Gly-Arg-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N DTPOVRRYXPJJAZ-FJXKBIBVSA-N 0.000 description 1
- WKJKBELXHCTHIJ-WPRPVWTQSA-N Gly-Arg-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CCCN=C(N)N WKJKBELXHCTHIJ-WPRPVWTQSA-N 0.000 description 1
- DWUKOTKSTDWGAE-BQBZGAKWSA-N Gly-Asn-Arg Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DWUKOTKSTDWGAE-BQBZGAKWSA-N 0.000 description 1
- JVWPPCWUDRJGAE-YUMQZZPRSA-N Gly-Asn-Leu Chemical compound [H]NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O JVWPPCWUDRJGAE-YUMQZZPRSA-N 0.000 description 1
- JVACNFOPSUPDTK-QWRGUYRKSA-N Gly-Asn-Phe Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 JVACNFOPSUPDTK-QWRGUYRKSA-N 0.000 description 1
- XRTDOIOIBMAXCT-NKWVEPMBSA-N Gly-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)CN)C(=O)O XRTDOIOIBMAXCT-NKWVEPMBSA-N 0.000 description 1
- XCLCVBYNGXEVDU-WHFBIAKZSA-N Gly-Asn-Ser Chemical compound NCC(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O XCLCVBYNGXEVDU-WHFBIAKZSA-N 0.000 description 1
- IWAXHBCACVWNHT-BQBZGAKWSA-N Gly-Asp-Arg Chemical compound NCC(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N IWAXHBCACVWNHT-BQBZGAKWSA-N 0.000 description 1
- XEJTYSCIXKYSHR-WDSKDSINSA-N Gly-Asp-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN XEJTYSCIXKYSHR-WDSKDSINSA-N 0.000 description 1
- MHHUEAIBJZWDBH-YUMQZZPRSA-N Gly-Asp-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)CN MHHUEAIBJZWDBH-YUMQZZPRSA-N 0.000 description 1
- VNBNZUAPOYGRDB-ZDLURKLDSA-N Gly-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)CN)O VNBNZUAPOYGRDB-ZDLURKLDSA-N 0.000 description 1
- JMQFHZWESBGPFC-WDSKDSINSA-N Gly-Gln-Asp Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O JMQFHZWESBGPFC-WDSKDSINSA-N 0.000 description 1
- AQLHORCVPGXDJW-IUCAKERBSA-N Gly-Gln-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)CN AQLHORCVPGXDJW-IUCAKERBSA-N 0.000 description 1
- BEQGFMIBZFNROK-JGVFFNPUSA-N Gly-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)CN)C(=O)O BEQGFMIBZFNROK-JGVFFNPUSA-N 0.000 description 1
- MBOAPAXLTUSMQI-JHEQGTHGSA-N Gly-Glu-Thr Chemical compound [H]NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MBOAPAXLTUSMQI-JHEQGTHGSA-N 0.000 description 1
- KAJAOGBVWCYGHZ-JTQLQIEISA-N Gly-Gly-Phe Chemical compound [NH3+]CC(=O)NCC(=O)N[C@H](C([O-])=O)CC1=CC=CC=C1 KAJAOGBVWCYGHZ-JTQLQIEISA-N 0.000 description 1
- BUEFQXUHTUZXHR-LURJTMIESA-N Gly-Gly-Pro zwitterion Chemical compound NCC(=O)NCC(=O)N1CCC[C@H]1C(O)=O BUEFQXUHTUZXHR-LURJTMIESA-N 0.000 description 1
- UTYGDAHJBBDPBA-BYULHYEWSA-N Gly-Ile-Asp Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)O)NC(=O)CN UTYGDAHJBBDPBA-BYULHYEWSA-N 0.000 description 1
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 1
- LHYJCVCQPWRMKZ-WEDXCCLWSA-N Gly-Leu-Thr Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O LHYJCVCQPWRMKZ-WEDXCCLWSA-N 0.000 description 1
- OQQKUTVULYLCDG-ONGXEEELSA-N Gly-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)CN)C(O)=O OQQKUTVULYLCDG-ONGXEEELSA-N 0.000 description 1
- GAFKBWKVXNERFA-QWRGUYRKSA-N Gly-Phe-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 GAFKBWKVXNERFA-QWRGUYRKSA-N 0.000 description 1
- YYXJFBMCOUSYSF-RYUDHWBXSA-N Gly-Phe-Gln Chemical compound [H]NCC(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYXJFBMCOUSYSF-RYUDHWBXSA-N 0.000 description 1
- YLEIWGJJBFBFHC-KBPBESRZSA-N Gly-Phe-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CC=CC=C1 YLEIWGJJBFBFHC-KBPBESRZSA-N 0.000 description 1
- HJARVELKOSZUEW-YUMQZZPRSA-N Gly-Pro-Gln Chemical compound [H]NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O HJARVELKOSZUEW-YUMQZZPRSA-N 0.000 description 1
- GLACUWHUYFBSPJ-FJXKBIBVSA-N Gly-Pro-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN GLACUWHUYFBSPJ-FJXKBIBVSA-N 0.000 description 1
- BMWFDYIYBAFROD-WPRPVWTQSA-N Gly-Pro-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)CN BMWFDYIYBAFROD-WPRPVWTQSA-N 0.000 description 1
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 1
- YOBGUCWZPXJHTN-BQBZGAKWSA-N Gly-Ser-Arg Chemical compound NCC(=O)N[C@@H](CO)C(=O)N[C@H](C(O)=O)CCCN=C(N)N YOBGUCWZPXJHTN-BQBZGAKWSA-N 0.000 description 1
- IALQAMYQJBZNSK-WHFBIAKZSA-N Gly-Ser-Asn Chemical compound [H]NCC(=O)N[C@@H](CO)C(=O)N[C@@H](CC(N)=O)C(O)=O IALQAMYQJBZNSK-WHFBIAKZSA-N 0.000 description 1
- ABPRMMYHROQBLY-NKWVEPMBSA-N Gly-Ser-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CO)NC(=O)CN)C(=O)O ABPRMMYHROQBLY-NKWVEPMBSA-N 0.000 description 1
- MYXNLWDWWOTERK-BHNWBGBOSA-N Gly-Thr-Pro Chemical compound C[C@H]([C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)CN)O MYXNLWDWWOTERK-BHNWBGBOSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 102100032610 Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLas Human genes 0.000 description 1
- 101150092640 HES1 gene Proteins 0.000 description 1
- HYWZHNUGAYVEEW-KKUMJFAQSA-N His-Phe-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CC2=CN=CN2)N HYWZHNUGAYVEEW-KKUMJFAQSA-N 0.000 description 1
- PYNPBMCLAKTHJL-SRVKXCTJSA-N His-Pro-Glu Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(O)=O PYNPBMCLAKTHJL-SRVKXCTJSA-N 0.000 description 1
- MDOBWSFNSNPENN-PMVVWTBXSA-N His-Thr-Gly Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O MDOBWSFNSNPENN-PMVVWTBXSA-N 0.000 description 1
- 101000727994 Homo sapiens ADAMTS-like protein 2 Proteins 0.000 description 1
- 101000799143 Homo sapiens Alkyldihydroxyacetonephosphate synthase, peroxisomal Proteins 0.000 description 1
- 101000975827 Homo sapiens Arylsulfatase L Proteins 0.000 description 1
- 101001130526 Homo sapiens Beta-1,3-N-acetylglucosaminyltransferase lunatic fringe Proteins 0.000 description 1
- 101000882992 Homo sapiens Carbohydrate sulfotransferase 3 Proteins 0.000 description 1
- 101000761509 Homo sapiens Cathepsin K Proteins 0.000 description 1
- 101000777367 Homo sapiens Coiled-coil domain-containing protein 8 Proteins 0.000 description 1
- 101000875067 Homo sapiens Collagen alpha-2(I) chain Proteins 0.000 description 1
- 101001039272 Homo sapiens Dihydroxyacetone phosphate acyltransferase Proteins 0.000 description 1
- 101000846893 Homo sapiens Fibrillin-1 Proteins 0.000 description 1
- 101000913551 Homo sapiens Filamin-B Proteins 0.000 description 1
- 101001014590 Homo sapiens Guanine nucleotide-binding protein G(s) subunit alpha isoforms XLas Proteins 0.000 description 1
- 101001014594 Homo sapiens Guanine nucleotide-binding protein G(s) subunit alpha isoforms short Proteins 0.000 description 1
- 101001014610 Homo sapiens Neuroendocrine secretory protein 55 Proteins 0.000 description 1
- 101000992104 Homo sapiens Obscurin-like protein 1 Proteins 0.000 description 1
- 101001003584 Homo sapiens Prelamin-A/C Proteins 0.000 description 1
- 101000928339 Homo sapiens Progressive ankylosis protein homolog Proteins 0.000 description 1
- 101000797903 Homo sapiens Protein ALEX Proteins 0.000 description 1
- 101000711796 Homo sapiens Sclerostin Proteins 0.000 description 1
- 101000635938 Homo sapiens Transforming growth factor beta-1 proprotein Proteins 0.000 description 1
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 1
- BGZIJZJBXRVBGJ-SXTJYALSSA-N Ile-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)O)N BGZIJZJBXRVBGJ-SXTJYALSSA-N 0.000 description 1
- NPROWIBAWYMPAZ-GUDRVLHUSA-N Ile-Asp-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N1CCC[C@@H]1C(=O)O)N NPROWIBAWYMPAZ-GUDRVLHUSA-N 0.000 description 1
- LDRALPZEVHVXEK-KBIXCLLPSA-N Ile-Cys-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N LDRALPZEVHVXEK-KBIXCLLPSA-N 0.000 description 1
- CYHJCEKUMCNDFG-LAEOZQHASA-N Ile-Gln-Gly Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)NCC(=O)O)N CYHJCEKUMCNDFG-LAEOZQHASA-N 0.000 description 1
- JXMSHKFPDIUYGS-SIUGBPQLSA-N Ile-Glu-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CCC(=O)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N JXMSHKFPDIUYGS-SIUGBPQLSA-N 0.000 description 1
- DGTOKVBDZXJHNZ-WZLNRYEVSA-N Ile-Thr-Tyr Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)N DGTOKVBDZXJHNZ-WZLNRYEVSA-N 0.000 description 1
- 108010065920 Insulin Lispro Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- WNGVUZWBXZKQES-YUMQZZPRSA-N Leu-Ala-Gly Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O WNGVUZWBXZKQES-YUMQZZPRSA-N 0.000 description 1
- XBBKIIGCUMBKCO-JXUBOQSCSA-N Leu-Ala-Thr Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O XBBKIIGCUMBKCO-JXUBOQSCSA-N 0.000 description 1
- KSZCCRIGNVSHFH-UWVGGRQHSA-N Leu-Arg-Gly Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O KSZCCRIGNVSHFH-UWVGGRQHSA-N 0.000 description 1
- KUEVMUXNILMJTK-JYJNAYRXSA-N Leu-Gln-Tyr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 KUEVMUXNILMJTK-JYJNAYRXSA-N 0.000 description 1
- POZULHZYLPGXMR-ONGXEEELSA-N Leu-Gly-Val Chemical compound CC(C)C[C@H](N)C(=O)NCC(=O)N[C@@H](C(C)C)C(O)=O POZULHZYLPGXMR-ONGXEEELSA-N 0.000 description 1
- RXGLHDWAZQECBI-SRVKXCTJSA-N Leu-Leu-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O RXGLHDWAZQECBI-SRVKXCTJSA-N 0.000 description 1
- RZXLZBIUTDQHJQ-SRVKXCTJSA-N Leu-Lys-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(O)=O)C(O)=O RZXLZBIUTDQHJQ-SRVKXCTJSA-N 0.000 description 1
- YUTNOGOMBNYPFH-XUXIUFHCSA-N Leu-Pro-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(O)=O YUTNOGOMBNYPFH-XUXIUFHCSA-N 0.000 description 1
- SBANPBVRHYIMRR-UHFFFAOYSA-N Leu-Ser-Pro Natural products CC(C)CC(N)C(=O)NC(CO)C(=O)N1CCCC1C(O)=O SBANPBVRHYIMRR-UHFFFAOYSA-N 0.000 description 1
- XFIHDSBIPWEYJJ-YUMQZZPRSA-N Lys-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCCCN XFIHDSBIPWEYJJ-YUMQZZPRSA-N 0.000 description 1
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 1
- NCTDKZKNBDZDOL-GARJFASQSA-N Lys-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCCN)N)C(=O)O NCTDKZKNBDZDOL-GARJFASQSA-N 0.000 description 1
- WGCKDDHUFPQSMZ-ZPFDUUQYSA-N Lys-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)CCCCN WGCKDDHUFPQSMZ-ZPFDUUQYSA-N 0.000 description 1
- VSRXPEHZMHSFKU-IUCAKERBSA-N Lys-Gln-Gly Chemical compound NCCCC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O VSRXPEHZMHSFKU-IUCAKERBSA-N 0.000 description 1
- UETQMSASAVBGJY-QWRGUYRKSA-N Lys-Gly-His Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CNC=N1 UETQMSASAVBGJY-QWRGUYRKSA-N 0.000 description 1
- GQFDWEDHOQRNLC-QWRGUYRKSA-N Lys-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCCCN GQFDWEDHOQRNLC-QWRGUYRKSA-N 0.000 description 1
- WOEDRPCHKPSFDT-MXAVVETBSA-N Lys-His-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCCN)N WOEDRPCHKPSFDT-MXAVVETBSA-N 0.000 description 1
- PINHPJWGVBKQII-SRVKXCTJSA-N Lys-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N PINHPJWGVBKQII-SRVKXCTJSA-N 0.000 description 1
- VHTOGMKQXXJOHG-RHYQMDGZSA-N Lys-Thr-Val Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(O)=O VHTOGMKQXXJOHG-RHYQMDGZSA-N 0.000 description 1
- CFOLERIRBUAYAD-HOCLYGCPSA-N Lys-Trp-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)NCC(O)=O CFOLERIRBUAYAD-HOCLYGCPSA-N 0.000 description 1
- 208000024556 Mendelian disease Diseases 0.000 description 1
- GAELMDJMQDUDLJ-BQBZGAKWSA-N Met-Ala-Gly Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)NCC(O)=O GAELMDJMQDUDLJ-BQBZGAKWSA-N 0.000 description 1
- DLAFCQWUMFMZSN-GUBZILKMSA-N Met-Arg-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](C)C(O)=O)CCCN=C(N)N DLAFCQWUMFMZSN-GUBZILKMSA-N 0.000 description 1
- UOENBSHXYCHSAU-YUMQZZPRSA-N Met-Gln-Gly Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)NCC(O)=O UOENBSHXYCHSAU-YUMQZZPRSA-N 0.000 description 1
- LQMHZERGCQJKAH-STQMWFEESA-N Met-Gly-Phe Chemical compound CSCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC1=CC=CC=C1 LQMHZERGCQJKAH-STQMWFEESA-N 0.000 description 1
- CIDICGYKRUTYLE-FXQIFTODSA-N Met-Ser-Ala Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O CIDICGYKRUTYLE-FXQIFTODSA-N 0.000 description 1
- 101100284799 Mus musculus Hesx1 gene Proteins 0.000 description 1
- 101100310657 Mus musculus Sox1 gene Proteins 0.000 description 1
- SITLTJHOQZFJGG-UHFFFAOYSA-N N-L-alpha-glutamyl-L-valine Natural products CC(C)C(C(O)=O)NC(=O)C(N)CCC(O)=O SITLTJHOQZFJGG-UHFFFAOYSA-N 0.000 description 1
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 208000036364 Normal newborn Diseases 0.000 description 1
- 102100031914 Obscurin-like protein 1 Human genes 0.000 description 1
- 206010030113 Oedema Diseases 0.000 description 1
- 208000004286 Osteochondrodysplasias Diseases 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- FPTXMUIBLMGTQH-ONGXEEELSA-N Phe-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CC=CC=C1 FPTXMUIBLMGTQH-ONGXEEELSA-N 0.000 description 1
- XMQSOOJRRVEHRO-ULQDDVLXSA-N Phe-Leu-Arg Chemical compound NC(N)=NCCC[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 XMQSOOJRRVEHRO-ULQDDVLXSA-N 0.000 description 1
- MMJJFXWMCMJMQA-STQMWFEESA-N Phe-Pro-Gly Chemical compound C([C@H](N)C(=O)N1[C@@H](CCC1)C(=O)NCC(O)=O)C1=CC=CC=C1 MMJJFXWMCMJMQA-STQMWFEESA-N 0.000 description 1
- 102100026531 Prelamin-A/C Human genes 0.000 description 1
- YFNOUBWUIIJQHF-LPEHRKFASA-N Pro-Asp-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O YFNOUBWUIIJQHF-LPEHRKFASA-N 0.000 description 1
- VDGTVWFMRXVQCT-GUBZILKMSA-N Pro-Glu-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@@H]1CCCN1 VDGTVWFMRXVQCT-GUBZILKMSA-N 0.000 description 1
- NMELOOXSGDRBRU-YUMQZZPRSA-N Pro-Glu-Gly Chemical compound OC(=O)CNC(=O)[C@H](CCC(=O)O)NC(=O)[C@@H]1CCCN1 NMELOOXSGDRBRU-YUMQZZPRSA-N 0.000 description 1
- LGSANCBHSMDFDY-GARJFASQSA-N Pro-Glu-Pro Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CCC(=O)O)C(=O)N2CCC[C@@H]2C(=O)O LGSANCBHSMDFDY-GARJFASQSA-N 0.000 description 1
- ULIWFCCJIOEHMU-BQBZGAKWSA-N Pro-Gly-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 ULIWFCCJIOEHMU-BQBZGAKWSA-N 0.000 description 1
- JMVQDLDPDBXAAX-YUMQZZPRSA-N Pro-Gly-Gln Chemical compound NC(=O)CC[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 JMVQDLDPDBXAAX-YUMQZZPRSA-N 0.000 description 1
- HAAQQNHQZBOWFO-LURJTMIESA-N Pro-Gly-Gly Chemical compound OC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1 HAAQQNHQZBOWFO-LURJTMIESA-N 0.000 description 1
- FEPSEIDIPBMIOS-QXEWZRGKSA-N Pro-Gly-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H]1CCCN1 FEPSEIDIPBMIOS-QXEWZRGKSA-N 0.000 description 1
- TYMBHHITTMGGPI-NAKRPEOUSA-N Pro-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@@H]1CCCN1 TYMBHHITTMGGPI-NAKRPEOUSA-N 0.000 description 1
- DWGFLKQSGRUQTI-IHRRRGAJSA-N Pro-Lys-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H]1CCCN1 DWGFLKQSGRUQTI-IHRRRGAJSA-N 0.000 description 1
- AJBQTGZIZQXBLT-STQMWFEESA-N Pro-Phe-Gly Chemical compound C([C@@H](C(=O)NCC(=O)O)NC(=O)[C@H]1NCCC1)C1=CC=CC=C1 AJBQTGZIZQXBLT-STQMWFEESA-N 0.000 description 1
- QUBVFEANYYWBTM-VEVYYDQMSA-N Pro-Thr-Asp Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(O)=O QUBVFEANYYWBTM-VEVYYDQMSA-N 0.000 description 1
- DCHQYSOGURGJST-FJXKBIBVSA-N Pro-Thr-Gly Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O DCHQYSOGURGJST-FJXKBIBVSA-N 0.000 description 1
- UCTIUWKCVNGEFH-OBJOEFQTSA-N Pro-Val-Gly-Pro Chemical compound N([C@@H](C(C)C)C(=O)NCC(=O)N1[C@@H](CCC1)C(O)=O)C(=O)[C@@H]1CCCN1 UCTIUWKCVNGEFH-OBJOEFQTSA-N 0.000 description 1
- 102100036812 Progressive ankylosis protein homolog Human genes 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 238000002123 RNA extraction Methods 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 108091006178 SLC26 Proteins 0.000 description 1
- 108091006957 SLC35D1 Proteins 0.000 description 1
- 102100034201 Sclerostin Human genes 0.000 description 1
- BTKUIVBNGBFTTP-WHFBIAKZSA-N Ser-Ala-Gly Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)NCC(O)=O BTKUIVBNGBFTTP-WHFBIAKZSA-N 0.000 description 1
- SWIQQMYVHIXPEK-FXQIFTODSA-N Ser-Cys-Val Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O SWIQQMYVHIXPEK-FXQIFTODSA-N 0.000 description 1
- CDVFZMOFNJPUDD-ACZMJKKPSA-N Ser-Gln-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O CDVFZMOFNJPUDD-ACZMJKKPSA-N 0.000 description 1
- BPMRXBZYPGYPJN-WHFBIAKZSA-N Ser-Gly-Asn Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O BPMRXBZYPGYPJN-WHFBIAKZSA-N 0.000 description 1
- SNVIOQXAHVORQM-WDSKDSINSA-N Ser-Gly-Gln Chemical compound [H]N[C@@H](CO)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O SNVIOQXAHVORQM-WDSKDSINSA-N 0.000 description 1
- WSTIOCFMWXNOCX-YUMQZZPRSA-N Ser-Gly-Lys Chemical compound C(CCN)C[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CO)N WSTIOCFMWXNOCX-YUMQZZPRSA-N 0.000 description 1
- KDGARKCAKHBEDB-NKWVEPMBSA-N Ser-Gly-Pro Chemical compound C1C[C@@H](N(C1)C(=O)CNC(=O)[C@H](CO)N)C(=O)O KDGARKCAKHBEDB-NKWVEPMBSA-N 0.000 description 1
- UIGMAMGZOJVTDN-WHFBIAKZSA-N Ser-Gly-Ser Chemical compound OC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O UIGMAMGZOJVTDN-WHFBIAKZSA-N 0.000 description 1
- SFTZTYBXIXLRGQ-JBDRJPRFSA-N Ser-Ile-Ala Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O SFTZTYBXIXLRGQ-JBDRJPRFSA-N 0.000 description 1
- YIUWWXVTYLANCJ-NAKRPEOUSA-N Ser-Ile-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O YIUWWXVTYLANCJ-NAKRPEOUSA-N 0.000 description 1
- KCNSGAMPBPYUAI-CIUDSAMLSA-N Ser-Leu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O KCNSGAMPBPYUAI-CIUDSAMLSA-N 0.000 description 1
- GVMUJUPXFQFBBZ-GUBZILKMSA-N Ser-Lys-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCC(O)=O)C(O)=O GVMUJUPXFQFBBZ-GUBZILKMSA-N 0.000 description 1
- OZPDGESCTGGNAD-CIUDSAMLSA-N Ser-Ser-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](N)CO OZPDGESCTGGNAD-CIUDSAMLSA-N 0.000 description 1
- 208000020221 Short stature Diseases 0.000 description 1
- 238000002105 Southern blotting Methods 0.000 description 1
- 208000000875 Spinal Curvatures Diseases 0.000 description 1
- 102000018509 Sulfate Transporters Human genes 0.000 description 1
- 108010006785 Taq Polymerase Proteins 0.000 description 1
- 101710168651 Thioredoxin 1 Proteins 0.000 description 1
- DFTCYYILCSQGIZ-GCJQMDKQSA-N Thr-Ala-Asn Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O DFTCYYILCSQGIZ-GCJQMDKQSA-N 0.000 description 1
- JEDIEMIJYSRUBB-FOHZUACHSA-N Thr-Asp-Gly Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)NCC(O)=O JEDIEMIJYSRUBB-FOHZUACHSA-N 0.000 description 1
- DSLHSTIUAPKERR-XGEHTFHBSA-N Thr-Cys-Val Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CS)C(=O)N[C@@H](C(C)C)C(O)=O DSLHSTIUAPKERR-XGEHTFHBSA-N 0.000 description 1
- SHOMROOOQBDGRL-JHEQGTHGSA-N Thr-Glu-Gly Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)NCC(O)=O SHOMROOOQBDGRL-JHEQGTHGSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- MECLEFZMPPOEAC-VOAKCMCISA-N Thr-Leu-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O MECLEFZMPPOEAC-VOAKCMCISA-N 0.000 description 1
- PRTHQBSMXILLPC-XGEHTFHBSA-N Thr-Ser-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O PRTHQBSMXILLPC-XGEHTFHBSA-N 0.000 description 1
- BKVICMPZWRNWOC-RHYQMDGZSA-N Thr-Val-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)[C@@H](C)O BKVICMPZWRNWOC-RHYQMDGZSA-N 0.000 description 1
- 102100030742 Transforming growth factor beta-1 proprotein Human genes 0.000 description 1
- 108700019146 Transgenes Proteins 0.000 description 1
- ULHASJWZGUEUNN-XIRDDKMYSA-N Trp-Lys-Ser Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O ULHASJWZGUEUNN-XIRDDKMYSA-N 0.000 description 1
- ACGIVBXINJFALS-HKUYNNGSSA-N Trp-Phe-Gly Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC2=CNC3=CC=CC=C32)N ACGIVBXINJFALS-HKUYNNGSSA-N 0.000 description 1
- AKLNEFNQWLHIGY-QWRGUYRKSA-N Tyr-Gly-Asp Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)NCC(=O)N[C@@H](CC(=O)O)C(=O)O)N)O AKLNEFNQWLHIGY-QWRGUYRKSA-N 0.000 description 1
- BSCBBPKDVOZICB-KKUMJFAQSA-N Tyr-Leu-Asp Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O BSCBBPKDVOZICB-KKUMJFAQSA-N 0.000 description 1
- PYJKETPLFITNKS-IHRRRGAJSA-N Tyr-Pro-Asn Chemical compound N[C@@H](Cc1ccc(O)cc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O PYJKETPLFITNKS-IHRRRGAJSA-N 0.000 description 1
- XUIOBCQESNDTDE-FQPOAREZSA-N Tyr-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O XUIOBCQESNDTDE-FQPOAREZSA-N 0.000 description 1
- 102100032284 UDP-glucuronic acid/UDP-N-acetylgalactosamine transporter Human genes 0.000 description 1
- ISERLACIZUGCDX-ZKWXMUAHSA-N Val-Asp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](C(C)C)N ISERLACIZUGCDX-ZKWXMUAHSA-N 0.000 description 1
- XLDYBRXERHITNH-QSFUFRPTSA-N Val-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)C(C)C XLDYBRXERHITNH-QSFUFRPTSA-N 0.000 description 1
- PGBJAZDAEWPDAA-NHCYSSNCSA-N Val-Gln-Met Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CCSC)C(=O)O)N PGBJAZDAEWPDAA-NHCYSSNCSA-N 0.000 description 1
- URIRWLJVWHYLET-ONGXEEELSA-N Val-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C URIRWLJVWHYLET-ONGXEEELSA-N 0.000 description 1
- YQMILNREHKTFBS-IHRRRGAJSA-N Val-Phe-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CS)C(=O)O)N YQMILNREHKTFBS-IHRRRGAJSA-N 0.000 description 1
- SJRUJQFQVLMZFW-WPRPVWTQSA-N Val-Pro-Gly Chemical compound CC(C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)NCC(O)=O SJRUJQFQVLMZFW-WPRPVWTQSA-N 0.000 description 1
- 101000928515 Xenopus laevis Homeobox protein DLL-1 Proteins 0.000 description 1
- 108010024078 alanyl-glycyl-serine Proteins 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010070783 alanyltyrosine Proteins 0.000 description 1
- 108010029483 alpha 1 Chain Collagen Type I Proteins 0.000 description 1
- 108010050025 alpha-glutamyltryptophan Proteins 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 239000012491 analyte Substances 0.000 description 1
- 108010013835 arginine glutamate Proteins 0.000 description 1
- 108010072041 arginyl-glycyl-aspartic acid Proteins 0.000 description 1
- 108010089975 arginyl-glycyl-aspartyl-serine Proteins 0.000 description 1
- 108010069926 arginyl-glycyl-serine Proteins 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 1
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 1
- 108010093581 aspartyl-proline Proteins 0.000 description 1
- 208000021018 autosomal dominant inheritance Diseases 0.000 description 1
- 238000005452 bending Methods 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 208000022696 bone development disease Diseases 0.000 description 1
- 210000000038 chest Anatomy 0.000 description 1
- 238000003759 clinical diagnosis Methods 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 108010093679 collagen type II (108-116) Proteins 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 210000002808 connective tissue Anatomy 0.000 description 1
- 108010060199 cysteinylproline Proteins 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013480 data collection Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 239000012153 distilled water Substances 0.000 description 1
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 1
- 238000013399 early diagnosis Methods 0.000 description 1
- 238000002592 echocardiography Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 108010006664 gamma-glutamyl-glycyl-glycine Proteins 0.000 description 1
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 1
- 238000003209 gene knockout Methods 0.000 description 1
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 1
- 108010008237 glutamyl-valyl-glycine Proteins 0.000 description 1
- 108010089804 glycyl-threonine Proteins 0.000 description 1
- 108010037850 glycylvaline Proteins 0.000 description 1
- 210000002149 gonad Anatomy 0.000 description 1
- 230000002710 gonadal effect Effects 0.000 description 1
- 208000035474 group of disease Diseases 0.000 description 1
- 238000010438 heat treatment Methods 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 102000056160 human COL2A1 Human genes 0.000 description 1
- 229960002591 hydroxyproline Drugs 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 108010027338 isoleucylcysteine Proteins 0.000 description 1
- 238000011819 knockout animal model Methods 0.000 description 1
- 108010003700 lysyl aspartic acid Proteins 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000008774 maternal effect Effects 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000007838 multiplex ligation-dependent probe amplification Methods 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 230000005311 nuclear magnetism Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000011164 ossification Effects 0.000 description 1
- 230000007918 pathogenicity Effects 0.000 description 1
- 108010024607 phenylalanylalanine Proteins 0.000 description 1
- 229920001184 polypeptide Polymers 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 108090000765 processed proteins & peptides Proteins 0.000 description 1
- 102000004196 processed proteins & peptides Human genes 0.000 description 1
- 108010016009 procolipase activation peptide Proteins 0.000 description 1
- 108010020755 prolyl-glycyl-glycine Proteins 0.000 description 1
- 108010070643 prolylglutamic acid Proteins 0.000 description 1
- 108010015796 prolylisoleucine Proteins 0.000 description 1
- 108010053725 prolylvaline Proteins 0.000 description 1
- 230000007363 regulatory process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 210000003491 skin Anatomy 0.000 description 1
- 206010062920 spondyloepiphyseal dysplasia Diseases 0.000 description 1
- 201000003504 spondyloepiphyseal dysplasia congenita Diseases 0.000 description 1
- 201000002962 spondyloepiphyseal dysplasia with congenital joint dislocations Diseases 0.000 description 1
- 101150063780 spp1 gene Proteins 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 229940126585 therapeutic drug Drugs 0.000 description 1
- FGMPLJWBKKVCDB-UHFFFAOYSA-N trans-L-hydroxy-proline Natural products ON1CCCC1C(O)=O FGMPLJWBKKVCDB-UHFFFAOYSA-N 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 210000003954 umbilical cord Anatomy 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 108010011876 valyl-glycyl-valyl-alanyl-prolyl-glycine Proteins 0.000 description 1
- 108010073969 valyllysine Proteins 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 238000007482 whole exome sequencing Methods 0.000 description 1
- 238000012070 whole genome sequencing analysis Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
- C07K14/78—Connective tissue peptides, e.g. collagen, elastin, laminin, fibronectin, vitronectin or cold insoluble globulin [CIG]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
- C12Q1/6874—Methods for sequencing involving nucleic acid arrays, e.g. sequencing by hybridisation
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Wood Science & Technology (AREA)
- Engineering & Computer Science (AREA)
- Biochemistry (AREA)
- Analytical Chemistry (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Microbiology (AREA)
- Physics & Mathematics (AREA)
- Biotechnology (AREA)
- Immunology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Gastroenterology & Hepatology (AREA)
- Medicinal Chemistry (AREA)
- Toxicology (AREA)
- Pathology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses pathogenic mutation of osteogenesis imperfecta and a detection reagent thereof. A novel mutant COL2A1 gene features that the mutant COL2A1 gene is single-point mutation c.3944G > A (chr12:48368588), the heterozygosis mutation is pathogenic, and its site mutation can change the structure of collagen type II in articular cartilage and cause collagen type II disease. A kit for detecting osteogenesis imperfecta comprising: a reagent for detecting the 3944bp site of CDS of the COL2A1 gene; or a reagent for detecting the 1315 th amino acid site of the COL2A1 protein. The invention obtains the pathogenic mutation (c.3944G > A on COL2A1 gene) of the osteogenesis imperfecta disease, and the diagnosis of the osteogenesis imperfecta disease can be carried out by detecting the mutation.
Description
Technical Field
The invention belongs to the fields of biomedicine and genetics, and relates to a pathogenic gene COL2A1 mutation of an abnormal bone development disease and a detection reagent thereof.
Background
Skeletal dysplasia, also known as dwarfism, is a general term for a group of diseases of short stature caused by abnormal bone or cartilage, and although most of the patients have normal intelligence, the diseases affect normal development of bone and cartilage tissues, sometimes cause bone deformation and even affect other systems of the body. At present, more than 200 kinds of bone dysplasia diseases are reported, many congenital bone dysplasias are caused by mutation of a genetic gene, the condition of the bone dysplasia is generally diagnosed and evaluated through examination such as an x-ray film and nuclear magnetic resonance, after symptoms generally appear, because the genetic disease cannot be treated once, heavy burden is brought to the society and families, and the only measure is early diagnosis and informed selection, so that pathogenic sites of the genetic disease are searched and verified, a gene diagnosis method of the bone dysplasia is established, prenatal gene diagnosis is carried out, the birth of the sick children is avoided, the burden of the society and families of patients is reduced, the prenatal and postnatal care is realized, and the quality of the population in China is improved.
More than ten years ago, for the diagnosis of genetic diseases, the diagnosis is generally carried out by examining a body, carrying out biochemical and mass spectrometric examination, then carrying out nuclear magnetism, CT and other auxiliary examinations, and finally carrying out targeted gene detection, so prenatal diagnosis cannot be realized. The advent of high throughput sequencing (NGS) provides a good solution to the above-mentioned problems. NGS is fast in speed, high in flux and high in operation automation degree, DNA samples can detect all related genes of diseases once only 8-9 hours from preparation of libraries and on-computer sequencing to data analysis, the diagnosis efficiency of the genetic diseases is greatly improved, particularly, the occurrence of whole exon sequencing greatly increases the detection rate of pathogenic mutation of the genetic diseases, the technology can detect about 2 ten thousand exon regions of genes in a human genome at one time, wherein the coverage rate reaches more than 95 percent of gene factors of 18105, and the technology reads 4000 genes which include definite pathogenic relations in an OMIM database, more than 5000 diseases and detects 68 microdeletion microdrop syndromes. The kit is particularly suitable for ultrasonic prompt of skeletal systems in pregnancy, but no family (new variation) exists, the acquisition of clinical representation in the fetal period is limited, and pathogenic mutation sites can be quickly, accurately and widely searched (all exons) by sequencing all exons of fetal parents and fetuses.
Based on the emergence of NGS, clinical diagnosis strategies for monogenic genetic diseases have changed greatly, and different detection schemes have emerged for 3 clinical detection needs: (1) for definite single diseases, a Sanger sequencing method, a qPCR method or a Panel single disease detection is used, a detection platform is first-generation sequencing, MLPA or NGS, the advantages and the limitations of the method are that a character or a disease gene is amplified, point mutation and deletion repetition are detected, new mutation is found, but new gene cannot be found; (2) for objects with definite single clinical phenotype, family history or other obvious genetic factors and higher requirements on diagnosis period and efficiency, a panel disease combination or full panel detection is adopted, the detection platform is NGS, the advantages and the limitations of the detection platform lie in detecting related gene combinations, the detection result has definite clinical significance, the result is easy to explain, the period is short, the cost performance is high, new mutation can be found, but new genes cannot be found; (3) for difficult cases with complex clinical representation or unknown disease directions, or negative panel detection results, or clinical scientific researches based on a plurality of disease families, the NGS-based whole exome sequencing detection is adopted, the whole exome is captured and sequenced, point mutation and deletion repetition are discovered, 75-85% pathogenic mutation can be detected, new genes are discovered, but mutation with unknown clinical significance can be discovered, and the explanation is difficult; the whole genome sequencing detection based on NGS can also be adopted, the whole genome of human can be detected, the whole genome is sequenced, point mutation and deletion repetition, structural variation and dynamic mutation can be found, the mitochondrial variation can be detected simultaneously, and new genes can be found, but the mutation with unknown clinical significance can be found possibly and is difficult to explain.
The COL2A1 gene is located at the position 12q13.11-q13.2 of the long arm of chromosome 12, and the gene is 31.538kb long, and has 54 exons and 53 introns and codes for 1487 amino acids. The COL2a1 gene encodes the pre- α 1(II) chain of collagen type II, which adds structure and strength to the connective tissue supporting the muscles, joints, organs and skin of the body, and is mainly present in cartilage, most of which is subsequently converted into bone, and is involved in the regulatory process of intrabony and intrachondral osteogenesis. Type II collagen has a homotrimeric structure and consists of three triple helical alpha polypeptides folded into a rod. The collagen regions are composed of a plurality of amino acid sequences that are repeated in a Gly-X-Y pattern (Gly being glycine, X typically representing proline and Y typically representing hydroxyproline). Many exons of COL2A1 gene are reported, so far, more than one hundred mutations have been reported, COL2A1 gene-related diseases include Torrance type vertebral body flat lethal skeletal dysplasia, chondrogenesis imperfecta type 2, congenital spondyloepiphysis dysplasia, Kniest dysplasia, Strudwick type spondyloepiphysis metaphysis dysplasia and the like (https:// mistor. OMIM. org/entry/120140), and COL2A1 gene-related diseases shown by OMIM are all normal chromosome dominant inheritance, however, COL2A1 gene c.3944G > A (p.Cys1315Tyr) mutation (detected by NGS technology based on the above strategies) causes osteogenesis imperfecta (Osteogenesis imperfecta, OI) never reported and proved.
Disclosure of Invention
The invention aims to provide a COL2A1 mutation gene causing bone dysplastic diseases and a detection reagent thereof aiming at the defects.
Another object of the present invention is to provide the use of such pathogenic mutations.
The purpose of the invention can be realized by the following technical scheme:
a mutated COL2a1 gene for detecting osteogenesis imperfecta disease, the mutated COL2a1 being heterozygous or homozygous mutation c.3944g > a, the wild type COL2a1 gene having the gene numbering in NCBI database of: NM-001844.4, the nucleotide of 3944bp of CDS is mutated from G to A, and the rest is the same as wild type. The CDS sequence of the wild-type COL2A1 gene is shown in SEQ ID NO. 1.
A mutant COL2a1 protein, wild-type COL2a1 protein, having the gene transcript number in NCBI database: NP-001835.3, mutant COL2A1 protein was mutated from cysteine to tyrosine at amino acid 1315 of the wild-type protein, and the rest was identical to the wild-type protein. The amino acid sequence of the wild type COL2A1 protein is shown in SEQ ID NO.2,
the application of the reagent for detecting the mutant COL2A1 gene or the mutant COL2A1 protein in preparing a detection reagent or detection equipment for the osteogenesis imperfecta disease.
The detection reagent is preferably selected from one or more of primers or primer pairs, probes, antibodies, or nucleic acid chips, high-throughput sequencing and Sanger sequencing.
The primer pair preferably consists of SEQ ID NO.3 and SEQ ID NO. 4.
The detection device preferably comprises a gene chip containing COL2A1 gene for detecting mutation, a high-throughput sequencing platform and a Sanger sequencing platform.
A kit for detecting osteogenesis imperfecta disease, said kit comprising:
(1) a reagent for detecting nucleotide 3944bp of CDS of COL2A1 gene; or a reagent for detecting the 1315 th amino acid site of the COL1A1 protein;
(2) the product use instruction contains the clear record that the nucleotide at 3944bp of CDS of COL2A1 gene is mutated from G to A, or the amino acid site at 1315 position of COL2A1 protein is mutated from C to Y to cause osteogenesis imperfecta.
Wherein the reagent is preferably selected from a primer or primer pair, a probe, an antibody, or a nucleic acid chip.
Preferably, the reagent is a gene chip hybridization probe based on deep sequencing as a platform.
The reagent is further preferably a primer pair for detecting nucleotide 3944bp of CDS of COL21A1 gene; still further preferred is a primer pair consisting of 5'-TGGACTTAGCTCATGCAGAT-3' (SEQ ID NO.3) and 5'-TGGATTGGGGTAGACGC-3' (SEQ ID NO. 4).
The gene chip hybridization probe sequence for detecting the nucleotide at 3944bp of CDS of COL2A1 gene in the kit is preferably shown as SEQ ID NO. 5.
A method for screening new mutations of COL2A1 genes in OI patients by taking deep sequencing as a platform and verifying the gene mutations into pathogenic gene mutations by combining a zebra fish mutation model with SIFT and Polyphen protein function prediction is disclosed: comprises the following steps:
(1) for families with fetal ultrasound showing skeletal dysplasia (+/-with other abnormalities) or with OI genetic disease history, collecting clinical data and DNA-containing specimens such as blood, tissues and the like, and extracting genomic DNA;
(2) a series of genes involved in the detection of bone dysplasia, including genes ADAMTSL2, AGPS, ANKH, ARSE, CCDC8, CHST3, COL10a1, COL2a1, COL9a1, COMP, CTSK, CUL 1, DLL 1, EBP, EVC 1, FBN1, FGFR1, FLNB, GNAS, GNPAT, HES 1, LFNG, LMNA, MATN 1, MESP 1, OBSL1, PEX 1, PTH 11, ROR 1, RUNX 1, SLC26 a1, SLC35D1, smarccal 1, SOST, SOX 1, TGFB1, TNFRS, ppf 11, trac 1, trx 1, ntb 1, ptx 1, pctfb 1, ptb 1, ptx 1, spb 1, ptx 1, spb 1, ptx 1, spb 1, spx 1, spf 1, spx 1, spf 1, spx 1, spf 1, spp 1, spf; or directly carrying out whole exon detection, wherein the detection mutation comprises point mutation of about 2 ten thousand genes and small fragment deletion insertion mutation within 20 bp.
(3) And breaking DNA and preparing a library, capturing and enriching a target gene coding region or a whole exon region and near-sheared DNA through a chip, and finally performing mutation detection by using a high-throughput sequencing platform.
(4) And (3) performing optimized bioinformatics analysis on the sequencing result, and screening a new OI pathogenic mutation to COL2A1. Cys1315Tyr. The mutation is located in chromosome 12, and the base with the physical position of 48368588(NCBI database) is mutated from G to A; protein level: the 1315 th amino acid of protein coded by COL2A1 gene is mutated from cysteine to tyrosine.
(5) The high-throughput sequencing in the step (3) is carried out, wherein the length of a target region based on whole exon sequencing is 58682415bp, the coverage of the target region reaches at least 99.91%, the average depth of the target region is at least 83.48X, and the proportion of sites with the average depth of the target region being more than 30X is at least 98.70%.
(6) Protein function prediction was performed on the new mutation site COL2A1.Cys1315Tyr with SIFT and Polyphen.
(7) The zebra fish model is used for verifying that the c.3944G > A point mutation on the COL2A1 gene influences skeletal development, the gene COL2a1a (ENSDARG00000069093) which is highly similar to the human COL2A1 gene is found in an ENSEMBL database, dominant expression in a human body is simulated by expressing the COL2a1a isopoint mutation gene in wild zebra fish, and the skeletal development condition of the zebra fish embryo is observed to verify that the new mutation site COL2A1.c.3944G > A causes skeletal development abnormality OI.
(8) Based on the detection result of the Trio family complete exon detection, the source of the mutation is identified to be a new mutation or a hereditary mutation, and if the mutation is the hereditary mutation, whether the mutation is the gonad chimeric mutation or not is also distinguished according to the mutation ratio of a parent side.
(9) If the mutation is new, the next pregnancy can be selected from natural conception and prenatal diagnosis, and the probability of suffering from the same mutation again is very low; if genetic mutation, test tube infants can be selected + pre-implantation genetic diagnosis + prenatal diagnosis, and implanted embryos are excluded from being inherited with the same mutation COL2A1.c.3944G > A.
Advantageous effects
1. The invention reports a new mutation site in an OI pathogenic gene COL2A1.c.3944G > A for the first time, which is pathogenic for autosomal dominant hereditary diseases, whether heterozygous mutation or homozygous mutation, and the invention provides a new pathogenic site for the diagnosis of the disease.
2. Provides a new idea for searching and verifying clinical new mutation, carries out Trio whole exon sequencing on difficult and complicated cases with complex clinical characteristics or unknown disease directions, analyzes the harmfulness of the new mutation by bioinformatics of families, carries out verification on animal models such as zebra fish and the like, combines diagnosis before implantation of tube-fed infants under the condition of informed consent, transplants and eliminates the embryo of the mutation COL2A1.c.3944G > A, thereby obtaining normal children without clinical abnormal bone development characteristics.
3. The method can be expanded to diagnosis and analysis of difficult cases with complex clinical characteristics or unknown disease directions in other fields except for bone dysplasia, and provides an effective platform for searching new mutation sites and even new mutant genes for clinic.
Drawings
FIG. 1 fetal ultrasound results chart
FIG. 2 family Sanger sequencing results
FIG. 3 conservation of human COL2A1 Gene to the site of homologous Gene mutation in Zebra fish, the mutation site described in the present invention (G3944A) and its counterpart in Zebra fish are indicated by boxes
FIG. 4 nucleotide and amino acid map of the sequence of the mutant site of zebra fish
FIG. 5 construction of zebra fish transcription and microinjection plasmids
Fig. 6 SIFT prediction results for the new locus in each database.
FIG. 7 the prediction of polyphen at the new site in each database.
FIG. 8 wild type zebra fish and COL2A1 gene c.3944G > A mutant zebra fish
FIG. 9 fetal Sanger sequencing results
FIG. 10 PGT-M fetal ultrasound results plot
Detailed Description
The present inventors have extensively and intensively studied and found that a novel mutation site of COL2A1, a gene related to OI, can be used for diagnosing the above-mentioned diseases and for developing a gene therapeutic drug effective for the above-mentioned diseases.
In detecting the variation at the relevant site, the detection may be directed to genomic DNA, to cDNA or mRNA, or to a protein. The mutation can be detected by using known techniques such as Western blotting, Southern blotting, DNA sequencing, PCR and in situ hybridization.
Various techniques can be used to detect the presence of a G to A mutation at position 3944 of the wild-type COL2A1 gene (SEQ ID NO.1), and are encompassed by the present invention. For example, gene chips and high throughput sequencing capture probes are prepared based on the relevant sites. In addition, PCR can be performed using primers specific to the relevant site for identification; or probes that specifically bind can be designed for identification based on the relevant sites; or may be identified using specific restriction enzymes.
As an optional mode, a single base extension technology based on a PCR technology can be adopted to detect the mutation site, and the principle is to design a primer which is positioned at the upstream of the mutation site to be detected, and the 3' end of the primer is one base away from the mutation site. Adding different fluorescently-labeled ddNTPs for reaction, or adding dNTP and related reaction enzyme through pyrosequencing, wherein the primer is extended only when the added ddNTP or dNTP is complementary with the base of the mutation site. The type of mutation can be determined by detecting fluorescence emitted from the extended base or visible light emitted from a series of enzyme reactions in pyrosequencing.
The invention also includes reagents for detecting the presence of the mutation site (presence of a G to A mutation at position 3944 of CDS of COL2A 1) in an analyte. The reagents are, for example: primers specific to the relevant mutation sites, wherein the amplified product contains a base corresponding to 3944 th site of the COL1A1 gene; a probe specific to the relevant mutation site, capable of specifically binding to the mutated region but not to the non-mutated region, and carrying a detectable signal; or a restriction enzyme specific for the relevant mutation site.
The kit may also include various reagents required for DNA extraction, RNA extraction, hybridization, color development, and the like, including but not limited to: an extraction solution, an amplification solution, a hybridization solution, an enzyme, a control solution, a color development solution, a washing solution, and the like.
In addition, the kit can also comprise instructions for use, nucleic acid sequence analysis software and the like.
The invention will be further illustrated with reference to the following specific examples.
Example 1
A fetus with abnormal bone development prompted by ultrasound is subjected to genetic detection.
The experimental method comprises the following steps:
1. for the husband chromosome 46, XY, inv (1) (p11q12), the collection of serial ultrasonic results of IVF-ET (PGT-A) pregnant single embryo transplantation pregnant women, and the case data collection: collecting umbilical cord tissue of the induced fetus and collecting peripheral blood of parents of the fetus for genetic diagnosis. Genomic DNA was extracted from blood and tissue samples of each member of the family using a blood genomic DNA extraction kit (Tiangen Biochemical technology Co., Ltd.).
2. Adopting a high-throughput sequencing technology to mine pathogenic mutation of the family: the detection of deletion and repeat syndrome of over 100kb of the whole genome and the detection of mutation of 1bp in the whole exon range are detected by the line CNV-seq. Clinical whole exome detection-exo detection region exome regions of about 2 ten thousand genes in the human genome, the detection strategy was an analysis of 3583 genes individually for all clear pathogenic relationships contained in the OMIM database involved in clinical complaints of subjects.
And 3, carrying out Trio full exon information analysis, namely filtering out joints, low-quality bases and undetected bases from original off-line data, then comparing the filtered data with a reference genome, carrying out SNP (single nucleotide polymorphism) detection and InDel or CNV (continuous nucleotide polymorphism) analysis, then annotating through a database, and screening harmful sites or genes related to diseases according to three analysis strategies based on variation harmfulness, sample family conditions and gene functional phenotypes on the variation detection result.
4. Through Sanger sequencing verification, the pathogenic genes are identified: the PCR method respectively amplifies the screened mutation sites and adjacent DNA sequences in corresponding families, the Primer sequences are designed by adopting Primer 5 Primer design software, and the sequences of the Primer pairs for detecting the pathogenic mutation are shown as SEQ ID NO.3 and SEQ ID NO. 4. The reaction system for the PCR used (50. mu.l system) was: 10 buffer 5. mu.l, 25mM MgCl 2 3μl,Taq DNA polymerase 5U, dNTP mix 2mM, forward and reverse primers 1.2. mu.M each, sterile distilled water to 50. mu.l. Placing into a PCR instrument, and heating at 94 deg.C for 3 min; (94 ℃, 25 s; 55 ℃, 25 s; 72 ℃, 20s)35 cycles; 7min at 72 ℃; storing at 4 deg.C. And (3) detecting by using a gel imager after 2% agarose electrophoresis, adding a Marker to judge the size of the fragment, carrying out Sanger sequencing on a sample with a single band and the fragment size meeting the size, and judging whether the site is mutated or not or judging whether the chimeric condition exists according to the mutation peak height.
The experimental results are as follows:
1. the ultrasonic department main and conception department carries out ultrasonic detection on the fetus in the abdomen of the pregnant woman, finds that the fetus has congenital dysplasia, fetal cervical lymphadenocystis cystoma, fetal edema, short and small development of limbs (namely, long bones of the limbs are in short strip-like echoes), the head and the trunk of the fetus are wrapped in cocoon-like shapes, the thorax is narrow and small, the abdomen is bulged (figure 1), the possibility of lethal dysplasia is preliminarily judged, and members with similar symptoms do not appear in the family.
2. After CNV-seq detection and whole exon detection are carried out on the DNA of a fetal induced delivery tissue sample, the fact that a c.3944G > A mutation exists on COL2A1 gene of a fetus, the mutation is VOUS mutation, namely the mutation with unknown clinical significance, other suspected pathogenic gene mutation sites are not found, and the result of whole exon secondary sequencing shows that the father of the fetus has low-proportion chimeric mutation sites at the same mutation sites, and the same mutation is not found in the peripheral blood DNA of the mother of the fetus. Sanger sequencing verification proves that the mutation of the gene locus is heterozygous mutation in a fetal sample, low-proportion chimeric mutation exists in the father peripheral blood, no mutation is found in the maternal peripheral blood, and the result is consistent with the result of second-generation sequencing (figure 2). The father is suspected to be gonadal chimerism, and children with the same mutation can be avoided by pre-implantation diagnosis of the test-tube infant.
3. According to the design scheme of the invention, the detected c.3944G > A mutation of the COL2A1 gene is successfully proved to be a new OI pathogenic site.
Example 2:
functional studies and knockout animal model studies were performed on the pathogenic gene detected in example 1, which is exemplified by the detected new mutation c.3944G > A in COL2A1 gene.
The experimental method comprises the following steps:
1. conservative analysis: the frequency of occurrence of the site in each database was evaluated.
2. And predicting the pathogenic capability of the mutation according to SIFT and polyphen values.
3. The animal model of gene knockout proves that the mutation site is a pathogenic mutation site.
(1) Analyzing the homologous genes and the point mutation positions of the COL2A1 in the zebra fish, and selecting the correct homologous genes in the zebra fish for preparing the point mutation; a gene which is highly similar to the human COL2A1 gene is found in the ENSEMBL website, namely ENSDARG00000069093, and the conservation of mutation positions is analyzed, as shown in figure 3. The comparison result shows that the site is conserved in the zebra fish gene, and the importance of the site function is suggested. To verify that mutation at this site in zebrafish results in a similar phenotype, the gene COL2a1a (ENSDARG00000069093) with higher similarity to human COL2A1 in zebrafish was selected for experiment.
(2) Method for verifying COL2A1(G3944A) point mutation function: in humans, COL2a1(G3944A) exhibits dominant skeletal dysplastic phenotype at embryonic stage, and thus can be verified by expressing COL2a1 a-like point mutant gene in wild-type zebrafish to simulate dominant expression in human body, and further by observing skeletal development of zebrafish embryos. Using the sequence of the transcript ENSDARG00000069093 of the col2a1a gene as a reference, primers were designed to clone the full length of the gene and construct the col2a1a (G3944A) point mutation, and the nucleotide and amino acid mapping of the sequence at the mutation site is shown in FIG. 4.
(3) Construction of a plasmid for expressing col2a1a (G3944A) in zebrafish: mutation points: G3944A, mutant primer: original plasmid is taken as a template, a promoter region is amplified, primers col2a1a-F1(5'-CCT CTG ACA CCT GAT GCC AAT TGC-3') and col2a1a-R1(5'-ATG CAG GTC CTA AGG GGT GAA AGT CG-3') are selected, zebra fish genome DNA is taken as the template, and three pairs of primers are used for amplifying col2a1a mutant fragments, namely col2a1a-M1F1(5'-ATG TTC AGA TTG CTG GAT TCA CG-3') and col2a1a-CDSR4(5'-GCC AAT TGG ACC AGT CAA ACC T-3'), col2a1a-F4(5'-AAG AGG TTT GAC TGG TCC AAT-3') and col2a1a-mutR (5'-CCA TGT TGT AGA AAA CTT TGA TGG CAT CAG CAG-3'), and col2a1a-mut (5' -GCC ATC AAA GTT TTC TAC AAC ATG GAG ACC GGA GAG ACC T2 a-mutR) (5'-CCA TGT TGT AGA AAA CTT TGA TGG CAT CAG CAG-3') -3') and col2a1a-CDSR5(5'-CAA GAA GCA GAC TGC GCC AAT GTC-3'). Synthesizing a plasmid by using a homologous recombination method, sending the plasmid to a sequencing company for sequencing after the plasmid is constructed, and analyzing a sequencing result to ensure that the plasmid is constructed correctly; strain preservation, plasmid extraction and purification. The concentration after plasmid purification was 400 ng/. mu.L. The method for expressing col2a1a (G3944A) selects DNA microinjection (using tol2 transposase to mediate high-efficiency transgenes), and uses the overexpression plasmid (figure 5) constructed by microinjection and transposase mRNA overexpression to confirm whether the influence is caused by phenotype.
(4) Phenotypic observations after expression of col2a1a (G3944A): after microinjection, the development of the overall morphology is continuously observed, and the development of trunk skeletons (with or without bending) is particularly concerned.
The experimental results are as follows:
2. Injection of col2a1a plasmid resulted in spinal column curvature in zebrafish as shown by spinal curvature phenotypic analysis (fig. 8).
Example 3:
making an assisted reproduction strategy, and selecting an assisted reproduction line implantation diagnosis PGT-M and a prenatal diagnosis experiment method by the informed consent of the couples:
the embryos remaining after the transplantation in example 1 are subjected to PGT-M detection, embryos (with normal chromosome CNV) without the mutation are selected for transplantation, the embryo is subjected to pregnancy at 18 weeks +, amniotic fluid puncture karyotype analysis, and COL2A1 gene c.3944G > A (p.Cys1315Tyr) mutation detection to further determine that no mutation exists at the site, genomic DNA is directly extracted from peripheral blood, tissues or amniotic fluid of a subject, a fragment of COL2A1 gene exon 52 c.3944G > A (p.Cys1315Tyr) is subjected to PCR amplification and Sanger sequencing, and the sequencing result is compared with a reference sequence (NM _001844) of the COL2A1 gene. Ultrasonic follow-up visit, and postnatal follow-up visit.
The experimental results are as follows:
1. from the gene level, the fetus (M3944, II-2) was not suggested to carry the COL2A1 gene c.3944G > A (p.Cys1315Tyr) variation (FIG. 9).
2. Ultrasound suggests no abnormality, normal newborn development after birth, no skeletal deformity, etc. (fig. 10).
Sequence listing
<110> Huanghuan
<120> pathogenic gene COL1A2 mutation of bone dysplastic disease and detection reagent thereof
<160> 5
<170> SIPOSequenceListing 1.0
<210> 1
<211> 5087
<212> DNA
<213> human (Homo sapiens)
<400> 1
aacgggcgcc gcggcgggga gaagacgcag agcgctgctg ggctgccggg tctcccgctt 60
ccccctcctg ctccaagggc ctcctgcatg agggcgcggt agagacccgg acccgcgccg 120
tgctcctgcc gtttcgctgc gctccgcccg ggcccggctc agccaggccc cgcggtgagc 180
catgattcgc ctcggggctc cccagacgct ggtgctgctg acgctgctcg tcgccgctgt 240
ccttcggtgt cagggccagg atgtccagga ggctggcagc tgtgtgcagg atgggcagag 300
gtataatgat aaggatgtgt ggaagccgga gccctgccgg atctgtgtct gtgacactgg 360
gactgtcctc tgcgacgaca taatctgtga agacgtgaaa gactgcctca gccctgagat 420
ccccttcgga gagtgctgcc ccatctgccc aactgacctc gccactgcca gtgggcaacc 480
aggaccaaag ggacagaaag gagaacctgg agacatcaag gatattgtag gacccaaagg 540
acctcctggg cctcagggac ctgcagggga acaaggaccc agaggggatc gtggtgacaa 600
aggtgaaaaa ggtgcccctg gacctcgtgg cagagatgga gaacctggga cccctggaaa 660
tcctggcccc cctggtcctc ccggcccccc tggtccccct ggtcttggtg gaaactttgc 720
tgcccagatg gctggaggat ttgatgaaaa ggctggtggc gcccagttgg gagtaatgca 780
aggaccaatg ggccccatgg gacctcgagg acctccaggc cctgcaggtg ctcctgggcc 840
tcaaggattt caaggcaatc ctggtgaacc tggtgaacct ggtgtctctg gtcccatggg 900
tccccgtggt cctcctggtc cccctggaaa gcctggtgat gatggtgaag ctggaaaacc 960
tggaaaagct ggtgaaaggg gtccgcctgg tcctcagggt gctcgtggtt tcccaggaac 1020
cccaggcctt cctggtgtca aaggtcacag aggttatcca ggcctggacg gtgctaaggg 1080
agaggcgggt gctcctggtg tgaagggtga gagtggttcc ccgggtgaga acggatctcc 1140
gggcccaatg ggtcctcgtg gcctgcctgg tgaaagagga cggactggcc ctgctggcgc 1200
tgcgggtgcc cgaggcaacg atggtcagcc aggccccgca gggcctccgg gtcctgtcgg 1260
tcctgctggt ggtcctggct tccctggtgc tcctggagcc aagggtgaag ccggccccac 1320
tggtgcccgt ggtcctgaag gtgctcaagg tcctcgcggt gaacctggta ctcctgggtc 1380
ccctgggcct gctggtgcct ccggtaaccc tggaacagat ggaattcctg gagccaaagg 1440
atctgctggt gctcctggca ttgctggtgc tcctggcttc cctgggccac ggggccctcc 1500
tggccctcaa ggtgcaactg gtcctctggg cccgaaaggt cagacgggtg aacctggtat 1560
tgctggcttc aaaggtgaac aaggccccaa gggagaacct ggccctgctg gcccccaggg 1620
agcccctgga cccgctggtg aagaaggcaa gagaggtgcc cgtggagagc ctggtggcgt 1680
tgggcccatc ggtccccctg gagaaagagg tgctcccggc aaccgcggtt tcccaggtca 1740
agatggtctg gcaggtccca agggagcccc tggagagcga gggcccagtg gtcttgctgg 1800
ccccaaggga gccaacggtg accctggccg tcctggagaa cctggccttc ctggagcccg 1860
gggtctcact ggccgccctg gtgatgctgg tcctcaaggc aaagttggcc cttctggagc 1920
ccctggtgaa gatggtcgtc ctggacctcc aggtcctcag ggggctcgtg ggcagcctgg 1980
tgtcatgggt ttccctggcc ccaaaggtgc caacggtgag cctggcaaag ctggtgagaa 2040
gggactgcct ggtgctcctg gtctgagggg tcttcctggc aaagatggtg agacaggtgc 2100
tgcaggaccc cctggccctg ctggacctgc tggtgaacga ggcgagcagg gtgctcctgg 2160
gccatctggg ttccagggac ttcctggccc tcctggtccc ccaggtgaag gtggaaaacc 2220
aggtgaccag ggtgttcccg gtgaagctgg agcccctggc ctcgtgggtc ccaggggtga 2280
acgaggtttc ccaggtgaac gtggctctcc cggtgcccag ggcctccagg gtccccgtgg 2340
cctccccggc actcctggca ctgatggtcc caaaggtgca tctggcccag caggcccccc 2400
tggggctcag ggccctccag gtcttcaggg aatgcctggc gagaggggag cagctggtat 2460
cgctgggccc aaaggcgaca ggggtgacgt tggtgagaaa ggccctgagg gagcccctgg 2520
aaaggatggt ggacgaggcc tgacaggtcc cattggcccc cctggcccag ctggtgctaa 2580
tggcgagaag ggagaagttg gacctcctgg tcctgcagga agtgctggtg ctcgtggcgc 2640
tccgggtgaa cgtggagaga ctgggccccc cggaccagcg ggatttgctg ggcctcctgg 2700
tgctgatggc cagcctgggg ccaagggtga gcaaggagag gccggccaga aaggcgatgc 2760
tggtgcccct ggtcctcagg gcccctctgg agcacctggg cctcagggtc ctactggagt 2820
gactggtcct aaaggagccc gaggtgccca aggccccccg ggagccactg gattccctgg 2880
agctgctggc cgcgttggac ccccaggctc caatggcaac cctggacccc ctggtccccc 2940
tggtccttct ggaaaagatg gtcccaaagg tgctcgagga gacagcggcc cccctggccg 3000
agctggtgaa cccggcctcc aaggtcctgc tggaccccct ggcgagaagg gagagcctgg 3060
agatgacggt ccctctggtg ccgaaggtcc accaggtccc cagggtctgg ctggtcagag 3120
aggcatcgtc ggtctgcctg ggcaacgtgg tgagagagga ttccctggct tgcctggccc 3180
gtcgggtgag cccggcaagc agggtgctcc tggagcatct ggagacagag gtcctcctgg 3240
ccccgtgggt cctcctggcc tgacgggtcc tgcaggtgaa cctggacgag agggaagccc 3300
cggtgctgat ggcccccctg gcagagatgg cgctgctgga gtcaagggtg atcgtggtga 3360
gactggtgct gtgggagctc ctggagcccc tgggccccct ggctcccctg gccccgctgg 3420
tccaactggc aagcaaggag acagaggaga agctggtgca caaggcccca tgggaccctc 3480
aggaccagct ggagcccggg gaatccaggg tcctcaaggc cccagaggtg acaaaggaga 3540
ggctggagag cctggcgaga gaggcctgaa gggacaccgt ggcttcactg gtctgcaggg 3600
tctgcccggc cctcctggtc cttctggaga ccaaggtgct tctggtcctg ctggtccttc 3660
tggccctaga ggtcctcctg gccccgtcgg tccctctggc aaagatggtg ctaatggaat 3720
ccctggcccc attgggcctc ctggtccccg tggacgatca ggcgaaaccg gccctgctgg 3780
tcctcctgga aatcctggac cccctggtcc tccaggtccc cctggccctg gcatcgacat 3840
gtccgccttt gctggcttag gcccgagaga gaagggcccc gaccccctgc agtacatgcg 3900
ggccgaccag gcagccggtg gcctgagaca gcatgacgcc gaggtggatg ccacactcaa 3960
gtccctcaac aaccagattg agagcatccg cagccccgag ggctcccgca agaaccctgc 4020
tcgcacctgc agagacctga aactctgcca ccctgagtgg aagagtggag actactggat 4080
tgaccccaac caaggctgca ccttggacgc catgaaggtt ttctgcaaca tggagactgg 4140
cgagacttgc gtctacccca atccagcaaa cgttcccaag aagaactggt ggagcagcaa 4200
gagcaaggag aagaaacaca tctggtttgg agaaaccatc aatggtggct tccatttcag 4260
ctatggagat gacaatctgg ctcccaacac tgccaacgtc cagatgacct tcctacgcct 4320
gctgtccacg gaaggctccc agaacatcac ctaccactgc aagaacagca ttgcctatct 4380
ggacgaagca gctggcaacc tcaagaaggc cctgctcatc cagggctcca atgacgtgga 4440
gatccgggca gagggcaata gcaggttcac gtacactgcc ctgaaggatg gctgcacgaa 4500
acataccggt aagtggggca agactgttat cgagtaccgg tcacagaaga cctcacgcct 4560
ccccatcatt gacattgcac ccatggacat aggagggccc gagcaggaat tcggtgtgga 4620
catagggccg gtctgcttct tgtaaaaacc tgaacccaga aacaacacaa tccgttgcaa 4680
acccaaagga cccaagtact ttccaatctc agtcactcta ggactctgca ctgaatggct 4740
gacctgacct gatgtccatt catcccaccc tctcacagtt cggacttttc tcccctctct 4800
ttctaagaga cctgaactgg gcagactgca aaataaaatc tcggtgttct atttatttat 4860
tgtcttcctg taagaccttc gggtcaaggc agaggcagga aactaactgg tgtgagtcaa 4920
atgccccctg agtgactgcc cccagcccag gccagaagac ctcccttcag gtgccgggcg 4980
caggaactgt gtgtgtccta cacaatggtg ctattctgtg tcaaacacct ctgtattttt 5040
taaaacatca attgatatta aaaatgaaaa gattattgga aagtaca 5087
<210> 2
<211> 1487
<212> PRT
<213> human (Homo sapiens)
<400> 2
Met Ile Arg Leu Gly Ala Pro Gln Thr Leu Val Leu Leu Thr Leu Leu
1 5 10 15
Val Ala Ala Val Leu Arg Cys Gln Gly Gln Asp Val Gln Glu Ala Gly
20 25 30
Ser Cys Val Gln Asp Gly Gln Arg Tyr Asn Asp Lys Asp Val Trp Lys
35 40 45
Pro Glu Pro Cys Arg Ile Cys Val Cys Asp Thr Gly Thr Val Leu Cys
50 55 60
Asp Asp Ile Ile Cys Glu Asp Val Lys Asp Cys Leu Ser Pro Glu Ile
65 70 75 80
Pro Phe Gly Glu Cys Cys Pro Ile Cys Pro Thr Asp Leu Ala Thr Ala
85 90 95
Ser Gly Gln Pro Gly Pro Lys Gly Gln Lys Gly Glu Pro Gly Asp Ile
100 105 110
Lys Asp Ile Val Gly Pro Lys Gly Pro Pro Gly Pro Gln Gly Pro Ala
115 120 125
Gly Glu Gln Gly Pro Arg Gly Asp Arg Gly Asp Lys Gly Glu Lys Gly
130 135 140
Ala Pro Gly Pro Arg Gly Arg Asp Gly Glu Pro Gly Thr Pro Gly Asn
145 150 155 160
Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Leu Gly
165 170 175
Gly Asn Phe Ala Ala Gln Met Ala Gly Gly Phe Asp Glu Lys Ala Gly
180 185 190
Gly Ala Gln Leu Gly Val Met Gln Gly Pro Met Gly Pro Met Gly Pro
195 200 205
Arg Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Pro Gln Gly Phe Gln
210 215 220
Gly Asn Pro Gly Glu Pro Gly Glu Pro Gly Val Ser Gly Pro Met Gly
225 230 235 240
Pro Arg Gly Pro Pro Gly Pro Pro Gly Lys Pro Gly Asp Asp Gly Glu
245 250 255
Ala Gly Lys Pro Gly Lys Ala Gly Glu Arg Gly Pro Pro Gly Pro Gln
260 265 270
Gly Ala Arg Gly Phe Pro Gly Thr Pro Gly Leu Pro Gly Val Lys Gly
275 280 285
His Arg Gly Tyr Pro Gly Leu Asp Gly Ala Lys Gly Glu Ala Gly Ala
290 295 300
Pro Gly Val Lys Gly Glu Ser Gly Ser Pro Gly Glu Asn Gly Ser Pro
305 310 315 320
Gly Pro Met Gly Pro Arg Gly Leu Pro Gly Glu Arg Gly Arg Thr Gly
325 330 335
Pro Ala Gly Ala Ala Gly Ala Arg Gly Asn Asp Gly Gln Pro Gly Pro
340 345 350
Ala Gly Pro Pro Gly Pro Val Gly Pro Ala Gly Gly Pro Gly Phe Pro
355 360 365
Gly Ala Pro Gly Ala Lys Gly Glu Ala Gly Pro Thr Gly Ala Arg Gly
370 375 380
Pro Glu Gly Ala Gln Gly Pro Arg Gly Glu Pro Gly Thr Pro Gly Ser
385 390 395 400
Pro Gly Pro Ala Gly Ala Ser Gly Asn Pro Gly Thr Asp Gly Ile Pro
405 410 415
Gly Ala Lys Gly Ser Ala Gly Ala Pro Gly Ile Ala Gly Ala Pro Gly
420 425 430
Phe Pro Gly Pro Arg Gly Pro Pro Gly Pro Gln Gly Ala Thr Gly Pro
435 440 445
Leu Gly Pro Lys Gly Gln Thr Gly Glu Pro Gly Ile Ala Gly Phe Lys
450 455 460
Gly Glu Gln Gly Pro Lys Gly Glu Pro Gly Pro Ala Gly Pro Gln Gly
465 470 475 480
Ala Pro Gly Pro Ala Gly Glu Glu Gly Lys Arg Gly Ala Arg Gly Glu
485 490 495
Pro Gly Gly Val Gly Pro Ile Gly Pro Pro Gly Glu Arg Gly Ala Pro
500 505 510
Gly Asn Arg Gly Phe Pro Gly Gln Asp Gly Leu Ala Gly Pro Lys Gly
515 520 525
Ala Pro Gly Glu Arg Gly Pro Ser Gly Leu Ala Gly Pro Lys Gly Ala
530 535 540
Asn Gly Asp Pro Gly Arg Pro Gly Glu Pro Gly Leu Pro Gly Ala Arg
545 550 555 560
Gly Leu Thr Gly Arg Pro Gly Asp Ala Gly Pro Gln Gly Lys Val Gly
565 570 575
Pro Ser Gly Ala Pro Gly Glu Asp Gly Arg Pro Gly Pro Pro Gly Pro
580 585 590
Gln Gly Ala Arg Gly Gln Pro Gly Val Met Gly Phe Pro Gly Pro Lys
595 600 605
Gly Ala Asn Gly Glu Pro Gly Lys Ala Gly Glu Lys Gly Leu Pro Gly
610 615 620
Ala Pro Gly Leu Arg Gly Leu Pro Gly Lys Asp Gly Glu Thr Gly Ala
625 630 635 640
Ala Gly Pro Pro Gly Pro Ala Gly Pro Ala Gly Glu Arg Gly Glu Gln
645 650 655
Gly Ala Pro Gly Pro Ser Gly Phe Gln Gly Leu Pro Gly Pro Pro Gly
660 665 670
Pro Pro Gly Glu Gly Gly Lys Pro Gly Asp Gln Gly Val Pro Gly Glu
675 680 685
Ala Gly Ala Pro Gly Leu Val Gly Pro Arg Gly Glu Arg Gly Phe Pro
690 695 700
Gly Glu Arg Gly Ser Pro Gly Ala Gln Gly Leu Gln Gly Pro Arg Gly
705 710 715 720
Leu Pro Gly Thr Pro Gly Thr Asp Gly Pro Lys Gly Ala Ser Gly Pro
725 730 735
Ala Gly Pro Pro Gly Ala Gln Gly Pro Pro Gly Leu Gln Gly Met Pro
740 745 750
Gly Glu Arg Gly Ala Ala Gly Ile Ala Gly Pro Lys Gly Asp Arg Gly
755 760 765
Asp Val Gly Glu Lys Gly Pro Glu Gly Ala Pro Gly Lys Asp Gly Gly
770 775 780
Arg Gly Leu Thr Gly Pro Ile Gly Pro Pro Gly Pro Ala Gly Ala Asn
785 790 795 800
Gly Glu Lys Gly Glu Val Gly Pro Pro Gly Pro Ala Gly Ser Ala Gly
805 810 815
Ala Arg Gly Ala Pro Gly Glu Arg Gly Glu Thr Gly Pro Pro Gly Pro
820 825 830
Ala Gly Phe Ala Gly Pro Pro Gly Ala Asp Gly Gln Pro Gly Ala Lys
835 840 845
Gly Glu Gln Gly Glu Ala Gly Gln Lys Gly Asp Ala Gly Ala Pro Gly
850 855 860
Pro Gln Gly Pro Ser Gly Ala Pro Gly Pro Gln Gly Pro Thr Gly Val
865 870 875 880
Thr Gly Pro Lys Gly Ala Arg Gly Ala Gln Gly Pro Pro Gly Ala Thr
885 890 895
Gly Phe Pro Gly Ala Ala Gly Arg Val Gly Pro Pro Gly Ser Asn Gly
900 905 910
Asn Pro Gly Pro Pro Gly Pro Pro Gly Pro Ser Gly Lys Asp Gly Pro
915 920 925
Lys Gly Ala Arg Gly Asp Ser Gly Pro Pro Gly Arg Ala Gly Glu Pro
930 935 940
Gly Leu Gln Gly Pro Ala Gly Pro Pro Gly Glu Lys Gly Glu Pro Gly
945 950 955 960
Asp Asp Gly Pro Ser Gly Ala Glu Gly Pro Pro Gly Pro Gln Gly Leu
965 970 975
Ala Gly Gln Arg Gly Ile Val Gly Leu Pro Gly Gln Arg Gly Glu Arg
980 985 990
Gly Phe Pro Gly Leu Pro Gly Pro Ser Gly Glu Pro Gly Lys Gln Gly
995 1000 1005
Ala Pro Gly Ala Ser Gly Asp Arg Gly Pro Pro Gly Pro Val Gly Pro
1010 1015 1020
Pro Gly Leu Thr Gly Pro Ala Gly Glu Pro Gly Arg Glu Gly Ser Pro
1025 1030 1035 1040
Gly Ala Asp Gly Pro Pro Gly Arg Asp Gly Ala Ala Gly Val Lys Gly
1045 1050 1055
Asp Arg Gly Glu Thr Gly Ala Val Gly Ala Pro Gly Ala Pro Gly Pro
1060 1065 1070
Pro Gly Ser Pro Gly Pro Ala Gly Pro Thr Gly Lys Gln Gly Asp Arg
1075 1080 1085
Gly Glu Ala Gly Ala Gln Gly Pro Met Gly Pro Ser Gly Pro Ala Gly
1090 1095 1100
Ala Arg Gly Ile Gln Gly Pro Gln Gly Pro Arg Gly Asp Lys Gly Glu
1105 1110 1115 1120
Ala Gly Glu Pro Gly Glu Arg Gly Leu Lys Gly His Arg Gly Phe Thr
1125 1130 1135
Gly Leu Gln Gly Leu Pro Gly Pro Pro Gly Pro Ser Gly Asp Gln Gly
1140 1145 1150
Ala Ser Gly Pro Ala Gly Pro Ser Gly Pro Arg Gly Pro Pro Gly Pro
1155 1160 1165
Val Gly Pro Ser Gly Lys Asp Gly Ala Asn Gly Ile Pro Gly Pro Ile
1170 1175 1180
Gly Pro Pro Gly Pro Arg Gly Arg Ser Gly Glu Thr Gly Pro Ala Gly
1185 1190 1195 1200
Pro Pro Gly Asn Pro Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro
1205 1210 1215
Gly Ile Asp Met Ser Ala Phe Ala Gly Leu Gly Pro Arg Glu Lys Gly
1220 1225 1230
Pro Asp Pro Leu Gln Tyr Met Arg Ala Asp Gln Ala Ala Gly Gly Leu
1235 1240 1245
Arg Gln His Asp Ala Glu Val Asp Ala Thr Leu Lys Ser Leu Asn Asn
1250 1255 1260
Gln Ile Glu Ser Ile Arg Ser Pro Glu Gly Ser Arg Lys Asn Pro Ala
1265 1270 1275 1280
Arg Thr Cys Arg Asp Leu Lys Leu Cys His Pro Glu Trp Lys Ser Gly
1285 1290 1295
Asp Tyr Trp Ile Asp Pro Asn Gln Gly Cys Thr Leu Asp Ala Met Lys
1300 1305 1310
Val Phe Cys Asn Met Glu Thr Gly Glu Thr Cys Val Tyr Pro Asn Pro
1315 1320 1325
Ala Asn Val Pro Lys Lys Asn Trp Trp Ser Ser Lys Ser Lys Glu Lys
1330 1335 1340
Lys His Ile Trp Phe Gly Glu Thr Ile Asn Gly Gly Phe His Phe Ser
1345 1350 1355 1360
Tyr Gly Asp Asp Asn Leu Ala Pro Asn Thr Ala Asn Val Gln Met Thr
1365 1370 1375
Phe Leu Arg Leu Leu Ser Thr Glu Gly Ser Gln Asn Ile Thr Tyr His
1380 1385 1390
Cys Lys Asn Ser Ile Ala Tyr Leu Asp Glu Ala Ala Gly Asn Leu Lys
1395 1400 1405
Lys Ala Leu Leu Ile Gln Gly Ser Asn Asp Val Glu Ile Arg Ala Glu
1410 1415 1420
Gly Asn Ser Arg Phe Thr Tyr Thr Ala Leu Lys Asp Gly Cys Thr Lys
1425 1430 1435 1440
His Thr Gly Lys Trp Gly Lys Thr Val Ile Glu Tyr Arg Ser Gln Lys
1445 1450 1455
Thr Ser Arg Leu Pro Ile Ile Asp Ile Ala Pro Met Asp Ile Gly Gly
1460 1465 1470
Pro Glu Gln Glu Phe Gly Val Asp Ile Gly Pro Val Cys Phe Leu
1475 1480 1485
<210> 3
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
tggacttagc tcatgcagat 20
<210> 4
<211> 17
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
tggattgggg tagacgc 17
<210> 5
<211> 21
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
aaggttttct gcaacatgga g 21
Claims (2)
1. The application of a reagent for detecting the mutant COL2A1 gene in preparing a detection reagent for the osteogenesis imperfecta disease; the mutant COL2a1 is either a heterozygous or homozygous mutation c.3944g > a, and the gene numbering of the wild-type COL2a1 gene in the NCBI database is: NM-001844.4, the nucleotide of 3944bp of CDS is mutated from G to A, and the rest is the same as wild type.
2. The use according to claim 1, wherein the reagent for detecting the mutant COL2A1 gene is selected from one or more of a probe and a primer for detecting the mutant COL2A1 gene.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011551982.5A CN112608925B (en) | 2020-12-24 | 2020-12-24 | Pathogenic gene COL2A1 mutation of bone dysplasia disease and detection reagent thereof |
PCT/CN2020/141426 WO2022134165A1 (en) | 2020-12-24 | 2020-12-30 | Pathogenic gene col1a2 mutation of bone dysplasia disease and detection reagent thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011551982.5A CN112608925B (en) | 2020-12-24 | 2020-12-24 | Pathogenic gene COL2A1 mutation of bone dysplasia disease and detection reagent thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112608925A CN112608925A (en) | 2021-04-06 |
CN112608925B true CN112608925B (en) | 2022-08-30 |
Family
ID=75244741
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011551982.5A Expired - Fee Related CN112608925B (en) | 2020-12-24 | 2020-12-24 | Pathogenic gene COL2A1 mutation of bone dysplasia disease and detection reagent thereof |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112608925B (en) |
WO (1) | WO2022134165A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110272900B (en) * | 2019-04-19 | 2024-03-26 | 中国人民解放军陆军军医大学 | sgRNA for preparing skeletal dysplasia pig model and application thereof |
CN116004799B (en) * | 2022-11-30 | 2024-04-26 | 湖南家辉生物技术有限公司 | CRTAP pathogenic mutant and application thereof in preparation of gristle syndrome VII type diagnosis kit |
CN116259422B (en) * | 2023-03-13 | 2024-02-06 | 暨南大学 | Virtual data enhancement-based ophthalmic disease diagnosis and treatment opinion generation method, system, medium and equipment |
CN116855606B (en) * | 2023-06-15 | 2024-02-02 | 华中科技大学 | Gene mutant, detection primer and kit for heart myxoma |
CN117223676A (en) * | 2023-09-25 | 2023-12-15 | 武汉大学 | Breeding method, auxiliary breeding reagent and preventive medicine for malformation animal in middle of face |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106755395B (en) * | 2016-12-16 | 2020-05-12 | 山东第一医科大学(山东省医学科学院) | Mutation site of XI type osteogenesis imperfecta pathogenic gene FKBP10 and application thereof |
CN109897894B (en) * | 2018-12-27 | 2022-08-30 | 黄欢 | Pathogenic mutation of osteogenesis imperfecta disease and detection reagent thereof |
CN109943633A (en) * | 2019-04-26 | 2019-06-28 | 苏州恩科金生物科技有限公司 | The Primer composition and probe of a kind of osteogenesis imperfecta Disease-causing gene COL1A1 and its application |
CN111549127A (en) * | 2020-06-22 | 2020-08-18 | 山东第一医科大学(山东省医学科学院) | Primers for amplification and mutation detection of human COL1A1 and/or COL1A2 genes, and kit thereof |
-
2020
- 2020-12-24 CN CN202011551982.5A patent/CN112608925B/en not_active Expired - Fee Related
- 2020-12-30 WO PCT/CN2020/141426 patent/WO2022134165A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
CN112608925A (en) | 2021-04-06 |
WO2022134165A1 (en) | 2022-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112608925B (en) | Pathogenic gene COL2A1 mutation of bone dysplasia disease and detection reagent thereof | |
WO2020133233A1 (en) | Pathogenic mutation of osteogenesis imperfecta disease and detection reagent therefor | |
CN109628573B (en) | Kit for noninvasive prenatal detection of 12 chromosome microdeletion and microduplication syndrome and special probe set thereof | |
CN103374627B (en) | Associated Gene of Congenital Heart Disease PKD1L1 and application thereof | |
RU2601151C2 (en) | Method of simultaneous genodiagnostic of two mutant alleles causing cvm and blad in cattle, and test system for it | |
JP5301281B2 (en) | Organ-specific gene, identification method thereof and use thereof | |
CN114875148A (en) | Familial multiple lipoma detection kit and application of primer group | |
KR101149696B1 (en) | Method of determining size of fatty acid content in intramuscular fat of cattle based on genotype of fatty acid synthase and method of determining eating quality of beef based on the result | |
CN110628898B (en) | BAZ1B susceptibility SNP locus detection reagent and kit prepared by same | |
KR102269653B1 (en) | A SNP marker composition and a method for diagnosis or prediction of osteochondrodyplasia in cats | |
CN109750098B (en) | ATP7B gene large fragment deletion detection kit and detection method | |
CN113265409B (en) | TIMM21 mutant gene, primer, kit and method for detecting same and application thereof | |
CN113265405B (en) | SAMM50 mutant gene, primer, kit and method for detecting same, and use thereof | |
WO2016054135A1 (en) | Next-generation sequencing for phased hla class i antigen recognition domain exons | |
CN113355405A (en) | TOMM20 mutant gene, primer, kit and method for detecting same and application thereof | |
CN113234134B (en) | Distal joint contracture syndrome pathogenic gene MYH3 and application thereof | |
CN113186193A (en) | HSCB mutant gene, primer, kit and method for detecting HSCB mutant gene, and application of HSCB mutant gene | |
CN113215169A (en) | TIMM44 mutant gene, primer, kit and method for detecting same and application thereof | |
CN113403378A (en) | TIMM13 mutant gene, primer, kit and method for detecting same and application thereof | |
WO2019016292A1 (en) | Prenatal screening and diagnostic system and method | |
CN113186274A (en) | GRPEL1 mutant gene, primer, kit and method for detecting GRPEL1 mutant gene and application of GRPEL1 mutant gene | |
CN113151288B (en) | Mutant HoxA10 gene and application thereof | |
RU2746126C9 (en) | Method for preparing dna library | |
CN111139297B (en) | Kit for preimplantation embryo genetic diagnosis and prenatal diagnosis of DMD | |
CN113201547A (en) | CHCHD4 mutant gene, primer, kit and method for detecting same and application thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220830 |