KR102250063B1 - 뚜렛증후군의 원인 유전자를 동정하는 방법 - Google Patents
뚜렛증후군의 원인 유전자를 동정하는 방법 Download PDFInfo
- Publication number
- KR102250063B1 KR102250063B1 KR1020190070779A KR20190070779A KR102250063B1 KR 102250063 B1 KR102250063 B1 KR 102250063B1 KR 1020190070779 A KR1020190070779 A KR 1020190070779A KR 20190070779 A KR20190070779 A KR 20190070779A KR 102250063 B1 KR102250063 B1 KR 102250063B1
- Authority
- KR
- South Korea
- Prior art keywords
- gly
- pro
- leu
- ser
- thr
- Prior art date
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B25/00—ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
- G16B25/10—Gene or protein expression profiling; Expression-ratio estimation or normalisation
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Physics & Mathematics (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Molecular Biology (AREA)
- Analytical Chemistry (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Organic Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Wood Science & Technology (AREA)
- Zoology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Microbiology (AREA)
- General Engineering & Computer Science (AREA)
- Pathology (AREA)
- Biochemistry (AREA)
- Immunology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
본 발명은 뚜렛증후군의 원인 유전자를 동정하는 방법에 관한 것이다. 본 발명의 뚜렛증후군 유전자 군의 분석을 통해 효과적으로 뚜렛증후군을 진단할 수 있다. 또한, 본 발명에 따른 뚜렛증후군 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법은 뚜렛증후군 원인 유전자의 동정 효율을 높이면서 비용 및 시간을 절감할 수 있다.
Description
본 발명은 뚜렛증후군의 원인 유전자를 동정하는 방법에 관한 것이다.
뚜렛증후군(tourette syndrome, TS)은 신경질환의 한 종류로서, 스스로 조절하기 힘든 갑작스럽고 단순하며 반복적인 동작인 운동틱이나 소리를 내는 현상인 음성틱을 나타내는 가장 흔한 원인으로 알려져 있다. 뚜렛증후군은 일시적 및 영구(만성) 틱을 포함하는 틱 장애 스펙트럼의 일부로 정의되고, 정확한 원인은 알려지지 않았지만 유전적 요소와 환경적 요소가 결합된 것으로 여겨진다. 뚜렛증후군은 흔히 다른 질병이 동반되는데, 뚜렛증후군의 60%에서 주의력결핍 과잉행동장애가 동반되었다는 보고가 있으며, 그 외에도 강박장애(27%)나 강박적 행동(32%), 학습장애(23%), 행동장애/적대적반항장애(15%)가 함께 나타날 수 있다(권순재, 이진영 및 신재필, 2012).
한편, 차세대 시퀀싱(next-generation sequenceing, NGS) 기술에 기반한 전체엑솜염기서열분석(whole exome sequencing, WES)의 등장으로 다양한 질환의 원인유전자 및 그의 돌연변이 등 다양한 질환의 유전적 연구가 빠르게 확대되고 있다.
현재까지 난치 질환 중 하나인 뚜렛증후군의 경우, 진단에 대한 구체적인 검사는 현재 없고 의사의 소견과 MRI 검사가 활용되고 있다. 따라서, 뚜렛증후군과 관련되는 모든 원인유전자에 대한 분석이 가능한 진단시스템 개발의 필요성이 대두되고 있다. 특히, 원인 유전자가 1-2개가 아닌 이질적 질환의 원인 유전자 및 돌연변이 확인에 NGS와 같은 유전적 연구가 매우 적절한 방법으로 여겨지고 있다.
권순재, 이진영 및 신재필, 뚜렛 증후군을 가진 15세 소아에서 발생한 양측성 망막박리. J Korean Ophthalmol Soc 2012;53(11):1704-1707
이러한 배경하에서, 본 연구자들은 뚜렛증후군의 원인 유전자 및 CNV(copy number variation)를 동정하는 방법을 개발하기 위해 예의 연구 노력한 결과, 뚜렛증후군의 원인 유전자 동정하였고, 이를 동정하기 위한 알고리즘을 확립하여 진단 시스템을 구축함으로써 본 발명을 완성하였다.
본 발명의 목적은 뚜렛증후군의 원인 유전자를 동정하는 방법을 제공하는 것이다.
상기 목적을 달성하기 위하여, 본 발명은 COL27A1, BTBD9, SGCE, MECP2, USH2A, CEP290, DRD5, ASIC3-1, ASIC3-2 및 TRAPPC9로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물을 제공한다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물을 제공한다.
또한, 본 발명은 TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질 또는 이를 코딩하는 유전자를 검출하는 제제를 포함하는 뚜렛증후군 진단용 키트를 제공한다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 유효성분으로 포함하는 뚜렛증후군 진단용 키트를 제공한다.
또한, 본 발명은 TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩을 제공한다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩을 제공한다.
또한, 본 발명은 1) 뚜렛증후군 환자, 그의 부모 또는 형제자매로부터 SNV(single nucleotide variation) 및 CNV(copy number variation) 데이터를 수득하는 단계; 2) 상기 SNV 및 CNV 데이터를 맵핑하는 단계; 및 3) 뚜렛증후군 원인 유전자의 변이된 위치를 확인 또는 CNV 변이를 확인하는 단계를 포함하는 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법을 제공한다.
또한, 본 발명은 1) 뚜렛증후군 SNV 데이터, CNV 데이터 및 가족 정보가 입력되는 데이터 취득부; 2) 상기 입력된 SNV 데이터, CNV 데이터 및 가족 정보를 기 설정된 수식 및 윈도우를 이용하여 우선순위 점수 연산을 수행하는 데이터 연산부; 3) 상기 연산부에서 연산된 우선순위 점수에 따라 선정된 SNV 및 CNV, 그리고 뚜렛증후군의 SNV 및 CNV 데이터를 맵핑하는 맵핑부; 및 4) 상기 맵핑된 SNV 및 CNV를 이용하여 뚜렛증후군 위험 여부를 출력하는 동정부를 포함하는 뚜렛증후군 진단용 시스템을 제공한다.
또한, 본 발명은 1) 뚜렛증후군이 의심되는 개체로부터 분리된 시료에서 SNV 관련 유전자 또는 CNV 관련 유전자의 변이를 확인하는 단계; 및 2) 상기 SNV 관련 유전자 또는 CNV 관련 유전자가 변이가 일어난 경우, 개체를 뚜렛증후군으로 판정하는 단계를 포함하는, 뚜렛증후군 진단에 대한 정보의 제공방법을 제공한다.
본 발명의 뚜렛증후군 유전자 군의 분석을 통해 효과적으로 뚜렛증후군을 진단할 수 있다. 또한, 본 발명에 따른 뚜렛증후군 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법은 뚜렛증후군 원인 유전자의 동정 효율을 높이면서 비용 및 시간을 절감할 수 있다.
도 1은 뚜렛증후군 샘플의 뉴클레오티드 변이를 분석방법을 도식화한 것이다.
도 2는 뚜렛증후군 샘플의 뉴클레오티드 변이 분석 및 CNV 변이 분석을 통하여 뚜렛증후군 관련 유전자를 도출하는 분석방법을 도식화한 것이다.
도 2는 뚜렛증후군 샘플의 뉴클레오티드 변이 분석 및 CNV 변이 분석을 통하여 뚜렛증후군 관련 유전자를 도출하는 분석방법을 도식화한 것이다.
본 발명은 일 측면으로, COL27A1, BTBD9, SGCE, MECP2, USH2A, CEP290, DRD5, ASIC3-1, ASIC3-2 및 TRAPPC9로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물을 제공한다.
이때, 상기 COL27A1이 서열번호 1의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 BTBD9가 서열번호 3의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 SGCE가 서열번호 5의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 MECP2가 서열번호 7의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USH2A가 서열번호 9의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CEP290가 서열번호 11의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DRD5가 서열번호 13의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ASIC3-1이 서열번호 15의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ASIC3-2가 서열번호 17의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 TRAPPC9가 서열번호 19의 아미노산 서열을 갖는 폴리펩타이드일 수 있다.
본 명세서에서 사용한 용어 "COL27A1"은 "Collagen Type XXVII Alpha 1 Chain"의 약자이다. 본 발명의 COL27A1 단백질은 서열번호 1의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 COL27A1 단백질은 서열번호 1의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 1의 아미노산 서열을 갖는 폴리펩타이드인 COL27A1 단백질을 코딩하는 유전자는 서열번호 2의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 COL27A1 단백질을 코딩하는 염기 서열은 서열번호 2의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "BTBD9"는 "BTB domain containing 9"의 약자이다. 본 발명의 BTBD9 단백질은 서열번호 3의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 BTBD9 단백질은 서열번호 3의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 3의 아미노산 서열을 갖는 폴리펩타이드인 BTBD9 단백질을 코딩하는 유전자는 서열번호 4의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 BTBD9 단백질을 코딩하는 염기 서열은 서열번호 4의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "SGCE"는 "sarcoglycan epsilon"의 약자이다. 본 발명의 SGCE 단백질은 서열번호 5의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 SGCE 단백질은 서열번호 5의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 5의 아미노산 서열을 갖는 폴리펩타이드인 SGCE 단백질을 코딩하는 유전자는 서열번호 6의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 SGCE 단백질을 코딩하는 염기 서열은 서열번호 6의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "MECP2"는 "methyl CpG binding protein 2"의 약자이다. 본 발명의 MECP2 단백질은 서열번호 7의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 MECP2 단백질은 서열번호 7의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 7의 아미노산 서열을 갖는 폴리펩타이드인 MECP2 단백질을 코딩하는 유전자는 서열번호 8의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 MECP2 단백질을 코딩하는 염기 서열은 서열번호 8의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "USH2A"는 "Usher syndrome 2A"의 약자이다. 본 발명의 USH2A 단백질은 서열번호 9의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 USH2A 단백질은 서열번호 9의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 9의 아미노산 서열을 갖는 폴리펩타이드인 USH2A 단백질을 코딩하는 유전자는 서열번호 10의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 USH2A 단백질을 코딩하는 염기 서열은 서열번호 10의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CEP290"는 "Centrosomal protein of 290 kDa"의 약자이다. 본 발명의 CEP290 단백질은 서열번호 11의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CEP290 단백질은 서열번호 11의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 11의 아미노산 서열을 갖는 폴리펩타이드인 CEP290 단백질을 코딩하는 유전자는 서열번호 12의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CEP290 단백질을 코딩하는 염기 서열은 서열번호 12의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 본 발명의 "DRD5" 단백질은 서열번호 13의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 DRD5 단백질은 서열번호 13의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 13의 아미노산 서열을 갖는 폴리펩타이드인 DRD5 단백질을 코딩하는 유전자는 서열번호 14의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 DRD5 단백질을 코딩하는 염기 서열은 서열번호 14의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "ASIC3-1"은 "Acid-sensing ion channel 1"의 약자이다. 본 발명의 ASIC3-1 단백질은 서열번호 15의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 ASIC3-1 단백질은 서열번호 15의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 15의 아미노산 서열을 갖는 폴리펩타이드인 ASIC3-1 단백질을 코딩하는 유전자는 서열번호 16의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 ASIC3-1 단백질을 코딩하는 염기 서열은 서열번호 16의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "ASIC3-2"는 "Acid-sensing ion channel 2"의 약자이다. 본 발명의 ASIC3-2 단백질은 서열번호 17의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 ASIC3-2 단백질은 서열번호 17의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 17의 아미노산 서열을 갖는 폴리펩타이드인 ASIC3-2 단백질을 코딩하는 유전자는 서열번호 18의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 ASIC3-2 단백질을 코딩하는 염기 서열은 서열번호 18의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "RAPPC9"는 서열번호 19의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 RAPPC9 단백질은 서열번호 19의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 19의 아미노산 서열을 갖는 폴리펩타이드인 RAPPC9 단백질을 코딩하는 유전자는 서열번호 20의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 RAPPC9 단백질을 코딩하는 염기 서열은 서열번호 20의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물을 제공한다.
이때, 상기 MST1L이 서열번호 21의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 GBP3이 서열번호 23의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CFHR3이 서열번호 25의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CFHR1이 서열번호 27의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 OR2T2가 서열번호 29의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 OR2T3이 서열번호 31의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 AQP12A가 서열번호 33의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 MUC4가 서열번호 35의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USP17L17이 서열번호 37의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USP17L18이 서열번호 39의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 TMPRSS11E가 서열번호 41의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 UGT2B17이 서열번호 43의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 PDZD2가 서열번호 45의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 GOLPH3이 서열번호 47의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 KLHL3이 서열번호 49의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CTNNA3이 서열번호 51의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 FSCB가 서열번호 53의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DUOXA1이 서열번호 55의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DLG4가 서열번호 57의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ACADVL이 서열번호 59의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CDRT1이 서열번호 61의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 BSG가 서열번호 63의 아미노산 서열을 갖는 폴리펩타이드일 수 있다.
본 명세서에서 사용한 용어 "MST1L"은 "macrophage stimulating 1-like"의 약자이다. 본 발명의 MST1L 단백질은 서열번호 21의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 MST1L 단백질은 서열번호 21의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 21의 아미노산 서열을 갖는 폴리펩타이드인 MST1L 단백질을 코딩하는 유전자는 서열번호 22의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 MST1L 단백질을 코딩하는 염기 서열은 서열번호 22의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "GBP3"은 "guanylate binding protein 3"의 약자이다. 본 발명의 GBP3 단백질은 서열번호 23의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 GBP3 단백질은 서열번호 23의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 23의 아미노산 서열을 갖는 폴리펩타이드인 GBP3 단백질을 코딩하는 유전자는 서열번호 24의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 GBP3 단백질을 코딩하는 염기 서열은 서열번호 24의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CFHR3"은 "Complement Factor H Related 3"의 약자이다. 본 발명의 CFHR3 단백질은 서열번호 25의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CFHR3 단백질은 서열번호 25의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 25의 아미노산 서열을 갖는 폴리펩타이드인 CFHR3 단백질을 코딩하는 유전자는 서열번호 26의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CFHR3 단백질을 코딩하는 염기 서열은 서열번호 26의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CFHR1"은 "Complement Factor H Related 1"의 약자이다. 본 발명의 CFHR1 단백질은 서열번호 27의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CFHR1 단백질은 서열번호 27의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 27의 아미노산 서열을 갖는 폴리펩타이드인 CFHR1 단백질을 코딩하는 유전자는 서열번호 28의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CFHR1 단백질을 코딩하는 염기 서열은 서열번호 28의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "OR2T2"는 "Olfactory receptor 2T2"의 약자이다. 본 발명의 OR2T2 단백질은 서열번호 29의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 OR2T2 단백질은 서열번호 29의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 29의 아미노산 서열을 갖는 폴리펩타이드인 OR2T2 단백질을 코딩하는 유전자는 서열번호 30의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 OR2T2 단백질을 코딩하는 염기 서열은 서열번호 30의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "OR2T3"은 "Olfactory Receptor Family 2 Subfamily T Member 3"의 약자이다. 본 발명의 OR2T3 단백질은 서열번호 31의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 OR2T3 단백질은 서열번호 31의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 31의 아미노산 서열을 갖는 폴리펩타이드인 OR2T3 단백질을 코딩하는 유전자는 서열번호 32의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 OR2T3 단백질을 코딩하는 염기 서열은 서열번호 32의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "AQP12A"는 "Aquaporin 12A"의 약자이다. 본 발명의 AQP12A 단백질은 서열번호 33의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 AQP12A 단백질은 서열번호 33의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 33의 아미노산 서열을 갖는 폴리펩타이드인 AQP12A 단백질을 코딩하는 유전자는 서열번호 34의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 AQP12A 단백질을 코딩하는 염기 서열은 서열번호 34의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "MUC4"는 "Mucin 4"의 약자이다. 본 발명의 MUC4 단백질은 서열번호 35의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 MUC4 단백질은 서열번호 35의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 35의 아미노산 서열을 갖는 폴리펩타이드인 MUC4 단백질을 코딩하는 유전자는 서열번호 36의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 MUC4 단백질을 코딩하는 염기 서열은 서열번호 36의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "USP17L17"은 "Ubiquitin Specific Peptidase 17-Like Family Member 17"의 약자이다. 본 발명의 USP17L17 단백질은 서열번호 37의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 USP17L17 단백질은 서열번호 37의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 37의 아미노산 서열을 갖는 폴리펩타이드인 USP17L17 단백질을 코딩하는 유전자는 서열번호 38의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 USP17L17 단백질을 코딩하는 염기 서열은 서열번호 38의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "USP17L18"은 "Ubiquitin Specific Peptidase 17-Like Family Member 18"의 약자이다. 본 발명의 USP17L18 단백질은 서열번호 39의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 USP17L18 단백질은 서열번호 39의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 39의 아미노산 서열을 갖는 폴리펩타이드인 USP17L18 단백질을 코딩하는 유전자는 서열번호 40의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 USP17L18 단백질을 코딩하는 염기 서열은 서열번호 40의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "TMPRSS11E"는 "Transmembrane Serine Protease 11E"의 약자이다. 본 발명의 TMPRSS11E 단백질은 서열번호 41의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 TMPRSS11E 단백질은 서열번호 41의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 41의 아미노산 서열을 갖는 폴리펩타이드인 TMPRSS11E 단백질을 코딩하는 유전자는 서열번호 42의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 TMPRSS11E 단백질을 코딩하는 염기 서열은 서열번호 42의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "UGT2B17"은 "UDP Glucuronosyltransferase Family 2 Member B17"의 약자이다. 본 발명의 UGT2B17 단백질은 서열번호 43의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 UGT2B17 단백질은 서열번호 43의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 43의 아미노산 서열을 갖는 폴리펩타이드인 UGT2B17 단백질을 코딩하는 유전자는 서열번호 44의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 UGT2B17 단백질을 코딩하는 염기 서열은 서열번호 44의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "PDZD2"는 "PDZ Domain Containing 2"의 약자이다. 본 발명의 PDZD2 단백질은 서열번호 45의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 PDZD2 단백질은 서열번호 45의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 45의 아미노산 서열을 갖는 폴리펩타이드인 PDZD2 단백질을 코딩하는 유전자는 서열번호 46의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 PDZD2 단백질을 코딩하는 염기 서열은 서열번호 46의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "GOLPH3"은 "Golgi Phosphoprotein 3"의 약자이다. 본 발명의 GOLPH3 단백질은 서열번호 47의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 GOLPH3 단백질은 서열번호 47의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 47의 아미노산 서열을 갖는 폴리펩타이드인 GOLPH3 단백질을 코딩하는 유전자는 서열번호 48의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 GOLPH3 단백질을 코딩하는 염기 서열은 서열번호 48의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "KLHL3"은 "Kelch Like Family Member 3"의 약자이다. 본 발명의 KLHL3 단백질은 서열번호 49의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 KLHL3 단백질은 서열번호 49의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 49의 아미노산 서열을 갖는 폴리펩타이드인 KLHL3 단백질을 코딩하는 유전자는 서열번호 50의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 KLHL3 단백질을 코딩하는 염기 서열은 서열번호 50의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CTNNA3"은 "Catenin Alpha 3"의 약자이다. 본 발명의 CTNNA3 단백질은 서열번호 51의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CTNNA3 단백질은 서열번호 51의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 51의 아미노산 서열을 갖는 폴리펩타이드인 CTNNA3 단백질을 코딩하는 유전자는 서열번호 52의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CTNNA3 단백질을 코딩하는 염기 서열은 서열번호 52의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "FSCB"는 "Fibrous Sheath CABYR Binding Protein"의 약자이다. 본 발명의 FSCB 단백질은 서열번호 53의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 FSCB 단백질은 서열번호 53의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 53의 아미노산 서열을 갖는 폴리펩타이드인 FSCB 단백질을 코딩하는 유전자는 서열번호 54의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 FSCB 단백질을 코딩하는 염기 서열은 서열번호 54의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "DUOXA1"는 "Dual Oxidase Maturation Factor 1"의 약자이다. 본 발명의 DUOXA1 단백질은 서열번호 55의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 DUOXA1 단백질은 서열번호 55의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 55의 아미노산 서열을 갖는 폴리펩타이드인 DUOXA1 단백질을 코딩하는 유전자는 서열번호 56의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 DUOXA1 단백질을 코딩하는 염기 서열은 서열번호 56의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "DLG4"는 "Discs Large MAGUK Scaffold Protein 4"의 약자이다. 본 발명의 DLG4 단백질은 서열번호 57의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 DLG4 단백질은 서열번호 57의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 57의 아미노산 서열을 갖는 폴리펩타이드인 DLG4 단백질을 코딩하는 유전자는 서열번호 58의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 DLG4 단백질을 코딩하는 염기 서열은 서열번호 58의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "ACADVL"는 "Acyl-CoA Dehydrogenase Very Long Chain"의 약자이다. 본 발명의 ACADVL 단백질은 서열번호 59의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 ACADVL 단백질은 서열번호 59의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 59의 아미노산 서열을 갖는 폴리펩타이드인 ACADVL 단백질을 코딩하는 유전자는 서열번호 60의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 ACADVL 단백질을 코딩하는 염기 서열은 서열번호 60의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CDRT1"는 "CMT1A Duplicated Region Transcript 1"의 약자이다. 본 발명의 CDRT1 단백질은 서열번호 61의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CDRT1 단백질은 서열번호 61의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 61의 아미노산 서열을 갖는 폴리펩타이드인 CDRT1 단백질을 코딩하는 유전자는 서열번호 62의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CDRT1 단백질을 코딩하는 염기 서열은 서열번호 62의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "BSG"는 "Basigin"의 약자이다. 본 발명의 BSG 단백질은 서열번호 63의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 BSG 단백질은 서열번호 63의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 63의 아미노산 서열을 갖는 폴리펩타이드인 BSG 단백질을 코딩하는 유전자는 서열번호 64의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 BSG 단백질을 코딩하는 염기 서열은 서열번호 64의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용하는 용어 "단일염기서열 변이(single nucleotide variant, SNV)"는 유전체상의 변이 중 단일염기서열이 다른 차이를 보이는 변이를 의미한다. 또한, 본 명세서에서 사용하는 용어 "유전자 단위 반복 변이(copy number variation, CNV)"는 서로 다른 두 DNA 시퀀스를 비교하여 50bp 이상의 DNA 세그먼트의 카피 수(copy number)가 서로 다른 경우의 변이로 정의된다. 상기 CNV는 자폐증(autism), 지적 장애(intellectual disability), 뇌전증(epilepsy), 조현병(schizophrenia), 소아비만(obesity), 암(cancer) 등과 같은 인간의 질병과 연관성이 있는 매우 중요한 변이 유형 중 하나이다.
본 발명은 다른 측면으로, TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질 또는 이를 코딩하는 유전자를 검출하는 제제를 포함하는 뚜렛증후군 진단용 키트를 제공한다. 이때, 상기 검출 제제는 상기 SNV 관련 단백질을 코딩하는 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브일 수 있다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 유효성분으로 포함하는 뚜렛증후군 진단용 키트를 제공한다. 이때, 상기 검출제제는 상기 CNV 관련 단백질을 코딩하는 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브일 수 있다.
“프라이머” 또는 “프로브”는 주형과 상보적으로 결합할 수 있고 역전사효소 또는 DNA 중합효소가 주형의 복제를 개시할 수 있도록 하는 자유 3말단 수산화기(free 3' hydroxyl group)를 가지는 핵산 서열로 상보적인 핵산의 주형과 염기쌍(base pair)을 형성할 수 있고, 핵산 주형의 가닥 복사를 위한 시작 지점으로부터 기능하는 핵산 서열을 의미한다. 본 발명에 따른 유전자들의 공지된 서열 정보를 기반으로, 당 분야에 잘 알려진 기술들을 이용하여 상기 프로브 또는 프라이머를 용이하게 제조할 수 있다.
본 발명의 일 실시예에서는, Sureselect(Agilent Technologies, Santa Clara, CA)를 이용하여 엑솜 프로브를 제작하였다. 구체적으로, 유전자의 UTR(untranslated region) 영역은 타겟에서 제외하여 총 219개 유전자의 CDS(coding sequence) 영역을 포함한 4402개 엑손 영역의 서열을 바탕으로 엑솜 프로브를 제작하였다.
이때, 상기 엑솜(exome)은 유전체의 영역 중에서 진유전자 영역의 총합을 일컫는 말로서, 본 발명에서는 뚜렛증후군 원인 유전자 분석방법을 보정하기 위해 한국인 뚜렛증후군 환자의 전체엑솜염기서열 즉, WES(whole exome sequencing) 유전체데이터를 분석하였다. 상기 WES는 차세대 시퀀싱 기술(NGS)에 기반한 것이다.
상기 NGS는 검체로부터 DNA를 추출한 이후 기계적으로 조각화(fragmentation)를 시킨 후 특정 크기를 가지는 라이브러리를 제작하여 시퀀싱에 사용한다. 대용량 시퀀싱 장비를 사용하여 한 개의 염기단위로 4가지 종류의 상보적 뉴클레오타이드의 결합 및 분리 반응을 반복하면서 초기 시퀀싱 데이터를 생산하게 된다. 이후에 초기 데이터의 가공(Trimming), 맵핑(Mapping), 유전체 변이의 동정 및 변이 정보의 해석(Annotation) 등 생물적보학(Bioinformatics)을 이용한 분석 단계를 수행하여 질병 및 다양한 생물학적 형태(phenotype)에 영향을 미치거나 가능성이 높은 유전체 변이를 발굴한다. 이러한 차세대 시퀀싱 기술 중, 앰플리콘(amplicon) 기반의 NGS 방법은 목적하는 유전자를 증폭시킬 수 있는 프라이머를 설계하여 짧은 길이의 리드를 다양하게 생산한 다음, 이를 정렬하여 분석하는 기술이다. 대표적인 기술은 Emulstion PCR 방법이 있고, 이를 바탕으로 하는 기기는 Roche의 454 platform, Thermo FIsher의 SOLid platform 및 Ion Torrent platform 등이 있다.
본 명세서에서 사용한 용어 "진단용 키트"는 뚜렛증후군 환자로부터 채취한 생물학적 시료를 정상인으로부터 채취한 생물학적 시료와 구분하여 진단할 수 있는 물질이다. 본 발명에서, 상기 키트는 RT-PCR 키트, DNA 칩 키트 또는 단백질 칩 키트인 것일 수 있으나, 이에 제한되는 것은 아니다.
본 발명은 또 다른 측면으로, TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩을 제공한다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩을 제공한다.
본 명세서에서 사용하는 용어 "마이크로어레이 칩"은 고체 기판표면에 DNA, 단백질, 세포 등과 같은 생체물질을 고밀도로 집적한 것을 말하며, 이를 이용해 생물학적 정보를 얻음으로써 유전자 발현양상, 유전자 결함, DNA-Protein 상호작용, Protein-Protein 상호작용, Chemical-Protein 상호작용, 질병진단 등의 목적을 수행하는 유용한 도구로 사용된다. 상기 마이크로어레이 칩의 분석장치는 형광, 화학발광, 질량분석 등의 검출기술을 사용할 수 있다. 마이크로어레이 칩의 분석장치 중 형광물질을 시료에 표지하고 형광스캐너로 분석하는 형광표지분석법이 보편적으로 사용된다.
본 발명은 또 다른 측면으로, 1) 뚜렛증후군 환자, 그의 부모 또는 형제자매로부터 SNV(single nucleotide variation) 및 CNV(copy number variation) 데이터를 수득하는 단계; 2) 상기 SNV 및 CNV 데이터를 맵핑하는 단계; 및 3) 뚜렛증후군 원인 유전자의 변이된 위치를 확인 또는 CNV 변이를 확인하는 단계를 포함하는 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법을 제공한다.
구체적으로, 본 발명은 i) Sickle 프로그램(https://github.com/najoshi/sickle)을 이용하여 Exome 서열 데이터의 전처리하는 단계, ii) Burrows-Wheeler Aligner(BWA) 0.1.17 버젼의 프로그램을 이용하여 인간 표준서열에 맵핑하는 단계, iii) GATK Lite 2.3.9 버전의 프로그램을 이용하여 Local realignment 하는 단계, iv) GATK Lite 2.3.9 프로그램에서 제공하는 BaseRecalibrator 옵션을 이용하여 Base recalibration하는 단계, v) GATK Lite 2.3.9 프로그램에서 제공하는 UnifiedGenotyper 옵션을 이용하여 변이체를 발굴하는 단계, vi) GATK Lite 2.3.9 프로그램에서 제공하는 VariantFiltration 옵션을 사용하여 변이체를 필터링하는 단계, 및 vii) 뚜렛증후군 환자, 뚜렛증후군 환자의 부모 및 형제자매의 염기서열 정보를 이용하여, 뚜렛증후군 환자와 정상인의 염기서열이 각각 일치하는 위치를 선별하는 단계를 통해 뚜렛증후군의 원인 유전자를 스크리닝하였다. 이때, 본 발명에 따른 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법은 카피(copy) 수가 2 이상인 경우 뚜렛증후군 관련 CNV라고 판정하는 단계를 추가적으로 포함할 수 있다.
상기 SNV는 하기 유전자 중 어느 하나 이상에서 발생될 수 있다: ABCA13, ADCY2, ADORA1, ADORA2, AGPAT5, AMBRA1, ANK3, ARHGAP26, ARHGAP30, ARID1A, ARL8A, ASIC3-1, ASIC3-2, ATF6, ATP1A1, BARD1, BCAS3, BCAT1, BDNF, BSN, BTBD9, C8A, CACNA1D, CAMSAP1, CAPRIN2, CARD8, CBFA2T1, CCAR1, CDK12, CELSR3, CEP290, CHD2, CHD5, CHRNA7, CIT, CLCN1, CNTNAP2, COL27A1, CPA4, CREBBP, CSDE1, CSNK1G3, CTCF, CX3CL1, CYP2B6, CYP2C18, DBH, DCLK2, DENND5A, DHX15, DLG5, DLGAP3, DNAH2, DNAJC13, DOCK7, DPP6, DRD1, DRD2, DRD3, DSCAM, DSCAML1, EVPL, FAM120A, FAM71A, FBXO15, FMNL2, FN1, FRY, GAPVD1, GBX2, GCH1, GDNF, GET4, GIGYF1, GNB2L1, GOPC, HDAC5, HDC, HEATR5B, HECTD3, HEPACAM2, HERC1, HERC2, HIST1H1T, HLA-E, HNRNPA0, HTR1F, HTR2C, HTR3A, IL16, IL1RN, ITGA1, IMMP2L, ITOA1, ITPR2, KBTBD8, KDM5B, KIAA0368, KIAA1429, KLHL32, KLHL9, KNDC1, KRTAP10-4, LAX1, LILRA2, LLGL1, LMNA, LRP8, LZTR1, MAB21L2, MARK2, MCM7, ME2, MECP2, MGAM, MPL, MRPL3, MUC5B, MYH10, MYH4, MYO5A, NCBP1, NID1, NIPBL, NLGN4X, NLRP11, NPC1, NUP85, OFCC1, OLFM1, OPA1, OR9I1, PAG1, PDP1, PKD1L1, PREX2, PROM1, PYROXD2, RELN, RFWD3, RNF213, RYR1, RYR2, RYR3, SCN11A, SCNN1B, SEL1L3, Serotonin 1B, SGCE, SH3TC1, SKP2, SLC1A3, SLC38A8, SLC6A1, SLC6A2, SLITRK1, SLO6A2, SNRNP200, SOCE, SPEN, SPRY2, SPTBN1, SRGAP3, SSBP2, ST18, STAB2, TDRD9, TGM1, THBS3, TLN2, TMEM147, TNPO1, TOX, TP53BP2, TPH2, TPX2, TRAPPC9, TTN, TULP4, UBASH3A, UBR4, UNC13C, USH2A, USPL1, WDFY3, WDR72, WNK4, WNT7B, WWC1, YLPM1, ZMIZ1, ZNF385A, ZNF799 또는 DRD5.
상기 CNV는 하기 유전자 중 어느 하나 이상에서 발생될 수 있다: A2BP1, AADAC, ACADVL, ADSL, ALDH18A1, AQP12A, ASTN2, AUTS2, BSG, CACNA1C, CBR2, CDH10, CDH13, CDH18, CDRT1, CFHR1, CFHR3, CNTN4, CNTNAP2, Col8A1, CTNNA3, CTNND2, DISC1, DLG4, DOPEY2, DPP6, DUOXA1, FHIT, FSCB, GABRA4, GABRB1, GABRG1, GALNT13, GBP3, GOLPH3, GPR89A, GRM8, KCNE1, KCNMA1, KLHL3, MACROD2, MST1L, MUC4, NF1, NRXN1, NSD1, OR2R2, OR2T2, OR2T3, OXTR, P2RX2, PAK7, PARK2, PDE9A, PDZD2, POLE, RB1CC1, SEMA5A, TBX1, TMEM195, TMPRSS11E, UGT2B17, USP17L17, USP17L18 또는 WDR4.
상기 뚜렛증후군 환자, 그의 부모 또는 형제자매로부터 SNV(single nucleotide variation) 및 CNV(copy number variation) 데이터를 수득하는 단계는 SNV 후보군 그룹 및 CNV 후보군 그룹으로부터 선택되는 하나 이상의 유전자를 코딩하는 뉴클레오타이드 서열에 특이적으로 결합하는 프라이머 또는 프로브를 이용하여 NGS(next generation sequencing)를 통해 수행되는 것일 수 있다.
본 발명은 또 다른 측면으로, 1) 뚜렛증후군 SNV 데이터, CNV 데이터 및 가족 정보가 입력되는 데이터 취득부; 2) 상기 입력된 SNV 데이터, CNV 데이터 및 가족 정보를 기 설정된 수식 및 윈도우를 이용하여 우선순위 점수 연산을 수행하는 데이터 연산부; 3) 상기 연산부에서 연산된 우선순위 점수에 따라 선정된 SNV 및 CNV, 그리고 뚜렛증후군의 SNV 및 CNV 데이터를 맵핑하는 맵핑부; 및 4) 상기 맵핑된 SNV 및 CNV를 이용하여 뚜렛증후군 위험 여부를 출력하는 동정부를 포함하는 뚜렛증후군 진단용 시스템을 제공한다.
상기 SNV 데이터는 COL27A1, BTBD9, SGCE, MECP2, USH2A, CEP290, DRD5, ASIC3-1, ASIC3-2 및 TRAPPC9로 구성된 군으로부터 선택되는 어느 하나 이상의 유전자를 포함할 수 있다. 또한, 상기 CNV 데이터는 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 유전자를 포함할 수 있다.
또한, 본 발명의 뚜렛증후군 진단용 시스템에서, 상기 데이터 연산부는 (i) 상기 입력된 SNV 데이터, CNV 데이터 및 가족 정보를 하기 수학식 1 또는 수학식 2를 이용하여 SNV 및 CNV 데이터의 수치화하는 단계;
[수학식 1]
[수학식 2]
(ii) 수치화된 SNV 데이터 및 CNV 데이터를 분석 가능한 가족 구성원의 수를 윈도우 사이즈 n으로 설정하는 단계; (iii) 설정된 윈도우 내 수치화된 SNV 데이터 및 CNV 데이터를 이용하여 분석 대상 가족의 비율을 연산하는 단계; (iv) 상기 설정된 윈도우 내 SNV 및 CNV 위치에서 단일 비율 검정을 이용하여 유의 확률(p-value)을 연산하는 단계; (v) 상기 설정된 윈도우의 양측 말단의 물리적인 위치 보정을 위한 가중치 연산하는 단계; (vi) 상기 연산된 유의 확률 및 가중치를 이용하여 점수를 계산하는 단계; (vii) 상기 계산된 (vi)의 점수가 -log(0.05)=2.996 이상인 단일 염기서열에서 뚜렛증후군 환자와 정상인의 패턴이 각각 일치하는지 확인하는 단계; (viii) 상기 (vii)의 조건을 만족하는 단일 염기서열 위치가 암호화 부위(coding region)인지 확인하는 단계; (ix) 상기 (viii)의 조건을 만족하는 위치의 단일 염기서열을 유전자 기호(gene symbol)로 변환하는 단계; 및 (x) 점수에 따라 우선순위를 매긴 후 원인 후보 유전자 리스트를 확인하는 단계로 이루어질 수 있다.
상기 SNV 및 CNV 데이터의 수치화는 뚜렛증후군 환자의 부모 염기서열 데이터를 모두 사용할 수 있는 경우, 상기 수학식 1을 이용하여 수치화할 수 있다. 이때, 상기 수학식 1에서 SNVjv(S)는 v번째 가족구성원의 j번째 SNV를 의미한다. 상기 S는 뚜렛증후군 환자와 정상인을 구분하는 변수이며, SNVjv(0) 및 SNVjv(1)은 각각 정상인 및 뚜렛증후군 환자를 의미한다. 또한, "v=1", "v=2", 및 "v=3,...,V"는 각각 부, 모, 및 v-2번째 자녀의 염기서열 데이터를 의미한다. 상기 수학식 1에서 REFj는 인간게놈 참조서열의 j번째 위치의 유전자형(genotype)을 나타내고, 가족 전체의 패턴 빈도인 LFj는 하기 수학식 3을 이용하여 계산할 수 있다. 이때, 상기 C는 자녀 수를 의미한다.
[수학식 3]
또한, 상기 SNV 및 CNV 데이터의 수치화에서, 뚜렛증후군 환자의 부모 염기서열 데이터 중 하나만 사용 가능하거나 부모 염기서열 데이터를 모두 사용할 수 없는 경우, 상기 수학식 2를 이용하여 단일 염기서열 정보를 수치화할 수 있다.
이때, 수학식 2에서 MSNVj(S=1)은 분석 대상 가족 구성원 내에서 뚜렛증후군 환자들만이 가지고 있는 SNV 중 빈도가 높은 패턴을 나타내며, 상기 수학식 3을 이용하여 가족 전체의 패턴 빈도 LFj를 계산할 수 있다. SNV에서 LFj 가계 뚜렛증후군의 원인 후보 유전자에 점수를 부여한다. 만약, SNV에서 뚜렛증후군 원인 후보유전자를 선발하지 못하였을 경우, CNV 뚜렛증후군 원인 유전자로 알려진 유전자 리스트에서 CNV 발견시 이를 해당 가계의 뚜렛증후군 원인 유전자로 정의할 수 있다.
또한, 본 발명은 1) 뚜렛증후군이 의심되는 개체로부터 분리된 시료에서 SNV 관련 유전자 또는 CNV 관련 유전자의 변이를 확인하는 단계; 및 2) 상기 SNV 관련 유전자 또는 CNV 관련 유전자가 변이가 일어난 경우, 개체를 뚜렛증후군으로 판정하는 단계를 포함하는, 뚜렛증후군 진단에 대한 정보의 제공방법을 제공한다.
구체적으로, 본 발명은 뚜렛증후군이 의심되는 개체로부터 분리된 시료에서 SNV 관련 유전자 또는 CNV 관련 유전자의 변이는 기존에 알려진 유전자로부터 SNV 및 CNV 데이터가 모두 예측된 경우, 유전자 변이가 단백질 구조에 어느 정도 영향을 주는지를 예측할 수 있는지 예측할 수 있는 프로그램 정보 및 SNV 종간 보존상태를 예측한 데이터를 추가 제공함으로써 뚜렛증후군 관련 SNV 또는 CNV 원인 변이체를 판단할 수 있는 단계를 포함한다. 또한, 정상인 유전체 데이터와의 비교분석을 통해 정상인에서 SNV 발생빈도를 판단하여 예측 프로그램에서 선발된 SNV 및 CNV가 정상인에서 발생하는 경우를 제외할 수 있어, 뚜렛증후군 원인 유전자를 판정하는데 정확도를 높일 수 있었다.
이하, 실시예를 통하여 본 발명을 보다 상세히 설명하고자 한다. 이들 실시예는 본 발명을 보다 구체적으로 설명하기 위한 것으로, 본 발명의 범위가 이들 실시예에 한정되는 것은 아니다.
I.
뚜렛증후군
관련 유전자
실시예
1. 연구 대상자 선정
총 34가계 81명의 뚜렛증후군 환자 또는 상기 환자의 정상 가계원을 본 실험에 등록하였다. 두 명의 독자적 신경과 전문의에 의한 임상 평가를 실시하였다. 모든 참가자는 인제대학교 백병원의 기관생명윤리위원회에 의해 승인된 절차에 따라 피험자 동의서를 제공하였다. 뚜렛증후군 환자 36명 및 뚜렛증후군 환자의 정상 가계원 45명을 건강한 대조군으로 이용하여 연구를 수행하였다.
실시예
2.
뚜렛증후군
원인 유전자의 검출을 위한 후보 유전자 선정
뚜렛증후군 관련 유전자를 선정하기 위하여, 다양한 질병에 관한 정보를 포함한 OMIM(online mendelian inheritance in man) 데이터베이스 및 선행문헌 연구를 통해 뚜렛증후군 환자에서 빈번하게 발생하는 유전자에 우선순위를 두고 최적의 뚜렛증후군 관련 후보 유전자를 선정하였다.
그 결과, CNV(46개 유전자: A2BP1, AADAC, ADSL, ALDH18A1, ASTN2, AUTS2, CACNA1C, CBR2, CDH10, CDH13, CDH18, CNTN4, CNTNAP2, Col8A1, CTNNA3, CTNND2, DISC1, DOPEY2, DPP6, DUOXA1, FHIT, FSCB, GABRA4, GABRB1, GABRG1, GALNT13, GPR89A, GRM8, KCNE1, KCNMA1, KLHL3, MACROD2, NF1, NRXN1, NSD1, OXTR, P2RX2, PAK7, PARK2, PDE9A, POLE, RB1CC1, SEMA5A, TBX1, TMEM195, WDR4) 및 SNV(176개 유전자 : ABCA13, ADCY2, ADORA1, ADORA2, AGPAT5, AMBRA1, ANK3, ARHGAP26, ARHGAP30, ARID1A, ARL8A, ATF6, ATP1A1, BARD1, BCAS3, BCAT1, BDNF, BSN, BTBD9, C8A, CACNA1D, CAMSAP1, CAPRIN2, CARD8, CBFA2T1, CCAR1, CDK12, CELSR3, CHD2, CHD5, CHRNA7, CIT, CLCN1, CNTNAP2, COL27A1, CPA4, CREBBP, CSDE1, CSNK1G3, CTCF, CX3CL1, CYP2B6, CYP2C18, DBH, DCLK2, DENND5A, DHX15, DLG5, DLGAP3, DNAH2, DNAJC13, DOCK7, DPP6, DRD1, DRD2, DSCAM, DSCAML1, EVPL, FAM120A, FAM71A, FBXO15, FMNL2, FN1, FRY, GAPVD1, GBX2, GCH1, GDNF, GET4, GIGYF1, GNB2L1, GOPC, HDAC5, HDC, HEATR5B, HECTD3, HEPACAM2, HERC1, HERC2, HIST1H1T, HLA-E, HNRNPA0, HTR2C, IL16, IMMP2L, ITPR2, KBTBD8, KDM5B, KIAA0368, KIAA1429, KLHL32, KLHL9, KNDC1, KRTAP10-4, LAX1, LILRA2, LLGL1, LMNA, LRP8, LZTR1, MAB21L2, MARK2, MCM7, ME2, MGAM, MPL, MRPL3, MUC5B, MYH10, MYH4, MYO5A, NCBP1, NID1, NIPBL, NLGN4X, NLRP11, NPC1, NUP85, OFCC1, OLFM1, OPA1, OR9I1, PAG1, PDP1, PKD1L1, PREX2, PROM1, PYROXD2, RELN, RFWD3, RNF213, RYR1, RYR2, RYR3, SCN11A, SCNN1B, SEL1L3, Serotonin 1B, SH3TC1, SKP2, SLC1A3, SLC38A8, SLC6A1, SLITRK1, SNRNP200, SPEN, SPRY2, SPTBN1, SRGAP3, SSBP2, ST18, STAB2, TDRD9, TGM1, THBS3, TLN2, TMEM147, TNPO1, TOX, TP53BP2, TPX2, TTN, TULP4, UBASH3A, UBR4, UNC13C, USPL1, WDFY3, WDR72, WNK4, WNT7B, WWC1, YLPM1, ZMIZ1, ZNF385A, ZNF799)를 뚜렛증후군 원인 유전자 검출하기 위한 유전자로 선정하였다.
실시예
3.
뚜렛증후군
원인 유전자 검출을 위한
프로브
제작
프라이머 설계는 Sureselect(Agilent Technologies, Santa Clara, CA)로 실시하였다. 유전자의 UTR(untranslated region) 영역의 경우 타겟에서 제외한 후, 단백질로 코딩이 되는 영역을 최대한 커버할 수 있는 프로브를 제작하였다. 총 219개 유전자의 CDS(coding sequence) 영역을 포함한 4402개 엑손 영역의 서열을 바탕으로 엑솜 프로브를 제작하였다. 총 4402개 액손 영역의 사이즈는 1,170,663bp 이다.
실시예
4.
뚜렛증후군
검출 유전자의
NGS
라이브러리 제작
뚜렛증후군을 검출하는 유전자의 NGS(next generation sequencing) 실험을 위해, 뚜렛증후군 환자의 혈액으로부터 유전체 DNA를 QiAmp DNA Mini kit(Qiagen, Valencia, CA, USA)를 사용하여 분리하였다. 이후, Nanodrop 8000 UV-Vis spectrometer(Thermo Scientific Inc., DE, USA), Qubit 2.0 Fluorometer(Life technologies Inc., Grand Island, NY, USA) 및 2200 TapeStation Instrument(Aglient Technologies, Santa Clara, CA, USA) 장비를 사용하여 분리된 유전체 DNA의 농도, 순도 및 분해(degradation) 여부를 확인하였다. QC(quality control) 기준에 부합한 임상시료의 경우 다음 단계의 실험에 사용하였다.
QC를 통과한 임상시료의 혈액으로부터 확보한 유전체 DNA(~250 ng)는 Covaris S220(Covaris, MA, USA)를 사용하여 전단(shearing)을 수행한 후, end-repair, A-tailing, paired-end adaptor ligation 및 amplification 단계를 거쳐 시퀀싱 라이브러리 제작을 수행하였다. 상기 실시예 1에서 선정된 219개 유전자의 4402개 엑손 영역들을 캡처하기 위해, 제작된 프로브를 모두 포함하는 조성물을 사용하였다. 라이브러리의 Hybridization은 65℃에서 24시간 동안 반응하였으며, Hybridization 의해 캡처된 유전체 DNA 라이브러리 조각들을 정제하였다. 정제는 엑손에 부착된 바이오틴과 스트렙타비딘의 결합 특성을 이용하였다. 구체적으로, 자성비드로 코팅된 스트렙타비딘과 캡처된 라이브러리 조각에 부착된 바이오틴을 결합시킨 후 자기력을 이용하여 혼합물로부터 캡처된 라이브러리 조각을 분리하였다. 이후, 정제된 유전체 DNA 라이브러리 조각을 index barcode tag와 함께 증폭하였다.
실시예
5.
뚜렛증후군
검출 유전자 서열정보 수득
뚜렛증후군 임상시료에서 222개 유전자의 4402개 엑손 영역들을 포함하는 진유전체(exome)를 캡처한 시퀀싱 라이브러리를 NGS 시퀀싱 기계(Miseq, illumina, USA)에 주입하여 각 DNA 절편의 서열 정보를 획득하였다. 그리고, 유전체 데이터를 가공(trimming) 및 표준 인간 유전체에 정렬하여 샘플에서 각 유전자에 대한 서열정보를 수득하였다. 시퀀싱 반응은 TruSeq Rapid PE Cluster kit 및 TruSeq Rapid SBS kit(Illumina, USA)를 사용하여 이루어졌으며, 양방향 100 bp를 읽을 수 있는 paired-end 조건으로 수행하였다.
실시예
6.
뚜렛증후군
변이체
데이터 추출
NGS 시퀀싱 장비에서 만들어진 시퀀싱 리드(reads) 데이터를 유전체 데이터의 가공 절차를 수행한 후, Burrows-Wheeler Aligner(BWA) 알고리즘을 사용하여 UCSC hg19 reference genome(http://genome.ucsc.edu)에 정렬(alignment)을 수행하였다. QC가 완료된 유전체 데이터의 서열을 UCSC hg19 표준 서열에 맵핑하는 과정에서, NGS 라이브러리를 만들 때 생기는 PCR 중복 리드(PCR duplicated reads)가 포함될 수 있는데, 이를 제거하였다. PCR 중복 리드를 제거하는 이유는 시퀀싱을 하기 위해 증폭하는 과정이 필요한데, 오류로 증폭이 더 많이 된 부분을 제거하기 위함이다. PCR 중복 리드는 picard-tools-1.119(http://picard.sourceforge.net/)를 사용하여 제거하였으며, GenomeAnalysisTK-3.8 알고리즘을 사용하여 단일 뉴클레오티드 변이(single Nucleotide Variation, SNV) 및 삽입-결실변이(indel)를 동정하였다. UCSC hg19의 정보를 바탕으로 동정된 뉴클레오티드 각 변이에 주석을 달기 위하여 ANNOVAR 프로그램을 이용하였다.
실시예
7.
뚜렛증후군
원인 유전자 변이 선발
각 임상시료별 추출이 완료된 SNV 및 삽입-결실변이 중에서 변이 특성, 변이 형태, 아미노산 변경 정보 및 SNP DB 수록 정보(dbSNP)를 고려하여, 변이 특성이 단백질을 변화시킬 수 있는 SNV 및 삽입-결실 변이를 제외한 나머지 변이를 원인 유전자 선발에서 제거하는 과정을 수행하였다.
또한, 뚜렛증후군의 경우 희귀질환으로 연구가 진행되어야 되기 때문에, 전체 동정된 SNV 및 삽입-결실 중 전체 발생 빈도가 1% 이상을 보이는 변이는 공통변이로 정의하여 제거하는 과정을 수행하였다. 구체적으로, dbSNP, 1000 Genome 및 NHLBI GO Exome Project의 유전체데이터를 사용하여 각 SNV 및 삽입-결실 중 전체 발생 빈도를 확인한 후 전체 발생 빈도가 1% 이상을 보이는 변이는 제거하였다. 공통변이를 제거하는 과정을 수행한 후, 남은 변이를 뚜렛증후군 원인 유전자 변이 후보로 정의하였다. 서로 다른 두 개 이상의 유전자에서 변이가 선발되었을 경우, 종간 유전체 서열의 보존된 정도(conservation rate)가 높은 변이를 우선순위로 선발하였다.
또한, 뚜렛증후군 원인 유전자 선발에 혼동되는 유전자를 제거하기 위하여, 뚜렛증후군과 동반되어 나타나는 ADHD(359개: AANAT, ADRA1A, ADRA1B, ADRA1D, ADRA2A, ADRA2B, ADRA2C, ADRB1, ADRB2, ADRB3, ADRBK1, ADRBK2, AGBL1, AK094352, AK8, ANK3, ANO5, ARRB1, ARRB2, ARSB, ARVCF, AS3MT, ASMT, ASTN2, ATP11A, ATP2B3, ATP2C2, ATXN1, ATXN2, BAIAP2, BCHE, BCL11A, BDNF, BMPR1B, BRE, C1orf173, CACNA1C, CACNB2, CADM2, CALY, CAMK1D, CCSER1, CDH13, CDH23, CDK20, CEP112, CHMP7, CHRNA3, CHRNA4, CHRNA7, CLASP2, CLOCK, CLYBL, CMTM8, CNR1, CNTF, CNTFR, CNTN4, CNTN5, COMT, COX7B2, CPLX1, CPLX2, CPLX4, CREB5, CRYGC, CSMD2, CSNK1E, CTNNA2, DACT1, DBH, DCDC2, DCLK1, DCLK2, DDC, DENND3, DGKH, DHCR7, DIRAS2, DISC1, DLEU2, DMRT2, DNAJA1P4, DNAJC27, DNM1, DOCK10, DPH6, DPP6, DRD1, DRD2, DRD3, DRD4, DRD5, DSCC1, DUSP1, DYX1C1, ELK3, ELOVL6, EMP2, EREG, ETV5, FADS1, FADS2, FAIM2, FANCL, FGF10, FGF12, FHIT, FLNC, FLRT2, FOXP1, FOXP2, FTO, FURIN, GABRG1, GDNF, GEMIN2, GFI1B, GFOD1, GIT1, GNAL, GNAO1, GNAT2, GNAZ, GNPDA2, GPC5, GPC6, GPR125, GPR50, GPRC5B, GPX6, GRID2, GRIK1, GRIK4, GRIN2A, GRIN2B, GRM1, GRM5, GRM7, GRM8, GSK3B, H2AFY, HAS3, HCN1, HES1, HES6, HK1, HKDC1, HLA-DRB1, HOXB1, HTR1A, HTR1B, HTR1D, HTR1E, HTR1F, HTR2A, HTR2C, HTR3A, HTR3B, HTR4, HTR5A, HTR6, HTR7, ID2, IL16, IL1RN, IL20RA, ISL1, ITGA1, ITGA11, ITGAE, ITIH3, KANK2, KANSL1, KCNC1, KCNIP4, KCTD15, KIAA0319, LARP7, LECT1, LHFPL3, LIN7C, LINC00478, LINGO2, LMAN2L, LMO4, LOC100128765, LOC100188947, LOC151121, LOC643308, LPHN3, LPL, LRP1B, MACROD2, MAD1L1, MAGI2, MAN2A2, MAOA, MAOB, MAP1B, MAP2K3, MAP2K5, MC4R, MCTP1, MED27, MEIS2, MIR96, MMP24, MMP7, MOBP, MOG, MTCH2, MTHFR, MTIF3, MTNR1A, MTNR1B, MYBPC1, MYO5B, MYT1L, NADSYN1, NAPRT1, NCAN, NCKAP5, NEGR1, NET1, NEUROD6, NFIL3, NGF, NLN, NOS1, NPAS3, NPPC, NPSR1, NR3C2, NR4A2, NRSN1, NRXN1, NRXN3, NT5C2, NT5DC3, NTF3, NTM, NTRK2, NUCB1, NUDT3, NXPH1, NYAP2, OPRM1, OR4C3, OXER1, OXTR, PARK2, PCDP1, PER1, PER2, PEX5L, PGRMC2, PHLDA1, PLCL1, PNMT, POC5, PPM1F, PPM1H, PPP1R1B, PPP2R2C, PRELID2, PRKAG2, PRKD1, PRKG1, PRTG, PSMC3, PTBP2, PTPRG, PTPRJ, PTPRN2, PYDC2, QPCTL, RASSF2, RBMS3, REEP5, RGS18, RPL23AP56, RPL27A, RPL31P43, SDK2, SEC16B, SGTB, SH2B1, SH3BP5, SLC18A2, SLC1A3, SLC38A1, SLC39A3, SLC39A8, SLC5A7, SLC6A1, SLC6A2, SLC6A3, SLC6A4, SLC7A10, SLC9A9, SLCO3A1, SLIT1, SNAP25, SNCA, SNPH, SORCS1, SORCS3, SPOCK3, SRGAP1, SSFA2, STS, STX1A, SUPT3H, SYN3, SYP, SYT1, SYT2, TAAR3, TAF2, TCEB1, TCERG1L, TDO2, TDP2, TENM4, TFAP2B, TFEB, TH, TIAM2, TLE1, TLE4, TLL2, TMEM132B, TMEM160, TMEM18, TNNI3K, TPH1, TPH2, TRANK1, TRIM32, TRIO, TSPAN8, UGT1A9, UNC5B, USP24, VAMP2, VEGFA, WDR96, XKR3, XKR4, XPO1, ZBBX, ZNF423, ZNF516, ZNF544, ZNF608, ZNF75A, ZNF804A, ZNF805) 관련 유전자를 제거함으로써 뚜렛증후군 원인 유전자의 선발을 용의하게 하였다.
실시예
8.
뚜렛
유전자 패널의 In
silico
분석
한국인 뚜렛증후군 환자의 34가계 81명의 임상시료에 대해서 선발된 뚜렛증후군 gene panel sequencing in silico 분석 결과, 전체 임상시료 중 30가계에서 뚜렛증후군 환자의 원인 유전자가 선발되었다. 나머지 4가계에서는 뚜렛증후군 환자의 원인 유전자가 선발되지 않았다. 이에, 뚜렛증후군 원인 유전자 분석방법을 보정하기 위해, 한국인 뚜렛증후군 환자의 WES(whole exome sequencing) 유전체데이터 SNV 분석결과 중 뚜렛증후군의 원인 유전자가 선발되지 않은 가계에서 CNV를 분석할 수 있는 분석방법을 보안하였다. 1차로 알려진 뚜렛증후군 관련 유전자에서 SNV 분석을 수행한 후, 뚜렛증후군 원인 유전자 변이를 찾지 못한 가계에서 문헌연구(Pubmed)를 통해 선발된 CNV를 보이는지 분석하는 방법을 추가함으로써 뚜렛증후군 진단 패널의 진단율을 높이는 방식을 도입하였다.
실시예
9.
유전자 변이 확인을 위한
프라이머
합성
NGS 방법으로 확인된 뚜렛증후군 원인 유전자 변이를 확인하기 위해, 변이 부분을 포함하는 150 내지 300 bp 크기의 유전자 부위를 선정하고, 이를 증폭시킬 수 있는 PCR 프라이머를 결정 및 합성하였다(바이오니아, 대전).
실시예
10.
gDNA
시료 확보
상기 실시예 9에서 합성한 프라이머를 이용하여 gDNA(genomic DNA) PCR 조건을 확인하기 위해, 인간 정상 폐 세포주인 WI38의 gDNA를 대상으로 PCR을 실시한 후 PCR 산물에 대한 서열 분석을 실시하였다. 서열 분석에 활용 가능한 PCR 프라이머를 확보한 후, 환자 혈액의 혈구세포로부터 gDNA를 분리하여 PCR을 실시하였고, 이에 대한 서열 분석을 실시하였다. 세포주 또는 환자 혈구세포의 gDNA 추출시, gDNA prep kit(NANOHELIX, Cat No.GCBL 200)를 이용하였고, 구체적인 방법은 하기와 같다.
먼저, 혈구세포를 분리하기 위해, 1.5 mL의 마이크로 튜브에 전혈 200 uL를 1 mL의 RBL(Red blood cell lysis) 용액과 섞어준 후 0℃를 유지하면서 2 내지 3회 살짝 흔들어주었다. 약 10분 후, 12,000 rpm의 속도로 10분 동안 원심분리하여 백혈구세포들을 모았다. 1x106 내지 1x106의 WI38 세포를 1.5 mL의 마이크로튜브에 모은 후, 3,000 rpm의 속도로 5분 동안 원심분리하였다. 분리한 세포들이 들어있는 마이크로 튜브에 세포용해용액을 처리한 후, gDNA prep kit(PureHelix Genomic DNA prep kit, NANOHELIX Co. 대전)에서 제공하는 방법에 따라 핵산분리 컬럼을 이용하는 등의 gDNA 분리과정을 수행하였다. 분리한 gDNA의 정량은 Nanodrop을 사용하여 260 nm의 흡광도에서 측정하여 결정하였다.
실시예
11.
gDNA의
PCR
및 서열분석
100 ng의 gDNA 또는 1 uL의 환자 gDNA를 주형으로 PCR을 실시하였다. PCR 반응물 조성은 DNA 주형(100 ng 또는 1 uL), 합성 프라이머(forward, reverse 프라이머 각각 1 uL), AccuPower® HotStart Pfu PCR premix(바이오니아, 대전) 및 증류수를 포함하여 총 20 ul가 되도록 하였다. PCR은 초기 변성(95°C, 15분, 1회) 후, 변성(95°C, 30초), 어닐링(55°C, 30초) 및 확장(72°C, 1분)의 조건으로 30회 반복 반응하고, 최후 확장(72°C, 5분)으로 반응하였다. 최종 PCR 반응물 중, 2 uL를 1% 아가로오즈 겔상에서 분석하여 예상한 크기의 PCR 산물을 확인하였고, 남은 반응물은 모두 아가로오즈 겔상에서 분리하여 PCR 산물을 겔 추출(gel extraction) 방법으로 정제하였다. 겔 추출은 Gel extraction kit(PureHelix Gel extraction kit, NANOHELIX Co. 대전)에서 제공하는 방법에 따라 진행하였다.
총 부피 30 uL로 정제한 PCR 산물 중 2 uL를 1% 아가로오즈 겔상에서 분리하여 다시 확인하였고, 남은 반응 산물 중 10 uL로 서열분석을 실시하였다. 서열분석 프라이머로는 PCR 프라이머를 이용하였고, 서열분석은 솔젠트(대전)에서 실시하였다. 이때, PCR 프라이머 서열을 하기 표 1 및 표 2에 나타내었다. 또한, 뚜렛증후군 유전자 패널 검증 결과를 하기 표 3에 나타내었다.
Family | Chr | Chromosomal localization | Gene Name | RefSeq | Exon | reference | mutation | LEFT(서열번호) | RIGHT(서열번호) | PRODUCT SIZE |
Family_1 | chr12 | 72335380 | TPH2 | C | A | GAGTGACACGGCAACTTCAC(서열번호 65) | CAACTGCTGTCTTGCCACTT(서열번호 66) | 238 | ||
Family_1 | chr3 | 88040035 | HTR1F | G | A | TGGTGTCCCTCACTCTGTCT(서열번호 67) | GCCAGTGGGATGTAGAAAGCT(서열번호 68) | 511 | ||
Family_2 | chr9 | 117070009 | COL27A1 | A | G | TCACAAGATGCAGGGTCCAT(서열번호 69) | CTGGGGATAGAGGCAGACAG(서열번호 70) | 248 | ||
Family_3 | chr12 | 72335380 | TPH2 | C | A | GAGTGACACGGCAACTTCAC(서열번호 71) | CAACTGCTGTCTTGCCACTT(서열번호72) | 238 | ||
Family_3 | chr9 | 116930998 | COL27A1 | C | T | CAGCCACCAAAATCCCCAAA(서열번호 73) | AACTGGACGGGAAGTAGGTG(서열번호 74) | 164 | ||
Family_3 | chr7 | 153749970 | DPP6 | C | T | AGTGGGAACCGGAGAGA(서열번호 75) | GGAACGTAAGGCGAATTC C(서열번호 76) | 596 | ||
Family_3 | chr6 | 38142846 | BTBD9 | G | C | CGCTGCCTCCTTTATTGGTG(서열번호 77) | CTTTGAGTGTCCAGAGCAGC(서열번호 78) | 192 | ||
Family_5 | chr2 | 113890284 | IL1RN | G | A | AACATCACTGACCTGAGCGA(서열번호 79) | GGCAGTACTACTCGTCCTCC(서열번호 80) | 217 | ||
Family_5 | chr6 | 38142846 | BTBD9 | G | C | CGCTGCCTCCTTTATTGGTG(서열번호 81) | CTTTGAGTGTCCAGAGCAGC(서열번호 82) | 192 | ||
Family_6 | chr7 | 94218004 | SGCE | T | C | GACACAAGTGTTTTGCCTT(서열번호 83) | GGGGTCATAGTTTACCCG(서열번호 84) | 267 | ||
Family-6 | chrX | 153296684 | MECP2 | G | A | AGGCATCTTGACAAGGAGCT(서열번호 85) | TTCACGGTAACTGGGAGAGG(서열번호 86) | 207 | ||
Family-11 | chr5 | 52240850 | ITGA1 | C | A | TCGGAGTGAAAATGCATCTCTG(서열번호 87) | TCTGTCACTTACCGAGAGCA(서열번호 88) | 173 | ||
Family-11 | chr3 | 113890728 | DRD3 | C | T | TGGATGAGGGACAGGATGGT(서열번호 89) | ACCAAGCCCCAAAGAGTCTG(서열번호 90) | 546 | ||
Family-12 | chr7 | 153750096 | DPP6 | G | A | AGTGGGAAC CGGAGAGA(서열번호 91) | GGAACGTAAGGCGAATTCC(서열번호 92) | 596 | ||
Family-12 | chr7 | 94259133 | SGCE | C | T | CAGGTTTTGGGTAAGGTGGA(서열번호 93) | GACCCCTCTTTATAAACAGCGT(서열번호 94) | 214 | ||
Family-13 | chr9 | 116994117 | COL27A1 | G | T | CTTCTGTGGCCTAGAGTCCC(서열번호 95) | CACAGATTTAGGGGAGGCCA(서열번호 96) | 236 | ||
Family_13 | chr9 | 116931124 | COL27A1 | C | T | AAAGTCAGCCCTACCCACTC(서열번호 97) | CATGGCTGGTTATCTTGGCC(서열번호 98) | 212 | ||
Family_15 | chr1 | 216166484 | USH2A-1 | A | T | AAAGTCAGCCCTACCCACTC(서열번호 99) | CATGGCTGGTTATCTTGGCC(서열번호 100) | 212 | ||
Family_15 | chr1 | 216144109 | USH2A-2 | G | A | GCTTGAAAGGCTAGCTGTGC(서열번호 101) | TCATGCTGGAACTGTTGGGT(서열번호 102) | 505 | ||
Family_15 | chr12 | 88512305 | CEP290 | T | ㅡ | GCAGATCCACAATAGAACA(서열번호 103) | CACTTAAAACAGCAGCAG(서열번호 104) | 319 | ||
Family_15 | chr22 | 42523636 | CYP2D6 | C | A | CATCTGGGAAACAGTGCA(서열번호 105) | ATGTCACGG GATGTCATA(서열번호 106) | 360 | ||
Family_16 | chr4 | 9783901 | DRD5 | T | C | GGGGCAGTTCGCTCTATACC(서열번호 107) | CGGTCCACGCTGATGACGC(서열번호 108) | 378 | ||
Family_16 | chr16 | 55734106 | SLC6A2 | T | C | TTCTCTCCCTTCTCTGCCCA(서열번호 109) | GACATCACAGTGAGCTGGGT(서열번호 110) | 536 | ||
Family_19 | chr22 | 18905859 | PRODH | G | A | CATGACATAAAAGCTGAGG(서열번호 111) | CCACAGGAT GCCTATGA(서열번호 112) | 322 | ||
Family_19 | chr7 | 150747934 | ASIC3-1 | CCCCAG | ㅡ | CATCATCGATCAGCTGGGCT(서열번호 113) | GGGTGGGCACAGTTCTTGTA(서열번호 114) | 549 | ||
Family_19 | chr7 | 150746097 | ASIC3-2 | G | A | TAGCCCCCTGACTGACTCTC(서열번호 115) | AGTCCAGCAGCATGTCATCC(서열번호 116) | 560 | ||
Family_21 | chr8 | 141231575 | TRAPPC9-1 | C | T | AGCTTCACTGTGACGGCTTT(서열번호 117) | AAAACAAAACCAGCCTGGGC(서열번호 118) | 579 | ||
Family_21 | chr8 | 141468504 | TRAPPC9-2 | T | C | GAAGGAGGCCCAGTTCTGTC(서열번호 119) | AGTCTGTAAGCCTCCCCCAT(서열번호 120) | 518 | ||
Family_21 | chr11 | 113860274 | HTR3A | G | A | ACCATGTTCAGGTCACCACC(서열번호 121) | AGGGTTCAGACCTTGGCTTG(서열번호 122) | 539 | ||
Family_23 | chr7 | 153750096 | DPP6 | G | A | AGTGGGAACCGGAGAGA(서열번호 123) | GGAACGTAAGGCGAATTCC(서열번호 124) | 596 |
Family | Chr | Start | Gene Name | RefSeq | Exon | mutation | LEFT(서열번호) | RIGHT(서열번호) | PRODUCT SIZE |
Family_1 | chr11 | 60704123 | ARHGAP32 | NM_014715 | exon13 | c.C4390T p.R1464C, |
CTGACCAGGAGGAACTGAGC(서열번호 125) | GGCGCAAATGTCACAAACT(서열번호 126) | 211 |
Family_1 | chr11 | 60704125 | ARHGAP32 | NM_014715 | exon13 | c.G4391A p.R1464H |
CTGACCAGGAGGAACTGAGC(서열번호127) | GGCGCAAATGTCACAAACT(서열번호 128) | 211 |
Family_1 | chr12 | 124323060 | DNAH10 | NM_207437 | exon28 | c.C4606T p.R1536C |
GAACAGTGTCTCCGCTCTCC(서열번호 129) | TTGAGGCTTTTCTGGCATTT(서열번호 130) | 177 |
Family_1 | chr12 | 124315194 | DNAH10 | NM_207437 | exon25 | c.A4139T p.D1380V |
AAATGACCGAAACGTTCACC(서열번호 131) | CATACCACCACGCTCAGCTA(서열번호 132) | 230 |
Family_3 | chr2 | 216239973 | FN1 | NM_212474 | exon36 | c.C5578T p.R1860W |
AGCATGGAAGCAGCAATACC(서열번호 133) | ATTGATGCACCATCCAACCT(서열번호 134) | 197 |
Family_3 | chr9 | 116930998 | COL27A1 | NM_032888 | exon3 | c.C1163T p.T388I |
CGCTCAACCATCACAGAAGA(서열번호 135) | GAGACTCTGGCAGGAACTGG(서열번호 136) | 201 |
Family_2 | chr17 | 80755631 | TBCD | NM_005993 | exon8 | c.772-2A>T |
TTTTCAGATGAATTTTTGGGAGA(서열번호 137) | GGGCAAACAGTCTTCACGTT(서열번호 138) | 247 |
Family_6 | chr19 | 40900339 | PRX | NM_181882 | exon7 | c.G3920C p.R1307P |
TTCCCCAGTGACCATCTCA(서열번호 139) | GCGTACCTTCTGCCTCTCAC(서열번호140) | 235 |
Family_6 | chr19 | 40901579 | PRX | NM_181882 | exon7 | c.G2680Ap.V894M | GAACTTGGAAGAGGGCTTGA(서열번호141) | TAGACCTGCCAGGAGCACTT(서열번호 142) | 218 |
Family_ 11 | chr3 | 113890728 | DRD3 | NM_000796 | exon2 | c.G112A p.A38T |
TATACCACCCAGGGCATCAC(서열번호 143) | ACTACACCTGTGGGGCAGAG(서열번호 144) | 229 |
Family_ 12 | chr19 | 8161788 | FBN3 | NM_032447 | exon42 | c.T5390C p.L1797P |
GACCTGGACAGAGCCATACC(서열번호 145) | CCCAGATGTCGATGAGTGTG(서열번호 146) | 215 |
Family_ 12 | chr19 | 8151993 | FBN3 | NM_032447 | exon53 | c.G6722Ap.R2241Q | AGTTTCCTGCACCCATGAAG(서열번호 147) | AGTGTGCAGATGGTCAGCAG(서열번호 148) | 162 |
Family_ 13 | chr9 | 116931124 | COL27A1 | NM_032888 | exon3 | c.C1289Tp.P430L | CAGTTCCTGCCAGAGTCTCC(서열번호 149) | CTGGCATGGCTGGTTATCTT(서열번호 150) | 169 |
Family_ 13 | chr9 | 116994117 | COL27A1 | NM_032888 | exon16 | c.G2536Tp.V846L | CATTTGCCCCCTTTTACAGA(서열번호 151) | GCAGAGAAACCACAGTGCAA(서열번호 152) | 226 |
Family_ 15 | chr9 | 141014669 | CACNA1B | NM_000718 | exon44 | c.G6083Ap.R2028Q | TGACTGTGAGACCAGGATGG(서열번호 153) | TGGTGCTGCAAAGATGAGTC(서열번호 154) | 210 |
Family_ 15 | chr9 | 140772393 | CACNA1B | NM_000718 | exon1 | c.G8Tp.R3L | ACGTGACCGGCCCCTTAT(서열번호 155) | CGATCGATTGCTTGTAGAGGA(서열번호 156) | 344 |
Family_ 16 | chr10 | 76781852 | KAT6B | NM_001256468 | exon16 | c.2686_2697delp.896_899del | CAGTAGGCAATCACCTGCAA(서열번호 157) | TTGGGGGAGAGCTTTGAATA(서열번호 158) | 242 |
Family_ 16 | chr21 | 47422538 | COL6A1 | NM_001848 | exon33 | c.G2348Ap.R783Q | CTTGTCCCCAGAAAGACGAG(서열번호 159) | GCGGTGACATTCTTCAGGA(서열번호 160) | 233 |
Family_ 17 | chr11 | 124744033 | ROBO3 | NM_022370 | exon12 | c.G1852Ap.G618S | GGAGTAGGCAGGTTGGGAGT(서열번호 161) | CACTGCTCGAACCAGAAACA(서열번호 162) | 172 |
Family_ 19 | chr22 | 18905859 | PRODH | NM_001195226 | exon11 | c.C1073T p.T358M |
CTGCCCTGAGAAGACAGAGG(서열번호 163) | CCACAGGATGCCTATGACAA(서열번호 164) | 220 |
Family_ 20 | chr10 | 68040262 | CTNNA3 | NM_001127384 | exon13 | c.T1850C p.I617T |
AGGCATTCCAGATGGTGAAG(서열번호 165) | CAAGTGAATGTTGCCTTGGA(서열번호 166) | 191 |
Family_ 21 | chr4 | 955317 | DGKQ | NM_001347 | exon21 | c.G2512C p.E838Q |
GCTCACCATGTGCACGAC(서열번호 167) | CTTCATCAACATCCCCAGGT(서열번호 168) | 245 |
Family_ 21 | chr4 | 961785 | DGKQ | NM_001347 | exon6 | c.C694G p.P232A |
AAGCTCTGCGTCTTGCTGA(서열번호 169) | GTGGGGTCTTTCCCTGGAC(서열번호 170) | 206 |
Family_ 22 | chr11 | 124764205 | ROBO4 | NM_001301088 | exon8 | c.A775G p.T259A |
GCACTGCCCTCACCTAAAAG(서열번호 171) | GCTGTTCACCTCTGCTTGTG(서열번호 172) | 208 |
Family_ 23 | chr6 | 90428873 | MDN1 | NM_014611 | exon41 | c.C6039G p.I2013M |
CCACGGGAAAGGACTGAGTA(서열번호 173) | ACCCATACATGGGAACCAGA(서열번호 174) | 181 |
Family_ 23 | chr6 | 90382295 | MDN1 | NM_014611 | exon81 | c.C13601Gp.T4534S | TGCCTGATTTCAGACATACCA(서열번호 175) | GTTGGACGAAGGATTTGTGG(서열번호 176) | 165 |
Family_ 24 | chr21 | 47832788 | PCNT | NM_006031 | exon29 | c.C6032Tp.A2011V | GTACTGGTTCCCAGCTCCAG(서열번호 177) | AGGCGCATTTCATTTTTCAC(서열번호 178) | 222 |
Family_ 24 | chr21 | 47847674 | PCNT | NM_006031 | exon34 | c.C7459Gp.L2487V | TTCTGCAGGTTGTGCAAGAG(서열번호 179) | GCAGAGCTGACACTCACCTG(서열번호 180) | 154 |
Family_ 26 | chr12 | 96076512 | NTN4 | NM_021229 | exon7 | c.C1481Tp.A494V | TCCCCTCATAGGATCCAAAA(서열번호 181) | TGCACAATAAGAGCGAACCA(서열번호 182) | 168 |
Fam_1 | chr21 | 35897642 | RCAN1 | NM_001285391 | exon1 | c.T71Gp.L24R | CGTTAAGGAGCAGTCGGAAC(서열번호 183) | TCAAGAGAGGTGGGGAAAAA(서열번호 184) | 188 |
Fam_2 | chr10 | 76788690 | KAT6B | NM_001256468 | exon18 | c.3559_3561delp.1187_1187del | ACATGTGCCCCTGTAAGTCC(서열번호 185) | TTTTCCGTGGAGATTTCTGG(서열번호 186) | 212 |
Fam_4 | chr3 | 99509813 | COL8A1 | NM_020351 | exon3 | c.C287Tp.A96V;COL8A1 | AGATGCCCCACTTGCAGTAT(서열번호 187) | TCCCCCTCTGATCCCATAAT(서열번호 188) | 183 |
Fam_5 | chr2 | 216296589 | FN1 | NM_001306129 | exon4 | c.A514Gp.N172D | CTAAGCATCCCAGCTCTTGC(서열번호 189) | CATGAAGGGGGTCAGTCCTA(서열번호 190) | 165 |
Fam_6 | chr11 | 124763789 | ROBO4 | NM_001301088 | exon9 | c.C1036Tp.R346C | CCTGGTCAGAGATCCAAAGC(서열번호 191) | CAGCTGAGGGCTACCTTGAA(서열번호 192) | 213 |
Fam_7 | chr11 | 124765478 | ROBO4 | NM_001301088 | exon6 | c.G476Tp.G159V | GCCAGAGGATGGTCTCACTT(서열번호 193) | CGTTCCTGAGCTCTCTGACC(서열번호 194) | 226 |
Fam_7 | chr11 | 124742934 | ROBO3 | NM_022370 | exon9 | c.C1485Ap.D495E | GAGTGACTGGGAACCCTCAA(서열번호 195) | GGCTACAGGCCCAGTGAGTA(서열번호 196) | 208 |
Fam_8 | chr16 | 55690691 | SLC6A2 | NM_001043 | exon1 | c.A85Gp.K29E | GACCGGTAAAGTTCCTCTCG(서열번호 197) | ATCTTCTTGCCCCAGGTCTC(서열번호 198) | 220 |
Fam_8 | chr11 | 124761429 | ROBO4 | NM_001301088 | exon12 | c.G1279Ap.V427M | GAGGCTGTCTGAGCTGGAAC(서열번호 199) | GATCTCAGGGATGGAAAGCA(서열번호 200) | 249 |
Fam_9 | chr11 | 2186957 | TH | NM_000360 | exon11 | c.C1141Tp.Q381X; | GAGGACTGGGCAGAGACAAG(서열번호 201) | ACTGGTTCACGGTGGAGTTC(서열번호 202) | 244 |
Fam_10 | chr3 | 45814094 | SLC6A20 | NM_020208 | exon5 | c.C596Tp.T199M | GCCCCTGATGAGGTAGATGA(서열번호 203) | GAATCTCCATGCCTTTTCCA(서열번호 204) | 199 |
Fam_12 | chr2 | 216244028 | FN1 | NM_001306131 | exon32 | c.C4904Ap.P1635Q | TTCATTGGTCCGGTCTTCTC(서열번호 205) | TTTTCCTTTTCCCCCATTTC(서열번호206) | 193 |
Fam_13 | chr11 | 124739454 | ROBO3 | NM_022370 | exon3 | c.C596Gp.S199C | CCAGTCCTCCGTGATGATTT(서열번호 207) | CCTATGTCCCCTCCCTTGTT(서열번호 208) | 211 |
Sample | Chr | Start | Gene | Ref | Alt | Left Primer |
Family-1 | chr12 | 72335380 | TPH2 | C | A | |
chr3 | 88040035 | HTR1F | G | A | ||
Family-2 | chr9 | 117070009 | COL27A1 | A | G | |
Family-3 | chr12 | 72335380 | TPH2 | C | A | |
chr9 | 116930998 | COL27A1 | C | T | ||
Family-3 | chr6 | 38142846 | BTBD9 | G | C | |
Family-5 | chr2 | 113890284 | IL1RN | G | A | |
Family-5 | chr6 | 38142846 | BTBD9 | G | C | |
Family-6 | chr7 | 94218004 | SGCE | T | C | |
Family-6 | chrX | 153296684 | MECP2 | G | A | |
Family-11 | chr5 | 52240850 | ITGA1 | C | A | |
Family-11 | chr3 | 113890728 | DRD3 | C | T | |
Family-12 | chr7 | 94259133 | SGCE | C | T | |
Family-13 | chr9 | 116994117 | COL27A1 | G | T | |
Family-13 | chr9 | 116931124 | COL27A1 | C | T | |
Family-15 | chr1 | 216166484 | USH2A | A | T | |
Family-15 | chr1 | 216144109 | USH2A | G | A | |
Family-15 | chr12 | 88512305 | CEP290 | T | * | |
Family-16 | chr4 | 42523636 | DRD5 | T | C | |
Family-16 | chr16 | 55734106 | SLC6A2 | T | C | |
Family-19 | chr7 | 150747934 | ASIC3-1 | CCCCAG | * | |
Family-19 | chr7 | 150746097 | ASIC3-2 | G | A | |
Family-21 | chr8 | 141231575 | TRAPPC9 | C | T | |
Family-21 | chr11 | 113860274 | HTR3A | G | A |
실시예
12.
뚜렛증후군
유전자 패널의
CNV
유전자 분석 결과
뚜렛증후군 관련 유전자 중 원인 변이를 찾지 못한 가계에서 문헌연구(Pubmed)를 통해 선발된 CNV를 나타내는지 분석하였다. 뚜렛증후군 유전자 패널의 in silico analysis 결과, 전체 임상시료 중 뚜렛증후군 원인 변이체가 선발되지 않은 가계는 4가계였다. 이중 Trio-sample로 구성된 3가계에서 CNV 분석을 수행한 결과, 2가계에서 기존에 보고된 뚜렛증후군 CNV가 동정되었다. 뚜렛증후군 유전자 패널의 뚜렛증후군 진단 결과를 종합한 결과, 전체 34가계 중 30가계(88%)는 SNV에 의해 원인 유전자가 선발되었다. 또한, 나머지 4가계 중 Trio-sample을 가진 3가계에서 CNV를 분석한 결과, 2가계에서 기존에 알려진 CNV 유전자의 중복 및 삭제가 관찰되었다. 이를 통해, 뚜렛증후군 유전자 패널의 뚜렛증후군 진단율은 약 94%임을 확인하였다. 상기 결과를 하기 표 4에 나타내었다.
Index |
Sample
name |
chr | CNV type |
Start
(bp) |
End
(bp) |
Length
(kb) |
Raw
coverage |
copy_no | Gene |
Family 5 | TN1512D0471 | chr1 | del | 17081128 | 17090975 | 9.848 | 3822 | 1.468 | MST1L |
TN1512D0471 | chr1 | dup | 89476586 | 89477710 | 1.125 | 334 | 3.064 | GBP3 | |
TN1512D0471 | chr1 | dup | 1.97E+08 | 1.97E+08 | 48.451 | 593 | 2.771 | CFHR3, CFHR1 | |
TN1512D0471 | chr1 | del | 2.49E+08 | 2.49E+08 | 21.511 | 1146 | 1.541 | OR2T2, OR2T3 | |
TN1512D0471 | chr2 | del | 2.42E+08 | 2.42E+08 | 0.678 | 200 | 1.258 | AQP12A | |
TN1512D0471 | chr3 | dup | 1.96E+08 | 1.96E+08 | 6.497 | 16770 | 2.805 | MUC4 | |
TN1512D0471 | chr4 | del | 9245604 | 9251948 | 6.345 | 212 | 0.696 | USP17L17, USP17L18 | |
TN1512D0471 | chr4 | del | 69337177 | 69434245 | 97.069 | 424 | 1.42 | TMPRSS11E, UGT2B17 | |
TN1512D0471 | chr5 | del | 32101210 | 32174425 | 73.216 | 674 | 1.516 | PDZD2, GOLPH3 | |
TN1512D0471 | chr5 | del | 1.3E +08 | 1.4E +08 | 3352.853 | 18571 | 1.904 | KLHL3 | |
Family 3 | TN1711D1003 | chr10 | del | 6.5E +07 | 7E+07 | 5386.132 | 8532 | 1.853 | CTNNA3 |
TN1711D1003 | chr14 | del | 3.8E +07 | 4.8E +07 | 9596.145 | 13042 | 1.886 | FSCB | |
TN1711D1003 | chr15 | dup | 45398322 | 45440195 | 41.874 | 2247 | 2.303 | DUOXA1 | |
TN1711D1003 | chr17 | dup | 7121559 | 7128586 | 7.028 | 1379 | 2.436 | DLG4, ACADVL | |
TN1711D1003 | chr17 | dup | 15468795 | 15496809 | 28.015 | 475 | 2.987 | CDRT1 | |
TN1711D1003 | chr19 | dup | 580645 | 581591 | 0.947 | 227 | 3.22 | BSG |
II.
뚜렛증후군
후보 유전자 스크리닝 방법
실시예
13.
뚜렛증후군
유전자 패널의
In
silico
분석 파이프라인 개발
본 실시예에서 사용된 프로그램은 가계성 및 비가계성 뚜렛증후군의 원인 유전자 후보를 찾아내는 방법을 제공하였다. 구체적으로, 뚜렛증후군 환자, 및 뚜렛증후군 환자의 부모 또는 형제자매의 염기 서열 정보를 수치화하고 그 패턴을 분석하여 뚜렛증후군의 원인 유전자 후보 리스트를 제공하였다. 상기 프로그램이 뚜렛증후군의 원인 유전자 후보 리스트를 제공하는 방법은 하기 실시예 13.1 내지 실시예 13.7와 같다.
실시예
13.1.
Exome
서열 데이터의 전처리
먼저, Exome 서열 데이터의 전처리 과정으로, 뚜렛증후군 환자에서 원인 유전자를 선발하는데 있어서 정확한 데이터를 확보하기 위한 단계이다. 해독기계에서 생산된 단서열로부터 정확한 변이정보를 얻기 위해, 해독된 단서열 중 퀄리티(quality)가 떨어지는 리드를 제거하였다. 이를 위해 Sickle 프로그램(https://github.com/najoshi/sickle)을 이용하였다. 상기 프로그램은 각 단서열의 평균 Q score를 계산한 후, 이 값이 사용자가 선택한 퀄리티 기준 값 미만(Q score<20)이면 단서열을 끝에서부터 시작하여 하나씩 잘랐다(trimming). 나머지 염기로 다시 Q score를 계산하여 평균값이 기준 퀄리티 이상이 나올 때까지 상기 작업을 반복하였다. 상기 trimming은 사용자의 선택에 따라 3'말단이나 5'말단으로부터 시작할 수 있다. 최종적으로 얻어진 단서열의 길이가 사용자가 선택한 기준 값(50 bp)보다 작을 경우, 해당 단서열을 제거하고, 그렇지 않을 경우 해당 단서열의 필터링을 종료하였다.
실시예
13.2. 인간 표준서열에
맵핑
그 다음, 인간 표준서열에 맵핑하는 단계이다. 각 샘플 당 전장게놈(whole genome)의 약 100배수 이상에 해당하는 단서열을 인간 표준서열(human reference sequence, NCBI build GRCh37, UCSC build hg19)에 효과적으로 맵핑하기 위하여, Burrows-Wheeler Aligner(BWA) 0.1.17 버젼의 프로그램을 이용하였다. 이때, 사용한 명령어와 옵션은 하기와 같은 옵션을 사용하였다: bwa aln -I -t 3 -l 45 -k 2 ref.fa sample1_1.fq.gz > sample1_1.sai, bwa aln -I -t 3 -l 45 -k 2 ref.fa sample1_2.fq.gz > sample1_2.sai, bwa sampe -r '@RG\tID:TGP2010D0009\tSM:TGP2010D0009\tPL:Illumina' ref.fa sample1_1.sai sample1_2.sai sample1_1.fq.gz sample1_2.fq.gz > sample1.sam.
다음으로, PCR 중복을 제거하는 단계이다. 정확한 변이체 발굴을 위해 Samtools 0.1.18 버전의 프로그램을 이용하여 PCR 중복 리드를 제거하였다. 이때, 사용한 명령어와 옵션은 하기와 같다: samtools rmdup sample1.sorted.bam sample1.sorted.rmdup.bam.
실시예
13.3. Local realignment
그 다음은 Local realignment 단계로서, 이전 단계에서 생성된 BAM 파일로부터 GATK Lite 2.3.9 버전의 프로그램을 이용하여 부분적인 재배열(local realignment)을 수행하였다. 이 단계는 단서열의 부분적인 재배열을 통해 전체 단서열 중에 mismatch 되는 염기의 수를 최소화하기 위해 고안되었으며, 하기의 두 세부 단계로 이루어진다: Realignment가 필요해 보이는 의심스러운 짧은 간격(interval)을 결정하는 단계(RealignerTargetCreator) 및 이들 간격에 걸쳐 부분적인 재배열을 하는 단계(IndelRealigner). 이때, indel로 인하여 misalignment가 일어나는 리드들의 local realignment를 실행하였다.
실시예
13.4. Base
recalibration
그 다음은, Base recalibration 하는 단계이다. GATK Lite 2.3.9 프로그램에서 제공하는 BaseRecalibrator 옵션을 이용하여, 단서열의 성질과 관련하여 사용자가 지정한 다양한 공변량(covariate)들을 바탕으로 재보정(recalibration) 테이블을 만들었다. 이에 따라, 염기의 quality score를 재보정 하였다. 사용자가 지정할 수 있는 공변량들로는 리드 그룹(read group), 기존에 보고된 quality score, machine cycle 및 nucleotide context 값들이 있다. 이 단계에서 사용한 명령어와 옵션은 하기와 같다; java -jar GenomeAnalysisTKLite.jar -T BaseRecalibrator -R ref.fa -I sample1.sorted.rmdup.realign.bam -o out.grp --plot_pdf_file out.grp.pdf --knownSites dbsnp_137.hg19.vcf ―disable_indel_quals.
실시예
13.5.
변이체
발굴
다음으로, 변이체 발굴하는 단계이다. GATK Lite 2.3.9 프로그램에서 제공하는 UnifiedGenotyper 옵션을 이용하여 변이체를 발굴하였다. 이 옵션은 Bayesian genotype likelihood model을 이용하여 가장 높은 확률의 변이체 유전형 및 위치를 예측하는 프로그램이다.
실시예
13.6.
변이체
필터링
그 다음은, 변이체 필터링하는 단계이다. GATK Lite 2.3.9 프로그램에서 제공하는 VariantFiltration 옵션을 사용하여 변이체 필터링을 수행하였다. 이 옵션은 사용자 지정 필터를 이용하여 변이체 자료 값을 평가한 후, 기준에 미달되는지의 여부를 새 파일의 'FILTER' 컬럼에 PASS, HARD_TO_VALIDATE 혹은 DepthFilter 등으로 표시한다.
실시예
13.7.
뚜렛증후군
환자 및 정상인의 일치하는 염기서열 위치 선별
마지막으로, 상기 조건을 만족한 SNV에서 뚜렛증후군 환자, 뚜렛증후군 환자의 부모 및 형제자매의 염기서열 정보를 이용하여, 뚜렛증후군 환자와 정상인의 염기서열이 각각 일치하는 위치를 선별하는 단계이다. 분석 대상 가족 내에서 뚜렛증후군 환자와 정상인의 염기서열을 정확하게 구분 짓는 위치는 SNV 및 CNV 위치이다. 상기 걸러낸 염기서열의 위치가 암호화 부위(coding region)인지 확인하고, 암호화 부위의 염기서열만 대상으로 삼았다. 상기 선정된 SNV 염기서열을 유전자기호(gene symbol)로 변환하였다. 상기 선정된 SNV에 해당하는 유전자에 대한 설명과 표현형, PolyPhen-2와 SIFT 프로그램의 점수 등 추가 정보를 제공하였다. PolyPhen-2와 SIFT는 염기서열의 변이에 따른 질병을 예측하는 프로그램이다. 상기 모든 과정은 LINUX에서 실행할 수 있는 파이프라인 형식으로 구성하였다. 분석 결과는 뚜렛증후군 유전자 패널에서 나타나는 모든 변이체를 제공함으로써, 뚜렛증후군 환자의 원인 유전자를 동정하는데 도움이 될 수 있는 형식으로 데이터를 제공하였다.
<110> Korea Research Institute of Bioscience and Biotechnology
<120> METHOD FOR IDENTIFYING CAUSATIVE GENES OF TOURETTE SYNDROME
<130> FPD/201901-0012
<160> 208
<170> KoPatentIn 3.0
<210> 1
<211> 1860
<212> PRT
<213> Artificial Sequence
<220>
<223> COL27A1
<400> 1
Met Gly Ala Gly Ser Ala Arg Gly Ala Arg Gly Thr Ala Ala Ala Ala
1 5 10 15
Ala Ala Arg Gly Gly Gly Phe Leu Phe Ser Trp Ile Leu Val Ser Phe
20 25 30
Ala Cys His Leu Ala Ser Thr Gln Gly Ala Pro Glu Asp Val Asp Ile
35 40 45
Leu Gln Arg Leu Gly Leu Ser Trp Thr Lys Ala Gly Ser Pro Ala Pro
50 55 60
Pro Gly Val Ile Pro Phe Gln Ser Gly Phe Ile Phe Thr Gln Arg Ala
65 70 75 80
Arg Leu Gln Ala Pro Thr Gly Thr Val Ile Pro Ala Ala Leu Gly Thr
85 90 95
Glu Leu Ala Leu Val Leu Ser Leu Cys Ser His Arg Val Asn His Ala
100 105 110
Phe Leu Phe Ala Val Arg Ser Gln Lys Arg Lys Leu Gln Leu Gly Leu
115 120 125
Gln Phe Leu Pro Gly Lys Thr Val Val His Leu Gly Ser Arg Arg Ser
130 135 140
Val Ala Phe Asp Leu Asp Met His Asp Gly Arg Trp His His Leu Ala
145 150 155 160
Leu Glu Leu Arg Gly Arg Thr Val Thr Leu Val Thr Ala Cys Gly Gln
165 170 175
Arg Arg Val Pro Val Leu Leu Pro Phe His Arg Asp Pro Ala Leu Asp
180 185 190
Pro Gly Gly Ser Phe Leu Phe Gly Lys Met Asn Pro His Ala Val Gln
195 200 205
Phe Glu Gly Ala Leu Cys Gln Phe Ser Ile Tyr Pro Val Thr Gln Val
210 215 220
Ala His Asn Tyr Cys Thr His Leu Arg Lys Gln Cys Gly Gln Ala Asp
225 230 235 240
Thr Tyr Gln Ser Pro Leu Gly Pro Leu Phe Ser Gln Asp Ser Gly Arg
245 250 255
Pro Phe Thr Phe Gln Ser Asp Leu Ala Leu Leu Gly Leu Glu Asn Leu
260 265 270
Thr Thr Ala Thr Pro Ala Leu Gly Ser Leu Pro Ala Gly Arg Gly Pro
275 280 285
Arg Gly Thr Val Ala Pro Ala Thr Pro Thr Lys Pro Gln Arg Thr Ser
290 295 300
Pro Thr Asn Pro His Gln His Met Ala Val Gly Gly Pro Ala Gln Thr
305 310 315 320
Pro Leu Leu Pro Ala Lys Leu Ser Ala Ser Asn Ala Leu Asp Pro Met
325 330 335
Leu Pro Ala Ser Val Gly Gly Ser Thr Arg Thr Pro Arg Pro Ala Ala
340 345 350
Ala Gln Pro Ser Gln Lys Ile Thr Ala Thr Lys Ile Pro Lys Ser Leu
355 360 365
Pro Thr Lys Pro Ser Ala Pro Ser Thr Ser Ile Val Pro Ile Lys Ser
370 375 380
Pro His Pro Thr Gln Lys Thr Ala Pro Ser Ser Phe Thr Lys Ser Ala
385 390 395 400
Leu Pro Thr Gln Lys Gln Val Pro Pro Thr Ser Arg Pro Val Pro Ala
405 410 415
Arg Val Ser Arg Pro Ala Glu Lys Pro Ile Gln Arg Asn Pro Gly Met
420 425 430
Pro Arg Pro Pro Pro Pro Ser Thr Arg Pro Leu Pro Pro Thr Thr Ser
435 440 445
Ser Ser Lys Lys Pro Ile Pro Thr Leu Ala Arg Thr Glu Ala Lys Ile
450 455 460
Thr Ser His Ala Ser Lys Pro Ala Ser Ala Arg Thr Ser Thr His Lys
465 470 475 480
Pro Pro Pro Phe Thr Ala Leu Ser Ser Ser Pro Ala Pro Thr Pro Gly
485 490 495
Ser Thr Arg Ser Thr Arg Pro Pro Ala Thr Met Val Pro Pro Thr Ser
500 505 510
Gly Thr Ser Thr Pro Arg Thr Ala Pro Ala Val Pro Thr Pro Gly Ser
515 520 525
Ala Pro Thr Gly Ser Lys Lys Pro Ile Gly Ser Glu Ala Ser Lys Lys
530 535 540
Ala Gly Pro Lys Ser Ser Pro Arg Lys Pro Val Pro Leu Arg Pro Gly
545 550 555 560
Lys Ala Ala Arg Asp Val Pro Leu Ser Asp Leu Thr Thr Arg Pro Ser
565 570 575
Pro Arg Gln Pro Gln Pro Ser Gln Gln Thr Thr Pro Ala Leu Val Leu
580 585 590
Ala Pro Ala Gln Phe Leu Ser Ser Ser Pro Arg Pro Thr Ser Ser Gly
595 600 605
Tyr Ser Ile Phe His Leu Ala Gly Ser Thr Pro Phe Pro Leu Leu Met
610 615 620
Gly Pro Pro Gly Pro Lys Gly Asp Cys Gly Leu Pro Gly Pro Pro Gly
625 630 635 640
Leu Pro Gly Leu Pro Gly Ile Pro Gly Ala Arg Gly Pro Arg Gly Pro
645 650 655
Pro Gly Pro Tyr Gly Asn Pro Gly Leu Pro Gly Pro Pro Gly Ala Lys
660 665 670
Gly Gln Lys Gly Asp Pro Gly Leu Ser Pro Gly Lys Ala His Asp Gly
675 680 685
Ala Lys Gly Asp Met Gly Leu Pro Gly Leu Ser Gly Asn Pro Gly Pro
690 695 700
Pro Gly Arg Lys Gly His Lys Gly Tyr Pro Gly Pro Ala Gly His Pro
705 710 715 720
Gly Glu Gln Gly Gln Pro Gly Pro Glu Gly Ser Pro Gly Ala Lys Gly
725 730 735
Tyr Pro Gly Arg Gln Gly Leu Pro Gly Pro Val Gly Asp Pro Gly Pro
740 745 750
Lys Gly Ser Arg Gly Tyr Ile Gly Leu Pro Gly Leu Phe Gly Leu Pro
755 760 765
Gly Ser Asp Gly Glu Arg Gly Leu Pro Gly Val Pro Gly Lys Arg Gly
770 775 780
Lys Met Gly Met Pro Gly Phe Pro Gly Val Phe Gly Glu Arg Gly Pro
785 790 795 800
Pro Gly Leu Asp Gly Asn Pro Gly Glu Leu Gly Leu Pro Gly Pro Pro
805 810 815
Gly Val Pro Gly Leu Ile Gly Asp Leu Gly Val Leu Gly Pro Ile Gly
820 825 830
Tyr Pro Gly Pro Lys Gly Met Lys Gly Leu Met Gly Ser Val Gly Glu
835 840 845
Pro Gly Leu Lys Gly Asp Lys Gly Glu Gln Gly Val Pro Gly Val Ser
850 855 860
Gly Asp Pro Gly Phe Gln Gly Asp Lys Gly Ser Gln Gly Leu Pro Gly
865 870 875 880
Phe Pro Gly Ala Arg Gly Lys Pro Gly Pro Leu Gly Lys Val Gly Asp
885 890 895
Lys Gly Ser Ile Gly Phe Pro Gly Pro Pro Gly Pro Glu Gly Phe Pro
900 905 910
Gly Asp Ile Gly Pro Pro Gly Asp Asn Gly Pro Glu Gly Met Lys Gly
915 920 925
Lys Pro Gly Ala Arg Gly Leu Pro Gly Pro Arg Gly Gln Leu Gly Pro
930 935 940
Glu Gly Asp Glu Gly Pro Met Gly Pro Pro Gly Ala Pro Gly Leu Glu
945 950 955 960
Gly Gln Pro Gly Arg Lys Gly Phe Pro Gly Arg Pro Gly Leu Asp Gly
965 970 975
Val Lys Gly Glu Pro Gly Asp Pro Gly Arg Pro Gly Pro Val Gly Glu
980 985 990
Gln Gly Phe Met Gly Phe Ile Gly Leu Val Gly Glu Pro Gly Ile Val
995 1000 1005
Gly Glu Lys Gly Asp Arg Gly Met Met Gly Pro Pro Gly Val Pro Gly
1010 1015 1020
Pro Lys Gly Ser Met Gly His Pro Gly Met Pro Gly Gly Met Gly Thr
1025 1030 1035 1040
Pro Gly Glu Pro Gly Pro Gln Gly Pro Pro Gly Ser Arg Gly Pro Pro
1045 1050 1055
Gly Met Arg Gly Ala Lys Gly Arg Arg Gly Pro Arg Gly Pro Asp Gly
1060 1065 1070
Pro Ala Gly Glu Gln Gly Ser Arg Gly Leu Lys Gly Pro Pro Gly Pro
1075 1080 1085
Gln Gly Arg Pro Gly Arg Pro Gly Gln Gln Gly Val Ala Gly Glu Arg
1090 1095 1100
Gly His Leu Gly Ser Arg Gly Phe Pro Gly Ile Pro Gly Pro Ser Gly
1105 1110 1115 1120
Pro Pro Gly Thr Lys Gly Leu Pro Gly Glu Pro Gly Pro Gln Gly Pro
1125 1130 1135
Gln Gly Pro Ile Gly Pro Pro Gly Glu Met Gly Pro Lys Gly Pro Pro
1140 1145 1150
Gly Ala Val Gly Glu Pro Gly Leu Pro Gly Glu Ala Gly Met Lys Gly
1155 1160 1165
Asp Leu Gly Pro Leu Gly Thr Pro Gly Glu Gln Gly Leu Ile Gly Gln
1170 1175 1180
Arg Gly Glu Pro Gly Leu Glu Gly Asp Ser Gly Pro Met Gly Pro Asp
1185 1190 1195 1200
Gly Leu Lys Gly Asp Arg Gly Asp Pro Gly Pro Asp Gly Glu His Gly
1205 1210 1215
Glu Lys Gly Gln Glu Gly Leu Met Gly Glu Asp Gly Pro Pro Gly Pro
1220 1225 1230
Pro Gly Val Thr Gly Val Arg Gly Pro Glu Gly Lys Ser Gly Lys Gln
1235 1240 1245
Gly Glu Lys Gly Arg Thr Gly Ala Lys Gly Ala Lys Gly Tyr Gln Gly
1250 1255 1260
Gln Leu Gly Glu Met Gly Val Pro Gly Asp Pro Gly Pro Pro Gly Thr
1265 1270 1275 1280
Pro Gly Pro Lys Gly Ser Arg Gly Ser Leu Gly Pro Thr Gly Ala Pro
1285 1290 1295
Gly Arg Met Gly Ala Gln Gly Glu Pro Gly Leu Ala Gly Tyr Asp Gly
1300 1305 1310
His Lys Gly Ile Val Gly Pro Leu Gly Pro Pro Gly Pro Lys Gly Glu
1315 1320 1325
Lys Gly Glu Gln Gly Glu Asp Gly Lys Ala Glu Gly Pro Pro Gly Pro
1330 1335 1340
Pro Gly Asp Arg Gly Pro Val Gly Asp Arg Gly Asp Arg Gly Glu Pro
1345 1350 1355 1360
Gly Asp Pro Gly Tyr Pro Gly Gln Glu Gly Val Gln Gly Leu Arg Gly
1365 1370 1375
Lys Pro Gly Gln Gln Gly Gln Pro Gly His Pro Gly Pro Arg Gly Trp
1380 1385 1390
Pro Gly Pro Lys Gly Ser Lys Gly Ala Glu Gly Pro Lys Gly Lys Gln
1395 1400 1405
Gly Lys Ala Gly Ala Pro Gly Arg Arg Gly Val Gln Gly Leu Gln Gly
1410 1415 1420
Leu Pro Gly Pro Arg Gly Val Val Gly Arg Gln Gly Leu Glu Gly Ile
1425 1430 1435 1440
Ala Gly Pro Asp Gly Leu Pro Gly Arg Asp Gly Gln Ala Gly Gln Gln
1445 1450 1455
Gly Glu Gln Gly Asp Asp Gly Asp Pro Gly Pro Met Gly Pro Ala Gly
1460 1465 1470
Lys Arg Gly Asn Pro Gly Val Ala Gly Leu Pro Gly Ala Gln Gly Pro
1475 1480 1485
Pro Gly Phe Lys Gly Glu Ser Gly Leu Pro Gly Gln Leu Gly Pro Pro
1490 1495 1500
Gly Lys Arg Gly Thr Glu Gly Arg Thr Gly Leu Pro Gly Asn Gln Gly
1505 1510 1515 1520
Glu Pro Gly Ser Lys Gly Gln Pro Gly Asp Ser Gly Glu Met Gly Phe
1525 1530 1535
Pro Gly Met Ala Gly Leu Phe Gly Pro Lys Gly Pro Pro Gly Asp Ile
1540 1545 1550
Gly Phe Lys Gly Ile Gln Gly Pro Arg Gly Pro Pro Gly Leu Met Gly
1555 1560 1565
Lys Glu Gly Ile Val Gly Pro Leu Gly Ile Leu Gly Pro Ser Gly Leu
1570 1575 1580
Pro Gly Pro Lys Gly Asp Lys Gly Ser Arg Gly Asp Trp Gly Leu Gln
1585 1590 1595 1600
Gly Pro Arg Gly Pro Pro Gly Pro Arg Gly Arg Pro Gly Pro Pro Gly
1605 1610 1615
Pro Pro Gly Gly Pro Ile Gln Leu Gln Gln Asp Asp Leu Gly Ala Ala
1620 1625 1630
Phe Gln Thr Trp Met Asp Thr Ser Gly Ala Leu Arg Pro Glu Ser Tyr
1635 1640 1645
Ser Tyr Pro Asp Arg Leu Val Leu Asp Gln Gly Gly Glu Ile Phe Lys
1650 1655 1660
Thr Leu His Tyr Leu Ser Asn Leu Ile Gln Ser Ile Lys Thr Pro Leu
1665 1670 1675 1680
Gly Thr Lys Glu Asn Pro Ala Arg Val Cys Arg Asp Leu Met Asp Cys
1685 1690 1695
Glu Gln Lys Met Val Asp Gly Thr Tyr Trp Val Asp Pro Asn Leu Gly
1700 1705 1710
Cys Ser Ser Asp Thr Ile Glu Val Ser Cys Asn Phe Thr His Gly Gly
1715 1720 1725
Gln Thr Cys Leu Lys Pro Ile Thr Ala Ser Lys Val Glu Phe Ala Ile
1730 1735 1740
Ser Arg Val Gln Met Asn Phe Leu His Leu Leu Ser Ser Glu Val Thr
1745 1750 1755 1760
Gln His Ile Thr Ile His Cys Leu Asn Met Thr Val Trp Gln Glu Gly
1765 1770 1775
Thr Gly Gln Thr Pro Ala Lys Gln Ala Val Arg Phe Arg Ala Trp Asn
1780 1785 1790
Gly Gln Ile Phe Glu Ala Gly Gly Gln Phe Arg Pro Glu Val Ser Met
1795 1800 1805
Asp Gly Cys Lys Val Gln Asp Gly Arg Trp His Gln Thr Leu Phe Thr
1810 1815 1820
Phe Arg Thr Gln Asp Pro Gln Gln Leu Pro Ile Ile Ser Val Asp Asn
1825 1830 1835 1840
Leu Pro Pro Ala Ser Ser Gly Lys Gln Tyr Arg Leu Glu Val Gly Pro
1845 1850 1855
Ala Cys Phe Leu
1860
<210> 2
<211> 7813
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1
<400> 2
ccttttcctc tcctccccca ggccggcggg gaggcagctt ccaccgccct ccgcgcgccc 60
tcacccggcc ttgctctgcc tccggggacc gccagcagcc cgcctccaaa agtttgatca 120
tctctctctc tctttttctt gcttcttctt cctttttggt ggaagcagaa aaggaccgag 180
gcaggggcga gcgcggcgcc cggactcctg ggaccatggg cctggcgcgg gcgcccgcgg 240
ggccccagcc gcgctgcctg cctgctcggg cgcccctggg cgcggggctg cgctgggggc 300
gcgggggccg cgcgctctaa gccggcctgg cgcggcgggg cggggggctg gcggccccat 360
ggggcgcgcc cacacttgcc ccccgggctc gggagcatga agtaggggcc tgccatggga 420
gcgggatcgg cgcggggggc ccgaggcaca gcggcggcgg cggcggcgcg cggggggggg 480
tttctcttct cctggatctt agtctcgttt gcctgtcacc tggcctccac ccaaggagct 540
cctgaagatg tggacatcct ccagcggctg ggcctcagct ggacgaaggc cgggagccct 600
gcacccccgg gagtcattcc tttccagtcg ggcttcatct ttacgcagcg ggcccggctc 660
caggctccca cgggcaccgt cattcctgcc gccttgggca cagagctggc actggtgctg 720
agcctctgct cccaccgggt gaaccatgcc ttcctcttcg ctgtccgcag ccagaaacgc 780
aagctgcagc tgggcctgca gttcctcccc ggcaagacgg tcgtccacct cgggtcccgg 840
cgctcagtgg ccttcgacct cgacatgcac gacgggcgct ggcaccacct ggccctcgag 900
ctccgaggcc gcacagtcac tctggtgact gcctgcgggc agcgccgggt gcctgtcctg 960
ctgcctttcc acagggaccc tgcactcgac cctgggggct ccttcctctt tgggaagatg 1020
aacccgcatg cagtccagtt tgaaggtgct ctctgccagt tcagtatcta ccctgtgacg 1080
caggtcgctc acaattactg tacccacctg aggaagcagt gtggacaggc tgacacgtac 1140
cagtccccac tgggacctct cttctcccaa gactctggca gaccttttac cttccagtcc 1200
gacctcgccc tgctaggcct ggagaacttg accactgcca caccagccct ggggtcactg 1260
ccagcaggca ggggacccag ggggactgtg gcacccgcca cgcccaccaa gccccaaagg 1320
actagcccca caaaccctca ccagcatatg gcggtgggag gcccagccca aaccccgctg 1380
ctacctgcca agctgtcagc cagtaacgca cttgatccca tgctcccagc ctctgttggc 1440
ggctctacca gaacgcctcg ccctgcggcc gctcaaccat cacagaagat cacagccacc 1500
aaaatcccca aaagcctccc taccaagcct tcggcccctt ctacttcaat tgtgcccatc 1560
aaaagccccc atcctaccca gaaaacagct ccatcttcat ttacaaagtc agccctaccc 1620
actcagaagc aagtgccacc tacttcccgt ccagttcctg ccagagtctc ccgtcccgca 1680
gagaagccca tccagaggaa cccgggaatg cccaggcccc caccgcccag cacccggccc 1740
ctacctccta ccaccagctc ctctaaaaaa cccattccca cactagctcg gactgaggcc 1800
aagataacca gccatgccag taagccggcc tctgcccgca ccagcaccca caaacctccc 1860
ccatttactg ctttatcctc atctcctgcc cctactcctg gttctaccag gagtactcgg 1920
ccaccagcca cgatggtacc tccaacttcg ggcaccagca ctcccagaac agcacctgcc 1980
gtccccactc ctggctcagc tcccactgga agcaagaagc ccattggatc ggaagcctca 2040
aagaaagccg gacccaagag cagcccccgg aagcctgtcc ccctcagacc tgggaaggca 2100
gccagggatg tccccttgag cgatctgaca accaggccta gccccagaca gccccagccc 2160
agtcagcaga ccaccccggc cctggtattg gccccggcgc aattcctgtc ctccagcccc 2220
cggcccacga gcagtggcta ttcgatcttc cacctggcag gatctacgcc tttccctctg 2280
ctgatggggc ctccgggacc caagggagac tgtggcttgc cgggtccccc tgggctacct 2340
gggctacctg gaatccctgg tgcacgtggg cctcggggtc ctcctgggcc ttatggaaat 2400
ccaggtctcc ccggccctcc tggagccaaa ggacagaaag gggacccagg gctctcacca 2460
ggaaaggccc acgatggggc aaagggtgac atgggcttgc ctgggctctc cgggaatcca 2520
ggacctccgg gacgaaaggg acacaagggc tatcctggac cggcagggca ccccggagaa 2580
caggggcagc caggacctga gggcagccca ggggccaaag gttaccctgg caggcagggg 2640
ttacctggac cggtaggaga tcccggcccc aaaggcagca ggggctacat tgggctccca 2700
gggctcttcg gcctgccagg gtctgatgga gaacgaggcc tgcctggcgt tcctggcaag 2760
aggggcaaga tgggtatgcc ggggtttcct ggagtctttg gggaaagagg ccctcctgga 2820
ctggatggaa atcctggaga actgggcctg ccaggccccc ctggagtccc cggcctcatt 2880
ggtgacttag gagtgttggg tccgattggc tacccgggac ccaagggcat gaagggactg 2940
atgggcagcg tgggggagcc cggactgaaa ggtgataagg gtgaacaagg ggttccaggt 3000
gtgtcaggag atcccggatt ccaaggagac aaggggagcc aggggttgcc agggttcccc 3060
ggtgcacggg ggaagccagg gcctctgggc aaagtcggag acaaaggatc cattgggttt 3120
cccgggcccc ctggacccga gggattccca ggagacatcg gcccccctgg cgacaatggc 3180
ccagaaggca tgaagggtaa gcctggagcc cgaggcctgc cgggaccccg tgggcagctg 3240
gggcccgagg gagatgaggg acccatgggg ccgccagggg cccctggctt ggagggtcag 3300
cctggcagga aggggtttcc tgggaggccc ggcctggatg gcgtgaaggg ggaaccaggg 3360
gatcctggtc ggccggggcc tgtgggagag cagggattta tgggattcat tggtctggtc 3420
ggggagccag gaatcgtggg agaaaagggt gatcgtggca tgatgggacc cccaggcgtg 3480
cctggaccca aggggtcgat gggtcatcct ggaatgccag gtggtatggg gacccctgga 3540
gagcctggac cccagggtcc tccaggatct cgaggcccac caggcatgag gggagcaaag 3600
ggacgtcggg gcccccgagg accggacgga ccagctgggg agcaagggtc caggggcctg 3660
aagggccctc caggacccca gggcagaccg ggccggcctg gacagcaggg tgtggctggt 3720
gagcgaggcc acttgggctc gagaggcttt cctggcatcc cgggtccctc aggcccccca 3780
ggcaccaagg gcctcccagg agaaccgggc cctcagggac cccaggggcc aattgggcct 3840
ccaggagaga tgggacccaa ggggccgcct ggtgcagtgg gagaaccggg ccttcctggg 3900
gaagccggga tgaagggtga ccttggaccc ctgggcactc ctggggagca gggcctcatt 3960
gggcaacggg gagagccagg ccttgagggt gacagtggcc ccatgggacc tgatgggctg 4020
aagggggaca ggggagaccc agggcctgat ggagaacatg gcgagaaagg ccaggaaggg 4080
ctgatgggtg aggacgggcc ccccggcccc cctggcgtca ctggtgtccg gggtcctgaa 4140
ggaaaatcag ggaagcaagg cgagaagggc cgcactggag ccaagggtgc caagggctat 4200
caaggacagc tgggtgagat gggcgtccct ggagaccctg gaccccctgg cactccaggc 4260
cctaaagggt cccggggcag cctgggacca acgggtgctc cgggacgcat gggggcccaa 4320
ggagaaccgg gactggctgg ttatgatgga cacaaaggca ttgtgggacc ccttggacct 4380
cctggaccaa aaggcgaaaa gggggagcag ggcgaggacg gcaaggctga ggggccccct 4440
gggccacctg gagatcgggg ccctgtgggt gatcgaggag accgcgggga accgggagac 4500
cctgggtacc ctggacagga gggtgtgcaa ggcctccgtg gaaagccagg ccagcagggc 4560
caacccgggc atccgggacc ccgggggtgg ccgggaccca aaggatcgaa aggcgcagag 4620
ggaccaaagg gaaagcaagg caaggcaggg gccccaggcc ggaggggggt ccagggcctg 4680
caggggctgc cagggccccg gggcgtggtg gggagacagg gcctcgaggg catcgctgga 4740
ccagatgggc ttcctggcag ggacgggcaa gcaggacagc agggggagca gggagacgat 4800
ggggaccctg gccccatggg ccctgctggg aagagaggaa atccaggtgt ggccggctta 4860
cctggagcac agggaccccc aggattcaag ggtgagagtg ggttacccgg acagctgggt 4920
ccccctggca agcgaggaac agagggcaga acggggctcc ctggaaacca gggggagcct 4980
gggtccaaag gccagccggg cgactctggc gagatgggct tcccaggaat ggcaggtctc 5040
ttcggaccca agggcccgcc tggagacatt ggcttcaaag gcatccaggg ccctcggggg 5100
ccacctggct tgatgggaaa ggaaggcatc gtcgggcccc tcggaatcct gggaccttcg 5160
ggactcccgg gtccgaaggg tgacaaaggc agccgtgggg actggggatt gcaaggtccg 5220
aggggtcctc ccggccccag agggcggccc ggccccccgg gtcctccagg gggtcctatc 5280
caattgcaac aagatgatct tggggcagct ttccagacgt ggatggacac cagtggagca 5340
ctcaggccag agagttacag ctatccagac cggctggtgc tggaccaggg aggagagatc 5400
tttaaaacct tacactacct cagcaacctc atccagagca ttaagacgcc cctgggcacc 5460
aaagagaacc ccgcccgggt ctgcagggac ctcatggact gtgagcagaa gatggtggat 5520
ggtacctact gggtggatcc aaaccttggc tgctcctctg acaccatcga ggtctcctgc 5580
aacttcactc atggtggaca gacgtgtctc aagcccatca cggcctccaa ggtcgagttt 5640
gccatcagcc gggtccagat gaatttcctg cacctgctaa gctccgaggt gacccagcac 5700
atcaccatcc actgccttaa catgaccgtg tggcaggagg gcactgggca gaccccagcc 5760
aagcaggccg tacgcttccg ggcctggaat ggacagattt ttgaagctgg gggtcagttc 5820
cggcccgagg tgtccatgga tggctgcaag gtccaagatg gccgctggca tcagacactc 5880
ttcaccttcc ggacccaaga cccccaacag ctgcccatca tcagtgtgga caacctccct 5940
cctgcctcat cagggaagca gtaccgcctg gaagttggac ctgcgtgctt cctctgacct 6000
ctgacctcgt ggccactcta ggcctcacgg aggagggaag aggaagaggc aaggggaggg 6060
tactgagggg cagatggctc caggagaggc agctcccctg cccaagggtc cttgggcaga 6120
ccccagctgt tgtctgccca gtagaagtgg gtgggggtag gaggggatag ggtgtccttg 6180
ggaacaatgg atcccagctt agccccaaag accaaccaaa gagccagcca gagtaagctg 6240
gacctgcaac ctgcctgagc cccgtggcct ctcagctctg cggccacccc gttccctccc 6300
cagcttcctg cccaaagagc cccacattca agccaacttg agggaagggg gcgtctcgtc 6360
agctggtccc tgctagggag ctattgatgt gcaatattag aaaggagaca tgaaaaaagg 6420
agaaaaggaa agacagaagt gtatatatat attatttaaa caaacaaaaa gaaggtgcgt 6480
tactattttt ttttcacccg ggaaagaggt gagaggatgg gaaggagcag ccaggcgtgg 6540
gaagcggcga gatcctcggg ctgggggtgc ccacgtttgc tacctcccac tgtgaaatcg 6600
ctggtgctca caattgtctc tcacagtgta tgtgattttt ttaaggaaaa aaaaaaatcc 6660
ctatttaaga ttctgaaggt gctaccatta ttttgccaca gactttgaag aaacttttgg 6720
atgtggggca tcatccgcat ctttctctct cctccaaatg acaaagtttg gggaattttt 6780
gaattttcct agcatcgccc ttgtgctcat caggtaatct gctaaggagg aaaaaagaaa 6840
agaaaaaagg aaaaaaaaaa aaaaaaagca aaacaaaaac aaaaacaaaa accctaccag 6900
aaaccagaag tagagagatt taccatataa cttatggact ttgaaatgtc tgtcctttta 6960
aggcagcagg gaggcctggg tgcgaagcat gttggcttgg cccttcacgg tcctggaggg 7020
aggtgaggct ggccttggaa ggcgtgccct ggagaggtct tgggtgaaaa cttgaccttg 7080
aagaaaccaa tcacaaaagc ggcgttgggt cagggctagg cttagaggtg aagcatcaac 7140
atggaaccat ctcaggaagc cgcatcgcct cttccgaggt cctcacttcc aggagcctgt 7200
ccttgcaaga tgcaatcatc gttcctgctt tttcattgtc attaaattct gtagaaaccc 7260
attgtcatta gctccaagtg taaatttggg tcaaggagac agaataataa tgggaatctc 7320
ggagttcgac accatagtga cgttcagcgt cctctgaatt gtgctacatc agcgaacaag 7380
tcggcgcttg aattggattt tgaggttatt ttaaccatgg aattattttt atagaagggg 7440
aaaatgtatg tgaaagtctc tatttgtgta tttctctcct aaagttgtgt ctctttggga 7500
attggatttg atttttatta tttaatacct cactttggcc cgtcccccct cccaacactt 7560
ctgtatcctc gccctgccgc cccagcctgg acgctctgcg tggaagtgcg tgtttgtagc 7620
agctcgggcc tcatctcagc gctcggatcc ctcctgctgc cagaatccac tggcctctgt 7680
ctcattcttg ggttttcctg ctgtcttcgt ttacgtctct gtccacatgt cagtgtatta 7740
aaaccccaat gggttccgtt tctccttttc ccctctggat tttaaataaa tatttaaaac 7800
tgaggcaatg gaa 7813
<210> 3
<211> 583
<212> PRT
<213> Artificial Sequence
<220>
<223> BTBD9
<400> 3
Met Cys Arg Ala Leu Leu Tyr Gly Gly Met Arg Glu Ser Gln Pro Glu
1 5 10 15
Ala Glu Ile Pro Leu Gln Asp Thr Thr Ala Glu Ala Phe Thr Met Leu
20 25 30
Leu Lys Tyr Ile Tyr Thr Gly Arg Ala Thr Leu Thr Asp Glu Lys Glu
35 40 45
Glu Val Leu Leu Asp Phe Leu Ser Leu Ala His Lys Tyr Gly Phe Pro
50 55 60
Glu Leu Glu Asp Ser Thr Ser Glu Tyr Leu Cys Thr Ile Leu Asn Ile
65 70 75 80
Gln Asn Val Cys Met Thr Phe Asp Val Ala Ser Leu Tyr Ser Leu Pro
85 90 95
Lys Leu Thr Cys Met Cys Cys Met Phe Met Asp Arg Asn Ala Gln Glu
100 105 110
Val Leu Ser Ser Glu Gly Phe Leu Ser Leu Ser Lys Thr Ala Leu Leu
115 120 125
Asn Ile Val Leu Arg Asp Ser Phe Ala Ala Pro Glu Lys Asp Ile Phe
130 135 140
Leu Ala Leu Leu Asn Trp Cys Lys His Asn Ser Lys Glu Asn His Ala
145 150 155 160
Glu Ile Met Gln Ala Val Arg Leu Pro Leu Met Ser Leu Thr Glu Leu
165 170 175
Leu Asn Val Val Arg Pro Ser Gly Leu Leu Ser Pro Asp Ala Ile Leu
180 185 190
Asp Ala Ile Lys Val Arg Ser Glu Ser Arg Asp Met Asp Leu Asn Tyr
195 200 205
Arg Gly Met Leu Ile Pro Glu Glu Asn Ile Ala Thr Met Lys Tyr Gly
210 215 220
Ala Gln Val Val Lys Gly Glu Leu Lys Ser Ala Leu Leu Asp Gly Asp
225 230 235 240
Thr Gln Asn Tyr Asp Leu Asp His Gly Phe Ser Arg His Pro Ile Asp
245 250 255
Asp Asp Cys Arg Ser Gly Ile Glu Ile Lys Leu Gly Gln Pro Ser Ile
260 265 270
Ile Asn His Ile Arg Ile Leu Leu Trp Asp Arg Asp Ser Arg Ser Tyr
275 280 285
Ser Tyr Phe Ile Glu Val Ser Met Asp Glu Leu Asp Trp Val Arg Val
290 295 300
Ile Asp His Ser Gln Tyr Leu Cys Arg Ser Trp Gln Lys Leu Tyr Phe
305 310 315 320
Pro Ala Arg Val Cys Ser Gly Asp Gly Val Ser Leu Trp Cys Pro Leu
325 330 335
Trp Ser Arg Thr Pro Glu Leu Lys Gln Ser Ser Leu Leu Gly Leu Pro
340 345 350
Lys Cys Arg Tyr Ile Arg Ile Val Gly Thr His Asn Thr Val Asn Lys
355 360 365
Ile Phe His Ile Val Ala Phe Glu Cys Met Phe Thr Asn Lys Thr Phe
370 375 380
Thr Leu Glu Lys Gly Leu Ile Val Pro Met Glu Asn Val Ala Thr Ile
385 390 395 400
Ala Asp Cys Ala Ser Val Ile Glu Gly Val Ser Arg Ser Arg Asn Ala
405 410 415
Leu Leu Asn Gly Asp Thr Lys Asn Tyr Asp Trp Asp Ser Gly Tyr Thr
420 425 430
Cys His Gln Leu Gly Ser Gly Ala Ile Val Val Gln Leu Ala Gln Pro
435 440 445
Tyr Met Ile Gly Ser Ile Arg Leu Leu Leu Trp Asp Cys Asp Asp Arg
450 455 460
Ser Tyr Ser Tyr Tyr Val Glu Val Ser Thr Asn Gln Gln Gln Trp Thr
465 470 475 480
Met Val Ala Asp Arg Thr Lys Val Ser Cys Lys Ser Trp Gln Ser Val
485 490 495
Thr Phe Glu Arg Gln Pro Ala Ser Phe Ile Arg Ile Val Gly Thr His
500 505 510
Asn Thr Ala Asn Glu Val Phe His Cys Val His Phe Glu Cys Pro Glu
515 520 525
Gln Gln Ser Ser Gln Lys Glu Glu Asn Ser Glu Glu Ser Gly Thr Gly
530 535 540
Asp Thr Ser Leu Ala Gly Gln Gln Leu Asp Ser His Ala Leu Arg Ala
545 550 555 560
Pro Ser Gly Ser Ser Leu Pro Ser Ser Pro Gly Ser Asn Ser Arg Ser
565 570 575
Pro Asn Arg Gln His Gln Ala
580
<210> 4
<211> 2034
<212> DNA
<213> Artificial Sequence
<220>
<223> BTBD9
<400> 4
catgtttatt gttccaaggg agcctcatca aaaccacttt taaagcattg agaatccaaa 60
taaataccat ggatcataca ggattgggga taagtcttga ggtctccccg agaaggagaa 120
gttctttaat ttggtaattt tagggtaatg gttggatgct ctctggatta atggaaacct 180
tcaggggtac ttttaggcag agtataaccc agtgaagagc aaatcctatt tgagggtttc 240
ttcatgtgtt tcttccctga tgtgcagagc attattatat ggtggaatgc gagagtctca 300
gcctgaagca gaaattcctc tccaagacac cactgcagaa gcattcacaa tgctactcaa 360
atatatctac actgggcggg caacgctgac agatgagaag gaggaggtgc tgctggactt 420
tttgagcctg gctcataaat atggatttcc agagctagag gattctacct ctgagtatct 480
ctgcaccata cttaacattc agaatgtctg catgactttt gatgttgcca gtctctactc 540
acttcccaag ttaacttgta tgtgctgcat gtttatggat aggaatgctc aggaagtcct 600
ctcaagtgaa ggtttcctct ccctttctaa gacagcactt ttaaacatcg tgttaagaga 660
ctcatttgca gctcccgaaa aagatatttt cctagcctta ttaaactggt gtaagcacaa 720
ttcaaaggag aatcatgctg aaatcatgca ggctgtgcgt ttacctctca tgagcctcac 780
agagcttctg aatgttgtga ggccttcagg actgctgtct cctgatgcca tcctggatgc 840
cattaaagtg cgatctgaga gccgggatat ggacctcaat tatagaggca tgctcatacc 900
agaagaaaac attgcaacta tgaagtatgg agcccaagtt gtaaaggggg agctgaaatc 960
agccttatta gatggtgata ctcaaaatta tgatttggat catggatttt caaggcaccc 1020
aattgatgat gactgccgtt ccggcatcga gattaagcta ggtcagccat ccattatcaa 1080
tcacatacgg atactcttgt gggaccgaga tagccggtct tactcatact tcattgaagt 1140
gtcaatggat gaacttgatt gggtcagagt gatagatcat tcacaatatc tgtgtcgttc 1200
ttggcagaaa ttatattttc cagcccgtgt ctgcagtgga gatggagtct cactatggtg 1260
cccactctgg tctcgaactc ctgagctcaa gcaatcctcc ctccttggcc ttccaaagtg 1320
caggtatatt cgaattgttg ggactcacaa cacagtgaac aagatttttc acattgtggc 1380
ttttgaatgt atgtttacaa acaaaacctt cactcttgag aaggggctga tagttcccat 1440
ggagaatgtt gcaacaattg ctgattgtgc cagtgtgatt gaaggagtca gtcggagccg 1500
aaatgccttg ctgaatgggg acactaagaa ttatgactgg gattctggct acacatgtca 1560
ccagctagga agtggtgcga ttgtggttca gttggcacaa ccgtacatga ttgggtcaat 1620
acggttacta ctttgggatt gtgatgatcg aagctatagc tactacgttg aggtttctac 1680
caaccagcaa cagtggacca tggttgctga cagaactaaa gtctcctgca agtcctggca 1740
gtcagtaact tttgaaaggc agcctgcctc cttcatccgt atcgttggga cacacaacac 1800
agcaaatgag gtgttccact gtgtccactt tgagtgtcca gagcagcaga gcagccagaa 1860
ggaggaaaat agtgaggaat cggggacagg ggacaccagc ctggccggtc agcagctcga 1920
ctcccatgcg ctgcgggcgc ctagtggcag ctcactaccc tccagcccag gctccaactc 1980
acgctccccc aaccggcagc accaataaag gaggcagcgg gcctggtgtg actt 2034
<210> 5
<211> 437
<212> PRT
<213> Artificial Sequence
<220>
<223> SGCE
<400> 5
Met Gln Leu Pro Arg Trp Trp Glu Leu Gly Asp Pro Cys Ala Trp Thr
1 5 10 15
Gly Gln Gly Arg Gly Thr Arg Arg Met Ser Pro Ala Thr Thr Gly Thr
20 25 30
Phe Leu Leu Thr Val Tyr Ser Ile Phe Ser Lys Val His Ser Asp Arg
35 40 45
Asn Val Tyr Pro Ser Ala Gly Val Leu Phe Val His Val Leu Glu Arg
50 55 60
Glu Tyr Phe Lys Gly Glu Phe Pro Pro Tyr Pro Lys Pro Gly Glu Ile
65 70 75 80
Ser Asn Asp Pro Ile Thr Phe Asn Thr Asn Leu Met Gly Tyr Pro Asp
85 90 95
Arg Pro Gly Trp Leu Arg Tyr Ile Gln Arg Thr Pro Tyr Ser Asp Gly
100 105 110
Val Leu Tyr Gly Ser Pro Thr Ala Glu Asn Val Gly Lys Pro Thr Ile
115 120 125
Ile Glu Ile Thr Ala Tyr Asn Arg Arg Thr Phe Glu Thr Ala Arg His
130 135 140
Asn Leu Ile Ile Asn Ile Met Ser Ala Glu Asp Phe Pro Leu Pro Tyr
145 150 155 160
Gln Ala Glu Phe Phe Ile Lys Asn Met Asn Val Glu Glu Met Leu Ala
165 170 175
Ser Glu Val Leu Gly Asp Phe Leu Gly Ala Val Lys Asn Val Trp Gln
180 185 190
Pro Glu Arg Leu Asn Ala Ile Asn Ile Thr Ser Ala Leu Asp Arg Gly
195 200 205
Gly Arg Val Pro Leu Pro Ile Asn Asp Leu Lys Glu Gly Val Tyr Val
210 215 220
Met Val Gly Ala Asp Val Pro Phe Ser Ser Cys Leu Arg Glu Val Glu
225 230 235 240
Asn Pro Gln Asn Gln Leu Arg Cys Ser Gln Glu Met Glu Pro Val Ile
245 250 255
Thr Cys Asp Lys Lys Phe Arg Thr Gln Phe Tyr Ile Asp Trp Cys Lys
260 265 270
Ile Ser Leu Val Asp Lys Thr Lys Gln Val Ser Thr Tyr Gln Glu Val
275 280 285
Ile Arg Gly Glu Gly Ile Leu Pro Asp Gly Gly Glu Tyr Lys Pro Pro
290 295 300
Ser Asp Ser Leu Lys Ser Arg Asp Tyr Tyr Thr Asp Phe Leu Ile Thr
305 310 315 320
Leu Ala Val Pro Ser Ala Val Ala Leu Val Leu Phe Leu Ile Leu Ala
325 330 335
Tyr Ile Met Cys Cys Arg Arg Glu Gly Val Glu Lys Arg Asn Met Gln
340 345 350
Thr Pro Asp Ile Gln Leu Val His His Ser Ala Ile Gln Lys Ser Thr
355 360 365
Lys Glu Leu Arg Asp Met Ser Lys Asn Arg Glu Ile Ala Trp Pro Leu
370 375 380
Ser Thr Leu Pro Val Phe His Pro Val Thr Gly Glu Ile Ile Pro Pro
385 390 395 400
Leu His Thr Asp Asn Tyr Asp Ser Thr Asn Met Pro Leu Met Gln Thr
405 410 415
Gln Gln Asn Leu Pro His Gln Thr Gln Ile Pro Gln Gln Gln Thr Thr
420 425 430
Gly Lys Trp Tyr Pro
435
<210> 6
<211> 1615
<212> DNA
<213> Artificial Sequence
<220>
<223> SGCE
<400> 6
gcctagccag gccaagaatg caattgcccc ggtggtggga gctgggagac ccctgtgctt 60
ggacgggaca gggtcggggg acacgcagga tgagccccgc gaccactggc acattcttgc 120
tgacagtgta cagtattttc tccaaggtac actccgatcg gaatgtatac ccatcagcag 180
gtgtcctctt tgttcatgtt ttggaaagag aatattttaa gggggaattt ccaccttacc 240
caaaacctgg cgagattagt aatgatccca taacatttaa tacaaattta atgggttacc 300
cagaccgacc tggatggctt cgatatatcc aaaggacacc atatagtgat ggagtcctat 360
atgggtcccc aacagctgaa aatgtgggga agccaacaat cattgagata actgcctaca 420
acaggcgcac ctttgagact gcaaggcata atttgataat taatataatg tctgcagaag 480
acttcccgtt gccatatcaa gcagaattct tcattaagaa tatgaatgta gaagaaatgt 540
tggccagtga ggttcttgga gactttcttg gcgcagtgaa aaatgtgtgg cagccagagc 600
gcctgaacgc cataaacatc acatcggccc tagacagggg tggcagggtg ccacttccca 660
ttaatgacct gaaggagggc gtttatgtca tggttggtgc agatgtcccg ttttcttctt 720
gtttacgaga agttgaaaat ccacagaatc aattgagatg tagtcaagaa atggagcctg 780
taataacatg tgataaaaaa tttcgtactc aattttacat tgactggtgc aaaatttcat 840
tggttgataa aacaaagcaa gtgtccacct atcaggaagt gattcgtgga gaggggattt 900
tacctgatgg tggagaatac aaaccccctt ctgattcttt gaaaagcaga gactattaca 960
cggatttcct aattacactg gctgtgccct cggcagtggc actggtcctt tttctaatac 1020
ttgcttatat catgtgctgc cgacgggaag gcgtggaaaa gagaaacatg caaacaccag 1080
acatccaact ggtccatcac agtgctattc agaaatctac caaggagctt cgagacatgt 1140
ccaagaatag agagatagca tggcccctgt caacgcttcc tgtgttccac cctgtgactg 1200
gggaaatcat acctccttta cacacagaca actatgatag cacaaacatg ccattgatgc 1260
aaacgcagca gaacttgcca catcagactc agattcccca acagcagact acaggtaaat 1320
ggtatccctg aagaaagaaa actgactgaa gcaatgaatt tataatcaga caatatagca 1380
gttacatcac atttcttttc tcttccaata atgcatgagc ttttctggca tatgttatgc 1440
atgttggcag tattaagtgt ataccaaata atacaacata actttcattt tactaatgta 1500
tttttttgta cttaaagcat ttttgacaat ttgtaaaaca ttgatgactt tatatttgtt 1560
acaataaaag ttgatcttta aaataaatat tattaatgaa gcctaaaaaa aaaaa 1615
<210> 7
<211> 486
<212> PRT
<213> Artificial Sequence
<220>
<223> MECP2
<400> 7
Met Val Ala Gly Met Leu Gly Leu Arg Glu Glu Lys Ser Glu Asp Gln
1 5 10 15
Asp Leu Gln Gly Leu Lys Asp Lys Pro Leu Lys Phe Lys Lys Val Lys
20 25 30
Lys Asp Lys Lys Glu Glu Lys Glu Gly Lys His Glu Pro Val Gln Pro
35 40 45
Ser Ala His His Ser Ala Glu Pro Ala Glu Ala Gly Lys Ala Glu Thr
50 55 60
Ser Glu Gly Ser Gly Ser Ala Pro Ala Val Pro Glu Ala Ser Ala Ser
65 70 75 80
Pro Lys Gln Arg Arg Ser Ile Ile Arg Asp Arg Gly Pro Met Tyr Asp
85 90 95
Asp Pro Thr Leu Pro Glu Gly Trp Thr Arg Lys Leu Lys Gln Arg Lys
100 105 110
Ser Gly Arg Ser Ala Gly Lys Tyr Asp Val Tyr Leu Ile Asn Pro Gln
115 120 125
Gly Lys Ala Phe Arg Ser Lys Val Glu Leu Ile Ala Tyr Phe Glu Lys
130 135 140
Val Gly Asp Thr Ser Leu Asp Pro Asn Asp Phe Asp Phe Thr Val Thr
145 150 155 160
Gly Arg Gly Ser Pro Ser Arg Arg Glu Gln Lys Pro Pro Lys Lys Pro
165 170 175
Lys Ser Pro Lys Ala Pro Gly Thr Gly Arg Gly Arg Gly Arg Pro Lys
180 185 190
Gly Ser Gly Thr Thr Arg Pro Lys Ala Ala Thr Ser Glu Gly Val Gln
195 200 205
Val Lys Arg Val Leu Glu Lys Ser Pro Gly Lys Leu Leu Val Lys Met
210 215 220
Pro Phe Gln Thr Ser Pro Gly Gly Lys Ala Glu Gly Gly Gly Ala Thr
225 230 235 240
Thr Ser Thr Gln Val Met Val Ile Lys Arg Pro Gly Arg Lys Arg Lys
245 250 255
Ala Glu Ala Asp Pro Gln Ala Ile Pro Lys Lys Arg Gly Arg Lys Pro
260 265 270
Gly Ser Val Val Ala Ala Ala Ala Ala Glu Ala Lys Lys Lys Ala Val
275 280 285
Lys Glu Ser Ser Ile Arg Ser Val Gln Glu Thr Val Leu Pro Ile Lys
290 295 300
Lys Arg Lys Thr Arg Glu Thr Val Ser Ile Glu Val Lys Glu Val Val
305 310 315 320
Lys Pro Leu Leu Val Ser Thr Leu Gly Glu Lys Ser Gly Lys Gly Leu
325 330 335
Lys Thr Cys Lys Ser Pro Gly Arg Lys Ser Lys Glu Ser Ser Pro Lys
340 345 350
Gly Arg Ser Ser Ser Ala Ser Ser Pro Pro Lys Lys Glu His His His
355 360 365
His His His His Ser Glu Ser Pro Lys Ala Pro Val Pro Leu Leu Pro
370 375 380
Pro Leu Pro Pro Pro Pro Pro Glu Pro Glu Ser Ser Glu Asp Pro Thr
385 390 395 400
Ser Pro Pro Glu Pro Gln Asp Leu Ser Ser Ser Val Cys Lys Glu Glu
405 410 415
Lys Met Pro Arg Gly Gly Ser Leu Glu Ser Asp Gly Cys Pro Lys Glu
420 425 430
Pro Ala Lys Thr Gln Pro Ala Val Ala Thr Ala Ala Thr Ala Ala Glu
435 440 445
Lys Tyr Lys His Arg Gly Glu Gly Glu Arg Lys Asp Ile Val Ser Ser
450 455 460
Ser Met Pro Arg Pro Asn Arg Glu Glu Pro Val Asp Ser Arg Thr Pro
465 470 475 480
Val Thr Glu Arg Val Ser
485
<210> 8
<211> 10241
<212> DNA
<213> Artificial Sequence
<220>
<223> MECP2
<400> 8
ccggcgtcgg cggcgcgcgc gctccctcct ctcggagaga gggctgtggt aaaagccgtc 60
cggaaaatgg ccgccgccgc cgccgccgcg ccgagcggag gaggaggagg aggcgaggag 120
gagagactgc tccataaaaa tacagactca ccagttcctg ctttgatgtg acatgtgact 180
ccccagaata caccttgctt ctgtagacca gctccaacag gattccatgg tagctgggat 240
gttagggctc agggaagaaa agtcagaaga ccaggacctc cagggcctca aggacaaacc 300
cctcaagttt aaaaaggtga agaaagataa gaaagaagag aaagagggca agcatgagcc 360
cgtgcagcca tcagcccacc actctgctga gcccgcagag gcaggcaaag cagagacatc 420
agaagggtca ggctccgccc cggctgtgcc ggaagcttct gcctccccca aacagcggcg 480
ctccatcatc cgtgaccggg gacccatgta tgatgacccc accctgcctg aaggctggac 540
acggaagctt aagcaaagga aatctggccg ctctgctggg aagtatgatg tgtatttgat 600
caatccccag ggaaaagcct ttcgctctaa agtggagttg attgcgtact tcgaaaaggt 660
aggcgacaca tccctggacc ctaatgattt tgacttcacg gtaactggga gagggagccc 720
ctcccggcga gagcagaaac cacctaagaa gcccaaatct cccaaagctc caggaactgg 780
cagaggccgg ggacgcccca aagggagcgg caccacgaga cccaaggcgg ccacgtcaga 840
gggtgtgcag gtgaaaaggg tcctggagaa aagtcctggg aagctccttg tcaagatgcc 900
ttttcaaact tcgccagggg gcaaggctga ggggggtggg gccaccacat ccacccaggt 960
catggtgatc aaacgccccg gcaggaagcg aaaagctgag gccgaccctc aggccattcc 1020
gaagaaacgg ggccgaaagc cggggagtgt ggtggcagcc gctgccgccg aggccaaaaa 1080
gaaagccgtg aaggagtctt ctatccgatc tgtgcaggag accgtactcc ccatcaagaa 1140
gcgcaagacc cgggagacgg tcagcatcga ggtcaaggaa gtggtgaagc ccctgctggt 1200
gtccaccctc ggtgagaaga gcgggaaagg actgaagacc tgtaagagcc ctgggcggaa 1260
aagcaaggag agcagcccca aggggcgcag cagcagcgcc tcctcacccc ccaagaagga 1320
gcaccaccac catcaccacc actcagagtc cccaaaggcc cccgtgccac tgctcccacc 1380
cctgccccca cctccacctg agcccgagag ctccgaggac cccaccagcc cccctgagcc 1440
ccaggacttg agcagcagcg tctgcaaaga ggagaagatg cccagaggag gctcactgga 1500
gagcgacggc tgccccaagg agccagctaa gactcagccc gcggttgcca ccgccgccac 1560
ggccgcagaa aagtacaaac accgagggga gggagagcgc aaagacattg tttcatcctc 1620
catgccaagg ccaaacagag aggagcctgt ggacagccgg acgcccgtga ccgagagagt 1680
tagctgactt tacacggagc ggattgcaaa gcaaaccaac aagaataaag gcagctgttg 1740
tctcttctcc ttatgggtag ggctctgaca aagcttcccg attaactgaa ataaaaaata 1800
tttttttttc tttcagtaaa cttagagttt cgtggcttca gggtgggagt agttggagca 1860
ttggggatgt ttttcttacc gacaagcaca gtcaggttga agacctaacc agggccagaa 1920
gtagctttgc acttttctaa actaggctcc ttcaacaagg cttgctgcag atactactga 1980
ccagacaagc tgttgaccag gcacctcccc tcccgcccaa acctttcccc catgtggtcg 2040
ttagagacag agcgacagag cagttgagag gacactcccg ttttcggtgc catcagtgcc 2100
ccgtctacag ctcccccagc tccccccacc tcccccactc ccaaccacgt tgggacaggg 2160
aggtgtgagg caggagagac agttggattc tttagagaag atggatatga ccagtggcta 2220
tggcctgtgc gatcccaccc gtggtggctc aagtctggcc ccacaccagc cccaatccaa 2280
aactggcaag gacgcttcac aggacaggaa agtggcacct gtctgctcca gctctggcat 2340
ggctaggagg ggggagtccc ttgaactact gggtgtagac tggcctgaac cacaggagag 2400
gatggcccag ggtgaggtgg catggtccat tctcaaggga cgtcctccaa cgggtggcgc 2460
tagaggccat ggaggcagta ggacaaggtg caggcaggct ggcctggggt caggccgggc 2520
agagcacagc ggggtgagag ggattcctaa tcactcagag cagtctgtga cttagtggac 2580
aggggagggg gcaaaggggg aggagaagaa aatgttcttc cagttacttt ccaattctcc 2640
tttagggaca gcttagaatt atttgcacta ttgagtcttc atgttcccac ttcaaaacaa 2700
acagatgctc tgagagcaaa ctggcttgaa ttggtgacat ttagtccctc aagccaccag 2760
atgtgacagt gttgagaact acctggattt gtatatatac ctgcgcttgt tttaaagtgg 2820
gctcagcaca tagggttccc acgaagctcc gaaactctaa gtgtttgctg caattttata 2880
aggacttcct gattggtttc tcttctcccc ttccatttct gccttttgtt catttcatcc 2940
tttcacttct ttcccttcct ccgtcctcct ccttcctagt tcatcccttc tcttccaggc 3000
agccgcggtg cccaaccaca cttgtcggct ccagtcccca gaactctgcc tgccctttgt 3060
cctcctgctg ccagtaccag ccccaccctg ttttgagccc tgaggaggcc ttgggctctg 3120
ctgagtccga cctggcctgt ctgtgaagag caagagagca gcaaggtctt gctctcctag 3180
gtagccccct cttccctggt aagaaaaagc aaaaggcatt tcccaccctg aacaacgagc 3240
cttttcaccc ttctactcta gagaagtgga ctggaggagc tgggcccgat ttggtagttg 3300
aggaaagcac agaggcctcc tgtggcctgc cagtcatcga gtggcccaac aggggctcca 3360
tgccagccga ccttgacctc actcagaagt ccagagtcta gcgtagtgca gcagggcagt 3420
agcggtacca atgcagaact cccaagaccc gagctgggac cagtacctgg gtccccagcc 3480
cttcctctgc tccccctttt ccctcggagt tcttcttgaa tggcaatgtt ttgcttttgc 3540
tcgatgcaga cagggggcca gaacaccaca catttcactg tctgtctggt ccatagctgt 3600
ggtgtagggg cttagaggca tgggcttgct gtgggttttt aattgatcag ttttcatgtg 3660
ggatcccatc tttttaacct ctgttcagga agtccttatc tagctgcata tcttcatcat 3720
attggtatat ccttttctgt gtttacagag atgtctctta tatctaaatc tgtccaactg 3780
agaagtacct tatcaaagta gcaaatgaga cagcagtctt atgcttccag aaacacccac 3840
aggcatgtcc catgtgagct gctgccatga actgtcaagt gtgtgttgtc ttgtgtattt 3900
cagttattgt ccctggcttc cttactatgg tgtaatcatg aaggagtgaa acatcataga 3960
aactgtctag cacttccttg ccagtcttta gtgatcagga accatagttg acagttccaa 4020
tcagtagctt aagaaaaaac cgtgtttgtc tcttctggaa tggttagaag tgagggagtt 4080
tgccccgttc tgtttgtaga gtctcatagt tggactttct agcatatatg tgtccatttc 4140
cttatgctgt aaaagcaagt cctgcaacca aactcccatc agcccaatcc ctgatccctg 4200
atcccttcca cctgctctgc tgatgacccc cccagcttca cttctgactc ttccccagga 4260
agggaagggg ggtcagaaga gagggtgagt cctccagaac tcttcctcca aggacagaag 4320
gctcctgccc ccatagtggc ctcgaactcc tggcactacc aaaggacact tatccacgag 4380
agcgcagcat ccgaccaggt tgtcactgag aagatgttta ttttggtcag ttgggttttt 4440
atgtattata cttagtcaaa tgtaatgtgg cttctggaat cattgtccag agctgcttcc 4500
ccgtcacctg ggcgtcatct ggtcctggta agaggagtgc gtggcccacc aggcccccct 4560
gtcacccatg acagttcatt cagggccgat ggggcagtcg tggttgggaa cacagcattt 4620
caagcgtcac tttatttcat tcgggcccca cctgcagctc cctcaaagag gcagttgccc 4680
agcctctttc ccttccagtt tattccagag ctgccagtgg ggcctgaggc tccttagggt 4740
tttctctcta tttccccctt tcttcctcat tccctcgtct ttcccaaagg catcacgagt 4800
cagtcgcctt tcagcaggca gccttggcgg tttatcgccc tggcaggcag gggccctgca 4860
gctctcatgc tgcccctgcc ttggggtcag gttgacagga ggttggaggg aaagccttaa 4920
gctgcaggat tctcaccagc tgtgtccggc ccagttttgg ggtgtgacct caatttcaat 4980
tttgtctgta cttgaacatt atgaagatgg gggcctcttt cagtgaattt gtgaacagca 5040
gaattgaccg acagctttcc agtacccatg gggctaggtc attaaggcca catccacagt 5100
ctcccccacc cttgttccag ttgttagtta ctacctcctc tcctgacaat actgtatgtc 5160
gtcgagctcc ccccaggtct acccctcccg gccctgcctg ctggtgggct tgtcatagcc 5220
agtgggattg ccggtcttga cagctcagtg agctggagat acttggtcac agccaggcgc 5280
tagcacagct cccttctgtt gatgctgtat tcccatatca aaagacacag gggacaccca 5340
gaaacgccac atcccccaat ccatcagtgc caaactagcc aacggcccca gcttctcagc 5400
tcgctggatg gcggaagctg ctactcgtga gcgccagtgc gggtgcagac aatcttctgt 5460
tgggtggcat cattccaggc ccgaagcatg aacagtgcac ctgggacagg gagcagcccc 5520
aaattgtcac ctgcttctct gcccagcttt tcattgctgt gacagtgatg gcgaaagagg 5580
gtaataacca gacacaaact gccaagttgg gtggagaaag gagtttcttt agctgacaga 5640
atctctgaat tttaaatcac ttagtaagcg gctcaagccc aggagggagc agagggatac 5700
gagcggagtc ccctgcgcgg gaccatctgg aattggttta gcccaagtgg agcctgacag 5760
ccagaactct gtgtcccccg tctaaccaca gctccttttc cagagcattc cagtcaggct 5820
ctctgggctg actgggccag gggaggttac aggtaccagt tctttaagaa gatctttggg 5880
catatacatt tttagcctgt gtcattgccc caaatggatt cctgtttcaa gttcacacct 5940
gcagattcta ggacctgtgt cctagacttc agggagtcag ctgtttctag agttcctacc 6000
atggagtggg tctggaggac ctgcccggtg ggggggcaga gccctgctcc ctccgggtct 6060
tcctactctt ctctctgctc tgacgggatt tgttgattct ctccattttg gtgtctttct 6120
cttttagata ttgtatcaat ctttagaaaa ggcatagtct acttgttata aatcgttagg 6180
atactgcctc ccccagggtc taaaattaca tattagaggg gaaaagctga acactgaagt 6240
cagttctcaa caatttagaa ggaaaaccta gaaaacattt ggcagaaaat tacatttcga 6300
tgtttttgaa tgaatacgag caagctttta caacagtgct gatctaaaaa tacttagcac 6360
ttggcctgag atgcctggtg agcattacag gcaaggggaa tctggaggta gccgacctga 6420
ggacatggct tctgaacctg tcttttggga gtggtatgga aggtggagcg ttcaccagtg 6480
acctggaagg cccagcacca ccctccttcc cactcttctc atcttgacag agcctgcccc 6540
agcgctgacg tgtcaggaaa acacccaggg aactaggaag gcacttctgc ctgaggggca 6600
gcctgccttg cccactcctg ctctgctcgc ctcggatcag ctgagccttc tgagctggcc 6660
tctcactgcc tccccaaggc cccctgcctg ccctgtcagg aggcagaagg aagcaggtgt 6720
gagggcagtg caaggaggga gcacaacccc cagctcccgc tccgggctcc gacttgtgca 6780
caggcagagc ccagaccctg gaggaaatcc tacctttgaa ttcaagaaca tttggggaat 6840
ttggaaatct ctttgccccc aaacccccat tctgtcctac ctttaatcag gtcctgctca 6900
gcagtgagag cagatgaggt gaaaaggcca agaggtttgg ctcctgccca ctgatagccc 6960
ctctccccgc agtgtttgtg tgtcaagtgg caaagctgtt cttcctggtg accctgatta 7020
tatccagtaa cacatagact gtgcgcatag gcctgctttg tctcctctat cctgggcttt 7080
tgttttgctt tttagttttg cttttagttt ttctgtccct tttatttaac gcaccgacta 7140
gacacacaaa gcagttgaat ttttatatat atatctgtat attgcacaat tataaactca 7200
ttttgcttgt ggctccacac acacaaaaaa agacctgtta aaattatacc tgttgcttaa 7260
ttacaatatt tctgataacc atagcatagg acaagggaaa ataaaaaaag aaaaaaaaga 7320
aaaaaaaacg acaaatctgt ctgctggtca cttcttctgt ccaagcagat tcgtggtctt 7380
ttcctcgctt ctttcaaggg ctttcctgtg ccaggtgaag gaggctccag gcagcaccca 7440
ggttttgcac tcttgtttct cccgtgcttg tgaaagaggt cccaaggttc tgggtgcagg 7500
agcgctccct tgacctgctg aagtccggaa cgtagtcggc acagcctggt cgccttccac 7560
ctctgggagc tggagtccac tggggtggcc tgactccccc agtccccttc ccgtgacctg 7620
gtcagggtga gcccatgtgg agtcagcctc gcaggcctcc ctgccagtag ggtccgagtg 7680
tgtttcatcc ttcccactct gtcgagcctg ggggctggag cggagacggg aggcctggcc 7740
tgtctcggaa cctgtgagct gcaccaggta gaacgccagg gaccccagaa tcatgtgcgt 7800
cagtccaagg ggtcccctcc aggagtagtg aagactccag aaatgtccct ttcttctccc 7860
ccatcctacg agtaattgca tttgcttttg taattcttaa tgagcaatat ctgctagaga 7920
gtttagctgt aacagttctt tttgatcatc tttttttaat aattagaaac accaaaaaaa 7980
tccagaaact tgttcttcca aagcagagag cattataatc accagggcca aaagcttccc 8040
tccctgctgt cattgcttct tctgaggcct gaatccaaaa gaaaaacagc cataggccct 8100
ttcagtggcc gggctacccg tgagcccttc ggaggaccag ggctggggca gcctctgggc 8160
ccacatccgg ggccagctcc ggcgtgtgtt cagtgttagc agtgggtcat gatgctcttt 8220
cccacccagc ctgggatagg ggcagaggag gcgaggaggc cgttgccgct gatgtttggc 8280
cgtgaacagg tgggtgtctg cgtgcgtcca cgtgcgtgtt ttctgactga catgaaatcg 8340
acgcccgagt tagcctcacc cggtgacctc tagccctgcc cggatggagc ggggcccacc 8400
cggttcagtg tttctgggga gctggacagt ggagtgcaaa aggcttgcag aacttgaagc 8460
ctgctccttc ccttgctacc acggcctcct ttccgtttga tttgtcactg cttcaatcaa 8520
taacagccgc tccagagtca gtagtcaatg aatatatgac caaatatcac caggactgtt 8580
actcaatgtg tgccgagccc ttgcccatgc tgggctcccg tgtatctgga cactgtaacg 8640
tgtgctgtgt ttgctcccct tccccttcct tctttgccct ttacttgtct ttctggggtt 8700
tttctgtttg ggtttggttt ggtttttatt tctccttttg tgttccaaac atgaggttct 8760
ctctactggt cctcttaact gtggtgttga ggcttatatt tgtgtaattt ttggtgggtg 8820
aaaggaattt tgctaagtaa atctcttctg tgtttgaact gaagtctgta ttgtaactat 8880
gtttaaagta attgttccag agacaaatat ttctagacac tttttcttta caaacaaaag 8940
cattcggagg gagggggatg gtgactgaga tgagagggga gagctgaaca gatgacccct 9000
gcccagatca gccagaagcc acccaaagca gtggagccca ggagtcccac tccaagccag 9060
caagccgaat agctgatgtg ttgccacttt ccaagtcact gcaaaaccag gttttgttcc 9120
gcccagtgga ttcttgtttt gcttcccctc cccccgagat tattaccacc atcccgtgct 9180
tttaaggaaa ggcaagattg atgtttcctt gaggggagcc aggaggggat gtgtgtgtgc 9240
agagctgaag agctggggag aatggggctg ggcccaccca agcaggaggc tgggacgctc 9300
tgctgtgggc acaggtcagg ctaatgttgg cagatgcagc tcttcctgga caggccaggt 9360
ggtgggcatt ctctctccaa ggtgtgcccc gtgggcatta ctgtttaaga cacttccgtc 9420
acatcccacc ccatcctcca gggctcaaca ctgtgacatc tctattcccc accctcccct 9480
tcccagggca ataaaatgac catggagggg gcttgcactc tcttggctgt cacccgatcg 9540
ccagcaaaac ttagatgtga gaaaacccct tcccattcca tggcgaaaac atctccttag 9600
aaaagccatt accctcatta ggcatggttt tgggctccca aaacacctga cagcccctcc 9660
ctcctctgag aggcggagag tgctgactgt agtgaccatt gcatgccggg tgcagcatct 9720
ggaagagcta ggcagggtgt ctgccccctc ctgagttgaa gtcatgctcc cctgtgccag 9780
cccagaggcc gagagctatg gacagcattg ccagtaacac aggccaccct gtgcagaagg 9840
gagctggctc cagcctggaa acctgtctga ggttgggaga ggtgcacttg gggcacaggg 9900
agaggccggg acacacttag ctggagatgt ctctaaaagc cctgtatcgt attcaccttc 9960
agtttttgtg ttttgggaca attactttag aaaataagta ggtcgtttta aaaacaaaaa 10020
ttattgattg cttttttgta gtgttcagaa aaaaggttct ttgtgtatag ccaaatgact 10080
gaaagcactg atatatttaa aaacaaaagg caatttatta aggaaatttg taccatttca 10140
gtaaacctgt ctgaatgtac ctgtatacgt ttcaaaaaca cccccccccc actgaatccc 10200
tgtaacctat ttattatata aagagtttgc cttataaatt t 10241
<210> 9
<211> 5202
<212> PRT
<213> Artificial Sequence
<220>
<223> USH2A
<400> 9
Met Asn Cys Pro Val Leu Ser Leu Gly Ser Gly Phe Leu Phe Gln Val
1 5 10 15
Ile Glu Met Leu Ile Phe Ala Tyr Phe Ala Ser Ile Ser Leu Thr Glu
20 25 30
Ser Arg Gly Leu Phe Pro Arg Leu Glu Asn Val Gly Ala Phe Lys Lys
35 40 45
Val Ser Ile Val Pro Thr Gln Ala Val Cys Gly Leu Pro Asp Arg Ser
50 55 60
Thr Phe Cys His Ser Ser Ala Ala Ala Glu Ser Ile Gln Phe Cys Thr
65 70 75 80
Gln Arg Phe Cys Ile Gln Asp Cys Pro Tyr Arg Ser Ser His Pro Thr
85 90 95
Tyr Thr Ala Leu Phe Ser Ala Gly Leu Ser Ser Cys Ile Thr Pro Asp
100 105 110
Lys Asn Asp Leu His Pro Asn Ala His Ser Asn Ser Ala Ser Phe Ile
115 120 125
Phe Gly Asn His Lys Ser Cys Phe Ser Ser Pro Pro Ser Pro Lys Leu
130 135 140
Met Ala Ser Phe Thr Leu Ala Val Trp Leu Lys Pro Glu Gln Gln Gly
145 150 155 160
Val Met Cys Val Ile Glu Lys Thr Val Asp Gly Gln Ile Val Phe Lys
165 170 175
Leu Thr Ile Ser Glu Lys Glu Thr Met Phe Tyr Tyr Arg Thr Val Asn
180 185 190
Gly Leu Gln Pro Pro Ile Lys Val Met Thr Leu Gly Arg Ile Leu Val
195 200 205
Lys Lys Trp Ile His Leu Ser Val Gln Val His Gln Thr Lys Ile Ser
210 215 220
Phe Phe Ile Asn Gly Val Glu Lys Asp His Thr Pro Phe Asn Ala Arg
225 230 235 240
Thr Leu Ser Gly Ser Ile Thr Asp Phe Ala Ser Gly Thr Val Gln Ile
245 250 255
Gly Gln Ser Leu Asn Gly Leu Glu Gln Phe Val Gly Arg Met Gln Asp
260 265 270
Phe Arg Leu Tyr Gln Val Ala Leu Thr Asn Arg Glu Ile Leu Glu Val
275 280 285
Phe Ser Gly Asp Leu Leu Arg Leu His Ala Gln Ser His Cys Arg Cys
290 295 300
Pro Gly Ser His Pro Arg Val His Pro Leu Ala Gln Arg Tyr Cys Ile
305 310 315 320
Pro Asn Asp Ala Gly Asp Thr Ala Asp Asn Arg Val Ser Arg Leu Asn
325 330 335
Pro Glu Ala His Pro Leu Ser Phe Val Asn Asp Asn Asp Val Gly Thr
340 345 350
Ser Trp Val Ser Asn Val Phe Thr Asn Ile Thr Gln Leu Asn Gln Gly
355 360 365
Val Thr Ile Ser Val Asp Leu Glu Asn Gly Gln Tyr Gln Val Phe Tyr
370 375 380
Ile Ile Ile Gln Phe Phe Ser Pro Gln Pro Thr Glu Ile Arg Ile Gln
385 390 395 400
Arg Lys Lys Glu Asn Ser Leu Asp Trp Glu Asp Trp Gln Tyr Phe Ala
405 410 415
Arg Asn Cys Gly Ala Phe Gly Met Lys Asn Asn Gly Asp Leu Glu Lys
420 425 430
Pro Asp Ser Val Asn Cys Leu Gln Leu Ser Asn Phe Thr Pro Tyr Ser
435 440 445
Arg Gly Asn Val Thr Phe Ser Ile Leu Thr Pro Gly Pro Asn Tyr Arg
450 455 460
Pro Gly Tyr Asn Asn Phe Tyr Asn Thr Pro Ser Leu Gln Glu Phe Val
465 470 475 480
Lys Ala Thr Gln Ile Arg Phe His Phe His Gly Gln Tyr Tyr Thr Thr
485 490 495
Glu Thr Ala Val Asn Leu Arg His Arg Tyr Tyr Ala Val Asp Glu Ile
500 505 510
Thr Ile Ser Gly Arg Cys Gln Cys His Gly His Ala Asp Asn Cys Asp
515 520 525
Thr Thr Ser Gln Pro Tyr Arg Cys Leu Cys Ser Gln Glu Ser Phe Thr
530 535 540
Glu Gly Leu His Cys Asp Arg Cys Leu Pro Leu Tyr Asn Asp Lys Pro
545 550 555 560
Phe Arg Gln Gly Asp Gln Val Tyr Ala Phe Asn Cys Lys Pro Cys Gln
565 570 575
Cys Asn Ser His Ser Lys Ser Cys His Tyr Asn Ile Ser Val Asp Pro
580 585 590
Phe Pro Phe Glu His Phe Arg Gly Gly Gly Gly Val Cys Asp Asp Cys
595 600 605
Glu His Asn Thr Thr Gly Arg Asn Cys Glu Leu Cys Lys Asp Tyr Phe
610 615 620
Phe Arg Gln Val Gly Ala Asp Pro Ser Ala Ile Asp Val Cys Lys Pro
625 630 635 640
Cys Asp Cys Asp Thr Val Gly Thr Arg Asn Gly Ser Ile Leu Cys Asp
645 650 655
Gln Ile Gly Gly Gln Cys Asn Cys Lys Arg His Val Ser Gly Arg Gln
660 665 670
Cys Asn Gln Cys Gln Asn Gly Phe Tyr Asn Leu Gln Glu Leu Asp Pro
675 680 685
Asp Gly Cys Ser Pro Cys Asn Cys Asn Thr Ser Gly Thr Val Asp Gly
690 695 700
Asp Ile Thr Cys His Gln Asn Ser Gly Gln Cys Lys Cys Lys Ala Asn
705 710 715 720
Val Ile Gly Leu Arg Cys Asp His Cys Asn Phe Gly Phe Lys Phe Leu
725 730 735
Arg Ser Phe Asn Asp Val Gly Cys Glu Pro Cys Gln Cys Asn Leu His
740 745 750
Gly Ser Val Asn Lys Phe Cys Asn Pro His Ser Gly Gln Cys Glu Cys
755 760 765
Lys Lys Glu Ala Lys Gly Leu Gln Cys Asp Thr Cys Arg Glu Asn Phe
770 775 780
Tyr Gly Leu Asp Val Thr Asn Cys Lys Ala Cys Asp Cys Asp Thr Ala
785 790 795 800
Gly Ser Leu Pro Gly Thr Val Cys Asn Ala Lys Thr Gly Gln Cys Ile
805 810 815
Cys Lys Pro Asn Val Glu Gly Arg Gln Cys Asn Lys Cys Leu Glu Gly
820 825 830
Asn Phe Tyr Leu Arg Gln Asn Asn Ser Phe Leu Cys Leu Pro Cys Asn
835 840 845
Cys Asp Lys Thr Gly Thr Ile Asn Gly Ser Leu Leu Cys Asn Lys Ser
850 855 860
Thr Gly Gln Cys Pro Cys Lys Leu Gly Val Thr Gly Leu Arg Cys Asn
865 870 875 880
Gln Cys Glu Pro His Arg Tyr Asn Leu Thr Ile Asp Asn Phe Gln His
885 890 895
Cys Gln Met Cys Glu Cys Asp Ser Leu Gly Thr Leu Pro Gly Thr Ile
900 905 910
Cys Asp Pro Ile Ser Gly Gln Cys Leu Cys Val Pro Asn Arg Gln Gly
915 920 925
Arg Arg Cys Asn Gln Cys Gln Pro Gly Phe Tyr Ile Ser Pro Gly Asn
930 935 940
Ala Thr Gly Cys Leu Pro Cys Ser Cys His Thr Thr Gly Ala Val Asn
945 950 955 960
His Ile Cys Asn Ser Leu Thr Gly Gln Cys Val Cys Gln Asp Ala Ser
965 970 975
Ile Ala Gly Gln Arg Cys Asp Gln Cys Lys Asp His Tyr Phe Gly Phe
980 985 990
Asp Pro Gln Thr Gly Arg Cys Gln Pro Cys Asn Cys His Leu Ser Gly
995 1000 1005
Ala Leu Asn Glu Thr Cys His Leu Val Thr Gly Gln Cys Phe Cys Lys
1010 1015 1020
Gln Phe Val Thr Gly Ser Lys Cys Asp Ala Cys Val Pro Ser Ala Ser
1025 1030 1035 1040
His Leu Asp Val Asn Asn Leu Leu Gly Cys Ser Lys Thr Pro Phe Gln
1045 1050 1055
Gln Pro Pro Pro Arg Gly Gln Val Gln Ser Ser Ser Ala Ile Asn Leu
1060 1065 1070
Ser Trp Ser Pro Pro Asp Ser Pro Asn Ala His Trp Leu Thr Tyr Ser
1075 1080 1085
Leu Leu Arg Asp Gly Phe Glu Ile Tyr Thr Thr Glu Asp Gln Tyr Pro
1090 1095 1100
Tyr Ser Ile Gln Tyr Phe Leu Asp Thr Asp Leu Leu Pro Tyr Thr Lys
1105 1110 1115 1120
Tyr Ser Tyr Tyr Ile Glu Thr Thr Asn Val His Gly Ser Thr Arg Ser
1125 1130 1135
Val Ala Val Thr Tyr Lys Thr Lys Pro Gly Val Pro Glu Gly Asn Leu
1140 1145 1150
Thr Leu Ser Tyr Ile Ile Pro Ile Gly Ser Asp Ser Val Thr Leu Thr
1155 1160 1165
Trp Thr Thr Leu Ser Asn Gln Ser Gly Pro Ile Glu Lys Tyr Ile Leu
1170 1175 1180
Ser Cys Ala Pro Leu Ala Gly Gly Gln Pro Cys Val Ser Tyr Glu Gly
1185 1190 1195 1200
His Glu Thr Ser Ala Thr Ile Trp Asn Leu Val Pro Phe Ala Lys Tyr
1205 1210 1215
Asp Phe Ser Val Gln Ala Cys Thr Ser Gly Gly Cys Leu His Ser Leu
1220 1225 1230
Pro Ile Thr Val Thr Thr Ala Gln Ala Pro Pro Gln Arg Leu Ser Pro
1235 1240 1245
Pro Lys Met Gln Lys Ile Ser Ser Thr Glu Leu His Val Glu Trp Ser
1250 1255 1260
Pro Pro Ala Glu Leu Asn Gly Ile Ile Ile Arg Tyr Glu Leu Tyr Met
1265 1270 1275 1280
Arg Arg Leu Arg Ser Thr Lys Glu Thr Thr Ser Glu Glu Ser Arg Val
1285 1290 1295
Phe Gln Ser Ser Gly Trp Leu Ser Pro His Ser Phe Val Glu Ser Ala
1300 1305 1310
Asn Glu Asn Ala Leu Lys Pro Pro Gln Thr Met Thr Thr Ile Thr Gly
1315 1320 1325
Leu Glu Pro Tyr Thr Lys Tyr Glu Phe Arg Val Leu Ala Val Asn Met
1330 1335 1340
Ala Gly Ser Val Ser Ser Ala Trp Val Ser Glu Arg Thr Gly Glu Ser
1345 1350 1355 1360
Ala Pro Val Phe Met Ile Pro Pro Ser Val Phe Pro Leu Ser Ser Tyr
1365 1370 1375
Ser Leu Asn Ile Ser Trp Glu Lys Pro Ala Asp Asn Val Thr Arg Gly
1380 1385 1390
Lys Val Val Gly Tyr Asp Ile Asn Met Leu Ser Glu Gln Ser Pro Gln
1395 1400 1405
Gln Ser Ile Pro Met Ala Phe Ser Gln Leu Leu His Thr Ala Lys Ser
1410 1415 1420
Gln Glu Leu Ser Tyr Thr Val Glu Gly Leu Lys Pro Tyr Arg Ile Tyr
1425 1430 1435 1440
Glu Phe Thr Ile Thr Leu Cys Asn Ser Val Gly Cys Val Thr Ser Ala
1445 1450 1455
Ser Gly Ala Gly Gln Thr Leu Ala Ala Ala Pro Ala Gln Leu Arg Pro
1460 1465 1470
Pro Leu Val Lys Gly Ile Asn Ser Thr Thr Ile His Leu Lys Trp Phe
1475 1480 1485
Pro Pro Glu Glu Leu Asn Gly Pro Ser Pro Ile Tyr Gln Leu Glu Arg
1490 1495 1500
Arg Glu Ser Ser Leu Pro Ala Leu Met Thr Thr Met Met Lys Gly Ile
1505 1510 1515 1520
Arg Phe Ile Gly Asn Gly Tyr Cys Lys Phe Pro Ser Ser Thr His Pro
1525 1530 1535
Val Asn Thr Asp Phe Thr Gly Ile Lys Ala Ser Phe Arg Thr Lys Val
1540 1545 1550
Pro Glu Gly Leu Ile Val Phe Ala Ala Ser Pro Gly Asn Gln Glu Glu
1555 1560 1565
Tyr Phe Ala Leu Gln Leu Lys Lys Gly Arg Leu Tyr Phe Leu Phe Asp
1570 1575 1580
Pro Gln Gly Ser Pro Val Glu Val Thr Thr Thr Asn Asp His Gly Lys
1585 1590 1595 1600
Gln Tyr Ser Asp Gly Lys Trp His Glu Ile Ile Ala Ile Arg His Gln
1605 1610 1615
Ala Phe Gly Gln Ile Thr Leu Asp Gly Ile Tyr Thr Gly Ser Ser Ala
1620 1625 1630
Ile Leu Asn Gly Ser Thr Val Ile Gly Asp Asn Thr Gly Val Phe Leu
1635 1640 1645
Gly Gly Leu Pro Arg Ser Tyr Thr Ile Leu Arg Lys Asp Pro Glu Ile
1650 1655 1660
Ile Gln Lys Gly Phe Val Gly Cys Leu Lys Asp Val His Phe Met Lys
1665 1670 1675 1680
Asn Tyr Asn Pro Ser Ala Ile Trp Glu Pro Leu Asp Trp Gln Ser Ser
1685 1690 1695
Glu Glu Gln Ile Asn Val Tyr Asn Ser Trp Glu Gly Cys Pro Ala Ser
1700 1705 1710
Leu Asn Glu Gly Ala Gln Phe Leu Gly Ala Gly Phe Leu Glu Leu His
1715 1720 1725
Pro Tyr Met Phe His Gly Gly Met Asn Phe Glu Ile Ser Phe Lys Phe
1730 1735 1740
Arg Thr Asp Gln Leu Asn Gly Leu Leu Leu Phe Val Tyr Asn Lys Asp
1745 1750 1755 1760
Gly Pro Asp Phe Leu Ala Met Glu Leu Lys Ser Gly Ile Leu Thr Phe
1765 1770 1775
Arg Leu Asn Thr Ser Leu Ala Phe Thr Gln Val Asp Leu Leu Leu Gly
1780 1785 1790
Leu Ser Tyr Cys Asn Gly Lys Trp Asn Lys Val Ile Ile Lys Lys Glu
1795 1800 1805
Gly Ser Phe Ile Ser Ala Ser Val Asn Gly Leu Met Lys His Ala Ser
1810 1815 1820
Glu Ser Gly Asp Gln Pro Leu Val Val Asn Ser Pro Val Tyr Val Gly
1825 1830 1835 1840
Gly Ile Pro Gln Glu Leu Leu Asn Ser Tyr Gln His Leu Cys Leu Glu
1845 1850 1855
Gln Gly Phe Gly Gly Cys Met Lys Asp Val Lys Phe Thr Arg Gly Ala
1860 1865 1870
Val Val Asn Leu Ala Ser Val Ser Ser Gly Ala Val Arg Val Asn Leu
1875 1880 1885
Asp Gly Cys Leu Ser Thr Asp Ser Ala Val Asn Cys Arg Gly Asn Asp
1890 1895 1900
Ser Ile Leu Val Tyr Gln Gly Lys Glu Gln Ser Val Tyr Glu Gly Gly
1905 1910 1915 1920
Leu Gln Pro Phe Thr Glu Tyr Leu Tyr Arg Val Ile Ala Ser His Glu
1925 1930 1935
Gly Gly Ser Val Tyr Ser Asp Trp Ser Arg Gly Arg Thr Thr Gly Ala
1940 1945 1950
Ala Pro Gln Ser Val Pro Thr Pro Ser Arg Val Arg Ser Leu Asn Gly
1955 1960 1965
Tyr Ser Ile Glu Val Thr Trp Asp Glu Pro Val Val Arg Gly Val Ile
1970 1975 1980
Glu Lys Tyr Ile Leu Lys Ala Tyr Ser Glu Asp Ser Thr Arg Pro Pro
1985 1990 1995 2000
Arg Met Pro Ser Ala Ser Ala Glu Phe Val Asn Thr Ser Asn Leu Thr
2005 2010 2015
Gly Ile Leu Thr Gly Leu Leu Pro Phe Lys Asn Tyr Ala Val Thr Leu
2020 2025 2030
Thr Ala Cys Thr Leu Ala Gly Cys Thr Glu Ser Ser His Ala Leu Asn
2035 2040 2045
Ile Ser Thr Pro Gln Glu Ala Pro Gln Glu Val Gln Pro Pro Val Ala
2050 2055 2060
Lys Ser Leu Pro Ser Ser Leu Leu Leu Ser Trp Asn Pro Pro Lys Lys
2065 2070 2075 2080
Ala Asn Gly Ile Ile Thr Gln Tyr Cys Leu Tyr Met Asp Gly Arg Leu
2085 2090 2095
Ile Tyr Ser Gly Ser Glu Glu Asn Tyr Thr Val Thr Asp Leu Ala Val
2100 2105 2110
Phe Thr Pro His Gln Phe Leu Leu Ser Ala Cys Thr His Val Gly Cys
2115 2120 2125
Thr Asn Ser Ser Trp Val Leu Leu Tyr Thr Ala Gln Leu Pro Pro Glu
2130 2135 2140
His Val Asp Ser Pro Val Leu Thr Val Leu Asp Ser Arg Thr Ile His
2145 2150 2155 2160
Ile Gln Trp Lys Gln Pro Arg Lys Ile Ser Gly Ile Leu Glu Arg Tyr
2165 2170 2175
Val Leu Tyr Met Ser Asn His Thr His Asp Phe Thr Ile Trp Ser Val
2180 2185 2190
Ile Tyr Asn Ser Thr Glu Leu Phe Gln Asp His Met Leu Gln Tyr Val
2195 2200 2205
Leu Pro Gly Asn Lys Tyr Leu Ile Lys Leu Gly Ala Cys Thr Gly Gly
2210 2215 2220
Gly Cys Thr Val Ser Glu Ala Ser Glu Ala Leu Thr Asp Glu Asp Ile
2225 2230 2235 2240
Pro Glu Gly Val Pro Ala Pro Lys Ala His Ser Tyr Ser Pro Asp Ser
2245 2250 2255
Phe Asn Val Ser Trp Thr Glu Pro Glu Tyr Pro Asn Gly Val Ile Thr
2260 2265 2270
Ser Tyr Gly Leu Tyr Leu Asp Gly Ile Leu Ile His Asn Ser Ser Glu
2275 2280 2285
Leu Ser Tyr Arg Ala Tyr Gly Phe Ala Pro Trp Ser Leu His Ser Phe
2290 2295 2300
Arg Val Gln Ala Cys Thr Ala Lys Gly Cys Ala Leu Gly Pro Leu Val
2305 2310 2315 2320
Glu Asn Arg Thr Leu Glu Ala Pro Pro Glu Gly Thr Val Asn Val Phe
2325 2330 2335
Val Lys Thr Gln Gly Ser Arg Lys Ala His Val Arg Trp Glu Ala Pro
2340 2345 2350
Phe Arg Pro Asn Gly Leu Leu Thr His Ser Val Leu Phe Thr Gly Ile
2355 2360 2365
Phe Tyr Val Asp Pro Val Gly Asn Asn Tyr Thr Leu Leu Asn Val Thr
2370 2375 2380
Lys Val Met Tyr Ser Gly Glu Glu Thr Asn Leu Trp Val Leu Ile Asp
2385 2390 2395 2400
Gly Leu Val Pro Phe Thr Asn Tyr Thr Val Gln Val Asn Ile Ser Asn
2405 2410 2415
Ser Gln Gly Ser Leu Ile Thr Asp Pro Ile Thr Ile Ala Met Pro Pro
2420 2425 2430
Gly Ala Pro Asp Gly Val Leu Pro Pro Arg Leu Ser Ser Ala Thr Pro
2435 2440 2445
Thr Ser Leu Gln Val Val Trp Ser Thr Pro Ala Arg Asn Asn Ala Pro
2450 2455 2460
Gly Ser Pro Arg Tyr Gln Leu Gln Met Arg Ser Gly Asp Ser Thr His
2465 2470 2475 2480
Gly Phe Leu Glu Leu Phe Ser Asn Pro Ser Ala Ser Leu Ser Tyr Glu
2485 2490 2495
Val Ser Asp Leu Gln Pro Tyr Thr Glu Tyr Met Phe Arg Leu Val Ala
2500 2505 2510
Ser Asn Gly Phe Gly Ser Ala His Ser Ser Trp Ile Pro Phe Met Thr
2515 2520 2525
Ala Glu Asp Lys Pro Gly Pro Val Val Pro Pro Ile Leu Leu Asp Val
2530 2535 2540
Lys Ser Arg Met Met Leu Val Thr Trp Gln His Pro Arg Lys Ser Asn
2545 2550 2555 2560
Gly Val Ile Thr His Tyr Asn Ile Tyr Leu His Gly Arg Leu Tyr Leu
2565 2570 2575
Arg Thr Pro Gly Asn Val Thr Asn Cys Thr Val Met His Leu His Pro
2580 2585 2590
Tyr Thr Ala Tyr Lys Phe Gln Val Glu Ala Cys Thr Ser Lys Gly Cys
2595 2600 2605
Ser Leu Ser Pro Glu Ser Gln Thr Val Trp Thr Leu Pro Gly Ala Pro
2610 2615 2620
Glu Gly Ile Pro Ser Pro Glu Leu Phe Ser Asp Thr Pro Thr Ser Val
2625 2630 2635 2640
Ile Ile Ser Trp Gln Pro Pro Thr His Pro Asn Gly Leu Val Glu Asn
2645 2650 2655
Phe Thr Ile Glu Arg Arg Val Lys Gly Lys Glu Glu Val Thr Thr Leu
2660 2665 2670
Val Thr Leu Pro Arg Ser His Ser Met Arg Phe Ile Asp Lys Thr Ser
2675 2680 2685
Ala Leu Ser Pro Trp Thr Lys Tyr Glu Tyr Arg Val Leu Met Ser Thr
2690 2695 2700
Leu His Gly Gly Thr Asn Ser Ser Ala Trp Val Glu Val Thr Thr Arg
2705 2710 2715 2720
Pro Ser Arg Pro Ala Gly Val Gln Pro Pro Val Val Thr Val Leu Glu
2725 2730 2735
Pro Asp Ala Val Gln Val Thr Trp Lys Pro Pro Leu Ile Gln Asn Gly
2740 2745 2750
Asp Ile Leu Ser Tyr Glu Ile His Met Pro Asp Pro His Ile Thr Leu
2755 2760 2765
Thr Asn Val Thr Ser Ala Val Leu Ser Gln Lys Val Thr His Leu Ile
2770 2775 2780
Pro Phe Thr Asn Tyr Ser Val Thr Ile Val Ala Cys Ser Gly Gly Asn
2785 2790 2795 2800
Gly Tyr Leu Gly Gly Cys Thr Glu Ser Leu Pro Thr Tyr Val Thr Thr
2805 2810 2815
His Pro Thr Val Pro Gln Asn Val Gly Pro Leu Ser Val Ile Pro Leu
2820 2825 2830
Ser Glu Ser Tyr Val Val Ile Ser Trp Gln Pro Pro Ser Lys Pro Asn
2835 2840 2845
Gly Pro Asn Leu Arg Tyr Glu Leu Leu Arg Arg Lys Ile Gln Gln Pro
2850 2855 2860
Leu Ala Ser Asn Pro Pro Glu Asp Leu Asn Arg Trp His Asn Ile Tyr
2865 2870 2875 2880
Ser Gly Thr Gln Trp Leu Tyr Glu Asp Lys Gly Leu Ser Arg Phe Thr
2885 2890 2895
Thr Tyr Glu Tyr Met Leu Phe Val His Asn Ser Val Gly Phe Thr Pro
2900 2905 2910
Ser Arg Glu Val Thr Val Thr Thr Leu Ala Gly Leu Pro Glu Arg Gly
2915 2920 2925
Ala Asn Leu Thr Ala Ser Val Leu Asn His Thr Ala Ile Asp Val Arg
2930 2935 2940
Trp Ala Lys Pro Thr Val Gln Asp Leu Gln Gly Glu Val Glu Tyr Tyr
2945 2950 2955 2960
Thr Leu Phe Trp Ser Ser Ala Thr Ser Asn Asp Ser Leu Lys Ile Leu
2965 2970 2975
Pro Asp Val Asn Ser His Val Ile Gly His Leu Lys Pro Asn Thr Glu
2980 2985 2990
Tyr Trp Ile Phe Ile Ser Val Phe Asn Gly Val His Ser Ile Asn Ser
2995 3000 3005
Ala Gly Leu His Ala Thr Thr Cys Asp Gly Glu Pro Gln Gly Met Leu
3010 3015 3020
Pro Pro Glu Val Val Ile Ile Asn Ser Thr Ala Val Arg Val Ile Trp
3025 3030 3035 3040
Thr Ser Pro Ser Asn Pro Asn Gly Val Val Thr Glu Tyr Ser Ile Tyr
3045 3050 3055
Val Asn Asn Lys Leu Tyr Lys Thr Gly Met Asn Val Pro Gly Ser Phe
3060 3065 3070
Ile Leu Arg Asp Leu Ser Pro Phe Thr Ile Tyr Asp Ile Gln Val Glu
3075 3080 3085
Val Cys Thr Ile Tyr Ala Cys Val Lys Ser Asn Gly Thr Gln Ile Thr
3090 3095 3100
Thr Val Glu Asp Thr Pro Ser Asp Ile Pro Thr Pro Thr Ile Arg Gly
3105 3110 3115 3120
Ile Thr Ser Arg Ser Leu Gln Ile Asp Trp Val Ser Pro Arg Lys Pro
3125 3130 3135
Asn Gly Ile Ile Leu Gly Tyr Asp Leu Leu Trp Lys Thr Trp Tyr Pro
3140 3145 3150
Cys Ala Lys Thr Gln Lys Leu Val Gln Asp Gln Ser Asp Glu Leu Cys
3155 3160 3165
Lys Ala Val Arg Cys Gln Lys Pro Glu Ser Ile Cys Gly His Ile Cys
3170 3175 3180
Tyr Ser Ser Glu Ala Lys Val Cys Cys Asn Gly Val Leu Tyr Asn Pro
3185 3190 3195 3200
Lys Pro Gly His Arg Cys Cys Glu Glu Lys Tyr Ile Pro Phe Val Leu
3205 3210 3215
Asn Ser Thr Gly Val Cys Cys Gly Gly Arg Ile Gln Glu Ala Gln Pro
3220 3225 3230
Asn His Gln Cys Cys Ser Gly Tyr Tyr Ala Arg Ile Leu Pro Gly Glu
3235 3240 3245
Val Cys Cys Pro Asp Glu Gln His Asn Arg Val Ser Val Gly Ile Gly
3250 3255 3260
Asp Ser Cys Cys Gly Arg Met Pro Tyr Ser Thr Ser Gly Asn Gln Ile
3265 3270 3275 3280
Cys Cys Ala Gly Arg Leu His Asp Gly His Gly Gln Lys Cys Cys Gly
3285 3290 3295
Arg Gln Ile Val Ser Asn Asp Leu Glu Cys Cys Gly Gly Glu Glu Gly
3300 3305 3310
Val Val Tyr Asn Arg Leu Pro Gly Met Phe Cys Cys Gly Gln Asp Tyr
3315 3320 3325
Val Asn Met Ser Asp Thr Ile Cys Cys Ser Ala Ser Ser Gly Glu Ser
3330 3335 3340
Lys Ala His Ile Lys Lys Asn Asp Pro Val Pro Val Lys Cys Cys Glu
3345 3350 3355 3360
Thr Glu Leu Ile Pro Lys Ser Gln Lys Cys Cys Asn Gly Val Gly Tyr
3365 3370 3375
Asn Pro Leu Lys Tyr Val Cys Ser Asp Lys Ile Ser Thr Gly Met Met
3380 3385 3390
Met Lys Glu Thr Lys Glu Cys Arg Ile Leu Cys Pro Ala Ser Met Glu
3395 3400 3405
Ala Thr Glu His Cys Gly Arg Cys Asp Phe Asn Phe Thr Ser His Ile
3410 3415 3420
Cys Thr Val Ile Arg Gly Ser His Asn Ser Thr Gly Lys Ala Ser Ile
3425 3430 3435 3440
Glu Glu Met Cys Ser Ser Ala Glu Glu Thr Ile His Thr Gly Ser Val
3445 3450 3455
Asn Thr Tyr Ser Tyr Thr Asp Val Asn Leu Lys Pro Tyr Met Thr Tyr
3460 3465 3470
Glu Tyr Arg Ile Ser Ala Trp Asn Ser Tyr Gly Arg Gly Leu Ser Lys
3475 3480 3485
Ala Val Arg Ala Arg Thr Lys Glu Asp Val Pro Gln Gly Val Ser Pro
3490 3495 3500
Pro Thr Trp Thr Lys Ile Asp Asn Leu Glu Asp Thr Ile Val Leu Asn
3505 3510 3515 3520
Trp Arg Lys Pro Ile Gln Ser Asn Gly Pro Ile Ile Tyr Tyr Ile Leu
3525 3530 3535
Leu Arg Asn Gly Ile Glu Arg Phe Arg Gly Thr Ser Leu Ser Phe Ser
3540 3545 3550
Asp Lys Glu Gly Ile Gln Pro Phe Gln Glu Tyr Ser Tyr Gln Leu Lys
3555 3560 3565
Ala Cys Thr Val Ala Gly Cys Ala Thr Ser Ser Lys Val Val Ala Ala
3570 3575 3580
Thr Thr Gln Gly Val Pro Glu Ser Ile Leu Pro Pro Ser Ile Thr Ala
3585 3590 3595 3600
Leu Ser Ala Val Ala Leu His Leu Ser Trp Ser Val Pro Glu Lys Ser
3605 3610 3615
Asn Gly Val Ile Lys Glu Tyr Gln Ile Arg Gln Val Gly Lys Gly Leu
3620 3625 3630
Ile His Thr Asp Thr Thr Asp Arg Arg Gln His Thr Val Thr Gly Leu
3635 3640 3645
Gln Pro Tyr Thr Asn Tyr Ser Phe Thr Leu Thr Ala Cys Thr Ser Ala
3650 3655 3660
Gly Cys Thr Ser Ser Glu Pro Phe Leu Gly Gln Thr Leu Gln Ala Ala
3665 3670 3675 3680
Pro Glu Gly Val Trp Val Thr Pro Arg His Ile Ile Ile Asn Ser Thr
3685 3690 3695
Thr Val Glu Leu Tyr Trp Ser Leu Pro Glu Lys Pro Asn Gly Leu Val
3700 3705 3710
Ser Gln Tyr Gln Leu Ser Arg Asn Gly Asn Leu Leu Phe Leu Gly Gly
3715 3720 3725
Ser Glu Glu Gln Asn Phe Thr Asp Lys Asn Leu Glu Pro Asn Ser Arg
3730 3735 3740
Tyr Thr Tyr Lys Leu Glu Val Lys Thr Gly Gly Gly Ser Ser Ala Ser
3745 3750 3755 3760
Asp Asp Tyr Ile Val Gln Thr Pro Met Ser Thr Pro Glu Glu Ile Tyr
3765 3770 3775
Pro Pro Tyr Asn Ile Thr Val Ile Gly Pro Tyr Ser Ile Phe Val Ala
3780 3785 3790
Trp Ile Pro Pro Gly Ile Leu Ile Pro Glu Ile Pro Val Glu Tyr Asn
3795 3800 3805
Val Leu Leu Asn Asp Gly Ser Val Thr Pro Leu Ala Phe Ser Val Gly
3810 3815 3820
His His Gln Ser Thr Leu Leu Glu Asn Leu Thr Pro Phe Thr Gln Tyr
3825 3830 3835 3840
Glu Ile Arg Ile Gln Ala Cys Gln Asn Gly Ser Cys Gly Val Ser Ser
3845 3850 3855
Arg Met Phe Val Lys Thr Pro Glu Ala Ala Pro Met Asp Leu Asn Ser
3860 3865 3870
Pro Val Leu Lys Ala Leu Gly Ser Ala Cys Ile Glu Ile Lys Trp Met
3875 3880 3885
Pro Pro Glu Lys Pro Asn Gly Ile Ile Ile Asn Tyr Phe Ile Tyr Arg
3890 3895 3900
Arg Pro Ala Gly Ile Glu Glu Glu Ser Val Leu Phe Val Trp Ser Glu
3905 3910 3915 3920
Gly Ala Leu Glu Phe Met Asp Glu Gly Asp Thr Leu Arg Pro Phe Thr
3925 3930 3935
Leu Tyr Glu Tyr Arg Val Arg Ala Cys Asn Ser Lys Gly Ser Val Glu
3940 3945 3950
Ser Leu Trp Ser Leu Thr Gln Thr Leu Glu Ala Pro Pro Gln Asp Phe
3955 3960 3965
Pro Ala Pro Trp Ala Gln Ala Thr Ser Ala His Ser Val Leu Leu Asn
3970 3975 3980
Trp Thr Lys Pro Glu Ser Pro Asn Gly Ile Ile Ser His Tyr Arg Val
3985 3990 3995 4000
Val Tyr Gln Glu Arg Pro Asp Asp Pro Thr Phe Asn Ser Pro Thr Val
4005 4010 4015
His Ala Phe Thr Val Lys Gly Thr Ser His Gln Ala His Leu Tyr Gly
4020 4025 4030
Leu Glu Pro Phe Thr Thr Tyr Arg Ile Gly Val Val Ala Ala Asn His
4035 4040 4045
Ala Gly Glu Ile Leu Ser Pro Trp Thr Leu Ile Gln Thr Leu Glu Ser
4050 4055 4060
Ser Pro Ser Gly Leu Arg Asn Phe Ile Val Glu Gln Lys Glu Asn Gly
4065 4070 4075 4080
Arg Ala Leu Leu Leu Gln Trp Ser Glu Pro Met Arg Thr Asn Gly Val
4085 4090 4095
Ile Lys Thr Tyr Asn Ile Phe Ser Asp Gly Phe Leu Glu Tyr Ser Gly
4100 4105 4110
Leu Asn Arg Gln Phe Leu Phe Arg Arg Leu Asp Pro Phe Thr Leu Tyr
4115 4120 4125
Thr Leu Thr Leu Glu Ala Cys Thr Arg Ala Gly Cys Ala His Ser Ala
4130 4135 4140
Pro Gln Pro Leu Trp Thr Asp Glu Ala Pro Pro Asp Ser Gln Leu Ala
4145 4150 4155 4160
Pro Thr Val His Ser Val Lys Ser Thr Ser Val Glu Leu Ser Trp Ser
4165 4170 4175
Glu Pro Val Asn Pro Asn Gly Lys Ile Ile Arg Tyr Glu Val Ile Arg
4180 4185 4190
Arg Cys Phe Glu Gly Lys Ala Trp Gly Asn Gln Thr Ile Gln Ala Asp
4195 4200 4205
Glu Lys Ile Val Phe Thr Glu Tyr Asn Thr Glu Arg Asn Thr Phe Met
4210 4215 4220
Tyr Asn Asp Thr Gly Leu Gln Pro Trp Thr Gln Cys Glu Tyr Lys Ile
4225 4230 4235 4240
Tyr Thr Trp Asn Ser Ala Gly His Thr Cys Ser Ser Trp Asn Val Val
4245 4250 4255
Arg Thr Leu Gln Ala Pro Pro Glu Gly Leu Ser Pro Pro Val Ile Ser
4260 4265 4270
Tyr Val Ser Met Asn Pro Gln Lys Leu Leu Ile Ser Trp Ile Pro Pro
4275 4280 4285
Glu Gln Ser Asn Gly Ile Ile Gln Ser Tyr Arg Leu Gln Arg Asn Glu
4290 4295 4300
Met Leu Tyr Pro Phe Ser Phe Asp Pro Val Thr Phe Asn Tyr Thr Asp
4305 4310 4315 4320
Glu Glu Leu Leu Pro Phe Ser Thr Tyr Ser Tyr Ala Leu Gln Ala Cys
4325 4330 4335
Thr Ser Gly Gly Cys Ser Thr Ser Lys Pro Thr Ser Ile Thr Thr Leu
4340 4345 4350
Glu Ala Ala Pro Ser Glu Val Ser Pro Pro Asp Leu Trp Ala Val Ser
4355 4360 4365
Ala Thr Gln Met Asn Val Cys Trp Ser Pro Pro Thr Val Gln Asn Gly
4370 4375 4380
Lys Ile Thr Lys Tyr Leu Val Arg Tyr Asp Asn Lys Glu Ser Leu Ala
4385 4390 4395 4400
Gly Gln Gly Leu Cys Leu Leu Val Ser His Leu Lys Pro Tyr Ser Gln
4405 4410 4415
Tyr Asn Phe Ser Leu Val Ala Cys Thr Asn Gly Gly Cys Thr Ala Ser
4420 4425 4430
Val Ser Lys Ser Ala Trp Thr Met Glu Ala Leu Pro Glu Asn Met Asp
4435 4440 4445
Ser Pro Thr Leu Gln Val Thr Gly Ser Glu Ser Ile Glu Ile Thr Trp
4450 4455 4460
Lys Pro Pro Arg Asn Pro Asn Gly Gln Ile Arg Ser Tyr Glu Leu Arg
4465 4470 4475 4480
Arg Asp Gly Thr Ile Val Tyr Thr Gly Leu Glu Thr Arg Tyr Arg Asp
4485 4490 4495
Phe Thr Leu Thr Pro Gly Val Glu Tyr Ser Tyr Thr Val Thr Ala Ser
4500 4505 4510
Asn Ser Gln Gly Gly Ile Leu Ser Pro Leu Val Lys Asp Arg Thr Ser
4515 4520 4525
Pro Ser Ala Pro Ser Gly Met Glu Pro Pro Lys Leu Gln Ala Arg Gly
4530 4535 4540
Pro Gln Glu Ile Leu Val Asn Trp Asp Pro Pro Val Arg Thr Asn Gly
4545 4550 4555 4560
Asp Ile Ile Asn Tyr Thr Leu Phe Ile Arg Glu Leu Phe Glu Arg Glu
4565 4570 4575
Thr Lys Ile Ile His Ile Asn Thr Thr His Asn Ser Phe Gly Met Gln
4580 4585 4590
Ser Tyr Ile Val Asn Gln Leu Lys Pro Phe His Arg Tyr Glu Ile Arg
4595 4600 4605
Ile Gln Ala Cys Thr Thr Leu Gly Cys Ala Ser Ser Asp Trp Thr Phe
4610 4615 4620
Ile Gln Thr Pro Glu Ile Ala Pro Leu Met Gln Pro Pro Pro His Leu
4625 4630 4635 4640
Glu Val Gln Met Ala Pro Gly Gly Phe Gln Pro Thr Val Ser Leu Leu
4645 4650 4655
Trp Thr Gly Pro Leu Gln Pro Asn Gly Lys Val Leu Tyr Tyr Glu Leu
4660 4665 4670
Tyr Arg Arg Gln Ile Ala Thr Gln Pro Arg Lys Ser Asn Pro Val Leu
4675 4680 4685
Ile Tyr Asn Gly Ser Ser Thr Ser Phe Ile Asp Ser Glu Leu Leu Pro
4690 4695 4700
Phe Thr Glu Tyr Glu Tyr Gln Val Trp Ala Val Asn Ser Ala Gly Lys
4705 4710 4715 4720
Ala Pro Ser Ser Trp Thr Trp Cys Arg Thr Gly Pro Ala Pro Pro Glu
4725 4730 4735
Gly Leu Arg Ala Pro Thr Phe His Val Ile Ser Ser Thr Gln Ala Val
4740 4745 4750
Val Asn Ile Ser Ala Pro Gly Lys Pro Asn Gly Ile Val Ser Leu Tyr
4755 4760 4765
Arg Leu Phe Ser Ser Ser Ala His Gly Ala Glu Thr Val Leu Ser Glu
4770 4775 4780
Gly Met Ala Thr Gln Gln Thr Leu His Gly Leu Gln Ala Phe Thr Asn
4785 4790 4795 4800
Tyr Ser Ile Gly Val Glu Ala Cys Thr Cys Phe Asn Cys Cys Ser Lys
4805 4810 4815
Gly Pro Thr Ala Glu Leu Arg Thr His Pro Ala Pro Pro Ser Gly Leu
4820 4825 4830
Ser Ser Pro Gln Ile Gly Thr Leu Ala Ser Arg Thr Ala Ser Phe Arg
4835 4840 4845
Trp Ser Pro Pro Met Phe Pro Asn Gly Val Ile His Ser Tyr Glu Leu
4850 4855 4860
Gln Phe His Val Ala Cys Pro Pro Asp Ser Ala Leu Pro Cys Thr Pro
4865 4870 4875 4880
Ser Gln Ile Glu Thr Lys Tyr Thr Gly Leu Gly Gln Lys Ala Ser Leu
4885 4890 4895
Gly Gly Leu Gln Pro Tyr Thr Thr Tyr Lys Leu Arg Val Val Ala His
4900 4905 4910
Asn Glu Val Gly Ser Thr Ala Ser Glu Trp Ile Ser Phe Thr Thr Gln
4915 4920 4925
Lys Glu Leu Pro Gln Tyr Arg Ala Pro Phe Ser Val Asp Ser Asn Leu
4930 4935 4940
Ser Val Val Cys Val Asn Trp Ser Asp Thr Phe Leu Leu Asn Gly Gln
4945 4950 4955 4960
Leu Lys Glu Tyr Val Leu Thr Asp Gly Gly Arg Arg Val Tyr Ser Gly
4965 4970 4975
Leu Asp Thr Thr Leu Tyr Ile Pro Arg Thr Ala Asp Lys Thr Phe Phe
4980 4985 4990
Phe Gln Val Ile Cys Thr Thr Asp Glu Gly Ser Val Lys Thr Pro Leu
4995 5000 5005
Ile Gln Tyr Asp Thr Ser Thr Gly Leu Gly Leu Val Leu Thr Thr Pro
5010 5015 5020
Gly Lys Lys Lys Gly Ser Arg Ser Lys Ser Thr Glu Phe Tyr Ser Glu
5025 5030 5035 5040
Leu Trp Phe Ile Val Leu Met Ala Met Leu Gly Leu Ile Leu Leu Ala
5045 5050 5055
Ile Phe Leu Ser Leu Ile Leu Gln Arg Lys Ile His Lys Glu Pro Tyr
5060 5065 5070
Ile Arg Glu Arg Pro Pro Leu Val Pro Leu Gln Lys Arg Met Ser Pro
5075 5080 5085
Leu Asn Val Tyr Pro Pro Gly Glu Asn His Met Gly Leu Ala Asp Thr
5090 5095 5100
Lys Ile Pro Arg Ser Gly Thr Pro Val Ser Ile Arg Ser Asn Arg Ser
5105 5110 5115 5120
Ala Cys Val Leu Arg Ile Pro Ser Gln Asn Gln Thr Ser Leu Thr Tyr
5125 5130 5135
Ser Gln Gly Ser Leu His Arg Ser Val Ser Gln Leu Met Asp Ile Gln
5140 5145 5150
Asp Lys Lys Val Leu Met Asp Asn Ser Leu Trp Glu Ala Ile Met Gly
5155 5160 5165
His Asn Ser Gly Leu Tyr Val Asp Glu Glu Asp Leu Met Asn Ala Ile
5170 5175 5180
Lys Asp Phe Ser Ser Val Thr Lys Glu Arg Thr Thr Phe Thr Asp Thr
5185 5190 5195 5200
His Leu
<210> 10
<211> 25983
<212> DNA
<213> Artificial Sequence
<220>
<223> USH2A
<400> 10
tgtttgctct gcagaatact ttacctgggc acccaagtct tccttccagc attcctgctg 60
ctacagccta tttgctgagt aaccaggggt tacagcagcg ttgccaggca acgagggaca 120
gcggtcctgt tgaagagcca tttgtcacac tgaggggact ggttgaaatg caataaagaa 180
atgnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnataccag cagctactca 300
tgtcttcgcc attgctaaga acgtcgttgg tattacctta ctctgagaac gtgtctgcag 360
tttccagaaa atggagtatc gcaacatcac ttaaagtacc ctgcttcaaa gtattgctgg 420
caagtggcgt gggcctgatt atttatttag aaatgcttta tcaggaggag aatgcttttt 480
tgtaaacatg aattgcccag ttctttcatt gggctctggc ttcttgtttc aggtcattga 540
aatgttgatc tttgcctatt ttgcttcaat atccttgact gagtcacgag gtcttttccc 600
aaggctggag aacgtgggag ctttcaagaa agtttccatc gtgccaaccc aagcagtatg 660
tggactccca gaccgaagca ctttttgtca cagctctgct gctgctgaaa gtattcagtt 720
ctgtacccag cggttttgta ttcaggattg cccatacaga tcttcacacc ctacctacac 780
tgcccttttc tcagcaggcc tcagtagctg catcacacca gacaagaatg atctgcatcc 840
taacgcccat agcaattctg caagttttat ttttggaaat cacaagagct gcttttcttc 900
tcctccttct ccaaagctga tggcatcatt taccttagct gtatggctga aacctgagca 960
acaaggtgta atnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nngtgtgtta 1080
tagaaaagac agtagatggg cagattgtgt tcaaacttac aatatctgag aaagagacca 1140
tgttttatta tcgcacagta aatggtttgc aacctccaat aaaagtaatg acactgggga 1200
gaattcttgt gaagaaatgg attcatctta gtgtgcagnn nnnnnnnnnn nnnnnnnnnn 1260
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320
nnnnnnnnnn nnnnnnnngt gcatcagaca aaaatcagct tctttatcaa tggcgtggag 1380
aaggatcata cacctttcaa tgcaagaact ctaagtggtt caattacaga ttttgcatct 1440
ggtactgtgc aaataggaca gagtttaaat gnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1500
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1560
nnnnnnnnnn ngtttagagc agtttgtcgg aagaatgcaa gattttcgat tataccaagt 1620
ggcacttaca aacagnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnagaga 1740
ttctggaagt cttctctgga gatcttctca gattgcatgc ccaatcacat tgccgttgcc 1800
ctggcagcca cccgcgggtc caccctttgg cacagcggta ctgcattcct aatgatgcag 1860
gagacacagc tgataataga gtgtcacggt tgaatcctga agcccatcct ctctcttttg 1920
tcaatgataa tgatgttggt acttcatggg tttcaaatgt gtttacaaac attacacagc 1980
ttaatcaagg agtgactatt tcagttgatt tggaaaatgg acagtatcag nnnnnnnnnn 2040
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2100
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn gtgttttata ttatcattca gttctttagt 2160
ccacaaccaa cggaaataag gattcaaagg aagaaggaaa atagtttaga ttgggaggac 2220
tggcaatatt ttgccaggaa ttgtggtgct tttggaatga aaaacaatgg agatttggaa 2280
aaacctgatt ctgtcaactg tcttcagctt tccaannnnn nnnnnnnnnn nnnnnnnnnn 2340
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2400
nnnnnnnnnn nnnnntttta ctccatattc ccgtggcaat gtcacattta gcatcctgac 2460
acctggacca aattatcgtc ctggatacaa taacttctat aataccccat ctcttcaaga 2520
gttcgtaaaa gccacgcaaa taaggtttca ttttcatggg cagtactata caactgagac 2580
tgctgttaac ctcagacaca gatattatgc agtggacgaa atcaccatta gtgggagnnn 2640
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2700
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnatg tcagtgccat ggtcatgccg 2760
ataactgcga cacaacaagc cagccatata gatgcctctg ctcccaggag agcttcactg 2820
aaggacttca tnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2880
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ntgtgatcgc 2940
tgcttgcctc tttataatga caagcctttc cgccaaggtg atcaagttta cgctttcaat 3000
tgtaaacctt gtcaatgcaa cagccattcc aaaagctgcc attacaacat ctctgtagac 3060
ccatttcctt ttgagcactt cagaggggga ggaggagttt gtgatgattg tgagcataac 3120
actacagnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3180
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnngaa ggaactgtga 3240
gctgtgcaag gattactttt tccgacaagt tggtgcagat ccttcggcca tagatgtttg 3300
caaaccctgt gactgtgata cagttggcac tagaaatggt agcattcttt gtgatcagnn 3360
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3420
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnat tggaggacag tgtaattgta 3480
agagacacgt gtctggcagg cagtgcaatc agtgccagaa tggattctac aatctacaag 3540
agttggatcc tgatggctgc agtccctgta actgcaatac ctctgggaca gtggatggag 3600
atattacctg tcaccaaaat tcaggccagt gcaagtgcaa agcaaacgtt attgnnnnnn 3660
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3720
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnggctta ggtgtgatca ttgcaatttt 3780
ggatttaaat ttctccgaag ctttaatgat gttggatgtg agccctgcca gtgtaacctc 3840
catggctcag tgaacaaatt ctgcaatcct cactctgggc agtgtgagtg caaaaaagaa 3900
gccaaaggac ttcagtgtga cacctgcaga gaaaactttt atgggttaga tgtcaccaat 3960
tgtaaggcct gtgactgtga cacagctgga tccctccctg ggactgtctg taatgctaag 4020
acagggcagt gcatctgcaa gcccaatgtt gaagggagac agtgcaataa atgtttggag 4080
ggaaacttct acctacggca aaataattct ttcctctgtc tgccttgcaa ctgtgataag 4140
actgggacaa taaatggctc tctgctgtgt aacaaatcaa caggacaatg tccttgcaaa 4200
ttaggggtaa caggtcttcg ctgtaatcag tgtgagcctc acaggtacaa tttgaccatt 4260
gacaattttc aacactgcca gatgtgtgag tgtgattcct tggggacatt acctgggacc 4320
atttgtgacc caatcagtgg ccagtgcctg tgtgtgccta atcgtcaagg aagaaggtgt 4380
aatcagtgtc aaccagnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4440
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnngttt 4500
ttatatttct ccaggcaatg ccactggctg cctgccatgc tcatgccata caactggtgc 4560
agttaatcac atctgtaata gcctgactgg tcagtgtgtt tgccaagatg cttccattgc 4620
tgggcaacgt tgtgaccaat gcaaagacca ttactttgga tttgatcctc agactggaag 4680
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4740
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn atgtcagcct tgtaattgtc 4800
atctctcagg agccttgaat gaaacctgtc acttggtcac aggccagtgt ttctgtaaac 4860
aatttgtcac tggctcaaag tgtgatgctt gtgttcccag tgcaagccac ttggatgtca 4920
acaatctatt gggttgcagc aaaannnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4980
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5040
nnnnctccat tccagcaacc tccgcccaga ggacaagttc aaagttcttc tgctatcaat 5100
ctctcctgga gtccacctga ttctccaaat gcccactggc ttacttacag tttactcagg 5160
gatggttttg aaatctacac aacagaggat caatacccat acannnnnnn nnnnnnnnnn 5220
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5280
nnnnnnnnnn nnnnnnnnnn nnngtattca atacttctta gacacagacc tgttaccata 5340
taccaaatat tcctattaca ttgagaccac caatgtgcat ggttcaacaa ggagtgtagc 5400
tgtcacttac aagacaaaac caggggtccc agagggaaac ttgactttaa gttatatcat 5460
tcctattggc tcagactctg tgacacttac ctggacaaca ctctcaaatc aatctggtcc 5520
catagagaaa tatattttgt cctgtgcccc tttggctggt ggtcagccat gtgtttccta 5580
cgaaggtcat gaaacctcag ctaccatctg gaatctggtt ccatttgcca agtacgattt 5640
ttctgtacag gcgtgtacta gcgggggctg tttacacagc ttgcccatta cagtgaccac 5700
agcccaggcc cctccccaaa gactaagtcc acctaagatg cagaaaatca gttctacaga 5760
acttcatgta gaatggtctc caccagcgga actaaatgnn nnnnnnnnnn nnnnnnnnnn 5820
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5880
nnnnnnnnnn nnnnnnnnga ataattataa gatatgaact atacatgaga agactgagat 5940
ctactaaaga aaccacatct gaggaaagtc gagtttttca gagcagtggt tggctcagtc 6000
ctcattcatt tgtagaatcg gccaatgaaa atgcattaaa acctcctcaa acaatgacaa 6060
ccatcactgg cttggagcca tacaccaagt atgagttcag agtcttagct gtgaatatgg 6120
ctggaagtgt gtcttctgcc tgggtctcag aaagaacggg agaatcagnn nnnnnnnnnn 6180
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6240
nnnnnnnnnn nnnnnnnnnn nnnnnnnnca cctgtattca tgatccctcc ttcagtcttt 6300
cccctctctt cgtactctct caatatctcc tgggagaagc cagcagataa tgttacaaga 6360
ggaaaagttg tggggtatga catcaatatg ctttctgaac aatcacctca acagtctatt 6420
cccatggcgt tttcacagnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6480
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnct 6540
gttgcacact gctaaatccc aagaactatc ttacactgta gaaggactga aaccttatag 6600
gatatatgag tttactatta ctctctgcaa ttcagttggt tgtgtgacca gtgcttcggg 6660
agcaggacaa actttagcag cagnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6720
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6780
nnncaccagc acaactgagg ccacctctgg ttaaaggaat caacagcaca acaatccatc 6840
ttaagtggtt tccacctgaa gaactgaatg gaccctctcc tatatatcag ctggaaagga 6900
gagagtcatc tctaccagct ctgatgacca cgatgatgaa aggaatccgt ttcataggaa 6960
atgggtattg taaatttccc agctccactc acccagtcaa tacagacttc actgnnnnnn 7020
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7080
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnngcatta aggccagctt tcgaacaaaa 7140
gtgcctgaag gtttgattgt ctttgcagca tcacctggca atcaggaaga gtattttgca 7200
cttcagttga agaagggacg tctttatttt ctttttgatc ctcagnnnnn nnnnnnnnnn 7260
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7320
nnnnnnnnnn nnnnnnnnnn nnnnngggtc accagtggaa gtaactacaa ctaatgatca 7380
tggcaaacaa tatagtgatg gaaaatggca tgaaataatt gctattaggc atcaggcttt 7440
tggccaaatc actctggatg ggatatatac agnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7500
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7560
nnnnnnnnnn nngttcctct gccatcctga atggtagtac tgttattgga gataacacag 7620
gagtctttct gggagggctc ccgcgaagtt ataccatcct caggaaggat cctgnnnnnn 7680
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7740
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnagataa tccaaaaagg ttttgtgggc 7800
tgtctcaagg atgtacattt tatgaagaat tacaatccgt cagctatttg ggaacctctg 7860
gattggcaga gttctgaaga acaaatcaac gtgtataaca gctgggaggg atgtcccgct 7920
tcattaaatg agggagctca gttcctagga gcagnnnnnn nnnnnnnnnn nnnnnnnnnn 7980
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8040
nnnnnnnnnn nnnnggttcc tggaacttca tccatatatg tttcatggtg gaatgaactt 8100
tgagatttcc tttaagttca gaactgacca attaaatgga ttgcttcttt tcgtttataa 8160
caaagatgga cctgattttc ttgctnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8220
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8280
nnnnnatgga gctgaaaagt ggaatattga ccttccggtt aaataccagt cttgccttta 8340
cacaagtgga tctattgctg gggctatcct attgtaatgg aaagtggaat aaagtcatta 8400
ttaaaaagga aggctctttc atatcagcaa gtgtgaatgg actgatgaag catgcatcgg 8460
agtccggaga ccagccactg gtggtgaatt caccagttta tgtgggagga atcccacagg 8520
aactgctgaa ctcttatcaa catttgtgtt tggaacaagn nnnnnnnnnn nnnnnnnnnn 8580
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8640
nnnnnnnnnn nnnnnnnnng tttcggtggt tgcatgaagg atgttaaatt tacacggggt 8700
gctgtcgtta acttggcatc tgtgtccagc ggtgctgtca gagtcaatct ggatggatgc 8760
ctatcaactg acagtgctgt taactgcagg ggaaatgact ccatcctggt ttaccaggga 8820
aaagagcaga gtgtttacga gggtggtctc cagcctttta cagnnnnnnn nnnnnnnnnn 8880
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8940
nnnnnnnnnn nnnnnnnnnn nnnaatacct gtatcgagtg atagcctcgc atgaaggagg 9000
ttcagtatat agtgattgga gtcgaggacg tacaacagga gcagnnnnnn nnnnnnnnnn 9060
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9120
nnnnnnnnnn nnnnnnnnnn nnnnctccac aaagtgtgcc aactccctca agagtccgca 9180
gcttaaatgg atacagcatt gaggtgacct gggatgaacc tgttgtcaga ggtgtaattg 9240
agaagtacat tctgaaagcc tatagtgagg acagcacccg tccaccccgc atgccctctg 9300
ccagtgctga atttgtcaat acaagcaacc tcacagnnnn nnnnnnnnnn nnnnnnnnnn 9360
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9420
nnnnnnnnnn nnnnnngcat attgacaggc ttgctaccct tcaaaaacta tgcagtaacc 9480
ctaactgctt gcactttggc tggctgtact gagagctcac atgcattgaa catctctact 9540
ccacaagaag nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9600
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ccccacaaga 9660
ggttcagcca ccagtagcca aatcccttcc cagttctttg ctgctctcct ggaacccacc 9720
caaaaaggca aatggtatta taactcagta ctgtttatac atggatggga ggctgatcta 9780
ttcaggcagt gaggagaact acacagtcac agnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9840
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9900
nnnnnnnnnn nnatttagca gtatttacac cccaccagtt tctactaagt gcatgcacac 9960
atgtgggctg tacaaacagt tcctgggtcc tactgtacac agcacagctg ccaccagaac 10020
acgtggattc cccagttctg actgtcctgg attctagaac tatacacata cannnnnnnn 10080
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10140
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nngtggaaac aaccaagaaa aataagtggg 10200
attctggaac gctatgtatt atatatgtca aaccatacac atgattttac aatttggagt 10260
gtcatctata acagtacaga acttttccag gatcatatgc tacaatacgt tttacctggt 10320
aataaatatc tcatcaagct gggannnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10380
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10440
nnnngcttgc acaggtggtg ggtgcacagt gagtgaggcc agtgaggccc taactgacga 10500
ggacataccc gaaggcgtgc cagcccccaa agcccactca tattcacctg actcctttaa 10560
tgtctcctgg actgagcctg aatatccgaa tgnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10620
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10680
nnnnnnnnnn nngtgttatc acgagttatg gattatatct agatggtata ttaatccaca 10740
attcctcaga actcagctat cgtgcttacg gatttgctcc ttggagttta cattccttca 10800
gagtccaagc atgcacggcc aaaggttgtg ctctgggccc actgnnnnnn nnnnnnnnnn 10860
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10920
nnnnnnnnnn nnnnnnnnnn nnnngtggaa aatcgaactc tagaagctcc tcctgaagga 10980
acagtaaatg tgtttgtcaa aacacaggga tcccggaaag cccacgtgag gtgggaagca 11040
ccttttcgcc ctaatggact cttaacacac tcagtccttt tcactgggat attctatgta 11100
gacccagnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 11160
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnntag gtaataacta 11220
cacccttctg aatgtcacaa aagtcatgta cagcggagaa gagacaaacc tttgggtgct 11280
catcgatggg ctggttcctt ttaccaacta tactgtacaa gtgaatattt caaatagcca 11340
aggcagcttg ataactgatc ctataacaat tgcaatgcct ccaggagnnn nnnnnnnnnn 11400
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 11460
nnnnnnnnnn nnnnnnnnnn nnnnnnnctc cagatggcgt gctgcctccc aggctttcat 11520
ctgccactcc aaccagtctt caggttgtct ggtctacacc agctcgtaat aacgctcctg 11580
gctctcccag ataccaactc cagatgaggt ctggcgactc cacccatgga tttctagann 11640
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 11700
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnngt tattttccaa tccttctgca 11760
tcgttaagct atgaagtgag tgatctccaa ccgtacacag agtatatgtt tcggttggtt 11820
gcctccaatg gatttggcag tgcacatagt tcttggattc cattcatgac cgcagaggac 11880
annnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 11940
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn naacctggac ctgtagttcc 12000
tccgattctt ctggatgtga agtcaagaat gatgttggtc acctggcagc atcctagaaa 12060
atccaatggg gttattaccc attataacat ttatctacat ggccgtctat acttgagaac 12120
tcctggaaat gtcactaatt gcacagtgat gcatttacac ccatacactg cctataagtt 12180
tcaggtagaa gcctgcactt caaaaggatg ttccctttca ccagagtccc agactgtatg 12240
gacactccca ggggcaccgg aagggatccc aagtccagag ctgttctctg atactccaac 12300
atctgtgatt atatcttggc aaccccctac ccaccccaat ggcttggtgg agaatttcac 12360
aattgagaga agagtcaaag gaaaggaaga agttactacc ctggtgactc tcccgaggag 12420
tcattccatg aggtttattg acaagacttc tgctcttagc ccatggacaa aatatgaata 12480
tcgggtactg atgagcactc ttcatggagg cacaaacagc agtgcttggg tagaagttac 12540
cacaagaccc tcacgacctg ctggggtgca gccacctgtg gtgacagtgc tggaacccga 12600
tgcagtccag nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 12660
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn gtcacttgga 12720
aacccccact catccagaac ggagacatac ttagctatga gattcacatg cctgaccctc 12780
acatcacttt aaccaatgtg acttccgcag tgttaagtca aaaagttact catctgattc 12840
ctttcactaa ttattctgtc accattgttg cttgctcagg gggtaatggg taccttggag 12900
ggtgcacaga gagtttacct acctatgtta ccactcaccc caccgtacct cagaatgttg 12960
gcccattgtc tgtgattcca ctaagtgaat catatgttgt gatttcttgg caaccaccat 13020
ccaagccaaa tggacctaat ttgagnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13080
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13140
nnnnnatatg agcttctgag acgtaaaatc cagcagccac ttgcatcaaa tcccccagaa 13200
gatttaaatc ggtggcacaa tatttattca ggaactcagt ggctttatga agataagggt 13260
cttagcagnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13320
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnngt ttacaaccta 13380
tgaatatatg ctcttcgtac acaacagtgt gggttttaca ccgagccgag aagtgactgt 13440
gacaacgtta gctggtcttc cagagagagg agccaatctc actgcgagtg tccttaacca 13500
cacagccatc gacgtgaggt gggctaaacc aannnnnnnn nnnnnnnnnn nnnnnnnnnn 13560
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13620
nnnnnnnnnn nnctgttcaa gacctacaag gtgaagttga atattacaca cttttttgga 13680
gttctgctac ctcaaacgac tctctaaaaa tcttgccaga tgtaaactct catgtcattg 13740
gccacctaaa gccaaacaca gagtattgga tctttatctc tgtcttcaat ggagtccaca 13800
gcatcaacag tgcaggactt catgcaacca cttgcgatgg ggnnnnnnnn nnnnnnnnnn 13860
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13920
nnnnnnnnnn nnnnnnnnnn nnagcctcag ggcatgcttc ctccagaggt tgtcatcatc 13980
aacagtacag ctgtacgtgt catctggaca tctccttcaa acccaaatgg tgttgtcact 14040
gagtattcta tctatgtaaa taataagctc tacaagactg gaatgaatgt gcctgggtcg 14100
tttattctga gagacctgtc tcccttcact atctatgaca ttcagnnnnn nnnnnnnnnn 14160
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 14220
nnnnnnnnnn nnnnnnnnnn nnnnngttga agtctgcaca atatatgcct gcgtgaaaag 14280
caatggaacc caaattacca ctgtggaaga cactccaagt gatataccaa cacccacaat 14340
tcgtggcatc acttcaagnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 14400
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnat 14460
ctcttcaaat tgattgggtg tctccacgga agccaaatgg catcattctt ggatatgatc 14520
tcctatggaa aacatggtat ccatgcgcta aaactcaaaa gttagtgcag gatcagagtg 14580
atgagctctg caaggcagtg aggtgtcaaa aacctgaatc tatctgtgga cacatttgct 14640
attcttctga agctaagnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 14700
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnngtt 14760
tgttgtaacg gagtgctcta taaccccaag cctggacatc gctgttgtga agaaaagtat 14820
atcccgtttg ttctgaattc tactggagtt tgttgtggtg gccgaataca ggaggcacaa 14880
ccaaatcatc agtgctgctc tgggtattac gctagaattc taccagnnnn nnnnnnnnnn 14940
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15000
nnnnnnnnnn nnnnnnnnnn nnnnnngtga agtatgctgt ccagatgaac agcacaatcg 15060
ggtttctgtt ggcattggtg attcctgctg tggcagaatg ccgtactcca cctcaggaaa 15120
ccagatttgc tgtgctggga ggcttcatga tggccatggc cagaagtgct gtggcagaca 15180
gattgtgagc aacgatttag agtgttgtgg tggagaagaa ggagtggtgt acaatcgcct 15240
tccagnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15300
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnngtatg ttctgttgtg 15360
ggcaggatta tgtgaatatg tcagatacca tatgctgctc agcttccagt ggagagtcta 15420
aagcacatat taaaaagaat gacccggtgc cagtaaaatg ctgtgagact gaacttattc 15480
caaagagcca gaaatgctgt aatggagttg gatataatcc tttgaaatat gtttgctctg 15540
acaagatttc aactggaatg atgatgaagn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15600
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15660
nnnnnnnnng aaaccaaaga gtgcaggatc ctctgcccag catctatgga agccacagaa 15720
cattgtggca ggtgtgactt caactttacc agccacattt gcactgtgat aagagggtct 15780
cacaattcca cagggaaggc atcaattgaa gaaatgtgtt catctgccga agaaaccatt 15840
catacaggga gtgtaaacac gtactcttac acagnnnnnn nnnnnnnnnn nnnnnnnnnn 15900
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15960
nnnnnnnnnn nnnnatgtga acctcaagcc ctacatgaca tatgagtaca ggatttctgc 16020
ctggaacagc tatgggcgag gactcagcaa agctgtgaga gccagaacaa aagaagatgt 16080
gcctcaagga gtgagtcccc ctacgtggac caaaatagac aatcttgaag atacaattgt 16140
cttaaactgg agaaaaccta tacaatcaaa tgnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16200
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16260
nnnnnnnnnn nngtcctatt atttactaca tccttcttcg aaatggaatt gaacgttttc 16320
ggggaacatc actgagcttc tctgataaag agggaattca accatttcag gaatattcat 16380
atcagctgaa agcttgcacg gttgctggct gtgccaccag tagcaagnnn nnnnnnnnnn 16440
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16500
nnnnnnnnnn nnnnnnnnnn nnnnnnngta gttgcagcta ctacccaagg agttccggag 16560
agcatcctgc caccaagcat cacagcccta agtgcagtgg ctctgcatct gagctggagt 16620
gtccctgaga aatcaaacgg cgtcattaaa gagtaccaga tcaggcaggt tgggaaaggt 16680
ctcatccaca ctgacaccac tgacaggaga cagcatacgg tcacagnnnn nnnnnnnnnn 16740
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16800
nnnnnnnnnn nnnnnnnnnn nnnnnngtct ccagccatac accaactaca gcttcactct 16860
tacagcttgt acatctgctg ggtgcacttc aagcgagcct tttctaggtc agacactgca 16920
ggcagctcct gaagnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16980
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnngagttt 17040
gggtgacacc tcgacacatt atcatcaatt ctacaacagt ggaattatat tggagtctgc 17100
cagaaaagcc caatggcctc gtttctcaat atcaattgag tcgtaatgga aacttgcttt 17160
tcctgggtgg cagtgaggag cagaatttca ctgataaaaa cctggagccc aatagcagnn 17220
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 17280
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnat acacttacaa gttagaagtc 17340
aaaactggag gtggcagcag tgctagtgat gattacattg ttcaaacacc tatgtcaaca 17400
ccagaagaaa tctatcctcc atataatatc acagtaattg ggccttattc tatatttgta 17460
gcttggatac caccagnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 17520
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnggat 17580
cctcatcccc gaaattcctg tggagtacaa tgtcttactc aatgatggaa gtgtaacacc 17640
tctggccttc tccgttggtc atcatcaatc cacccttctg gaaaatttga ctccattcac 17700
acagtatgag ataaggatac aagcatgtca aaatgnnnnn nnnnnnnnnn nnnnnnnnnn 17760
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 17820
nnnnnnnnnn nnnnngaagt tgtggagtta gcagtaggat gtttgtcaaa acacctgaag 17880
cagccccaat ggatcttaat tctcctgttc ttaaggcact ggggtcagct tgcatagaga 17940
ttaagtggat gccacctgaa aaaccaaatg gaatcatcat caactacttt atttacagnn 18000
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 18060
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnac gccctgctgg cattgaagag 18120
gagtctgttt tatttgtctg gtcagaagga gcccttgaat ttatggatga aggagacacc 18180
ctgaggcctt tcacactcta cgaatatcgg gtcagagcct gtaactccaa gggttcagtg 18240
gagagtctgt ggtcattaac acaaactctg gaagctccac ctcaagattt tccagctcct 18300
tgggctcaag ccacgagtgc tcattcagtt ctgttgaatt ggacaaagcc agaatctccc 18360
aatggcatta tctcccatta ccgtgtggtc taccaggaga gacccgacga tcctacattt 18420
aacagcccta ccgtgcatgc tttcacagtg aagnnnnnnn nnnnnnnnnn nnnnnnnnnn 18480
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 18540
nnnnnnnnnn nnnggaacaa gccatcaagc ccacctgtac gggttagaac cattcacaac 18600
atatcgcatt ggtgttgtgg ctgcaaacca tgcaggagaa attttaagcc cttggactct 18660
gattcaaacc ttagaatctt ccccaagtgg actgagaaac tttatagtag aacagaaaga 18720
gaatggccgg gcattgctac tacagtggtc agaacctatg agaaccaatg gtgtgattaa 18780
gnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 18840
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nacatacaac atcttcagtg 18900
acgggttcct ggagtactct ggtttgaatc gtcagtttct cttccgccgc ctggatcctt 18960
tcactctcta cacactgacc ctggaggcct gcaccagagc aggttgtgca cactcggcgc 19020
ctcagcctct gtggacagat gaagcccctc cagactctca gctggctcct actgtccact 19080
ctgtgaagtc caccagtgtt gagctgagct ggtctgagcc tgttaaccca aatggaaaaa 19140
taattcgcta tgaagtgatt cgcagatgct tcgagggaaa agcttgggga aatcagacga 19200
tccaggccga cgagaaaatt gttttcacag aatataacac tgaaaggaat acatttatgt 19260
ataatgacac aggtttgcaa ccatggacgc agtgtgaata taaaatctac acttggaatt 19320
cagctgggca tacctgtagc tcttggaatg tggtgaggac attgcaagca cctccagaag 19380
gtctctctcc acctgtgata tcctatgttt ctatgaatcc ccaaaaactg ctgatttcct 19440
ggatcccacc agaacagtct aatggtatta tccagtccta taggcttcaa aggaatgaaa 19500
tgctctatcc ttttagcttt gatcctgtga ctttcaatta cactgatgaa gagcttcttc 19560
ctttttccac ctatagctat gcactccaag cctgcacgag tggaggatgc tccaccagca 19620
aacccaccag catcacaact ctggaggctg ctccatcaga agtcagccct ccagatcttt 19680
gggccgtcag tgccactcaa atgaatgtat gttggtcacc gcccacagtg caaaatggaa 19740
agattactaa atatttagtt agatatgata ataaagagtc ccttgctggc cagggcctgt 19800
gcctgctggt ttcccacctg aagccttact ctcagtataa cttctccctt gtagcctgca 19860
cgaatggagg ttgcacagct agtgtgtcaa aatctgcctg gacaatggag gccctgccag 19920
agaacatgga ctctccaaca ttgcaagtca caggctcaga atcaatagaa atcacctgga 19980
aacctccaag aaacccaaat ggccagatca gaagttatga acttaggagg gatggaacca 20040
ttgtatatac aggcttggaa acacgctatc gtgattttac tctcacccca ggtgtggagt 20100
atagctacac agtaactgcc agcaacagcc aagggggtat tttgagtcct cttgtcaaag 20160
atcgaaccag cccctcagca ccctcaggga tggaacctcc aaaattgcag gccaggggtc 20220
ctcaggagat cttagtgaac tgggaccctc cagtgagaac aaatggtgat atcatcaatt 20280
ataccctctt catccgtgaa ctatttgaaa gagaaactaa aatcatacac ataaacacaa 20340
ctcataattc ttttggtatg cagtcatata tagtaaacca gctgaagcca tttcacagnn 20400
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 20460
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnngt atgaaatacg aattcaagcg 20520
tgcaccaccc tgggatgtgc atcaagtgac tggacattca tacagacccc tgagattgca 20580
cctttgatgc aaccccctcc acatctggag gtacaaatgg ctccaggagg attccagcca 20640
actgtttctc ttttgtggac aggaccgctg cagccaaatg gaaaagtttt gtattacgaa 20700
ttatacagaa gacaaatagc aactcagcct agaaaatcca atccagtcct aatctataac 20760
ggaagctcaa catcttttat agattccgaa ctattgcctt tcacagagta tgagtatcag 20820
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 20880
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn gtctgggcag tgaattctgc 20940
aggaaaagcc cccagtagct ggacatggtg cagaaccggg ccagccccac cagaaggtct 21000
cagagccccc acgttccatg tgatctcttc tacccaagca gtggtcaaca tcagtgcccc 21060
tgggaagccc aacgggatcg tcagtctcta caggctgttc tccagcagcg cccatggggc 21120
tgagacagtg nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 21180
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ctatccgaag 21240
gcatggccac ccagcagact ctccatggcc ttcaagcctt cactaactac tctattggag 21300
tagaggcctg cacctgcttc aactgttgca gcaaaggacc gacagctgaa ctgagaaccc 21360
atcctgcccc accctcagga ctgtcctctc cacaaatcgg gacgctggcc tcaaggacgg 21420
cctccttccg gtggagtccc cccatgttcc ccaatggtgt cattcacagn nnnnnnnnnn 21480
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 21540
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnc tatgaactcc aattccacgt ggcttgccct 21600
cctgactcag ccctcccctg tactcccagc caaatagaaa caaagtacac ggggctgggg 21660
cagaaagcca gccttggggg tctccagccc tacaccacat acaagctgag agtggtggca 21720
cacaacgagg tgggcagtac ggcttccgag tggatcagtt tcaccaccca aaaagaatnn 21780
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 21840
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnntg cctcagtacc gagccccatt 21900
ttcggtggac agcaatttgt ctgtggtgtg tgtgaactgg agtgacacct tcctcctgaa 21960
cggccaactg aaggagtacg tgttaaccga cggagggcga cgcgtgtaca gcggcttgga 22020
caccaccctc tacataccga gaacggcgga caaaannnnn nnnnnnnnnn nnnnnnnnnn 22080
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22140
nnnnnnnnnn nnnnnccttc tttttccagg tcatctgcac gactgacgaa ggaagtgtta 22200
agacgccgtt gatccaatat gatacctcta ctggacttgn nnnnnnnnnn nnnnnnnnnn 22260
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22320
nnnnnnnnnn nnnnnnnnng cttggtccta acaactcctg ggaaaaagaa gggatcgcgg 22380
agcaaaagca cagagttcta cagcgagctg tggttcatag tgttaatggc gatgctgggc 22440
ttgatcttgt tggccatttt tctgtccctg atactacaaa gaaaaatcca caaagagcca 22500
tatatcagag aaagacctcc cttggtacct cttcagaaga ggatgtctcc attgaatgtt 22560
tacccaccgg gggaaaacca tatgnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22620
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22680
nnnngggtta gccgatacca aaattccccg gtctgggaca cctgtgagta tccgcagcaa 22740
ccggagtgca tgtgtcctgc gcatcccgag tcaaaaccaa accagcctaa cctactccca 22800
gggttctctt caccgcagcg tcagccagct catggacatt caagacaaga aagtcttgat 22860
ggacaactca ctgtgggaag ccatcatggg ccacaacagt ggactgnnnn nnnnnnnnnn 22920
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22980
nnnnnnnnnn nnnnnnnnnn nnnnnntatg tggatgaaga ggacctgatg aacgccatca 23040
aggatttcag ctcagtgact aaggaacgca ccacattcac agacacccac ctgtaaagga 23100
tggaaaccca gaagacgtaa ccctggaatg caaggtctgc acccatttcc tcctgggtta 23160
tcactcacac atcataaatg ctgaaaagcc attgtttatt atcctataat tctttaaaga 23220
aatgatgact gtttttgaaa gtgttccttc ctaatagagg tctaagaaat gatatttttc 23280
tcatcttaaa tgagagagaa tattcatatg aaaatacttg atttgctctt attttgtaga 23340
agacaaagaa gtatgtaatt gtcacttggt tctgtttggc agtgatgctc ctggttaact 23400
gaataatcag tggcaatttc aagatggctc acagttgtta gaagtagtaa gttagttact 23460
ggctcaaaaa tgattctgtt gaaaggatgt cactgctgtt catttctatc tgccatttct 23520
gtcagggttg acacaatcct gcaagaatag ttattctaat gatcacagct gctaaatgaa 23580
tcccaaactt tgcaccaggt cgacaaactt ttctgaaggt tctatttatt taccatacat 23640
agggttactt accaaacttt ttgacaaggc tgaaggttct atttatttac aatacatagg 23700
gttactcacc aaactttttg acaaggcaac acataactta cacataaatg tctctgttct 23760
tgcatttatg aattttccaa aaatctaagg agtaaacagc ttatttatac attttgagga 23820
gaaaacaaag tgtttcacta ggaacacctc tacttgaacc aatgttttta tttcatatat 23880
tttatagttt tgaaactagt ttctcataaa attctgtcaa ttcactgaat atcagagaat 23940
actgacatct tcaacctagc acatttcaaa tggaaactac tgttctattt gcaatattag 24000
gctgcgtgaa attttaaaag gaaaaatgta tctgttcctt ctagcattaa catatataca 24060
tgtagagaca agactatacc tatgtgtata tatatgtata tcatgtatat attactctgc 24120
actatatccc ttctttttgg agaactagcc attattttag ccacagaatc agtaagaaca 24180
gatgatatgc aacagtacca attacggttc aaaaatgtct gtcacctgct ctagttggat 24240
tacaaagtca ttggtgaaag tcctatggca agaaaaattt tcttgcaaat catccacata 24300
aaatcagata tttaaatttg ttcttcatgg aaaacagagt aagaaaacct cttgtcttcc 24360
ttcatcctta aaggtctttg tgaccccagg aaaatattga ctctgtctaa cacacaatag 24420
tcacaatact ttttgtgaat ctacaaccag agacaggcaa aaacttgtaa agtaagggat 24480
agtcttactt attctgcctg aaaacaatgt attaccccag ggcccaacag taaaagattg 24540
tggacttttt gggtattgag atttcatcta gctctgtgag agagcagctc ctcagactga 24600
ccaactccta gacaaagttt gccaaccata agtgtcaaaa gcacaggcca gtattaagca 24660
gaagttctac caccttatta gaactgctat aaacaaaagc atctgaaata attgtgcaca 24720
tctggcagtg actgtagaaa atacgaaata tatatttctc gccaagtttt tatactttct 24780
gaaatgaaaa cataggattg actagtttac tggtttttat tcccatatgc cgattctggg 24840
acaataaagt tgtttaaagc tggcacaaat aagcattaac caaggctgtg tccaccttct 24900
gtgagctact taaggtatat aggaaaggag tggtcacaaa cttgcatcct aatccttggt 24960
ggactcttct aagaatacag tttgctagtc acaaagaata gtctacaaat atgctttgct 25020
aggttcagaa gattgagttt atcctgattt ttgaaaaatt aaccaggtat ctttatcact 25080
gtgtattttt ccaagcacag tataaaattt taacaacgca caaaaaaata cagaactgca 25140
ggggatttta tcttggatca ttatccattt aatcatctaa ttagacatga actcagttag 25200
ctgaatcatt tacattttga ctccatagct tagggcagac agaagcctgt atggcttctg 25260
cccagaactc tgtcccctgc tacatgtcta agtttacttg tatttatttc agagaagaac 25320
tctaagatgt tgctttgcta ctttaagtgg tattgcgtgc caagcctcta ttatacaaac 25380
catgcagact cgcctctaga gattctgatt cggttgatct ggggtgtgtg gctgaggcat 25440
cagtactttt taaagcttcc aggtgttcta atgttgagac ccactgatgt tccacaatct 25500
ggaagaaatc atgtacagga ataatatgct atgcacaggg actatgctcc ttggctcacc 25560
ccttctccct tataaacaat gagcagttct tgatgaacct ctttaaattt aaatctcctg 25620
actcacattt taccaattgt acatgccaca ttctcagctt acgaactacc atgttttgtt 25680
attcttaata tcaactgttt ggtaagagta cagttgtttt tatacactct aagaaatgtg 25740
tttataatct actgtaattt ccactaaatg gaacccaaat attaatgtta tggtaccata 25800
tactgatgta aaaatcatgc tggcatccat gaacacaccg gtaaataaaa catagtccaa 25860
gtggaagaat tcattaataa ggaactttta attatgtcac aaatgaatag ttggtttcca 25920
atgcacaaat atcatgtaaa ctaatctaaa gatggtttgc ttaataaata tttgaatgtg 25980
acc 25983
<210> 11
<211> 805
<212> PRT
<213> Artificial Sequence
<220>
<223> CEP290
<400> 11
Met Pro Pro Asn Ile Asn Trp Lys Glu Ile Met Lys Val Asp Pro Asp
1 5 10 15
Asp Leu Pro Arg Gln Glu Glu Leu Ala Asp Asn Leu Leu Ile Ser Leu
20 25 30
Ser Lys Val Glu Val Asn Glu Leu Lys Ser Glu Lys Gln Glu Asn Val
35 40 45
Ile His Leu Phe Arg Ile Thr Gln Ser Leu Met Lys Met Lys Ala Gln
50 55 60
Glu Val Glu Leu Ala Leu Glu Glu Val Glu Lys Ala Gly Glu Glu Gln
65 70 75 80
Ala Lys Phe Glu Asn Gln Leu Lys Thr Lys Val Met Lys Leu Glu Asn
85 90 95
Glu Leu Glu Met Ala Gln Gln Ser Ala Gly Gly Arg Asp Thr Arg Phe
100 105 110
Leu Arg Asn Glu Ile Cys Gln Leu Glu Lys Gln Leu Glu Gln Lys Asp
115 120 125
Arg Glu Leu Glu Asp Met Glu Lys Glu Leu Glu Lys Glu Lys Lys Val
130 135 140
Asn Glu Gln Leu Ala Leu Arg Asn Glu Glu Ala Glu Asn Glu Asn Ser
145 150 155 160
Lys Leu Arg Arg Glu Asn Lys Arg Leu Lys Lys Lys Asn Glu Gln Leu
165 170 175
Cys Gln Asp Ile Ile Asp Tyr Gln Lys Gln Ile Asp Ser Gln Lys Glu
180 185 190
Thr Leu Leu Ser Arg Arg Gly Glu Asp Ser Asp Tyr Arg Ser Gln Leu
195 200 205
Ser Lys Lys Asn Tyr Glu Leu Ile Gln Tyr Leu Asp Glu Ile Gln Thr
210 215 220
Leu Thr Glu Ala Asn Glu Lys Ile Glu Val Gln Asn Gln Glu Met Arg
225 230 235 240
Lys Asn Leu Glu Glu Ser Val Gln Glu Met Glu Lys Met Thr Asp Glu
245 250 255
Tyr Asn Arg Met Lys Ala Ile Val His Gln Thr Asp Asn Val Ile Asp
260 265 270
Gln Leu Lys Lys Glu Asn Asp His Tyr Gln Leu Gln Val Gln Glu Leu
275 280 285
Thr Asp Leu Leu Lys Ser Lys Asn Glu Glu Asp Asp Pro Ile Met Val
290 295 300
Ala Val Asn Ala Lys Val Glu Glu Trp Lys Leu Ile Leu Ser Ser Lys
305 310 315 320
Asp Asp Glu Ile Ile Glu Tyr Gln Gln Met Leu His Asn Leu Arg Glu
325 330 335
Lys Leu Lys Asn Ala Gln Leu Asp Ala Asp Lys Ser Asn Val Met Ala
340 345 350
Leu Gln Gln Gly Ile Gln Glu Arg Asp Ser Gln Ile Lys Met Leu Thr
355 360 365
Glu Gln Val Glu Gln Tyr Thr Lys Glu Met Glu Lys Asn Thr Cys Ile
370 375 380
Ile Glu Asp Leu Lys Asn Glu Leu Gln Arg Asn Lys Gly Ala Ser Thr
385 390 395 400
Leu Ser Gln Gln Thr His Met Lys Ile Gln Ser Thr Leu Asp Ile Leu
405 410 415
Lys Glu Lys Thr Lys Glu Ala Glu Arg Thr Ala Glu Leu Ala Glu Ala
420 425 430
Asp Ala Arg Glu Lys Asp Lys Glu Leu Val Glu Ala Leu Lys Arg Leu
435 440 445
Lys Asp Tyr Glu Ser Gly Val Tyr Gly Leu Glu Asp Ala Val Val Glu
450 455 460
Ile Lys Asn Cys Lys Asn Gln Ile Lys Ile Arg Asp Arg Glu Ile Glu
465 470 475 480
Ile Leu Thr Lys Glu Ile Asn Lys Leu Glu Leu Lys Ile Ser Asp Phe
485 490 495
Leu Asp Glu Asn Glu Ala Leu Arg Glu Arg Val Gly Leu Glu Pro Lys
500 505 510
Thr Met Ile Asp Leu Thr Glu Phe Arg Asn Ser Lys His Leu Lys Gln
515 520 525
Gln Gln Tyr Arg Ala Glu Asn Gln Ile Leu Leu Lys Glu Ile Glu Ser
530 535 540
Leu Glu Glu Glu Arg Leu Asp Leu Lys Lys Lys Ile Arg Gln Met Ala
545 550 555 560
Gln Glu Arg Gly Lys Arg Ser Ala Thr Ser Gly Leu Thr Thr Glu Asp
565 570 575
Leu Asn Leu Thr Glu Asn Ile Ser Gln Gly Asp Arg Ile Ser Glu Arg
580 585 590
Lys Leu Asp Leu Leu Ser Leu Lys Asn Met Ser Glu Ala Gln Ser Lys
595 600 605
Asn Glu Phe Leu Ser Arg Glu Leu Ile Glu Lys Glu Arg Asp Leu Glu
610 615 620
Arg Ser Arg Thr Val Ile Ala Lys Phe Gln Asn Lys Leu Lys Glu Leu
625 630 635 640
Val Glu Glu Asn Lys Gln Leu Glu Glu Gly Met Lys Glu Ile Leu Gln
645 650 655
Ala Ile Lys Glu Met Gln Lys Asp Pro Asp Val Lys Gly Gly Glu Thr
660 665 670
Ser Leu Ile Ile Pro Ser Leu Glu Arg Leu Val Asn Ala Ile Glu Ser
675 680 685
Lys Asn Ala Glu Gly Ile Phe Asp Ala Ser Leu His Leu Lys Ala Gln
690 695 700
Val Asp Gln Leu Thr Gly Arg Asn Glu Glu Leu Arg Gln Glu Leu Arg
705 710 715 720
Glu Ser Arg Lys Glu Ala Ile Asn Tyr Ser Gln Gln Leu Ala Lys Ala
725 730 735
Asn Leu Lys Ile Asp His Leu Glu Lys Glu Thr Ser Leu Leu Arg Gln
740 745 750
Ser Glu Gly Ser Asn Val Val Phe Lys Gly Ile Asp Leu Pro Asp Gly
755 760 765
Ile Ala Pro Ser Ser Ala Ser Ile Ile Asn Ser Gln Asn Glu Tyr Leu
770 775 780
Ile His Leu Leu Gln Glu Leu Glu Asn Lys Glu Lys Lys Val Lys Glu
785 790 795 800
Phe Arg Arg Phe Ser
805
<210> 12
<211> 2846
<212> DNA
<213> Artificial Sequence
<220>
<223> CEP290
<400> 12
ggcccgcggc cgggtccagc ttggtggttg cggtagtgag aggcctccgc tggttgccag 60
gcttggtcta gaggtggagc acagtgaaag aattcaagat gccacctaat ataaactgga 120
aagaaataat gaaagttgac ccagatgacc tgccccgtca agaagaactg gcagataatt 180
tattgatttc cttatccaag gtggaagtaa atgagctaaa aagtgaaaag caagaaaatg 240
tgatacacct tttcagaatt actcagtcac taatgaagat gaaagctcaa gaagtggagc 300
tggctttgga agaagtagaa aaagctggag aagaacaagc aaaatttgaa aatcaattaa 360
aaactaaagt aatgaaactg gaaaatgaac tggagatggc tcagcagtct gcaggtggac 420
gagatactcg gtttttacgt aatgaaattt gccaacttga aaaacaatta gaacaaaaag 480
atagagaatt ggaggacatg gaaaaggagt tggagaaaga gaagaaagtt aatgagcaat 540
tggctcttcg aaatgaggag gcagaaaatg aaaacagcaa attaagaaga gagaacaaac 600
gtctaaagaa aaagaatgaa caactttgtc aggatattat tgactaccag aaacaaatag 660
attcacagaa agaaacactt ttatcaagaa gaggggaaga cagtgactac cgatcacagt 720
tgtctaaaaa aaactatgag cttatccaat atcttgatga aattcagact ttaacagaag 780
ctaatgagaa aattgaagtt cagaatcaag aaatgagaaa aaatttagaa gagtctgtac 840
aggaaatgga gaagatgact gatgaatata atagaatgaa agctattgtg catcagacag 900
ataatgtaat agatcagtta aaaaaagaaa acgatcatta tcaacttcaa gtgcaggagc 960
ttacagatct tctgaaatca aaaaatgaag aagatgatcc aattatggta gctgtcaatg 1020
caaaagtaga agaatggaag ctaattttgt cttctaaaga tgatgaaatt attgagtatc 1080
agcaaatgtt acataaccta agggagaaac ttaagaatgc tcagcttgat gctgataaaa 1140
gtaatgttat ggctctacag cagggtatac aggaacgaga cagtcaaatt aagatgctca 1200
ccgaacaagt agaacaatat acaaaagaaa tggaaaagaa tacttgtatt attgaagatt 1260
tgaaaaatga gctccaaaga aacaaaggtg cttcaaccct ttctcaacag actcatatga 1320
aaattcagtc aacgttagac attttaaaag agaaaactaa agaggctgag agaacagctg 1380
aactggctga ggctgatgct agggaaaagg ataaagaatt agttgaggct ctgaagaggt 1440
taaaagatta tgaatcggga gtatatggtt tagaagatgc tgtcgttgaa ataaagaatt 1500
gtaaaaacca aattaaaata agagatcgag agattgaaat attaacaaag gaaatcaata 1560
aacttgaatt gaagatcagt gatttccttg atgaaaatga ggcacttaga gagcgtgtgg 1620
gccttgaacc aaagacaatg attgatttaa ctgaatttag aaatagcaaa cacttaaaac 1680
agcagcagta cagagctgaa aaccagattc ttttgaaaga gattgaaagt ctagaggaag 1740
aacgacttga tctgaaaaaa aaaattcgtc aaatggctca agaaagagga aaaagaagtg 1800
caacttcagg attaaccact gaggacctga acctaactga aaacatttct caaggagata 1860
gaataagtga aagaaaattg gatttattga gcctcaaaaa tatgagtgaa gcacaatcaa 1920
agaatgaatt tctttcaaga gaactaattg aaaaagaaag agatttagaa aggagtagga 1980
cagtgatagc caaatttcag aataaattaa aagaattagt tgaagaaaat aagcaacttg 2040
aagaaggtat gaaagaaata ttgcaagcaa ttaaggaaat gcagaaagat cctgatgtta 2100
aaggaggaga aacatctcta attatcccta gccttgaaag actagttaat gctatagaat 2160
caaagaatgc agaaggaatc tttgatgcga gtctgcattt gaaagcccaa gttgatcagc 2220
ttaccggaag aaatgaagaa ttaagacagg agctcaggga atctcggaaa gaggctataa 2280
attattcaca gcagttggca aaagctaatt taaagataga ccatcttgaa aaagaaacta 2340
gtcttttacg acaatcagaa ggatcgaatg ttgtttttaa aggaattgac ttacctgatg 2400
ggatagcacc atctagtgcc agtatcatta attctcagaa tgaatattta atacatttgt 2460
tacaggaact agaaaataaa gaaaaaaaag ttaaagaatt tagaagattc tcttgaagat 2520
tacaacagaa aatttgctgt aattcgtcat caacaaagtt tgttgtataa agaataccta 2580
agtgaaaagg agacctggaa aacagaatct aaaacaataa aagaggaaaa gagaaaactt 2640
gaggatcaag tccaacaaga tgctataaaa gtaaaagaat ataataattt gctcaatgct 2700
cttcagatgg attcggatga aatgaaaaaa atacttgcag aaaatagtag gaaaattact 2760
gttttgcaag tgaatgaaaa atcacttata aggcaatata caaccttagt agaattggag 2820
cgacaactta gaaaaaaaaa aaaaaa 2846
<210> 13
<211> 477
<212> PRT
<213> Artificial Sequence
<220>
<223> DRD5
<400> 13
Met Leu Pro Pro Gly Ser Asn Gly Thr Ala Tyr Pro Gly Gln Phe Ala
1 5 10 15
Leu Tyr Gln Gln Leu Ala Gln Gly Asn Ala Val Gly Gly Ser Ala Gly
20 25 30
Ala Pro Pro Leu Gly Pro Ser Gln Val Val Thr Ala Cys Leu Leu Thr
35 40 45
Leu Leu Ile Ile Trp Thr Leu Leu Gly Asn Val Leu Val Cys Ala Ala
50 55 60
Ile Val Arg Ser Arg His Leu Arg Ala Asn Met Thr Asn Val Phe Ile
65 70 75 80
Val Ser Leu Ala Val Ser Asp Leu Phe Val Ala Leu Leu Val Met Pro
85 90 95
Trp Lys Ala Val Ala Glu Val Ala Gly Tyr Trp Pro Phe Gly Ala Phe
100 105 110
Cys Asp Val Trp Val Ala Phe Asp Ile Met Cys Ser Thr Ala Ser Ile
115 120 125
Leu Asn Leu Cys Val Ile Ser Val Asp Arg Tyr Trp Ala Ile Ser Arg
130 135 140
Pro Phe Arg Tyr Lys Arg Lys Met Thr Gln Arg Met Ala Leu Val Met
145 150 155 160
Val Gly Leu Ala Trp Thr Leu Ser Ile Leu Ile Ser Phe Ile Pro Val
165 170 175
Gln Leu Asn Trp His Arg Asp Gln Ala Ala Ser Trp Gly Gly Leu Asp
180 185 190
Leu Pro Asn Asn Leu Ala Asn Trp Thr Pro Trp Glu Glu Asp Phe Trp
195 200 205
Glu Pro Asp Val Asn Ala Glu Asn Cys Asp Ser Ser Leu Asn Arg Thr
210 215 220
Tyr Ala Ile Ser Ser Ser Leu Ile Ser Phe Tyr Ile Pro Val Ala Ile
225 230 235 240
Met Ile Val Thr Tyr Thr Arg Ile Tyr Arg Ile Ala Gln Val Gln Ile
245 250 255
Arg Arg Ile Ser Ser Leu Glu Arg Ala Ala Glu His Ala Gln Ser Cys
260 265 270
Arg Ser Ser Ala Ala Cys Ala Pro Asp Thr Ser Leu Arg Ala Ser Ile
275 280 285
Lys Lys Glu Thr Lys Val Leu Lys Thr Leu Ser Val Ile Met Gly Val
290 295 300
Phe Val Cys Cys Trp Leu Pro Phe Phe Ile Leu Asn Cys Met Val Pro
305 310 315 320
Phe Cys Ser Gly His Pro Glu Gly Pro Pro Ala Gly Phe Pro Cys Val
325 330 335
Ser Glu Thr Thr Phe Asp Val Phe Val Trp Phe Gly Trp Ala Asn Ser
340 345 350
Ser Leu Asn Pro Val Ile Tyr Ala Phe Asn Ala Asp Phe Gln Lys Val
355 360 365
Phe Ala Gln Leu Leu Gly Cys Ser His Phe Cys Ser Arg Thr Pro Val
370 375 380
Glu Thr Val Asn Ile Ser Asn Glu Leu Ile Ser Tyr Asn Gln Asp Ile
385 390 395 400
Val Phe His Lys Glu Ile Ala Ala Ala Tyr Ile His Met Met Pro Asn
405 410 415
Ala Val Thr Pro Gly Asn Arg Glu Val Asp Asn Asp Glu Glu Glu Gly
420 425 430
Pro Phe Asp Arg Met Phe Gln Ile Tyr Gln Thr Ser Pro Asp Gly Asp
435 440 445
Pro Val Ala Glu Ser Val Trp Glu Leu Asp Cys Glu Gly Glu Ile Ser
450 455 460
Leu Asp Lys Ile Thr Pro Phe Thr Pro Asn Gly Phe His
465 470 475
<210> 14
<211> 1673
<212> DNA
<213> Artificial Sequence
<220>
<223> DRD5
<400> 14
cccggcgcag ctcatggtga gcgcctctgg ggctcgaggg tcccttggct gagggggcgc 60
atcctcgggg tgcccgatgg ggctgcctgg gggtcgcagg gctgaagttg ggatcgcgca 120
caaaccgacc ctgcagtcca gcccgaaatg ctgccgccag gcagcaacgg caccgcgtac 180
ccggggcagt tcgctctata ccagcagctg gcgcagggga acgccgtggg gggctcggcg 240
ggggcaccgc cactggggcc ctcacaggtg gtcaccgcct gcctgctgac cctactcatc 300
atctggaccc tgctgggcaa cgtgctggtg tgcgcagcca tcgtgcggag ccgccacctg 360
cgcgccaaca tgaccaacgt cttcatcgtg tctctggccg tgtctgacct tttcgtggcg 420
ctgctggtca tgccctggaa ggcagtcgcc gaggtggccg gttactggcc ctttggagcg 480
ttctgcgacg tctgggtggc cttcgacatc atgtgctcca ctgcctccat cctgaacctg 540
tgcgtcatca gcgtggaccg ctactgggcc atctccaggc ccttccgcta caagcgcaag 600
atgactcagc gcatggcctt ggtcatggtc ggcctggcat ggaccttgtc catcctcatc 660
tccttcattc cggtccagct caactggcac agggaccagg cggcctcttg gggcgggctg 720
gacctgccaa acaacctggc caactggacg ccctgggagg aggacttttg ggagcccgac 780
gtgaatgcag agaactgtga ctccagcctg aatcgaacct acgccatctc ttcctcgctc 840
atcagcttct acatccccgt tgccatcatg atcgtgacct acacgcgcat ctaccgcatc 900
gcccaggtgc agatccgcag gatttcctcc ctggagaggg ccgcagagca cgcgcagagc 960
tgccggagca gcgcagcctg cgcgcccgac accagcctgc gcgcttccat caagaaggag 1020
accaaggttc tcaagaccct gtcggtgatc atgggggtct tcgtgtgttg ctggctgccc 1080
ttcttcatcc ttaactgcat ggtccctttc tgcagtggac accctgaagg ccctccggcc 1140
ggcttcccct gcgtcagtga gaccaccttc gacgtcttcg tctggttcgg ctgggctaac 1200
tcctcactca accccgtcat ctatgccttc aacgccgact ttcagaaggt gtttgcccag 1260
ctgctggggt gcagccactt ctgctcccgc acgccggtgg agacggtgaa catcagcaat 1320
gagctcatct cctacaacca agacatcgtc ttccacaagg aaatcgcagc tgcctacatc 1380
cacatgatgc ccaacgccgt tacccccggc aaccgggagg tggacaacga cgaggaggag 1440
ggtcctttcg atcgcatgtt ccagatctat cagacgtccc cagatggtga ccctgttgct 1500
gagtctgtct gggagctgga ctgcgagggg gagatttctt tagacaaaat aacacctttc 1560
accccgaatg gattccatta aactgcatta agaaaccccc tcatggatct gcataaccgc 1620
acagacactg acaagcacgc acacacacgc aaatacatgc ctttccagta ctg 1673
<210> 15
<211> 531
<212> PRT
<213> Artificial Sequence
<220>
<223> ASIC3-1
<400> 15
Met Lys Pro Thr Ser Gly Pro Glu Glu Ala Arg Arg Pro Ala Ser Asp
1 5 10 15
Ile Arg Val Phe Ala Ser Asn Cys Ser Met His Gly Leu Gly His Val
20 25 30
Phe Gly Pro Gly Ser Leu Ser Leu Arg Arg Gly Met Trp Ala Ala Ala
35 40 45
Val Val Leu Ser Val Ala Thr Phe Leu Tyr Gln Val Ala Glu Arg Val
50 55 60
Arg Tyr Tyr Arg Glu Phe His His Gln Thr Ala Leu Asp Glu Arg Glu
65 70 75 80
Ser His Arg Leu Ile Phe Pro Ala Val Thr Leu Cys Asn Ile Asn Pro
85 90 95
Leu Arg Arg Ser Arg Leu Thr Pro Asn Asp Leu His Trp Ala Gly Ser
100 105 110
Ala Leu Leu Gly Leu Asp Pro Ala Glu His Ala Ala Phe Leu Arg Ala
115 120 125
Leu Gly Arg Pro Pro Ala Pro Pro Gly Phe Met Pro Ser Pro Thr Phe
130 135 140
Asp Met Ala Gln Leu Tyr Ala Arg Ala Gly His Ser Leu Asp Asp Met
145 150 155 160
Leu Leu Asp Cys Arg Phe Arg Gly Gln Pro Cys Gly Pro Glu Asn Phe
165 170 175
Thr Thr Ile Phe Thr Arg Met Gly Lys Cys Tyr Thr Phe Asn Ser Gly
180 185 190
Ala Asp Gly Ala Glu Leu Leu Thr Thr Thr Arg Gly Gly Met Gly Asn
195 200 205
Gly Leu Asp Ile Met Leu Asp Val Gln Gln Glu Glu Tyr Leu Pro Val
210 215 220
Trp Arg Asp Asn Glu Glu Thr Pro Phe Glu Val Gly Ile Arg Val Gln
225 230 235 240
Ile His Ser Gln Glu Glu Pro Pro Ile Ile Asp Gln Leu Gly Leu Gly
245 250 255
Val Ser Pro Gly Tyr Gln Thr Phe Val Ser Cys Gln Gln Gln Gln Leu
260 265 270
Ser Phe Leu Pro Pro Pro Trp Gly Asp Cys Ser Ser Ala Ser Leu Asn
275 280 285
Pro Asn Tyr Glu Pro Glu Pro Ser Asp Pro Leu Gly Ser Pro Ser Pro
290 295 300
Ser Pro Ser Pro Pro Tyr Thr Leu Met Gly Cys Arg Leu Ala Cys Glu
305 310 315 320
Thr Arg Tyr Val Ala Arg Lys Cys Gly Cys Arg Met Val Tyr Met Pro
325 330 335
Gly Asp Val Pro Val Cys Ser Pro Gln Gln Tyr Lys Asn Cys Ala His
340 345 350
Pro Ala Ile Asp Ala Met Leu Arg Lys Asp Ser Cys Ala Cys Pro Asn
355 360 365
Pro Cys Ala Ser Thr Arg Tyr Ala Lys Glu Leu Ser Met Val Arg Ile
370 375 380
Pro Ser Arg Ala Ala Ala Arg Phe Leu Ala Arg Lys Leu Asn Arg Ser
385 390 395 400
Glu Ala Tyr Ile Ala Glu Asn Val Leu Ala Leu Asp Ile Phe Phe Glu
405 410 415
Ala Leu Asn Tyr Glu Thr Val Glu Gln Lys Lys Ala Tyr Glu Met Ser
420 425 430
Glu Leu Leu Gly Asp Ile Gly Gly Gln Met Gly Leu Phe Ile Gly Ala
435 440 445
Ser Leu Leu Thr Ile Leu Glu Ile Leu Asp Tyr Leu Cys Glu Val Phe
450 455 460
Arg Asp Lys Val Leu Gly Tyr Phe Trp Asn Arg Gln His Ser Gln Arg
465 470 475 480
His Ser Ser Thr Asn Leu Leu Gln Glu Gly Leu Gly Ser His Arg Thr
485 490 495
Gln Val Pro His Leu Ser Leu Gly Pro Arg Pro Pro Thr Pro Pro Cys
500 505 510
Ala Val Thr Lys Thr Leu Ser Ala Ser His Arg Thr Cys Tyr Leu Val
515 520 525
Thr Gln Leu
530
<210> 16
<211> 2053
<212> DNA
<213> Artificial Sequence
<220>
<223> ASIC3-1
<400> 16
ctctgcagca gcgccggctc agcaccgccg gctcagcacc gctccgcagc ccctgcctgc 60
cacggtcagc tacgtcccac ctggtctgct gcggagtccc cagcccagtg cctagcccag 120
tggagccacc gcctgttcct cgggaaggaa cagtgggacc tgaccggcca gatcacctcc 180
tccaatcctg ccaggctagt gcctccctgc cttccaacct tggctgtctc ccaccctctc 240
ttctcctctc cttgcctggc ctcctgaatc ctatcttagc ctccttagcc ccctgactga 300
ctctctctcg cttcttccaa gcctctgtag ctggttccgc tcctgggttc tggccatgaa 360
gcccacctca ggcccagagg aggcccggcg gccagcctcg gacatccgcg tgttcgccag 420
caactgctcg atgcacgggc tgggccacgt cttcgggcca ggcagcctga gcctgcgccg 480
ggggatgtgg gcagcggccg tggtcctgtc agtggccacc ttcctctacc aggtggctga 540
gagggtgcgc tactacaggg agttccacca ccagactgcc ctggatgagc gagaaagcca 600
ccggctcatc ttcccggctg tcaccctgtg caacatcaac ccactgcgcc gctcgcgcct 660
aacgcccaac gacctgcact gggctgggtc tgcgctgctg ggcctggatc ccgcagagca 720
cgccgccttc ctgcgcgccc tgggccggcc ccctgcaccg cccggcttca tgcccagtcc 780
cacctttgac atggcgcaac tctatgcccg tgctgggcac tccctggatg acatgctgct 840
ggactgtcgc ttccgtggcc aaccttgtgg gcctgagaac ttcaccacga tcttcacccg 900
gatgggaaag tgctacacat ttaactctgg cgctgatggg gcagagctgc tcaccactac 960
taggggtggc atgggcaatg ggctggacat catgctggac gtgcagcagg aggaatatct 1020
acctgtgtgg agggacaatg aggagacccc gtttgaggtg gggatccgag tgcagatcca 1080
cagccaggag gagccgccca tcatcgatca gctgggcttg ggggtgtccc cgggctacca 1140
gacctttgtt tcttgccagc agcagcagct gagcttcctg ccaccgccct ggggcgattg 1200
cagttcagca tctctgaacc ccaactatga gccagagccc tctgatcccc taggctcccc 1260
cagccccagc cccagccctc cctataccct tatggggtgt cgcctggcct gcgaaacccg 1320
ctacgtggct cggaagtgcg gctgccgaat ggtgtacatg ccaggcgacg tgccagtgtg 1380
cagcccccag cagtacaaga actgtgccca cccggccata gatgccatgc ttcgcaagga 1440
ctcgtgcgcc tgccccaacc cgtgcgccag cacgcgctac gccaaggagc tctccatggt 1500
gcggatcccg agccgcgccg ccgcgcgctt cctggcccgg aagctcaacc gcagcgaggc 1560
ctacatcgcg gagaacgtgc tggccctgga catcttcttt gaggccctca actatgagac 1620
cgtggagcag aagaaggcct atgagatgtc agagctgctt ggtgacattg ggggccagat 1680
ggggctgttc atcggggcca gcctgctcac catcctcgag atcctagact acctctgtga 1740
ggtgttccga gacaaggtcc tgggatattt ctggaaccga cagcactccc aaaggcactc 1800
cagcaccaat ctgcttcagg aagggctggg cagccatcga acccaagttc cccacctcag 1860
cctgggcccc agacctccca cccctccctg tgccgtcacc aagactctct ccgcctccca 1920
ccgcacctgc taccttgtca cacagctcta gacctgctgt ctgtgtcctc ggagccccgc 1980
cctgacatcc tggacatgcc tagcctgcac gtagcttttc cgtcttcacc ccaaataaag 2040
tcctaatgca tca 2053
<210> 17
<211> 549
<212> PRT
<213> Artificial Sequence
<220>
<223> ASIC3-2
<400> 17
Met Lys Pro Thr Ser Gly Pro Glu Glu Ala Arg Arg Pro Ala Ser Asp
1 5 10 15
Ile Arg Val Phe Ala Ser Asn Cys Ser Met His Gly Leu Gly His Val
20 25 30
Phe Gly Pro Gly Ser Leu Ser Leu Arg Arg Gly Met Trp Ala Ala Ala
35 40 45
Val Val Leu Ser Val Ala Thr Phe Leu Tyr Gln Val Ala Glu Arg Val
50 55 60
Arg Tyr Tyr Arg Glu Phe His His Gln Thr Ala Leu Asp Glu Arg Glu
65 70 75 80
Ser His Arg Leu Ile Phe Pro Ala Val Thr Leu Cys Asn Ile Asn Pro
85 90 95
Leu Arg Arg Ser Arg Leu Thr Pro Asn Asp Leu His Trp Ala Gly Ser
100 105 110
Ala Leu Leu Gly Leu Asp Pro Ala Glu His Ala Ala Phe Leu Arg Ala
115 120 125
Leu Gly Arg Pro Pro Ala Pro Pro Gly Phe Met Pro Ser Pro Thr Phe
130 135 140
Asp Met Ala Gln Leu Tyr Ala Arg Ala Gly His Ser Leu Asp Asp Met
145 150 155 160
Leu Leu Asp Cys Arg Phe Arg Gly Gln Pro Cys Gly Pro Glu Asn Phe
165 170 175
Thr Thr Ile Phe Thr Arg Met Gly Lys Cys Tyr Thr Phe Asn Ser Gly
180 185 190
Ala Asp Gly Ala Glu Leu Leu Thr Thr Thr Arg Gly Gly Met Gly Asn
195 200 205
Gly Leu Asp Ile Met Leu Asp Val Gln Gln Glu Glu Tyr Leu Pro Val
210 215 220
Trp Arg Asp Asn Glu Glu Thr Pro Phe Glu Val Gly Ile Arg Val Gln
225 230 235 240
Ile His Ser Gln Glu Glu Pro Pro Ile Ile Asp Gln Leu Gly Leu Gly
245 250 255
Val Ser Pro Gly Tyr Gln Thr Phe Val Ser Cys Gln Gln Gln Gln Leu
260 265 270
Ser Phe Leu Pro Pro Pro Trp Gly Asp Cys Ser Ser Ala Ser Leu Asn
275 280 285
Pro Asn Tyr Glu Pro Glu Pro Ser Asp Pro Leu Gly Ser Pro Ser Pro
290 295 300
Ser Pro Ser Pro Pro Tyr Thr Leu Met Gly Cys Arg Leu Ala Cys Glu
305 310 315 320
Thr Arg Tyr Val Ala Arg Lys Cys Gly Cys Arg Met Val Tyr Met Pro
325 330 335
Gly Asp Val Pro Val Cys Ser Pro Gln Gln Tyr Lys Asn Cys Ala His
340 345 350
Pro Ala Ile Asp Ala Met Leu Arg Lys Asp Ser Cys Ala Cys Pro Asn
355 360 365
Pro Cys Ala Ser Thr Arg Tyr Ala Lys Glu Leu Ser Met Val Arg Ile
370 375 380
Pro Ser Arg Ala Ala Ala Arg Phe Leu Ala Arg Lys Leu Asn Arg Ser
385 390 395 400
Glu Ala Tyr Ile Ala Glu Asn Val Leu Ala Leu Asp Ile Phe Phe Glu
405 410 415
Ala Leu Asn Tyr Glu Thr Val Glu Gln Lys Lys Ala Tyr Glu Met Ser
420 425 430
Glu Leu Leu Gly Asp Ile Gly Gly Gln Met Gly Leu Phe Ile Gly Ala
435 440 445
Ser Leu Leu Thr Ile Leu Glu Ile Leu Asp Tyr Leu Cys Glu Val Phe
450 455 460
Arg Asp Lys Val Leu Gly Tyr Phe Trp Asn Arg Gln His Ser Gln Arg
465 470 475 480
His Ser Ser Thr Asn Leu Leu Gln Glu Gly Leu Gly Ser His Arg Thr
485 490 495
Gln Val Pro His Leu Ser Leu Gly Pro Ser Thr Leu Leu Cys Ser Glu
500 505 510
Asp Leu Pro Pro Leu Pro Val Pro Ser Pro Arg Leu Ser Pro Pro Pro
515 520 525
Thr Ala Pro Ala Thr Leu Ser His Ser Ser Arg Pro Ala Val Cys Val
530 535 540
Leu Gly Ala Pro Pro
545
<210> 18
<211> 2314
<212> DNA
<213> Artificial Sequence
<220>
<223> ASIC3-2
<400> 18
tacaggggtt gcaactggga gcctaggggg ccccaaggca tctccaggcc caatctacct 60
ctgggctttt ctcaagctct ccctaggatt actgcggttt cctcctggcg cctctcgtct 120
tggacagcca tgcccccctc catgctgcac taatggctca gcctggggcc ctagggacct 180
ctcctacccc ccagactgct ctgtcggccc cctttccccc ctactgctga aacccaatcc 240
tctgcagcag cgccggctca gcaccgccgg ctcagcaccg ctccgcagcc cctgcctgcc 300
acggtcagct acgtcccacc tggtctgctg cggagtcccc agcccagtgc ctagcccagt 360
ggagccaccg cctgttcctc gggaaggaac agtgggacct gaccggccag atcacctcct 420
ccaatcctgc caggctagtg cctccctgcc ttccaacctt ggctgtctcc caccctctct 480
tctcctctcc ttgcctggcc tcctgaatcc tatcttagcc tccttagccc cctgactgac 540
tctctctcgc ttcttccaag cctctgtagc tggttccgct cctgggttct ggccatgaag 600
cccacctcag gcccagagga ggcccggcgg ccagcctcgg acatccgcgt gttcgccagc 660
aactgctcga tgcacgggct gggccacgtc ttcgggccag gcagcctgag cctgcgccgg 720
gggatgtggg cagcggccgt ggtcctgtca gtggccacct tcctctacca ggtggctgag 780
agggtgcgct actacaggga gttccaccac cagactgccc tggatgagcg agaaagccac 840
cggctcatct tcccggctgt caccctgtgc aacatcaacc cactgcgccg ctcgcgccta 900
acgcccaacg acctgcactg ggctgggtct gcgctgctgg gcctggatcc cgcagagcac 960
gccgccttcc tgcgcgccct gggccggccc cctgcaccgc ccggcttcat gcccagtccc 1020
acctttgaca tggcgcaact ctatgcccgt gctgggcact ccctggatga catgctgctg 1080
gactgtcgct tccgtggcca accttgtggg cctgagaact tcaccacgat cttcacccgg 1140
atgggaaagt gctacacatt taactctggc gctgatgggg cagagctgct caccactact 1200
aggggtggca tgggcaatgg gctggacatc atgctggacg tgcagcagga ggaatatcta 1260
cctgtgtgga gggacaatga ggagaccccg tttgaggtgg ggatccgagt gcagatccac 1320
agccaggagg agccgcccat catcgatcag ctgggcttgg gggtgtcccc gggctaccag 1380
acctttgttt cttgccagca gcagcagctg agcttcctgc caccgccctg gggcgattgc 1440
agttcagcat ctctgaaccc caactatgag ccagagccct ctgatcccct aggctccccc 1500
agccccagcc ccagccctcc ctataccctt atggggtgtc gcctggcctg cgaaacccgc 1560
tacgtggctc ggaagtgcgg ctgccgaatg gtgtacatgc caggcgacgt gccagtgtgc 1620
agcccccagc agtacaagaa ctgtgcccac ccggccatag atgccatgct tcgcaaggac 1680
tcgtgcgcct gccccaaccc gtgcgccagc acgcgctacg ccaaggagct ctccatggtg 1740
cggatcccga gccgcgccgc cgcgcgcttc ctggcccgga agctcaaccg cagcgaggcc 1800
tacatcgcgg agaacgtgct ggccctggac atcttctttg aggccctcaa ctatgagacc 1860
gtggagcaga agaaggccta tgagatgtca gagctgcttg gtgacattgg gggccagatg 1920
gggctgttca tcggggccag cctgctcacc atcctcgaga tcctagacta cctctgtgag 1980
gtgttccgag acaaggtcct gggatatttc tggaaccgac agcactccca aaggcactcc 2040
agcaccaatc tgcttcagga agggctgggc agccatcgaa cccaagttcc ccacctcagc 2100
ctgggcccca gcactctgct ctgttccgaa gacctcccac ccctccctgt gccgtcacca 2160
agactctctc cgcctcccac cgcacctgct accttgtcac acagctctag acctgctgtc 2220
tgtgtcctcg gagccccgcc ctgacatcct ggacatgcct agcctgcacg tagcttttcc 2280
gtcttcaccc caaataaagt cctaatgcat cagc 2314
<210> 19
<211> 1246
<212> PRT
<213> Artificial Sequence
<220>
<223> TRAPPC9
<400> 19
Met Val Pro Ala Gly Asp Gln Asp Arg Ala Pro His Arg Gly Lys Pro
1 5 10 15
Ala Gln Ala Gly Ala Arg Thr Ser Arg Ala Ser Arg Ala Leu Arg Ser
20 25 30
Trp Arg Arg Ser Gln Ala Ala Arg Ala Thr Val Thr His Pro Arg Gly
35 40 45
Gly His Asp Arg Gly Ser His Gly Gly Tyr Arg Glu Gly His Arg Gly
50 55 60
Cys Arg Arg Asp Pro Gln Trp Ala Ser Ala Gly Pro Pro Pro Leu Ser
65 70 75 80
Phe Thr Glu Glu Val Lys Phe Glu Leu Arg Ala Leu Lys Asp Trp Asp
85 90 95
Phe Lys Met Ser Val Pro Asp Tyr Met Gln Cys Ala Glu Asp His Gln
100 105 110
Thr Leu Leu Val Val Val Gln Pro Val Gly Ile Val Ser Glu Glu Asn
115 120 125
Phe Phe Arg Ile Tyr Lys Arg Ile Cys Ser Val Ser Gln Ile Ser Val
130 135 140
Arg Asp Ser Gln Arg Val Leu Tyr Ile Arg Tyr Arg His His Tyr Pro
145 150 155 160
Pro Glu Asn Asn Glu Trp Gly Asp Phe Gln Thr His Arg Lys Val Val
165 170 175
Gly Leu Ile Thr Ile Thr Asp Cys Phe Ser Ala Lys Asp Trp Pro Gln
180 185 190
Thr Phe Glu Lys Phe His Val Gln Lys Glu Ile Tyr Gly Ser Thr Leu
195 200 205
Tyr Asp Ser Arg Leu Phe Val Phe Gly Leu Gln Gly Glu Ile Val Glu
210 215 220
Gln Pro Arg Thr Asp Val Ala Phe Tyr Pro Asn Tyr Glu Asp Cys Gln
225 230 235 240
Thr Val Glu Lys Arg Ile Glu Asp Phe Ile Glu Ser Leu Phe Ile Val
245 250 255
Leu Glu Ser Lys Arg Leu Asp Arg Ala Thr Asp Lys Ser Gly Asp Lys
260 265 270
Ile Pro Leu Leu Cys Val Pro Phe Glu Lys Lys Asp Phe Val Gly Leu
275 280 285
Asp Thr Asp Ser Arg His Tyr Lys Lys Arg Cys Gln Gly Arg Met Arg
290 295 300
Lys His Val Gly Asp Leu Cys Leu Gln Ala Gly Met Leu Gln Asp Ser
305 310 315 320
Leu Val His Tyr His Met Ser Val Glu Leu Leu Arg Ser Val Asn Asp
325 330 335
Phe Leu Trp Leu Gly Ala Ala Leu Glu Gly Leu Cys Ser Ala Ser Val
340 345 350
Ile Tyr His Tyr Pro Gly Gly Thr Gly Gly Lys Ser Gly Ala Arg Arg
355 360 365
Phe Gln Gly Ser Thr Leu Pro Ala Glu Ala Ala Asn Arg His Arg Pro
370 375 380
Gly Ala Gln Glu Val Leu Ile Asp Pro Gly Ala Leu Thr Thr Asn Gly
385 390 395 400
Ile Asn Pro Asp Thr Ser Thr Glu Ile Gly Arg Ala Lys Asn Cys Leu
405 410 415
Ser Pro Glu Asp Ile Ile Asp Lys Tyr Lys Glu Ala Ile Ser Tyr Tyr
420 425 430
Ser Lys Tyr Lys Asn Ala Gly Val Ile Glu Leu Glu Ala Cys Ile Lys
435 440 445
Ala Val Arg Val Leu Ala Ile Gln Lys Arg Ser Met Glu Ala Ser Glu
450 455 460
Phe Leu Gln Asn Ala Val Tyr Ile Asn Leu Arg Gln Leu Ser Glu Glu
465 470 475 480
Glu Lys Ile Gln Arg Tyr Ser Ile Leu Ser Glu Leu Tyr Glu Leu Ile
485 490 495
Gly Phe His Arg Lys Ser Ala Phe Phe Lys Arg Val Ala Ala Met Gln
500 505 510
Cys Val Ala Pro Ser Ile Ala Glu Pro Gly Trp Arg Ala Cys Tyr Lys
515 520 525
Leu Leu Leu Glu Thr Leu Pro Gly Tyr Ser Leu Ser Leu Asp Pro Lys
530 535 540
Asp Phe Ser Arg Gly Thr His Arg Gly Trp Ala Ala Val Gln Met Arg
545 550 555 560
Leu Leu His Glu Leu Val Tyr Ala Ser Arg Arg Met Gly Asn Pro Ala
565 570 575
Leu Ser Val Arg His Leu Ser Phe Leu Leu Gln Thr Met Leu Asp Phe
580 585 590
Leu Ser Asp Gln Glu Lys Lys Asp Val Ala Gln Ser Leu Glu Asn Tyr
595 600 605
Thr Ser Lys Cys Pro Gly Thr Met Glu Pro Ile Ala Leu Pro Gly Gly
610 615 620
Leu Thr Leu Pro Pro Val Pro Phe Thr Lys Leu Pro Ile Val Arg His
625 630 635 640
Val Lys Leu Leu Asn Leu Pro Ala Ser Leu Arg Pro His Lys Met Lys
645 650 655
Ser Leu Leu Gly Gln Asn Val Ser Thr Lys Ser Pro Phe Ile Tyr Ser
660 665 670
Pro Ile Ile Ala His Asn Arg Gly Glu Glu Arg Asn Lys Lys Ile Asp
675 680 685
Phe Gln Trp Val Gln Gly Asp Val Cys Glu Val Gln Leu Met Val Tyr
690 695 700
Asn Pro Met Pro Phe Glu Leu Arg Val Glu Asn Met Gly Leu Leu Thr
705 710 715 720
Ser Gly Val Glu Phe Glu Ser Leu Pro Ala Ala Leu Ser Leu Pro Ala
725 730 735
Glu Ser Gly Leu Tyr Pro Val Thr Leu Val Gly Val Pro Gln Thr Thr
740 745 750
Gly Thr Ile Thr Val Asn Gly Tyr His Thr Thr Val Phe Gly Val Phe
755 760 765
Ser Asp Cys Leu Leu Asp Asn Leu Pro Gly Ile Lys Thr Ser Gly Ser
770 775 780
Thr Val Glu Val Ile Pro Ala Leu Pro Arg Leu Gln Ile Ser Thr Ser
785 790 795 800
Leu Pro Arg Ser Ala His Ser Leu Gln Pro Ser Ser Gly Asp Glu Ile
805 810 815
Ser Thr Asn Val Ser Val Gln Leu Tyr Asn Gly Glu Ser Gln Gln Leu
820 825 830
Ile Ile Lys Leu Glu Asn Ile Gly Met Glu Pro Leu Glu Lys Leu Glu
835 840 845
Val Thr Ser Lys Val Leu Thr Thr Lys Glu Lys Leu Tyr Gly Asp Phe
850 855 860
Leu Ser Trp Lys Leu Glu Glu Thr Leu Ala Gln Phe Pro Leu Gln Pro
865 870 875 880
Gly Lys Val Ala Thr Phe Thr Ile Asn Ile Lys Val Lys Leu Asp Phe
885 890 895
Ser Cys Gln Glu Asn Leu Leu Gln Asp Leu Ser Asp Asp Gly Ile Ser
900 905 910
Val Ser Gly Phe Pro Leu Ser Ser Pro Phe Arg Gln Val Val Arg Pro
915 920 925
Arg Val Glu Gly Lys Pro Val Asn Pro Pro Glu Ser Asn Lys Ala Gly
930 935 940
Asp Tyr Ser His Val Lys Thr Leu Glu Ala Val Leu Asn Phe Lys Tyr
945 950 955 960
Ser Gly Gly Pro Gly His Thr Glu Gly Tyr Tyr Arg Asn Leu Ser Leu
965 970 975
Gly Leu His Val Glu Val Glu Pro Ser Val Phe Phe Thr Arg Val Ser
980 985 990
Thr Leu Pro Ala Thr Ser Thr Arg Gln Cys His Leu Leu Leu Asp Val
995 1000 1005
Phe Asn Ser Thr Glu His Glu Leu Thr Val Ser Thr Arg Ser Ser Glu
1010 1015 1020
Ala Leu Ile Leu His Ala Gly Glu Cys Gln Arg Met Ala Ile Gln Val
1025 1030 1035 1040
Asp Lys Phe Asn Phe Glu Ser Phe Pro Glu Ser Pro Gly Glu Lys Gly
1045 1050 1055
Gln Phe Ala Asn Pro Lys Gln Leu Glu Glu Glu Arg Arg Glu Ala Arg
1060 1065 1070
Gly Leu Glu Ile His Ser Lys Leu Gly Ile Cys Trp Arg Ile Pro Ser
1075 1080 1085
Leu Lys Arg Ser Gly Glu Ala Ser Val Glu Gly Leu Leu Asn Gln Leu
1090 1095 1100
Val Leu Glu His Leu Gln Leu Ala Pro Leu Gln Trp Asp Val Leu Val
1105 1110 1115 1120
Asp Gly Gln Pro Cys Asp Arg Glu Ala Val Ala Ala Cys Gln Val Gly
1125 1130 1135
Asp Pro Val Arg Leu Glu Val Arg Leu Thr Asn Arg Ser Pro Arg Ser
1140 1145 1150
Val Gly Pro Phe Ala Leu Thr Val Val Pro Phe Gln Asp His Gln Asn
1155 1160 1165
Gly Val His Asn Tyr Asp Leu His Asp Thr Val Ser Phe Val Gly Ser
1170 1175 1180
Ser Thr Phe Tyr Leu Asp Ala Val Gln Pro Ser Gly Gln Ser Ala Cys
1185 1190 1195 1200
Leu Gly Ala Leu Leu Phe Leu Tyr Thr Gly Asp Phe Phe Leu His Ile
1205 1210 1215
Arg Phe His Glu Asp Ser Thr Ser Lys Glu Leu Pro Pro Ser Trp Phe
1220 1225 1230
Cys Leu Pro Ser Val His Val Cys Ala Leu Glu Ala Gln Ala
1235 1240 1245
<210> 20
<211> 7099
<212> DNA
<213> Artificial Sequence
<220>
<223> TRAPPC9
<400> 20
aaagtcggga gtgccatggt gccagctggg gatcaagacc gcgcgccaca cagggggaag 60
ccggcccagg ctggggctcg cacctcacgt gcctcccggg ccctgcgatc ctggaggcgc 120
tcccaggccg cgcgcgccac ggtcacccac ccacgtgggg ggcacgaccg tgggagtcac 180
ggggggtacc gtgagggtca cagggggtgc cgcagggatc cacagtgggc ttccgcgggg 240
cctccacccc tgagcttcac agaggaagtg aaatttgagc tgcgcgccct gaaggactgg 300
gacttcaaaa tgagcgtccc tgactacatg cagtgtgctg aggaccacca gacgctgctc 360
gtggtggtcc agcctgtggg catcgtctcc gaggagaact tcttcaggat ctataagagg 420
atttgctctg tgagtcagat cagcgtgcgg gactcccagc gagtcctcta catccgctac 480
aggcaccact acccacccga gaacaacgag tggggtgact tccagaccca ccgcaaagtc 540
gtgggcctca tcaccatcac agactgcttc tcggccaagg actggccaca gacctttgag 600
aagttccacg tgcagaagga gatctacggc tccacactgt atgactcccg gctctttgtc 660
ttcgggctgc agggggagat cgtggagcag ccgcgcaccg acgtggcttt ctaccccaac 720
tacgaggact gccagacggt ggagaagaga atcgaggact tcatcgagtc actgttcatc 780
gtgctggagt ccaagcgtct ggacagagcc acagacaagt ctggggataa gatccccctt 840
ctctgtgtcc cgtttgagaa aaaggacttt gtaggactgg acacagacag cagacattac 900
aagaagcggt gccaaggccg catgcggaag cacgtggggg acctgtgcct gcaggcaggg 960
atgctgcagg actccctggt gcattaccac atgtcggtgg agctgctgcg ttctgtgaat 1020
gactttctgt ggcttggagc tgccctggaa ggattgtgtt cagcttctgt catctatcac 1080
tatcctggtg gaactggtgg gaagagtgga gctcggaggt tccagggcag cacccttcct 1140
gctgaagcag ccaatagaca ccggccaggg gcacaggaag ttctcattga tccaggtgcc 1200
ctcaccacca atggcatcaa ccctgacacc agtactgaga tcggacgtgc taagaactgc 1260
cttagccctg aagacataat tgacaagtat aaagaggcga tttcctatta cagcaagtat 1320
aagaatgcgg gagtgattga gttggaagcg tgcatcaagg ctgtacgtgt ccttgcaatt 1380
cagaaacgga gcatggaagc atcagaattt cttcagaatg cagtttacat taaccttcga 1440
cagctttctg aggaagagaa aattcagcgc tacagcatcc tctccgagct ctatgagctg 1500
atcggcttcc atcgcaagtc tgcgttcttc aagcgcgtgg ccgccatgca gtgcgtggcc 1560
ccaagcatcg cggagcctgg gtggagggcc tgctacaaac tcctcctgga aacgctgccc 1620
ggctacagtc tgtcgctgga tcccaaagat ttcagcagag gcacgcacag aggctgggct 1680
gcggtccaga tgcgtttgct ccatgaattg gtctacgcct cccgaaggat ggggaaccct 1740
gccctctctg tcagacacct gtccttcctt ctacagacca tgctggactt cttgtcggat 1800
caggaaaaga aagatgtggc ccaaagccta gagaactata cgtccaagtg tcctgggacc 1860
atggagccca tcgccctccc tggcggcctc accctgccac cggtgccctt caccaagctt 1920
cccatcgtca ggcatgtgaa actattgaac cttcctgcta gcctccggcc acacaaaatg 1980
aaaagcttgc tgggtcagaa cgtgtcaacc aaaagtcctt tcatctattc accaattatc 2040
gcacacaacc gtggagaaga gcggaacaag aaaatagatt tccagtgggt tcaaggagat 2100
gtgtgtgaag ttcagctgat ggtatataac ccaatgccgt ttgaacttcg agttgaaaac 2160
atggggctgc tcaccagcgg agtggagttc gagtctctcc ctgcggcgct ttctcttccg 2220
gctgaatctg gtctgtaccc agtgacgctc gtcggggtcc cgcagacgac tggaacgatt 2280
actgtgaacg gttaccatac cacggtcttc ggtgtgttca gtgactgttt gctggataac 2340
ctgccgggaa taaaaaccag tggctccaca gtggaagtca ttcccgcgtt gccaagactg 2400
cagatcagca cctctctgcc cagatctgca cattcattgc aaccttcttc tggtgatgaa 2460
atatctacta atgtatctgt ccagctttac aatggagaaa gtcagcaact aatcattaaa 2520
ttggaaaata ttggaatgga accattggag aaactggagg tcacctcgaa agttctcacc 2580
actaaagaaa aattgtatgg cgacttcttg agctggaagc tagaggaaac ccttgcccag 2640
ttccctttgc agcctgggaa ggtggccacg ttcacaatca acatcaaagt gaagctggat 2700
ttctcctgcc aggagaatct cctgcaggat ctcagtgatg atggaatcag tgtgagtggc 2760
tttcccctgt ccagtccttt tcggcaggtc gttcggcccc gagtggaggg caaacctgtg 2820
aacccacccg agagcaacaa agcaggcgac tacagccacg tgaagaccct ggaagctgtc 2880
ctgaatttca aatactctgg aggcccgggc cacactgaag gatattacag gaatctctcc 2940
ctggggctgc atgtagaagt cgagccgtct gtatttttca cccgagtcag caccctccca 3000
gcaaccagta cccggcagtg tcacctgctc ctggatgtct tcaactccac cgagcatgag 3060
ctgaccgtca gcaccaggag cagcgaggca ctcatcctgc acgccggtga gtgccagcga 3120
atggctattc aagtggacaa gttcaacttt gagagtttcc cggagtcccc tggggagaag 3180
gggcaatttg caaaccccaa gcagctggag gaagagcggc gggaagcccg aggcctggag 3240
atccacagca agctgggcat ctgctggaga atcccctccc tgaagcgcag tggcgaggcg 3300
agtgtggaag gactcctgaa ccagctcgtc ctggagcacc tgcagctggc gcctctgcag 3360
tgggatgtgc tggtggacgg acagccatgt gaccgcgagg ctgtggcggc ctgccaggtg 3420
ggcgaccccg tgcgcctgga ggtgcggctg accaaccgga gcccgcgcag cgtagggccc 3480
ttcgccctca ctgtggtccc cttccaggac caccagaacg gcgtgcacaa ctacgacctg 3540
cacgacaccg tctccttcgt gggctccagc accttctacc tcgacgcggt gcagccgtcc 3600
ggccagtcgg cctgcctcgg ggccctcctc ttcctctaca cgggagactt cttcctccac 3660
atccggttcc acgaggacag caccagcaag gagctgccac cctcttggtt ctgcctgccc 3720
agtgtgcacg tgtgtgccct ggaggcgcag gcctgagccc gcctacttcc gtccctcttt 3780
ctgcagggcc agaggtgacc ctgcctggcc tcccacaccc cctgcaatga gcaaggcctt 3840
cactgcagcc ccatctcctc ctcctccccc agacccctcc cagccctctc ctcctgttcc 3900
tcctgtagca tctttgctgg gctacgcaga agccccggac atggcagccc caccccatgc 3960
cacgcccctt cctacactgt tccctggacc atacacaggc tgaagcagag gaaatcccaa 4020
agcgggtgcc catccagccc aggtcccagg atccctgcac ccatttctgt gacctggggc 4080
cccagccgtg ctgtgctgct catcccagca gagggacctc cctcgtccag cgacttccct 4140
ttggccatag aaagaaatgg tgagcatgag actgggcaca gcctgagggc gtgggcagct 4200
tcccaccctc cctgggcctt ggaatccccc aaggctggtt ttcttcctgg agacccccat 4260
gggcaacttg gcaggagaga tggtgccgta ggaggtcgtg gatggttgat gccaagagag 4320
gccctccacc cgtggtgggc aaatgtccag gcctgggctg gcagcccagg gctgtttctg 4380
ggtgctccct ggccccaggg tggcgtctgg ttaccatggc tgtgtgtgtc catgtctgca 4440
agcagttctt caataaatgg cctgcctccc cctccctgcc tccctgcatc tgctagccca 4500
gtgcagtccg gggcccccac ccagcccgtc agcccccacc tcaggtggct ggcttcccag 4560
cagaagccgg acccaggaag ggacgggttc tgagttagag atctcacata agcaaacgct 4620
gagacaggaa tctggtcacc agcagcttgt ctgggaggtg gaaggaggcc tgcaggggag 4680
ggagatcacc agtgaaggag tgtgactggg cccgagctgc cgccacagtg ggcagctggc 4740
cctgcgttcc tcagggagga ccggagagag aaagcacacg gctcacatct cccttgaggg 4800
ccagagctgg ggtgtgtcta cactacccac aagagtcctg ttctgggggc aggcttctgg 4860
ctgtgttgac ccttagctcc ccaggtccca gggaggataa acagcccttg gtccccagca 4920
gataccaggc tcgtcaggtg cccgttgggc actgaaactc ccatggagta ttgggggaca 4980
gagagcgggg cttcctaaag gccagttggc caaacaggca gagccaggaa gtggctgccc 5040
tggcctccca tggggcagag tcatgttggc atcaggaggc ctgcggggct gagggaactt 5100
cctgaggacc tgaagtccca ggcccaaacc tccctcgctg ggagcaaggt caccctgtgg 5160
cctccggcct aaggaacaaa ttttttgttc catctgtggc ctcctgttgt cgctcagtga 5220
atgatgggag cacactgtga tgtggtgtgg gctgggggca tgtgggggcg ggtgtccagt 5280
ggccctcagt tcctggagct cattcagcat ctgctcgagg ctccccaggg aaggcagccc 5340
cagaaggtct ggctgcagta gggggtggag accccagggc ggtctcttca gcccttccgt 5400
tcaggatgcc tcagcgtagt tgagggcctg gccacctggg ctgtttctga atggaacaaa 5460
cagcatccat gtgtgtcaac ctcaacagag cccaaaaata cgccattgaa tgaaacaaca 5520
cattgcagga tgatgggctc agcccaacct cacttgtgta gacaatgtcc ccccaaatta 5580
taaattttct atgcctcaga acgtgtctat aacaacatgt ttaaaaggta gaaaggatac 5640
acatcccctg tagaagagtg acctgaactg gagttgcatt tgacatctga tgtcagcatg 5700
agggtgctgg tcaccgggga cttgagtacg tactttcctt tcaaaagcac aatagatcaa 5760
gatacaggta cacacatagc tacagatata gatccagcca cagatccaga ttagatatag 5820
atgcatatag aagcctagaa aagccaatat ggaaaactct ggcagtgttt gattctgagt 5880
ggcaagcata ttggtgactg taattatttc ttgcagtttt ctatattttt tagtttccca 5940
aactgcaaaa aatacccaca acagttacca gagaaaatgc tggatgccac aagaaggaag 6000
aacctcaagg ggcaggaggc aaagttgtct tcttggagga gggggtgtta gaaataaccc 6060
aggcctcctg gaaaatgacc caaatacagc ctggctctgc cagccccagg cccttcccca 6120
cccccagggc accctttctg tcccagtaga gaaggtgacc tggagtcagg ccttgtgtgt 6180
gctccaagcg tttggagctt taccgtggga tgtggggagc caggggtgtt gtttgctggc 6240
atttctgtgg ctgggtcttt ctggctgtgg agctgtgttt gtgggcagtg gctgagatgg 6300
aggacctggg gcaggtgtct cagtgacagc gcagtatcct ccagcctctt ccagggcccc 6360
acgctactgg ctacctggca atacagtcca gcagttgctg cttctctagg cccctggtgc 6420
attcagaaac ctcctgaagg ccagcggagg gtaagccagg aacagatcat gtccattgca 6480
cttcactagc tgagcgagtt ggttctgccc ttctgatcct tggtccctca gggccaggac 6540
aggcccggtg accccataat catgcagccg tcatgctccg tcatctgaaa ttcacaagga 6600
aggcaaggat tcaaaggaag atgtctgaga tgctgaactc cagctcagcc acttgtcaac 6660
tccttgacct tggacaagtt acttaaatta agccttgatt ttttttcatc tgaataaata 6720
gagataaaaa tacttatttc atggacactt tcatatcagt ttatttgatt taacctccta 6780
gatttttttt tagcacatga tagcaacttc ttaataaata ctgggtccct tctcctagcc 6840
ttcccattgc tgttcttttt aattccatcc acatgtcctg tgcctgaaat atcctgtgcc 6900
atgtgcctga aattccattc ttctcctgga acccttaaca ggccacattg ctatgctccg 6960
ttagtcccct gcatggctta atgtattagg taacatgaag catgtaagga agaactttct 7020
aacacaggtg ctgaaatggg acaggaaatg ctgatggagt gagctctttg tcaatagaag 7080
tattcaagca aaaaaaaaa 7099
<210> 21
<211> 689
<212> PRT
<213> Artificial Sequence
<220>
<223> MST1L
<400> 21
Met Ala Pro Ala Pro Val Thr Leu Leu Ala Pro Gly Ala Ala Ser Ser
1 5 10 15
Met Ser Cys Ser Gln Pro Gly Gln Arg Ser Pro Ser Asn Asp Phe Gln
20 25 30
Val Leu Arg Gly Thr Glu Leu Gln His Leu Leu His Ala Val Val Pro
35 40 45
Gly Pro Trp Gln Glu Asp Val Ala Asp Ala Glu Glu Cys Ala Gly Arg
50 55 60
Cys Gly Pro Leu Met Asp Cys Trp Ala Phe His Tyr Asn Val Ser Ser
65 70 75 80
His Gly Cys Gln Leu Leu Pro Trp Thr Gln His Ser Pro His Ser Arg
85 90 95
Leu Trp His Ser Gly Arg Cys Asp Leu Phe Gln Glu Lys Gly Glu Trp
100 105 110
Gly Tyr Met Pro Thr Leu Arg Asn Gly Leu Glu Glu Asn Phe Cys Arg
115 120 125
Asn Pro Asp Gly Asp Pro Gly Gly Pro Trp Cys His Thr Thr Asp Pro
130 135 140
Ala Val Arg Phe Gln Ser Cys Ser Ile Lys Ser Cys Arg Val Ala Ala
145 150 155 160
Cys Val Trp Cys Asn Gly Glu Glu Tyr Arg Gly Ala Val Asp Arg Thr
165 170 175
Glu Ser Gly Arg Glu Cys Gln Arg Trp Asp Leu Gln His Pro His Gln
180 185 190
His Pro Phe Glu Pro Gly Lys Phe Leu Asp Gln Gly Leu Asp Asp Asn
195 200 205
Tyr Cys Arg Asn Pro Asp Gly Ser Glu Arg Pro Trp Cys Tyr Thr Thr
210 215 220
Asp Pro Gln Ile Glu Arg Glu Phe Cys Asp Leu Pro Arg Cys Gly Ser
225 230 235 240
Glu Ala Gln Pro Arg Gln Glu Ala Thr Ser Val Ser Cys Phe Arg Gly
245 250 255
Lys Gly Glu Gly Tyr Arg Gly Thr Ala Asn Thr Thr Thr Ala Ala Tyr
260 265 270
Leu Ala Ser Val Gly Thr Arg Lys Ser His Ile Ser Thr Asp Leu Arg
275 280 285
Gln Lys Asn Thr Arg Ala Ser Glu Val Gly Gly Gly Ala Gly Val Gly
290 295 300
Thr Cys Cys Cys Gly Asp Leu Arg Glu Asn Phe Cys Trp Asn Leu Asp
305 310 315 320
Gly Ser Glu Ala Pro Trp Cys Phe Thr Leu Arg Pro Gly Thr Arg Val
325 330 335
Gly Phe Cys Tyr Gln Ile Arg Arg Cys Thr Asp Asp Val Arg Pro Gln
340 345 350
Asp Cys Tyr His Gly Ala Gly Glu Gln Tyr Arg Gly Thr Val Ser Lys
355 360 365
Thr Arg Lys Gly Val Gln Cys Gln Arg Trp Ser Ala Glu Thr Pro His
370 375 380
Lys Leu Gln Ala Leu Thr Leu Gly Arg His Ala Leu Met Ser Gly Thr
385 390 395 400
Arg Ala Trp Lys Trp Leu Arg Leu Pro Cys His Asp Phe Ala Pro Ala
405 410 415
Pro Ala Ser Val His Ile Tyr Leu Arg Thr Ala Cys Thr Thr Gly Gly
420 425 430
Glu Leu Leu Pro Asp Pro Asp Gly Asp Ser His Gly Pro Trp Cys Tyr
435 440 445
Thr Met Asp Pro Arg Thr Pro Phe Asp Tyr Cys Ala Leu Arg Arg Cys
450 455 460
Asp Gln Val Gln Phe Glu Lys Cys Gly Lys Arg Val Asp Arg Leu Asp
465 470 475 480
Gln Arg Arg Ser Lys Leu Arg Val Ala Gly Gly His Pro Gly Asn Ser
485 490 495
Pro Trp Thr Val Ser Leu Arg Asn Arg His Met Pro Leu Thr Gly Tyr
500 505 510
Glu Val Trp Leu Gly Thr Leu Phe Gln Asn Pro Gln His Gly Glu Pro
515 520 525
Gly Leu Gln Arg Val Pro Val Ala Lys Met Leu Cys Gly Pro Ser Gly
530 535 540
Ser Gln Leu Val Leu Leu Lys Leu Glu Arg Ser Val Thr Leu Asn Gln
545 550 555 560
Arg Val Ala Leu Ile Cys Leu Pro Pro Glu Trp Tyr Val Val Pro Pro
565 570 575
Gly Thr Lys Cys Glu Ile Ala Gly Trp Gly Glu Thr Lys Gly Thr Gly
580 585 590
Asn Asp Thr Val Leu Asn Val Ala Leu Leu Asn Val Ile Ser Asn Gln
595 600 605
Glu Cys Asn Ile Lys His Arg Gly His Val Arg Glu Ser Glu Met Cys
610 615 620
Thr Glu Gly Leu Leu Ala Pro Val Gly Ala Cys Glu Gly Asp Tyr Gly
625 630 635 640
Gly Pro Leu Ala Cys Phe Thr His Asn Cys Trp Val Leu Lys Gly Ile
645 650 655
Arg Ile Pro Asn Arg Val Cys Thr Arg Ser Arg Trp Pro Ala Val Phe
660 665 670
Thr Arg Val Ser Val Phe Val Asp Trp Ile His Lys Val Met Arg Leu
675 680 685
Gly
<210> 22
<211> 4668
<212> DNA
<213> Artificial Sequence
<220>
<223> MST1L
<400> 22
atggcgcctg ccccagtcac cctgctggcc cctggggcag catcctcaat gtcttgcagc 60
cagcccgggc agcgctcgcc atcgaatgac ttccaggtgc tccggggcac agagctacag 120
cacctgctac atgcggtggt gcccgggcct tggcaggagg atgtggcaga tgctgaagag 180
tgtgctggtc gctgtgggcc cttaatggac tgctgggcgt tccactacaa tgtgagcagc 240
catggttgcc aactgctgcc atggactcaa cactcgcccc actcaaggct gtggcattct 300
gggcgctgtg acctcttcca ggagaaaggc gagtgggggt acatgcccac gctccggaat 360
ggcctggaag agaacttctg ccgtaaccct gatggcgacc ccggaggtcc ttggtgccac 420
acaacagacc ctgccgtgcg cttccagagc tgcagcatca aatcctgccg ggtggccgcg 480
tgtgtctggt gcaatggcga ggaataccgc ggcgcggtag accgcaccga gtcagggcgc 540
gagtgccagc gctgggatct tcagcacccg caccagcacc ccttcgagcc gggcaagttc 600
ctcgaccaag gtctggacga caactattgc cggaatcctg acggctccga gcggccatgg 660
tgctacacta cggatccgca gatcgagcga gagttctgtg acctcccccg ctgcgggtcc 720
gaggcacagc cccgccaaga ggccacaagt gtcagctgct tccgcgggaa gggtgagggc 780
taccggggca cagccaatac caccaccgcg gcgtaccttg ccagcgttgg gacgcgcaaa 840
tcccacatca gcaccgattt acgccagaaa aatacgcgtg caagtgaggt gggcgggggg 900
gcgggcgttg ggacgtgctg ctgcggagac cttcgggaga acttctgctg gaacctcgac 960
ggctcagagg cgccctggtg cttcaccctg cggcccggca cgcgcgtggg cttttgctac 1020
cagatccggc gttgtacaga cgacgtgcgg ccccaggact gctaccacgg cgcgggggag 1080
cagtaccgcg gcacggtcag caagacccgc aagggtgtcc agtgccagcg ctggtccgct 1140
gagacgccgc acaagctgca ggccctaacc ctggggcggc atgctttgat gtctgggacc 1200
agagcctgga aatggttgag actaccctgc cacgattttg ctcccgctcc cgcctcggtt 1260
cacatttacc tccgaaccgc atgcacaact ggaggagaac ttctgccaga cccagatggg 1320
gatagccatg ggccctggtg ctacacgatg gacccaagga ccccattcga ctactgtgcc 1380
ctgcgacgct gcgaccaggt gcagtttgag aagtgtggca agagggtgga tcggctggat 1440
cagcgtcgtt ccaagctgcg cgtggctggg ggccatccgg gcaactcacc ctggacagtc 1500
agcttgcgga atcgccatat gcctctcacg ggctatgagg tatggttggg caccctgttc 1560
cagaacccac aacatggaga gccaggccta cagcgggtcc cagtagccaa gatgctgtgt 1620
gggccctcag gctcccagct tgtcctgctc aagctggaga gatctgtgac cctgaaccag 1680
cgtgtggccc tgatctgcct gccgcctgaa tggtatgtgg tgcctccagg gaccaagtgt 1740
gagattgcag gctggggtga gaccaaaggt acgggtaatg acacagtcct aaatgtggcc 1800
ttgctgaacg tcatctccaa ccaggagtgt aacatcaagc accgaggaca tgtgcgggag 1860
agcgagatgt gcactgaggg actgttggcc cctgtggggg cctgtgaggg tgactacggg 1920
ggcccacttg cctgctttac ccacaactgc tgggtcctga aaggaattag aatccccaac 1980
cgagtatgca caaggtcgcg ctggccagcc gtcttcacgc gtgtctctgt gtttgtggac 2040
tggattcaca aggtcatgag actgggttag gcccagcctt gacgccatat gctttgggga 2100
ggacaaaact tgtaagtaca gtcaaggaca agacttgtac tcaaggttga gatttaataa 2160
aattaatatt tttactactt caccaaggac tttcttaaac gaaaatggtt tttccccctg 2220
caagtaaaca gtaatgaaga agagaattat tcctagtgca gtttgttttc atggtcttaa 2280
tttttgctaa gactccactg tttttgcctt atcaatacaa gtgccaacac agtgaaaagg 2340
caaatatcat cttagtatta ctctgaaaat agttctgagc taatggccta ctgaaaggaa 2400
aagagtggct cctgctattc tattagactt attacaatta tcttaagtat tctttctacc 2460
ctcctttaat tgaatggaaa cagggatgga ttggaagagc tgtttttctc ctttctttcc 2520
cccggcaata tttaccattt aatgccactt actaacactc aaagaaacaa aaccaaactt 2580
ctcaattgac agtgcagtga cccaacaaag acacgggttc ttgaattcaa agtggagcag 2640
gagagacggt aaatacacat ttactttaat atatatatat ttattattta tgtgtttaaa 2700
gcacaaatta gtttggtaaa aaacatctca tgtctgtttt atttccacat ccctgagact 2760
gacaatggga tgcctatcaa ttaattcatt tagagagcca tacaccacaa gaaacaaatt 2820
atttgtcctc tggagcttgt cacaggggga tttttaaaaa accattaaac agaaagacaa 2880
ctgtgcatct tagaaagata aaaggccaat tcttcctctc cggctgatag gttcttaata 2940
atagtgatat ctactaataa ggtgttttac atagtgtaaa gcatgttcac atacaaatta 3000
cttagcctct ttgagcctca gttttcttat atgtaaaact ggattaatag tacattttgt 3060
gtttaaaaag ataatgtata tgaagtgttt accatttttg cttggcatct agttcagttc 3120
tcagtaactg atgtggtggt ggtggtggtc atagtagcag taagatccgt agtaatagta 3180
gcagcagttg ttttagaaat tagtaactga ggcctggcaa agttaaaggc tctttcatta 3240
acacccagag gggaagaaat gaagctggtc ttcagaggca ggctattttc actctgtgtc 3300
ccaaattttc ccccctagac cgtttttata cttctggggc ctcagaaaat attctcagct 3360
attctgttag cttgatctcc taccatctga gagtgggctt ccttcaaaca accaaatttc 3420
caggtatttc taaactgccc ttcccctaca ccattctttg gttcagtatt tcaagacccc 3480
taagagaaat ggtacattta catgtaagca caggatagtg aagtatttac aacaagtgct 3540
ttggagccag caaatatgaa tcagaatcca gctttccttt cctacataca tgacattggg 3600
cagctaattt ctaagatttt acttctttat ctatgaaagt ggagtactag tacttgctct 3660
gtgcaactgt gatggttgtt acatgaggta gcatctagaa gcagcttgca cattgccaga 3720
cacccagtgg aaggtcaatg aatgactatt tgaggactaa ctattacaga aatgtttact 3780
cttctgagtc ctgatttcta gtctcctgga ctaaataggt tcactgtttt cctcccggtt 3840
cagtttccag acacatcaca gaattataag aatattaaaa actcaggctt atacctacac 3900
aggattttct ataaccctct ttctgctttg agctcctaaa ggtatttcat agaaaaatga 3960
ccttattttt aaatagaggg ggcagttgaa aatcagtgaa cgggcctacc ccctaatgat 4020
ttttttctca gacctaatta taataattag cattataaag tgctaattat ctttggacac 4080
agaggacctg cacaccagag acagaggtcc gcattaagta aagtggattt cactttcttc 4140
agttgtgaga tttctctttt ttcttctttg taatgatgca aagatatatc ttccaccaag 4200
cctcatttaa aagctttttc cagttaagga aactatctct tggccatcca cagccagact 4260
gcatattgag attatggata ttcaaagaaa ttgtctttcc tttgtatatt gtcataactt 4320
tttgtgaaat gttcgtttta tagttccagg ccagcaccta gaacctggct agaataaaaa 4380
actgcagaaa tcatgagttt cttgtttgga tgaaagagca cacctattaa caaatgatag 4440
acggctatcc tactgtgagt cctgaaaact ggtggtgtga ttgttgaatg ggttaggggt 4500
atagcagaga aactcagtgt gggctacata caatttcagc ttgaatcaca cttaacagat 4560
cctctgttcc aaccatttaa atttacaaag aagaaactaa ggcacagaac tacttgagaa 4620
gagaagcaga attgaaaact agagctcctg attgttctca aaataatt 4668
<210> 23
<211> 595
<212> PRT
<213> Artificial Sequence
<220>
<223> GBP3
<400> 23
Met Ala Pro Glu Ile His Met Thr Gly Pro Met Cys Leu Ile Glu Asn
1 5 10 15
Thr Asn Gly Glu Leu Val Ala Asn Pro Glu Ala Leu Lys Ile Leu Ser
20 25 30
Ala Ile Thr Gln Pro Val Val Val Val Ala Ile Val Gly Leu Tyr Arg
35 40 45
Thr Gly Lys Ser Tyr Leu Met Asn Lys Leu Ala Gly Lys Asn Lys Gly
50 55 60
Phe Ser Leu Gly Ser Thr Val Lys Ser His Thr Lys Gly Ile Trp Met
65 70 75 80
Trp Cys Val Pro His Pro Lys Lys Pro Glu His Thr Leu Val Leu Leu
85 90 95
Asp Thr Glu Gly Leu Gly Asp Val Lys Lys Gly Asp Asn Gln Asn Asp
100 105 110
Ser Trp Ile Phe Thr Leu Ala Val Leu Leu Ser Ser Thr Leu Val Tyr
115 120 125
Asn Ser Met Gly Thr Ile Asn Gln Gln Ala Met Asp Gln Leu Tyr Tyr
130 135 140
Val Thr Glu Leu Thr His Arg Ile Arg Ser Lys Ser Ser Pro Asp Glu
145 150 155 160
Asn Glu Asn Glu Asp Ser Ala Asp Phe Val Ser Phe Phe Pro Asp Phe
165 170 175
Val Trp Thr Leu Arg Asp Phe Ser Leu Asp Leu Glu Ala Asp Gly Gln
180 185 190
Pro Leu Thr Pro Asp Glu Tyr Leu Glu Tyr Ser Leu Lys Leu Thr Gln
195 200 205
Gly Thr Ser Gln Lys Asp Lys Asn Phe Asn Leu Pro Gln Leu Cys Ile
210 215 220
Trp Lys Phe Phe Pro Lys Lys Lys Cys Phe Val Phe Asp Leu Pro Ile
225 230 235 240
His Arg Arg Lys Leu Ala Gln Leu Glu Lys Leu Gln Asp Glu Glu Leu
245 250 255
Asp Pro Glu Phe Val Gln Gln Val Ala Asp Phe Cys Ser Tyr Ile Phe
260 265 270
Ser Asn Ser Lys Thr Lys Thr Leu Ser Gly Gly Ile Lys Val Asn Gly
275 280 285
Pro Cys Leu Glu Ser Leu Val Leu Thr Tyr Ile Asn Ala Ile Ser Arg
290 295 300
Gly Asp Leu Pro Cys Met Glu Asn Ala Val Leu Ala Leu Ala Gln Ile
305 310 315 320
Glu Asn Ser Ala Ala Val Gln Lys Ala Ile Ala His Tyr Asp Gln Gln
325 330 335
Met Gly Gln Lys Val Gln Leu Pro Ala Glu Thr Leu Gln Glu Leu Leu
340 345 350
Asp Leu His Arg Val Ser Glu Arg Glu Ala Thr Glu Val Tyr Met Lys
355 360 365
Asn Ser Phe Lys Asp Val Asp His Leu Phe Gln Lys Lys Leu Ala Ala
370 375 380
Gln Leu Asp Lys Lys Arg Asp Asp Phe Cys Lys Gln Asn Gln Glu Ala
385 390 395 400
Ser Ser Asp Arg Cys Ser Ala Leu Leu Gln Val Ile Phe Ser Pro Leu
405 410 415
Glu Glu Glu Val Lys Ala Gly Ile Tyr Ser Lys Pro Gly Gly Tyr Cys
420 425 430
Leu Phe Ile Gln Lys Leu Gln Asp Leu Glu Lys Lys Tyr Tyr Glu Glu
435 440 445
Pro Arg Lys Gly Ile Gln Ala Glu Glu Ile Leu Gln Thr Tyr Leu Lys
450 455 460
Ser Lys Glu Ser Val Thr Asp Ala Ile Leu Gln Thr Asp Gln Ile Leu
465 470 475 480
Thr Glu Lys Glu Lys Glu Ile Glu Val Glu Cys Val Lys Ala Glu Ser
485 490 495
Ala Gln Ala Ser Ala Lys Met Val Glu Glu Met Gln Ile Lys Tyr Gln
500 505 510
Gln Met Met Glu Glu Lys Glu Lys Ser Tyr Gln Glu His Val Lys Gln
515 520 525
Leu Thr Glu Lys Met Glu Arg Glu Arg Ala Gln Leu Leu Glu Glu Gln
530 535 540
Glu Lys Thr Leu Thr Ser Lys Leu Gln Glu Gln Ala Arg Val Leu Lys
545 550 555 560
Glu Arg Cys Gln Gly Glu Ser Thr Gln Leu Gln Asn Glu Ile Gln Lys
565 570 575
Leu Gln Lys Thr Leu Lys Lys Lys Thr Lys Arg Tyr Met Ser His Lys
580 585 590
Leu Lys Ile
595
<210> 24
<211> 2332
<212> DNA
<213> Artificial Sequence
<220>
<223> GBP3
<400> 24
cagcgatcca gcgaaagaaa agagaagtga cagaaacaac tttacctgga ctgaagataa 60
aagcacagac aagagaacaa tgccctggac atggctccag agatccacat gacaggccca 120
atgtgcctca ttgagaacac taatggggaa ctggtggcga atccagaagc tctgaaaatc 180
ctgtctgcca ttacacagcc tgtggtggtg gtggcaattg tgggcctcta ccgcacagga 240
aaatcctacc tgatgaacaa gctagctggg aagaataagg gcttctctct gggctccaca 300
gtgaaatctc acaccaaagg aatctggatg tggtgtgtgc ctcaccccaa aaagccagaa 360
cacaccttag tcctgcttga cactgagggc ctgggagatg taaagaaggg tgacaaccag 420
aatgactcct ggatcttcac cctggccgtc ctcctgagca gcactctcgt gtacaatagc 480
atgggaacca tcaaccagca ggctatggac caactgtact atgtgacaga gctgacacat 540
cgaatccgat caaaatcctc acctgatgag aatgagaatg aggattcagc tgactttgtg 600
agcttcttcc cagattttgt gtggacactg agagatttct ccctggactt ggaagcagat 660
ggacaacccc tcacaccaga tgagtacctg gagtattccc tgaagctaac gcaaggtacc 720
agtcaaaaag ataaaaattt taatctgccc caactctgta tctggaagtt cttcccaaag 780
aaaaaatgtt ttgtcttcga tctgcccatt caccgcagga agcttgccca gcttgagaaa 840
ctacaagatg aagagctgga ccctgaattt gtgcaacaag tagcagactt ctgttcctac 900
atctttagca attccaaaac taaaactctt tcaggaggca tcaaggtcaa tgggccttgt 960
ctagagagcc tagtgctgac ctatatcaat gctatcagca gaggggatct gccctgcatg 1020
gagaacgcag tcctggcctt ggcccagata gagaactcag ccgcagtgca aaaggctatt 1080
gcccactatg accagcagat gggccagaag gtgcagctgc ccgcagaaac cctccaggag 1140
ctgctggacc tgcacagggt tagtgagagg gaggccactg aagtctatat gaagaactct 1200
ttcaaggatg tggaccatct gtttcaaaag aaattagcgg cccagctaga caaaaagcgg 1260
gatgactttt gtaaacagaa tcaagaagca tcatcagatc gttgctcagc tttacttcag 1320
gtcattttca gtcctctaga agaagaagtg aaggcgggaa tttattcgaa accagggggc 1380
tattgtctct ttattcagaa gctacaagac ctggagaaaa agtactatga ggaaccaagg 1440
aaggggatac aggctgaaga gattctgcag acatacttga aatccaagga gtctgtgacc 1500
gatgcaattc tacagacaga ccagattctc acagaaaagg aaaaggagat tgaagtggaa 1560
tgtgtaaaag ctgaatctgc acaggcttca gcaaaaatgg tggaggaaat gcaaataaag 1620
tatcagcaga tgatggaaga gaaagagaag agttatcaag aacatgtgaa acaattgact 1680
gagaagatgg agagggagag ggcccagttg ctggaagagc aagagaagac cctcactagt 1740
aaacttcagg aacaggcccg agtactaaag gagagatgcc aaggtgaaag tacccaactt 1800
caaaatgaga tacaaaagct acagaagacc ctgaaaaaaa aaaccaagag atatatgtcg 1860
cataagctaa agatctaaac aacagagctt ttctgtcatc ctaacccaag gcataactga 1920
aacaatttta gaatttggaa caagtgtcac tatatttgat aataattaga tcttgcatca 1980
taacactaaa agtttacaag aacatgcagt tcaatgatca aaatcatgtt ttttccttaa 2040
aaagattgta aattgtgcaa caaagatgca tttacctctg taccaacaga ggagggatca 2100
tgagttgcca ccactcagaa gtttattctt ccagacgacc agtggatact gaggaaagtc 2160
ttaggtaaaa atcttgggac atatttgggc actggtttgg ccaagtgtac aatgggtccc 2220
aatatcagaa acaaccatcc tagcttccta gggaagacag tgtacagttc tccattatat 2280
caaggctaca aggtctatga gcaataatgt gatttctgga cattgcccat gg 2332
<210> 25
<211> 232
<212> PRT
<213> Artificial Sequence
<220>
<223> CFHR3
<400> 25
Met Leu Leu Leu Ile Asn Val Ile Leu Thr Leu Trp Val Ser Cys Ala
1 5 10 15
Asn Gly Gln Val Lys Pro Cys Asp Phe Pro Asp Ile Lys His Gly Gly
20 25 30
Leu Phe His Glu Asn Met Arg Arg Pro Tyr Phe Pro Val Ala Val Gly
35 40 45
Lys Tyr Tyr Ser Tyr Tyr Cys Asp Glu His Phe Glu Thr Pro Ser Gly
50 55 60
Ser Tyr Trp Asp Tyr Ile His Cys Thr Gln Asn Gly Trp Ser Pro Ala
65 70 75 80
Val Pro Cys Leu Arg Lys Cys Tyr Phe Pro Tyr Leu Glu Asn Gly Tyr
85 90 95
Asn Gln Asn Tyr Gly Arg Lys Phe Val Gln Gly Asn Ser Thr Glu Val
100 105 110
Ala Cys His Pro Gly Tyr Gly Leu Pro Lys Ala Gln Thr Thr Val Thr
115 120 125
Cys Thr Glu Lys Gly Trp Ser Pro Thr Pro Arg Cys Ile Arg Val Arg
130 135 140
Thr Cys Ser Lys Ser Asp Ile Glu Ile Glu Asn Gly Phe Ile Ser Glu
145 150 155 160
Ser Ser Ser Ile Tyr Ile Leu Asn Lys Glu Ile Gln Tyr Lys Cys Lys
165 170 175
Pro Gly Tyr Ala Thr Ala Asp Gly Asn Ser Ser Gly Ser Ile Thr Cys
180 185 190
Leu Gln Asn Gly Trp Ser Ala Gln Pro Ile Cys Ile Thr Ala Cys Ile
195 200 205
Ala Phe Arg Ala His Ala Gln Lys Ser Cys Thr Cys Arg Gly Arg Asn
210 215 220
Glu Cys Leu Ile Leu Asn Phe Cys
225 230
<210> 26
<211> 1995
<212> DNA
<213> Artificial Sequence
<220>
<223> CFHR3
<400> 26
aataatgaaa ggtttcaaac cccaaacagt gcaactgaaa cttttgtatt agcatactac 60
tgagaatatc taacatgttg ttactaatca atgtcattct gaccttgtgg gtttcctgtg 120
ctaatggaca agtgaaacct tgtgattttc cagacattaa acatggaggt ctatttcatg 180
agaatatgcg tagaccatac tttccagtag ctgtaggaaa atattactcc tattactgtg 240
atgaacattt tgagactcct tcaggaagtt actgggatta cattcattgc acacaaaatg 300
ggtggtcacc agcagtacca tgtctcagaa aatgttattt tccttatttg gaaaatggat 360
ataatcaaaa ttatggaaga aagtttgtac agggtaactc tacagaagtt gcctgccatc 420
ctggctacgg tcttccaaaa gcgcagacca cagttacatg tacggagaaa ggctggtctc 480
ctactcccag atgcatccgt gtcagaacat gctcaaaatc agatatagaa attgaaaatg 540
gattcatttc cgaatcttcc tctatttata ttttaaataa agaaatacaa tataaatgta 600
aaccaggata tgcaacagca gatggaaatt cttcaggatc aattacatgt ttgcaaaatg 660
gatggtcagc acaaccaatt tgcattactg cttgtattgc attccgtgct cacgctcaga 720
aaagttgtac atgtcggggg agaaatgagt gcttaattct gaatttctgc tagcgtcagg 780
agaatcagac cttaataatt tgtatcaatg atgctactga ggatatccaa tcaaaaaatt 840
atctctaccc tattgtttac tacagagaaa acaagtaaag gaaaagggta agtgggtggg 900
ctgaatgttt acaaacctca tacttgccaa tggattcttt acatgtaaag ttctctgaac 960
gtgctcgacc tttacttagt atgataaaga gaatgcataa tgaacaaagg aaatctttca 1020
gtggcaaaag ttgatttttt ttctttcctc tttatatatt cacaaagagt tttaaagata 1080
aattgctgaa tgtattttta acccaaatac tgttgtataa tttacatact cccaaatcca 1140
cttcattttc aagggtacaa ttcaactatt tttagtcatg agataatatg gaaccttgat 1200
acataataca acatggaaga ctctcaaaaa tattaagcta agagagaaaa aagatacact 1260
tacaaagtga ttatctcaat gttatcatga gttttggggg ttatatgaat tcctacattt 1320
ctagaatact gttttcaatt tctatactta tcaagggctc tgtgtaaaga aaaaggtgta 1380
tacttcaatt ttactcagct ttgataaatg atctacttta aaacttgtga aataaaagac 1440
aggatgcttc ataaatttta tagaacttat atccaattaa tataattttt atctaatata 1500
tacaccctta ctgttagtaa atcgttacat aactacataa atggttacat taacttttaa 1560
attcacaaat ttaaaagcgg tttaagctcc gtgacacatt ggactacgaa tgctacgatg 1620
gatatggaat cagttatgga aacaccacag gttccatagt gtgtggtgaa gatgggtagt 1680
cccatttccc aacatgttat agtaagtatt ttattcaagt attttttatt agaattaaat 1740
aaaataataa atagacacct acatatgtat atgtacacat atgtgtgtac atatatgtac 1800
atatatatgt agtcctccta tgagtgtgaa ttatcttgag acttaaaaag aaaaaacaac 1860
gttgaaaatg cagatgtctt cctaagaaat caaataagat acagttaaga gtatataaaa 1920
agctttattt agaaagtttc caataagact attgattttt ccccaaaaaa aaaaaaaaaa 1980
aaaaaaaaaa aaaaa 1995
<210> 27
<211> 330
<212> PRT
<213> Artificial Sequence
<220>
<223> CFHR1
<400> 27
Met Trp Leu Leu Val Ser Val Ile Leu Ile Ser Arg Ile Ser Ser Val
1 5 10 15
Gly Gly Glu Ala Thr Phe Cys Asp Phe Pro Lys Ile Asn His Gly Ile
20 25 30
Leu Tyr Asp Glu Glu Lys Tyr Lys Pro Phe Ser Gln Val Pro Thr Gly
35 40 45
Glu Val Phe Tyr Tyr Ser Cys Glu Tyr Asn Phe Val Ser Pro Ser Lys
50 55 60
Ser Phe Trp Thr Arg Ile Thr Cys Thr Glu Glu Gly Trp Ser Pro Thr
65 70 75 80
Pro Lys Cys Leu Arg Leu Cys Phe Phe Pro Phe Val Glu Asn Gly His
85 90 95
Ser Glu Ser Ser Gly Gln Thr His Leu Glu Gly Asp Thr Val Gln Ile
100 105 110
Ile Cys Asn Thr Gly Tyr Arg Leu Gln Asn Asn Glu Asn Asn Ile Ser
115 120 125
Cys Val Glu Arg Gly Trp Ser Thr Pro Pro Lys Cys Arg Ser Thr Asp
130 135 140
Thr Ser Cys Val Asn Pro Pro Thr Val Gln Asn Ala His Ile Leu Ser
145 150 155 160
Arg Gln Met Ser Lys Tyr Pro Ser Gly Glu Arg Val Arg Tyr Glu Cys
165 170 175
Arg Ser Pro Tyr Glu Met Phe Gly Asp Glu Glu Val Met Cys Leu Asn
180 185 190
Gly Asn Trp Thr Glu Pro Pro Gln Cys Lys Asp Ser Thr Gly Lys Cys
195 200 205
Gly Pro Pro Pro Pro Ile Asp Asn Gly Asp Ile Thr Ser Phe Pro Leu
210 215 220
Ser Val Tyr Ala Pro Ala Ser Ser Val Glu Tyr Gln Cys Gln Asn Leu
225 230 235 240
Tyr Gln Leu Glu Gly Asn Lys Arg Ile Thr Cys Arg Asn Gly Gln Trp
245 250 255
Ser Glu Pro Pro Lys Cys Leu His Pro Cys Val Ile Ser Arg Glu Ile
260 265 270
Met Glu Asn Tyr Asn Ile Ala Leu Arg Trp Thr Ala Lys Gln Lys Leu
275 280 285
Tyr Leu Arg Thr Gly Glu Ser Ala Glu Phe Val Cys Lys Arg Gly Tyr
290 295 300
Arg Leu Ser Ser Arg Ser His Thr Leu Arg Thr Thr Cys Trp Asp Gly
305 310 315 320
Lys Leu Glu Tyr Pro Thr Cys Ala Lys Arg
325 330
<210> 28
<211> 1320
<212> DNA
<213> Artificial Sequence
<220>
<223> CFHR1
<400> 28
ggggacactg aaattcaaag tcatgctcat aactgttaat gaaagcagat tcaaagcaac 60
accaccacca ctgaagtatt tttggttata taagattgga actaccaagc atgtggctcc 120
tggtcagtgt aattctaatc tcacggatat cctctgttgg gggagaagca acattttgtg 180
attttccaaa aataaaccat ggaattctat atgatgaaga aaaatataag ccattttccc 240
aggttcctac aggggaagtt ttctattact cctgtgaata taattttgtg tctccttcaa 300
aatcattttg gactcgcata acatgcacag aagaaggatg gtcaccaaca ccaaagtgtc 360
tcagactgtg tttctttcct tttgtggaaa atggtcattc tgaatcttca ggacaaacac 420
atctggaagg tgatactgtg caaattattt gcaacacagg atacaggctt caaaacaatg 480
agaacaacat ttcatgtgta gaacggggct ggtccacccc tcccaaatgc aggtccactg 540
acacttcctg tgtgaatccg cccacagtac aaaatgctca tatactgtcg agacagatga 600
gtaaatatcc atctggtgag agagtacgtt atgaatgtag gagcccttat gaaatgtttg 660
gggatgaaga agtgatgtgt ttaaatggaa actggacaga accacctcaa tgcaaagatt 720
ctacgggaaa atgtgggccc cctccaccta ttgacaatgg ggacattact tcattcccgt 780
tgtcagtata tgctccagct tcatcagttg agtaccaatg ccagaacttg tatcaacttg 840
agggtaacaa gcgaataaca tgtagaaatg gacaatggtc agaaccacca aaatgcttac 900
atccgtgtgt aatatcccga gaaattatgg aaaattataa catagcatta aggtggacag 960
ccaaacagaa gctttatttg agaacaggtg aatcagctga atttgtgtgt aaacggggat 1020
atcgtctttc atcacgttct cacacattgc gaacaacatg ttgggatggg aaactggagt 1080
atccaacttg tgcaaaaaga tagaatcaat cataaaatgc acacctttat tcagaacttt 1140
agtattaaat cagttcttaa tttcattttt aagtattgtt ttactccttt ttattcatac 1200
gtaaaatttt ggattaattt gtgaaaatgt aattataagc tgagaccggt ggctctcttc 1260
ttaaaagcac catattaaaa cttggaaaac caaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1320
1320
<210> 29
<211> 322
<212> PRT
<213> Artificial Sequence
<220>
<223> OR2T2
<400> 29
Met Glu Gly Leu Leu Gln Asn Ser Thr Asn Phe Val Leu Thr Gly Leu
1 5 10 15
Ile Thr His Pro Ala Phe Pro Gly Leu Leu Phe Ala Ile Val Phe Ser
20 25 30
Ile Phe Val Val Ala Ile Thr Ala Asn Leu Val Met Ile Leu Leu Ile
35 40 45
His Met Asp Ser Arg Leu His Thr Pro Met Tyr Phe Leu Leu Ser Gln
50 55 60
Leu Ser Ile Met Asp Thr Ile Tyr Ile Cys Ile Thr Val Pro Lys Met
65 70 75 80
Leu Gln Asp Leu Leu Ser Lys Asp Lys Thr Ile Ser Phe Leu Gly Cys
85 90 95
Ala Val Gln Ile Phe Leu Tyr Leu Thr Leu Ile Gly Gly Glu Phe Phe
100 105 110
Leu Leu Gly Leu Met Ala Tyr Asp Arg Tyr Val Ala Val Cys Asn Pro
115 120 125
Leu Arg Tyr Pro Leu Leu Met Asn Arg Arg Val Cys Leu Phe Met Val
130 135 140
Val Gly Ser Trp Val Gly Gly Ser Leu Asp Gly Phe Met Leu Thr Pro
145 150 155 160
Val Thr Met Ser Phe Pro Phe Cys Arg Ser Arg Glu Ile Asn His Phe
165 170 175
Phe Cys Glu Ile Pro Ala Val Leu Lys Leu Ser Cys Thr Asp Thr Ser
180 185 190
Leu Tyr Glu Thr Leu Met Tyr Ala Cys Cys Val Leu Met Leu Leu Ile
195 200 205
Pro Leu Ser Val Ile Ser Val Ser Tyr Thr His Ile Leu Leu Thr Val
210 215 220
His Arg Met Asn Ser Ala Glu Gly Arg Arg Lys Ala Phe Ala Thr Cys
225 230 235 240
Ser Ser His Ile Met Val Val Ser Val Phe Tyr Gly Ala Ala Phe Tyr
245 250 255
Thr Asn Val Leu Pro His Ser Tyr His Thr Pro Glu Lys Asp Lys Val
260 265 270
Val Ser Ala Phe Tyr Thr Ile Leu Thr Pro Met Leu Asn Pro Leu Ile
275 280 285
Tyr Ser Leu Arg Asn Lys Asp Val Ala Ala Ala Leu Arg Lys Val Leu
290 295 300
Gly Arg Cys Gly Ser Ser Gln Ser Ile Arg Val Ala Thr Val Ile Arg
305 310 315 320
Lys Gly
<210> 30
<211> 969
<212> DNA
<213> Artificial Sequence
<220>
<223> OR2T2
<400> 30
atggagggtc ttctccagaa ctccactaac ttcgtcctca caggcctcat cacccatcct 60
gccttccccg ggcttctctt tgcaatagtc ttctccatct ttgtggtggc tataacagcc 120
aacttggtca tgattctgct catccacatg gactcccgcc tccacacacc catgtacttc 180
ttgctcagcc agctctccat catggatacc atctacatct gtatcactgt ccccaagatg 240
ctccaggacc tcctgtccaa ggacaagacc atttccttcc tgggctgtgc agttcagatc 300
ttcctctacc tgaccctgat tggaggggaa ttcttcctgc tgggtctcat ggcctatgac 360
cgctatgtgg ctgtgtgcaa ccctctacgg taccctctcc tcatgaaccg cagggtttgc 420
ttattcatgg tggtcggctc ctgggttggt ggttccttgg atgggttcat gctgactcct 480
gtcactatga gtttcccctt ctgtagatcc cgagagatca atcacttttt ctgtgagatc 540
ccagccgtgc tgaagttgtc ttgcacagac acgtcactct atgagaccct gatgtatgcc 600
tgctgcgtgc tgatgctgct tatccctcta tctgtcatct ctgtctccta cacgcacatc 660
ctcctgactg tccacaggat gaactctgct gagggccggc gcaaagcctt tgctacgtgt 720
tcctcccaca ttatggtggt gagcgttttc tacggggcag ccttctacac caacgtgctg 780
ccccactcct accacactcc agagaaagat aaagtggtgt ctgccttcta caccatcctc 840
acccccatgc tcaacccact catctacagc ttgaggaata aagatgtggc tgcagctctg 900
aggaaagtac tagggagatg tggttcctcc cagagcatca gggtggcgac tgtgatcagg 960
aagggctag 969
<210> 31
<211> 318
<212> PRT
<213> Artificial Sequence
<220>
<223> OR2T3
<400> 31
Met Cys Ser Gly Asn Gln Thr Ser Gln Asn Gln Thr Ala Ser Thr Asp
1 5 10 15
Phe Thr Leu Thr Gly Leu Phe Ala Glu Ser Lys His Ala Ala Leu Leu
20 25 30
Tyr Thr Val Thr Phe Leu Leu Phe Leu Met Ala Leu Thr Gly Asn Ala
35 40 45
Leu Leu Ile Leu Leu Ile His Ser Glu Pro Arg Leu His Thr Pro Met
50 55 60
Tyr Phe Phe Ile Ser Gln Leu Ala Leu Met Asp Leu Met Tyr Leu Cys
65 70 75 80
Val Thr Val Pro Lys Met Leu Val Gly Gln Val Thr Gly Asp Asp Thr
85 90 95
Ile Ser Pro Ser Gly Cys Gly Ile Gln Met Phe Phe Tyr Leu Thr Leu
100 105 110
Ala Gly Ala Glu Val Phe Leu Leu Ala Ala Met Ala Tyr Asp Arg Tyr
115 120 125
Ala Ala Val Cys Arg Pro Leu His Tyr Pro Leu Leu Met Asn Gln Arg
130 135 140
Val Cys Gln Leu Leu Val Ser Ala Cys Trp Val Leu Gly Met Val Asp
145 150 155 160
Gly Leu Leu Leu Thr Pro Ile Thr Met Ser Phe Pro Phe Cys Gln Ser
165 170 175
Arg Lys Ile Leu Ser Phe Phe Cys Glu Thr Pro Ala Leu Leu Lys Leu
180 185 190
Ser Cys Ser Asp Val Ser Leu Tyr Lys Thr Leu Met Tyr Leu Cys Cys
195 200 205
Ile Leu Met Leu Leu Ala Pro Ile Met Val Ile Ser Ser Ser Tyr Thr
210 215 220
Leu Ile Leu His Leu Ile His Arg Met Asn Ser Ala Ala Gly His Arg
225 230 235 240
Lys Ala Leu Ala Thr Cys Ser Ser His Met Ile Ile Val Leu Leu Leu
245 250 255
Phe Gly Ala Ser Phe Tyr Thr Tyr Met Leu Pro Ser Ser Tyr His Thr
260 265 270
Ala Glu Gln Asp Met Met Val Ser Ala Phe Tyr Thr Ile Phe Thr Pro
275 280 285
Val Leu Asn Pro Leu Ile Tyr Ser Leu Arg Asn Lys Asp Val Thr Arg
290 295 300
Ala Leu Arg Ser Met Met Gln Ser Arg Met Asn Gln Glu Lys
305 310 315
<210> 32
<211> 957
<212> DNA
<213> Artificial Sequence
<220>
<223> OR2T3
<400> 32
atgtgctcag ggaatcagac ttctcagaat caaacagcaa gcactgattt caccctcacg 60
ggactctttg ctgagagcaa gcatgctgcc ctcctctaca ccgtgacctt ccttcttttc 120
ttgatggccc tcactgggaa tgccctcctc atcctcctca tccactcaga gccccgcctc 180
cacaccccca tgtacttctt catcagccag ctcgcgctca tggatctcat gtacctatgc 240
gtgactgtgc ccaagatgct tgtgggccag gtcactggag atgataccat ttccccgtca 300
ggctgtggga tccagatgtt cttctacctg accctggctg gagctgaggt tttcctcctg 360
gctgccatgg cctatgaccg atatgctgct gtttgcagac ctctccatta cccactgctg 420
atgaaccaga gggtgtgcca gctcctggtg tcagcctgct gggttttggg aatggttgat 480
ggtttgttgc tcacccccat taccatgagc ttcccctttt gccagtctag gaaaatcctg 540
agttttttct gtgagactcc tgccctgctg aagctctcct gctctgacgt ctccctctat 600
aagacgctca tgtacctgtg ctgcatcctc atgcttctcg cccccatcat ggtcatctcc 660
agctcataca ccctcatcct gcatctcatc cacaggatga attctgccgc cggccacagg 720
aaggccttgg ccacctgctc ctcccacatg atcatagtgc tgctgctctt cggtgcttcc 780
ttctacacct acatgctccc gagttcctac cacacagctg agcaggacat gatggtgtct 840
gccttttaca ccatcttcac tcctgtgctg aaccccctca tttacagtct ccgcaacaaa 900
gatgtcacca gggctctgag gagcatgatg cagtcaagaa tgaaccaaga aaagtag 957
<210> 33
<211> 295
<212> PRT
<213> Artificial Sequence
<220>
<223> AQP12A
<400> 33
Met Ala Gly Leu Asn Val Ser Leu Ser Phe Phe Phe Ala Thr Phe Ala
1 5 10 15
Leu Cys Glu Ala Ala Arg Arg Ala Ser Lys Ala Leu Leu Pro Val Gly
20 25 30
Ala Tyr Glu Val Phe Ala Arg Glu Ala Met Arg Thr Leu Val Glu Leu
35 40 45
Gly Pro Trp Ala Gly Asp Phe Gly Pro Asp Leu Leu Leu Thr Leu Leu
50 55 60
Phe Leu Leu Phe Leu Ala His Gly Val Thr Leu Asp Gly Ala Ser Ala
65 70 75 80
Asn Pro Thr Val Ser Leu Gln Glu Phe Leu Met Ala Glu Gln Ser Leu
85 90 95
Pro Gly Thr Leu Leu Lys Leu Ala Ala Gln Gly Leu Gly Met Gln Ala
100 105 110
Ala Cys Thr Leu Met Arg Leu Cys Trp Ala Trp Glu Leu Ser Asp Leu
115 120 125
His Leu Leu Gln Ser Leu Met Ala Gln Ser Cys Ser Ser Ala Leu Arg
130 135 140
Thr Ser Val Pro His Gly Ala Leu Val Glu Ala Ala Cys Ala Phe Cys
145 150 155 160
Phe His Leu Thr Leu Leu His Leu Arg His Ser Pro Pro Ala Tyr Ser
165 170 175
Gly Pro Ala Val Ala Leu Leu Val Thr Val Thr Ala Tyr Thr Ala Gly
180 185 190
Pro Phe Thr Ser Ala Phe Phe Asn Pro Ala Leu Ala Ala Ser Val Thr
195 200 205
Phe Ala Cys Ser Gly His Thr Leu Leu Glu Tyr Val Gln Val Tyr Trp
210 215 220
Leu Gly Pro Leu Thr Gly Met Val Leu Ala Val Leu Leu His Gln Gly
225 230 235 240
Arg Leu Pro His Leu Phe Gln Arg Asn Leu Phe Tyr Gly Gln Lys Asn
245 250 255
Lys Tyr Arg Ala Pro Arg Gly Lys Pro Ala Pro Ala Ser Gly Asp Thr
260 265 270
Gln Thr Pro Ala Lys Gly Ser Ser Val Arg Glu Pro Gly Arg Ser Gly
275 280 285
Val Glu Gly Pro His Ser Ser
290 295
<210> 34
<211> 1097
<212> DNA
<213> Artificial Sequence
<220>
<223> AQP12A
<400> 34
gaaccagcca gctcctgctc tgtcccctca ggtgtcctgc aggcacagct cctcgggggg 60
cccaggccga tggcaggtct taacgtgtcc ctctccttct tctttgccac cttcgccctc 120
tgtgaggcgg ccaggcgggc ctccaaggcc ctgctcccag tgggcgccta tgaagtcttc 180
gcccgggagg cgatgaggac gctggtcgag ctcgggccct gggctgggga ctttgggcct 240
gacctgctgc tcaccctgct cttcctgctc ttcctggcgc acggggtcac cttggacggg 300
gcctcggcca accccactgt gtccctgcag gagttcctca tggccgagca gtctctgcct 360
ggcacgctgt tgaagctggc ggcacagggg ctgggcatgc aggccgcctg caccctgatg 420
cgcctctgct gggcctggga gctcagtgac ctgcacctgc tgcagagcct catggcccag 480
agctgcagct cggccctgcg cacatccgtg ccccacgggg cgcttgtgga ggccgcctgc 540
gccttttgtt tccatctgac cctcctgcac ctgcggcaca gtcctcccgc ctacagcggg 600
cccgctgtgg ctctgttggt caccgtcacg gcctacacgg ccgggccctt cacgtctgcc 660
ttcttcaacc ctgccctggc cgcctctgtg acctttgcct gctcgggaca caccttactg 720
gagtacgtgc aggtgtactg gctgggccct ctgacaggga tggtcctggc tgtgctgctg 780
caccagggcc gccttcccca ccttttccag aggaacctgt tctacggcca gaagaacaag 840
taccgagcac cccgagggaa gccggccccg gcctcagggg acacccagac ccctgcaaag 900
gggtccagtg tccgggagcc tgggcgcagt ggtgttgagg ggccacattc cagctgagtg 960
gccttgctct gtgtgagccc cgtgcgaggg ccctgcttgt agctggaccc tggaaccttc 1020
tgtagctaag agggaatcct ggccccctcc ccagaagcca tttgtcaata aaccatttct 1080
aagaaaaaaa aaaaaaa 1097
<210> 35
<211> 2169
<212> PRT
<213> Artificial Sequence
<220>
<223> MUC4
<400> 35
Met Lys Gly Ala Arg Trp Arg Arg Val Pro Trp Val Ser Leu Ser Cys
1 5 10 15
Leu Cys Leu Cys Leu Leu Pro His Val Val Pro Gly Thr Thr Glu Asp
20 25 30
Thr Leu Ile Thr Gly Ser Lys Thr Pro Ala Pro Val Thr Ser Thr Gly
35 40 45
Ser Thr Thr Ala Thr Leu Glu Gly Gln Ser Thr Ala Ala Ser Ser Arg
50 55 60
Thr Ser Asn Gln Asp Ile Ser Ala Ser Ser Gln Asn His Gln Thr Lys
65 70 75 80
Ser Thr Glu Thr Thr Ser Lys Ala Gln Thr Asp Thr Leu Thr Gln Met
85 90 95
Met Thr Ser Thr Leu Phe Ser Ser Pro Ser Val His Asn Val Met Glu
100 105 110
Thr Val Thr Gln Glu Thr Ala Pro Pro Asp Glu Met Thr Thr Ser Phe
115 120 125
Pro Ser Ser Val Thr Asn Thr Leu Met Met Thr Ser Lys Thr Ile Thr
130 135 140
Met Thr Thr Ser Thr Asp Ser Thr Leu Gly Asn Thr Glu Glu Thr Ser
145 150 155 160
Thr Ala Gly Thr Glu Ser Ser Thr Pro Val Thr Ser Ala Val Ser Ile
165 170 175
Thr Ala Gly Gln Glu Gly Gln Ser Arg Thr Thr Ser Trp Arg Thr Ser
180 185 190
Ile Gln Asp Thr Ser Ala Ser Ser Gln Asn His Trp Thr Arg Ser Thr
195 200 205
Gln Thr Thr Arg Glu Ser Gln Thr Ser Thr Leu Thr His Arg Thr Thr
210 215 220
Ser Thr Pro Ser Phe Ser Pro Ser Val His Asn Val Thr Gly Thr Val
225 230 235 240
Ser Gln Lys Thr Ser Pro Ser Gly Glu Thr Ala Thr Ser Ser Leu Cys
245 250 255
Ser Val Thr Asn Thr Ser Met Met Thr Ser Glu Lys Ile Thr Val Thr
260 265 270
Thr Ser Thr Gly Ser Thr Leu Gly Asn Pro Gly Glu Thr Ser Ser Val
275 280 285
Pro Val Thr Gly Ser Leu Met Pro Val Thr Ser Ala Ala Leu Val Thr
290 295 300
Val Asp Pro Glu Gly Gln Ser Pro Ala Thr Phe Ser Arg Thr Ser Thr
305 310 315 320
Gln Asp Thr Thr Ala Phe Ser Lys Asn His Gln Thr Gln Ser Val Glu
325 330 335
Thr Thr Arg Val Ser Gln Ile Asn Thr Leu Asn Thr Leu Thr Pro Val
340 345 350
Thr Thr Ser Thr Val Leu Ser Ser Pro Ser Gly Phe Asn Pro Ser Gly
355 360 365
Thr Val Ser Gln Glu Thr Phe Pro Ser Gly Glu Thr Thr Ile Ser Ser
370 375 380
Pro Ser Ser Val Ser Asn Thr Phe Leu Val Thr Ser Lys Val Phe Arg
385 390 395 400
Met Pro Ile Ser Arg Asp Ser Thr Leu Gly Asn Thr Glu Glu Thr Ser
405 410 415
Leu Ser Val Ser Gly Thr Ile Ser Ala Ile Thr Ser Lys Val Ser Thr
420 425 430
Ile Trp Trp Ser Asp Thr Leu Ser Thr Ala Leu Ser Pro Ser Ser Leu
435 440 445
Pro Pro Lys Ile Ser Thr Ala Phe His Thr Gln Gln Ser Glu Gly Ala
450 455 460
Glu Thr Thr Gly Arg Pro His Glu Arg Ser Ser Phe Ser Pro Gly Val
465 470 475 480
Ser Gln Glu Ile Phe Thr Leu His Glu Thr Thr Thr Trp Pro Ser Ser
485 490 495
Phe Ser Ser Lys Gly His Thr Thr Trp Ser Gln Thr Glu Leu Pro Ser
500 505 510
Thr Ser Thr Gly Ala Ala Thr Arg Leu Val Thr Gly Asn Pro Ser Thr
515 520 525
Arg Ala Ala Gly Thr Ile Pro Arg Val Pro Ser Lys Val Ser Ala Ile
530 535 540
Gly Glu Pro Gly Glu Pro Thr Thr Tyr Ser Ser His Ser Thr Thr Leu
545 550 555 560
Pro Lys Thr Thr Gly Ala Gly Ala Gln Thr Gln Trp Thr Gln Glu Thr
565 570 575
Gly Thr Thr Gly Glu Ala Leu Leu Ser Ser Pro Ser Tyr Ser Val Ile
580 585 590
Gln Met Ile Lys Thr Ala Thr Ser Pro Ser Ser Ser Pro Met Leu Asp
595 600 605
Arg His Thr Ser Gln Gln Ile Thr Thr Ala Pro Ser Thr Asn His Ser
610 615 620
Thr Ile His Ser Thr Ser Thr Ser Pro Gln Glu Ser Pro Ala Val Ser
625 630 635 640
Gln Arg Gly His Thr Arg Ala Pro Gln Thr Thr Gln Glu Ser Gln Thr
645 650 655
Thr Arg Ser Val Ser Pro Met Thr Asp Thr Lys Thr Val Thr Thr Pro
660 665 670
Gly Ser Ser Phe Thr Ala Ser Gly His Ser Pro Ser Glu Ile Val Pro
675 680 685
Gln Asp Ala Pro Thr Ile Ser Ala Ala Thr Thr Phe Ala Pro Ala Pro
690 695 700
Thr Gly Asn Gly His Thr Thr Gln Ala Pro Thr Thr Ala Leu Gln Ala
705 710 715 720
Ala Pro Ser Ser His Asp Ala Thr Leu Gly Pro Ser Gly Gly Thr Ser
725 730 735
Leu Ser Lys Thr Gly Ala Leu Thr Leu Ala Asn Ser Val Val Ser Thr
740 745 750
Pro Gly Gly Pro Glu Gly Gln Trp Thr Ser Ala Ser Ala Ser Thr Ser
755 760 765
Pro Asp Thr Ala Ala Ala Met Thr His Thr His Gln Ala Glu Ser Thr
770 775 780
Glu Ala Ser Gly Gln Thr Gln Thr Ser Glu Pro Ala Ser Ser Gly Ser
785 790 795 800
Arg Thr Thr Ser Ala Gly Thr Ala Thr Pro Ser Ser Ser Gly Ala Ser
805 810 815
Gly Thr Thr Pro Ser Gly Ser Glu Gly Ile Ser Thr Ser Gly Glu Thr
820 825 830
Thr Arg Phe Ser Ser Asn Pro Ser Arg Asp Ser His Thr Thr Gln Ser
835 840 845
Thr Thr Glu Leu Leu Ser Ala Ser Ala Ser His Gly Ala Ile Pro Val
850 855 860
Ser Thr Gly Met Ala Ser Ser Ile Val Pro Gly Thr Phe His Pro Thr
865 870 875 880
Leu Ser Glu Ala Ser Thr Ala Gly Arg Pro Thr Gly Gln Ser Ser Pro
885 890 895
Thr Ser Pro Ser Ala Ser Pro Gln Glu Thr Ala Ala Ile Ser Arg Met
900 905 910
Ala Gln Thr Gln Arg Thr Gly Thr Ser Arg Gly Ser Asp Thr Ile Ser
915 920 925
Leu Ala Ser Gln Ala Thr Asp Thr Phe Ser Thr Val Pro Pro Thr Pro
930 935 940
Pro Ser Ile Thr Ser Ser Gly Leu Thr Ser Pro Gln Thr Gln Thr His
945 950 955 960
Thr Leu Ser Pro Ser Gly Ser Gly Lys Thr Phe Thr Thr Ala Leu Ile
965 970 975
Ser Asn Ala Thr Pro Leu Pro Val Thr Ser Thr Ser Ser Ala Ser Thr
980 985 990
Gly His Ala Thr Pro Leu Ala Val Ser Ser Ala Thr Ser Ala Ser Thr
995 1000 1005
Val Ser Ser Asp Ser Pro Leu Lys Met Glu Thr Ser Gly Met Thr Thr
1010 1015 1020
Pro Ser Leu Lys Thr Asp Gly Gly Arg Arg Thr Ala Thr Ser Pro Pro
1025 1030 1035 1040
Pro Thr Thr Ser Gln Thr Ile Ile Ser Thr Ile Pro Ser Thr Ala Met
1045 1050 1055
His Thr Arg Ser Thr Ala Ala Pro Ile Pro Ile Leu Pro Glu Arg Gly
1060 1065 1070
Val Ser Leu Phe Pro Tyr Gly Ala Asp Ala Gly Asp Leu Glu Phe Val
1075 1080 1085
Arg Arg Thr Val Asp Phe Thr Ser Pro Leu Phe Lys Pro Ala Thr Gly
1090 1095 1100
Phe Pro Leu Gly Ser Ser Leu Arg Asp Ser Leu Tyr Phe Thr Asp Asn
1105 1110 1115 1120
Gly Gln Ile Ile Phe Pro Glu Ser Asp Tyr Gln Ile Phe Ser Tyr Pro
1125 1130 1135
Asn Pro Leu Pro Thr Gly Phe Thr Gly Arg Asp Pro Val Ala Leu Val
1140 1145 1150
Ala Pro Phe Trp Asp Asp Ala Asp Phe Ser Thr Gly Arg Gly Thr Thr
1155 1160 1165
Phe Tyr Gln Glu Tyr Glu Thr Phe Tyr Gly Glu His Ser Leu Leu Val
1170 1175 1180
Gln Gln Ala Glu Ser Trp Ile Arg Lys Ile Thr Asn Asn Gly Gly Tyr
1185 1190 1195 1200
Lys Ala Arg Trp Ala Leu Lys Val Thr Trp Val Asn Ala His Ala Tyr
1205 1210 1215
Pro Ala Gln Trp Thr Leu Gly Ser Asn Thr Tyr Gln Ala Ile Leu Ser
1220 1225 1230
Thr Asp Gly Ser Arg Ser Tyr Ala Leu Phe Leu Tyr Gln Ser Gly Gly
1235 1240 1245
Met Gln Trp Asp Val Ala Gln Arg Ser Gly Lys Pro Val Leu Met Gly
1250 1255 1260
Phe Ser Ser Gly Asp Gly Phe Phe Glu Asn Ser Pro Leu Met Ser Gln
1265 1270 1275 1280
Pro Val Trp Glu Arg Tyr Arg Pro Asp Arg Phe Leu Asn Ser Asn Ser
1285 1290 1295
Gly Leu Gln Gly Leu Gln Phe Tyr Gly Leu His Arg Glu Glu Arg Pro
1300 1305 1310
Asn Tyr Arg Leu Glu Cys Leu Gln Trp Leu Lys Ser Gln Pro Arg Trp
1315 1320 1325
Pro Ser Trp Gly Trp Asn Gln Val Ser Cys Pro Cys Ser Trp Gln Gln
1330 1335 1340
Gly Arg Arg Asp Leu Arg Phe Gln Pro Val Ser Ile Gly Arg Trp Gly
1345 1350 1355 1360
Leu Gly Ser Arg Gln Leu Cys Ser Phe Thr Ser Trp Arg Gly Gly Val
1365 1370 1375
Cys Cys Ser Tyr Gly Pro Trp Gly Glu Phe Arg Glu Gly Trp His Val
1380 1385 1390
Gln Arg Pro Trp Gln Leu Ala Gln Glu Leu Glu Pro Gln Ser Trp Cys
1395 1400 1405
Cys Arg Trp Asn Asp Lys Pro Tyr Leu Cys Ala Leu Tyr Gln Gln Arg
1410 1415 1420
Arg Pro His Val Gly Cys Ala Thr Tyr Arg Pro Pro Gln Pro Ala Trp
1425 1430 1435 1440
Met Phe Gly Asp Pro His Ile Thr Thr Leu Asp Gly Val Ser Tyr Thr
1445 1450 1455
Phe Asn Gly Leu Gly Asp Phe Leu Leu Val Gly Ala Gln Asp Gly Asn
1460 1465 1470
Ser Ser Phe Leu Leu Gln Gly Arg Thr Ala Gln Thr Gly Ser Ala Gln
1475 1480 1485
Ala Thr Asn Phe Ile Ala Phe Ala Ala Gln Tyr Arg Ser Ser Ser Leu
1490 1495 1500
Gly Pro Val Thr Val Gln Trp Leu Leu Glu Pro His Asp Ala Ile Arg
1505 1510 1515 1520
Val Leu Leu Asp Asn Gln Thr Val Thr Phe Gln Pro Asp His Glu Asp
1525 1530 1535
Gly Gly Gly Gln Glu Thr Phe Asn Ala Thr Gly Val Leu Leu Ser Arg
1540 1545 1550
Asn Gly Ser Glu Ala Ser Ala Ser Phe Asp Gly Trp Ala Thr Val Ser
1555 1560 1565
Val Ile Ala Leu Ser Asn Ile Leu His Ser Ser Ala Ser Leu Pro Pro
1570 1575 1580
Glu Tyr Gln Asn Arg Thr Glu Gly Leu Leu Gly Val Trp Asn Asn Asn
1585 1590 1595 1600
Pro Glu Asp Asp Phe Arg Met Pro Asn Gly Ser Thr Ile Pro Pro Gly
1605 1610 1615
Ser Pro Glu Glu Met Leu Phe His Phe Gly Met Thr Trp Gln Ile Asn
1620 1625 1630
Gly Thr Gly Leu Leu Gly Lys Arg Asn Asp Gln Leu Pro Ser Asn Phe
1635 1640 1645
Thr Pro Val Phe Tyr Ser Gln Leu Gln Lys Asn Ser Ser Trp Ala Glu
1650 1655 1660
His Leu Ile Ser Asn Cys Asp Gly Asp Ser Ser Cys Ile Tyr Asp Thr
1665 1670 1675 1680
Leu Ala Leu Arg Asn Ala Ser Ile Gly Leu His Thr Arg Glu Val Ser
1685 1690 1695
Lys Asn Tyr Glu Gln Ala Asn Ala Thr Leu Asn Gln Tyr Pro Pro Ser
1700 1705 1710
Ile Asn Gly Gly Arg Val Ile Glu Ala Tyr Lys Gly Gln Thr Thr Leu
1715 1720 1725
Ile Gln Tyr Thr Ser Asn Ala Glu Asp Ala Asn Phe Thr Leu Arg Asp
1730 1735 1740
Ser Cys Thr Asp Leu Glu Leu Phe Glu Asn Gly Thr Leu Leu Trp Thr
1745 1750 1755 1760
Pro Lys Ser Leu Glu Pro Phe Thr Leu Glu Ile Leu Ala Arg Ser Ala
1765 1770 1775
Lys Ile Gly Leu Ala Ser Ala Leu Gln Pro Arg Thr Val Val Cys His
1780 1785 1790
Cys Asn Ala Glu Ser Gln Cys Leu Tyr Asn Gln Thr Ser Arg Val Gly
1795 1800 1805
Asn Ser Ser Leu Glu Val Ala Gly Cys Lys Cys Asp Gly Gly Thr Phe
1810 1815 1820
Gly Arg Tyr Cys Glu Gly Ser Glu Asp Ala Cys Glu Glu Pro Cys Phe
1825 1830 1835 1840
Pro Ser Val His Cys Val Pro Gly Lys Gly Cys Glu Ala Cys Pro Pro
1845 1850 1855
Asn Leu Thr Gly Asp Gly Arg His Cys Ala Ala Leu Gly Ser Ser Phe
1860 1865 1870
Leu Cys Gln Asn Gln Ser Cys Pro Val Asn Tyr Cys Tyr Asn Gln Gly
1875 1880 1885
His Cys Tyr Ile Ser Gln Thr Leu Gly Cys Gln Pro Met Cys Thr Cys
1890 1895 1900
Pro Pro Ala Phe Thr Asp Ser Arg Cys Phe Leu Ala Gly Asn Asn Phe
1905 1910 1915 1920
Ser Pro Thr Val Asn Leu Glu Leu Pro Leu Arg Val Ile Gln Leu Leu
1925 1930 1935
Leu Ser Glu Glu Glu Asn Ala Ser Met Ala Glu Val Asn Ala Ser Val
1940 1945 1950
Ala Tyr Arg Leu Gly Thr Leu Asp Met Arg Ala Phe Leu Arg Asn Ser
1955 1960 1965
Gln Val Glu Arg Ile Asp Ser Ala Ala Pro Ala Ser Gly Ser Pro Ile
1970 1975 1980
Gln His Trp Met Val Ile Ser Glu Phe Gln Tyr Arg Pro Arg Gly Pro
1985 1990 1995 2000
Val Ile Asp Phe Leu Asn Asn Gln Leu Leu Ala Ala Val Val Glu Ala
2005 2010 2015
Phe Leu Tyr His Val Pro Arg Arg Ser Glu Glu Pro Arg Asn Asp Val
2020 2025 2030
Val Phe Gln Pro Ile Ser Gly Glu Asp Val Arg Asp Val Thr Ala Leu
2035 2040 2045
Asn Val Ser Thr Leu Lys Ala Tyr Phe Arg Cys Asp Gly Tyr Lys Gly
2050 2055 2060
Tyr Asp Leu Val Tyr Ser Pro Gln Ser Gly Phe Thr Cys Val Ser Pro
2065 2070 2075 2080
Cys Ser Arg Gly Tyr Cys Asp His Gly Gly Gln Cys Gln His Leu Pro
2085 2090 2095
Ser Gly Pro Arg Cys Ser Cys Val Ser Phe Ser Ile Tyr Thr Ala Trp
2100 2105 2110
Gly Glu His Cys Glu His Leu Ser Met Lys Leu Asp Ala Phe Phe Gly
2115 2120 2125
Ile Phe Phe Gly Ala Leu Gly Gly Leu Leu Leu Leu Gly Val Gly Thr
2130 2135 2140
Phe Val Val Leu Arg Phe Trp Gly Cys Ser Gly Ala Arg Phe Ser Tyr
2145 2150 2155 2160
Phe Leu Asn Ser Ala Glu Ala Leu Pro
2165
<210> 36
<211> 9579
<212> DNA
<213> Artificial Sequence
<220>
<223> MUC4
<400> 36
gtctgctcct cacactgcag ctgctgggcc gtggagcttc cccgggagcc agggggactt 60
ttgccgcagc catgaagggg gcacgctgga ggagggtccc ctgggtgtcc ctgagctgcc 120
tgtgtctctg cctccttccg catgtggtcc caggtaagtg atgnnnnnnn nnnnnnnnnn 180
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240
nnnnnnnnnn nnnnnnnnnn nnnttcactc caggaaccac agaggacaca ttaataactg 300
gaagtaaaac tcctgcccca gtcacctcaa caggctcaac aacagcgaca ctagagggac 360
aatcaactgc agcttcttca aggacctcta atcaggacat atcagcttca tctcagaacc 420
accagactaa gagcacggag accaccagca aagctcaaac cgacaccctc acgcagatga 480
tgacatcaac tcttttttct tccccaagtg tacacaatgt gatggagact gttacgcagg 540
agacagctcc tccagatgaa atgaccacat catttccctc cagtgtcacc aacacactca 600
tgatgacatc aaagactata acaatgacaa cctccacaga ctccactctt ggaaacacag 660
aagagacatc aacagcagga actgaaagtt ctaccccagt gacctcagca gtctcaataa 720
cagctggaca ggaaggacaa tcacgaacaa cttcctggag gacctctatc caagacacat 780
cagcttcttc tcagaaccac tggactcgga gcacgcagac caccagggaa tctcaaacca 840
gcaccctaac acacagaacc acttcaactc cttctttctc tccaagtgta cacaatgtga 900
cagggactgt ttctcagaag acatctcctt caggtgaaac agctacctca tccctctgta 960
gtgtcacaaa cacatccatg atgacatcag agaagataac agtgacaacc tccacaggct 1020
ccactcttgg aaacccaggg gagacatcat cagtacctgt tactggaagt cttatgccag 1080
tcacctcagc agccttagta acagttgatc cagaaggaca atcaccagca actttctcaa 1140
ggacttctac tcaggacaca acagcttttt ctaagaacca ccagactcag agcgtggaga 1200
ccaccagagt atctcaaatc aacaccctca acaccctcac accggttaca acatcaactg 1260
ttttatcctc accaagtgga ttcaacccaa gtggaacagt ttctcaggag acattccctt 1320
ctggtgaaac aaccatctca tccccttcca gtgtcagcaa tacattcctg gtaacatcaa 1380
aggtgttcag aatgccaatc tccagagact ctactcttgg aaacacagag gagacatcac 1440
tatctgtaag tggaaccatt tctgcaatca cttccaaagt ttcaaccata tggtggtcag 1500
acactctgtc aacagcactc tcccccagtt ctctacctcc aaaaatatcc acagctttcc 1560
acacccagca gagtgaaggt gcagagacca caggacggcc tcatgagagg agctcattct 1620
ctccaggtgt gtctcaagaa atatttactc tacatgaaac aacaacatgg ccttcctcat 1680
tctccagcaa aggccacaca acttggtcac aaacagaact gccctcaaca tcaacaggtg 1740
ctgccactag gcttgtcaca ggaaatccat ctacaagggc agctggcact attccaaggg 1800
tcccctctaa ggtctcagca ataggggaac caggagagcc caccacatac tcctcccaca 1860
gcacaactct cccaaaaaca acaggggcag gcgcccagac acaatggaca caagaaacgg 1920
ggaccactgg agaggctctt ctcagcagcc caagctatag tgtgattcag atgataaaaa 1980
cggccacatc cccatcttct tcacctatgc tggatagaca cacatcacaa caaattacaa 2040
cggcaccatc aacaaatcat tcaacaatac attccacaag cacctctcct caggaatcac 2100
cagctgtttc ccaaaggggt cacactcgag ccccgcagac cacacaagaa tcacaaacca 2160
cgaggtccgt ctcccccatg actgacacca agacagtcac caccccaggt tcttccttca 2220
cagccagtgg gcactcgccc tcagaaattg ttcctcagga cgcacccacc ataagtgcag 2280
caacaacctt tgccccagct cccaccggga atggtcacac aacccaggcc ccgaccacag 2340
cactgcaggc agcacccagc agccatgatg ccaccctggg gccctcagga ggcacgtcac 2400
tttccaaaac aggtgccctt actctggcca actctgtagt gtcaacacca gggggcccag 2460
aaggacaatg gacatcagcc tctgccagca cctcacctga cacagcagca gccatgaccc 2520
atacccacca ggctgagagc acagaggcct ctggacaaac acagaccagc gaaccggcct 2580
cctcagggtc acgaaccacc tcagcgggca cagctacccc ttcctcatcc ggggcgagtg 2640
gcacaacacc ttcaggaagc gaaggaatat ccacctcagg agagacgaca aggttttcat 2700
caaacccctc cagggacagt cacacaaccc agtcaacaac cgaattgctg tccgcctcag 2760
ccagtcatgg tgccatccca gtaagcacag gaatggcgtc ttcgatcgtc cccggcacct 2820
ttcatcccac cctctctgag gcctccactg cagggagacc gacaggacag tcaagcccaa 2880
cttctcccag tgcctctcct caggagacag ccgccatttc ccggatggcc cagactcaga 2940
ggacaggaac cagcagaggg tctgacacta tcagcctggc gtcccaggca accgacacct 3000
tctcaacagt cccacccaca cctccatcga tcacatccag tgggcttaca tctccacaaa 3060
cccagaccca cactctgtca ccttcagggt ctggtaaaac cttcaccacg gccctcatca 3120
gcaacgccac ccctcttcct gtcaccagca cctcctcagc ctccacaggt cacgccaccc 3180
ctcttgctgt cagcagtgct acctcagctt ccacagtatc ctcggactcc cctctgaaga 3240
tggaaacatc aggtagctgc cannnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3300
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3360
nnacgtgtcc aggaatgaca acaccgtcac tgaagacaga cggtgggaga cgcacagcca 3420
catcaccacc ccccacaacc tcccagacca tcatttccac cattcccagc actgccatgc 3480
acacccgctc cacagctgcc cccatcccca tcctgcctga gagaggtgag gccatnnnnn 3540
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3600
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnncctgg cccaggagtt tccctcttcc 3660
cctatggggc agacgccggg gacctggagt tcgtcaggag gaccgtggac ttcacctccc 3720
cactcttcaa gccggcgact ggcttccccc ttggctcctc tctccgtgat tccctctacg 3780
tgagtccggn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3840
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnna tgtgctcagt 3900
tcacagacaa tggccagatc atcttcccag agtcagacta ccagattttc tcctacccca 3960
acccactccc aacaggcttc acaggccggg accctgtggc cctggtggct ccgttctggg 4020
acgatgctga cttctccact ggtcggggga ccacatttta tcaggtgagc ctttnnnnnn 4080
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4140
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnncctttc ctaggaatac gagacgttct 4200
atggtgaaca cagcctgcta gtccagcagg ccgagtcttg gattagaaag atcacaaaca 4260
acgggggcta caaggccagg tgggccctaa aggtcacgtg ggtcaatgcc cacgcctatc 4320
ctgcccagtg gaccctcggg gtgagtagac nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4380
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4440
nnnnnnnnnn ccacccccag agcaacacct accaagccat cctctccacg gacgggagca 4500
ggtcctatgc cctgtttctc taccagagcg gtgggatgca gtgggacgtg gcccagcgct 4560
caggcaagcc ggtgctcatg ggcttctcta ggtaggatgg gnnnnnnnnn nnnnnnnnnn 4620
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4680
nnnnnnnnnn nnnnnnnnnn ntttcctgca gtggagatgg ctttttcgaa aacagcccac 4740
tgatgtccca gccagtgtgg gagaggtatc gccctgatag attcctgaat tccaactcag 4800
gtaaaagtgc nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4860
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ccgacctcag 4920
gcctccaagg gctgcagttc tacgggctac accgggaaga aaggcccaac taccgtctcg 4980
agtgcctgca gtggctgaag agccagcctc ggtggcccag ctggggctgg aaccaggtct 5040
cctgcccttg ttcctggcag cagggacgac gggacttacg attccaaccc gtcagcatag 5100
gtgacacctc nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5160
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn cttgtttcag 5220
gtcgctgggg cctcggcagt aggcagctgt gcagcttcac ctcttggcga ggaggcgtgt 5280
gctgcagcta cgggccctgg ggagagtttc gtgaaggctg gcacgtgcag cgtccttggc 5340
agttgggtga tctcaannnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5400
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnttct 5460
ccgcagccca ggaactggag ccacagagct ggtgctgccg ctggaatgac aagccctacc 5520
tctgtgccct gtaccagcag aggcggcccc acgtgggctg tgctacatac aggcccccac 5580
agcccggtga gcgacannnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5640
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnttcc 5700
ttccagcctg gatgttcggg gacccccaca tcaccacctt ggatggtgtc agttacacct 5760
tcaatgggct gggggacttc ctgctggtcg gggcccaaga cgggaactcc tccttcctgc 5820
ttcagggccg caccgcccag actggctcag cccaggccac caacttcatc gcctttgcgg 5880
ctcagtaccg ctccagcagc ctgggccccg tcacggtgag tgaggnnnnn nnnnnnnnnn 5940
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6000
nnnnnnnnnn nnnnnnnnnn nnnnnctcct tccaggtcca atggctcctt gagcctcacg 6060
acgcaatccg tgtcctgctg gataaccaga ctgtgacatt tcagcctgac catgaagacg 6120
gcggaggtag gttgggnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6180
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnngatg 6240
ctccaggcca ggagacgttc aacgccaccg gagtcctcct gagccgcaac ggctctgagg 6300
cctccgccag cttcgacggc tgggccaccg tctcggtgat cgcgctctcc aacatcctcc 6360
actcctccgc cagcctcccg cccgagtacc agaaccgcac ggaggggctc ctgggtgagg 6420
gcggnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6480
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnntgtccc tcaggggtct 6540
ggaataacaa tccagaggac gacttcagga tgcccaatgg ctccaccatt cccccaggga 6600
gccctgagga gatgcttttc cactttggaa tgacctgtga gtctggnnnn nnnnnnnnnn 6660
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6720
nnnnnnnnnn nnnnnnnnnn nnnnnntgtg ttacagggca gatcaacggg acaggcctcc 6780
ttggcaagag gaatgaccag ctgccttcca acttcacccc tgttttctac tcacaactgc 6840
aaaaaaacag ctcctgggct gaacatttga tctccaactg tgacggagat agctcatgca 6900
tctatgacac cctggccctg cgcaacgcaa gcatcggact tcacacgagg gaagtcagta 6960
aaaactacga gcaggcgaac gccaccctca gtaagtggcc nnnnnnnnnn nnnnnnnnnn 7020
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7080
nnnnnnnnnn nnnnnnnnnn tgtgtttcag atcagtaccc gccctccatc aatggtggtc 7140
gtgtgattga agcctacaag gggcagacca cgctgattca gtacaccagc aatgctgagg 7200
atgccaactt cacgctcaga gacagctgca ccgacttgga gctctttggt aggactatnn 7260
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7320
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnncc cggggcagag aatgggacgt 7380
tgctgtggac acccaagtcg ctggagccat tcactctgga gattctagca agaagtgcca 7440
agattggctt ggcatctgca ctccagccca ggactgtggt ctgccattgc aatgcagaga 7500
gccagtgttt gtacaatcag accagcaggg tgggcaactc ctccctggag gtgagtgttg 7560
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7620
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn cctcctccag gtggctggct 7680
gcaagtgtga cgggggcacc ttcggccgct actgcgaggg ctccgaggat gcctgtgagg 7740
agccgtgctt cccgagtgtc cactgcgttc ctgggaaggg ctgcgaggcc tgccctccaa 7800
acctgactgg ggatgggcgg cactgtgcgg gtgagccggg nnnnnnnnnn nnnnnnnnnn 7860
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7920
nnnnnnnnnn nnnnnnnnnn cactctgcag ctctggggag ctctttcctg tgtcagaacc 7980
agtcctgccc tgtgaattac tgctacaatc aaggccactg ctacatctcc cagactctgg 8040
gctgtcagcc catgtgcacc tgccccccag ccttcactga cagccgctgc ttcctggctg 8100
ggaacaactt cagtccaact gtcaacctag gtaccgccag nnnnnnnnnn nnnnnnnnnn 8160
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8220
nnnnnnnnnn nnnnnnnnnn ccatctccag aacttccctt aagagtcatc cagctcttgc 8280
tcagtgaaga ggaaaatgcc tccatggcag aggtcaacgc ctcggtcagt gctgnnnnnn 8340
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8400
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnntctaac ctaggtggca tacagactgg 8460
ggaccctgga catgcgggcc tttctccgca acagccaagt ggaacgaatg taagtgggan 8520
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8580
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt cccacacagc gattctgcag 8640
caccggcctc gggaagcccc atccaacact ggatggtcat ctcggagttc cagtaccgcc 8700
ctcggggccc ggtcattgac ttcctgaaca accagctgct ggccgcggtg gtggaggcgt 8760
tcttatacca cgttccacgg aggagtgagg agcccaggaa cgacgtggtc ttccagccca 8820
tttccgggga agacgtgcgc gatgtgacag cccgtgagtc cgtnnnnnnn nnnnnnnnnn 8880
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8940
nnnnnnnnnn nnnnnnnnnn nnnttcccga cagtgaacgt gagcacgctg aaggcttact 9000
tcagatgcga tggctacaag ggctacgacc tggtctacag cccccagagc ggcttcacct 9060
gcgtgtcccc gtgcagtagg ggctactgtg accatggagg ccagtgccag cacctgccca 9120
gtgggccccg ctgcaggtgc atagggnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9180
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9240
nnnnnntggt caccagctgt gtgtccttct ccatctacac ggcctggggc gagcactgtg 9300
agcacctgag catgaaactc gacgcgttct tcggcatctt ctttggggcc ctgggcggcc 9360
tcttgctgct gggggtcggg acgttcgtgg tcctgcgctt ctggggttgc tccggggcca 9420
ggttctccta tttcctgaac tcagctgagg ccttgccttg aaggggcagc tgtggcctag 9480
gctacctcaa gactcacctc atccttaccg cacatttaag gcgccattgc ttttgggaga 9540
ctggaaaagg gaaggtgact gaaggctgtc aggattctt 9579
<210> 37
<211> 530
<212> PRT
<213> Artificial Sequence
<220>
<223> USP17L17
<400> 37
Met Glu Asp Asp Ser Leu Tyr Leu Gly Gly Glu Trp Gln Phe Asn His
1 5 10 15
Phe Ser Lys Leu Thr Ser Ser Arg Pro Asp Ala Ala Phe Ala Glu Ile
20 25 30
Gln Arg Thr Ser Leu Pro Glu Lys Ser Pro Leu Ser Cys Glu Thr Arg
35 40 45
Val Asp Leu Cys Asp Asp Leu Ala Pro Val Ala Arg Gln Leu Ala Pro
50 55 60
Arg Glu Lys Leu Pro Leu Ser Ser Arg Arg Pro Ala Ala Val Gly Ala
65 70 75 80
Gly Leu Gln Asn Met Gly Asn Thr Cys Tyr Val Asn Ala Ser Leu Gln
85 90 95
Cys Leu Thr Tyr Thr Pro Pro Leu Ala Asn Tyr Met Leu Ser Arg Glu
100 105 110
His Ser Gln Thr Cys His Arg His Lys Gly Cys Met Leu Cys Thr Met
115 120 125
Gln Ala His Ile Thr Arg Ala Leu His Asn Pro Gly His Val Ile Gln
130 135 140
Pro Ser Gln Ala Leu Ala Ala Gly Phe His Arg Gly Lys Gln Glu Asp
145 150 155 160
Ala His Glu Phe Leu Met Phe Thr Val Asp Ala Met Lys Lys Ala Cys
165 170 175
Leu Pro Gly His Lys Gln Val Asp His His Ser Lys Asp Thr Thr Leu
180 185 190
Ile His Gln Ile Phe Gly Gly Tyr Trp Arg Ser Gln Ile Lys Cys Leu
195 200 205
His Cys His Gly Ile Ser Asp Thr Phe Asp Pro Tyr Leu Asp Ile Ala
210 215 220
Leu Asp Ile Gln Ala Ala Gln Ser Val Gln Gln Ala Leu Glu Gln Leu
225 230 235 240
Val Lys Pro Glu Glu Leu Asn Gly Glu Asn Ala Tyr His Cys Gly Val
245 250 255
Cys Leu Gln Arg Ala Pro Ala Ser Lys Thr Leu Thr Leu His Thr Ser
260 265 270
Ala Lys Val Leu Ile Leu Val Leu Lys Arg Phe Ser Asp Val Thr Gly
275 280 285
Asn Lys Ile Ala Lys Asn Val Gln Tyr Pro Glu Cys Leu Asp Met Gln
290 295 300
Pro Tyr Met Ser Gln Gln Asn Thr Gly Pro Leu Val Tyr Val Leu Tyr
305 310 315 320
Ala Val Leu Val His Ala Gly Trp Ser Cys His Asn Gly His Tyr Phe
325 330 335
Ser Tyr Val Lys Ala Gln Glu Gly Gln Trp Tyr Lys Met Asp Asp Ala
340 345 350
Glu Val Thr Ala Ala Ser Ile Thr Ser Val Leu Ser Gln Gln Ala Tyr
355 360 365
Val Leu Phe Tyr Ile Gln Lys Ser Glu Trp Glu Arg His Ser Glu Ser
370 375 380
Val Ser Arg Gly Arg Glu Pro Arg Ala Leu Gly Ala Glu Asp Thr Asp
385 390 395 400
Arg Arg Ala Thr Gln Gly Glu Leu Lys Arg Asp His Pro Cys Leu Gln
405 410 415
Ala Pro Glu Leu Asp Glu His Leu Val Glu Arg Ala Thr Gln Glu Ser
420 425 430
Thr Leu Asp His Trp Lys Phe Leu Gln Glu Gln Asn Lys Thr Lys Pro
435 440 445
Glu Phe Asn Val Arg Lys Val Glu Gly Thr Leu Pro Pro Asp Val Leu
450 455 460
Val Ile His Gln Ser Lys Tyr Lys Cys Gly Met Lys Asn His His Pro
465 470 475 480
Glu Gln Gln Ser Ser Leu Leu Asn Leu Ser Ser Ser Thr Pro Thr His
485 490 495
Gln Glu Ser Met Asn Thr Gly Thr Leu Ala Ser Leu Arg Gly Arg Ala
500 505 510
Arg Arg Ser Lys Gly Lys Asn Lys His Ser Lys Arg Ala Leu Leu Val
515 520 525
Cys Gln
530
<210> 38
<211> 1593
<212> DNA
<213> Artificial Sequence
<220>
<223> USP17L17
<400> 38
atggaggacg actcactcta cttgggaggt gagtggcagt tcaaccactt ttcaaaactc 60
acatcttctc ggcccgatgc agcttttgct gaaatccagc ggacttctct ccctgagaag 120
tcaccactct catgtgagac ccgtgtcgac ctctgtgatg atttggctcc tgtggcaaga 180
cagcttgctc ccagggagaa gcttcctctg agtagcagga gacctgctgc ggtgggggct 240
gggctccaga atatgggaaa tacctgctac gtgaacgctt ccttgcagtg cctgacatac 300
acaccgcccc ttgccaacta catgctgtcc cgggagcact ctcaaacgtg tcatcgtcac 360
aagggctgca tgctctgtac gatgcaagct cacatcacac gggccctcca caatcctggc 420
cacgtcatcc agccctcaca ggcattggct gctggcttcc atagaggcaa gcaggaagat 480
gcccatgaat ttctcatgtt cactgtggat gccatgaaaa aggcatgcct tcccgggcac 540
aagcaggtag atcatcactc taaggacacc accctcatcc accaaatatt tggaggctac 600
tggagatctc aaatcaagtg tctccactgc cacggcattt cagacacttt tgacccttac 660
ctggacatcg ccctggatat ccaggcagct cagagtgtcc agcaagcttt ggaacagttg 720
gtgaagcccg aagaactcaa tggagagaat gcctatcatt gtggtgtttg tctccagagg 780
gcgccggcct ccaagacgtt aactttacac acctctgcca aggtcctcat ccttgtattg 840
aagagattct ccgatgtgac aggcaacaag attgccaaga atgtgcaata tcctgagtgc 900
cttgacatgc agccatacat gtctcagcag aacacaggac ctcttgtcta tgtcctctat 960
gctgtgctgg tccacgctgg gtggagttgt cacaacggac attacttctc ttatgtcaaa 1020
gctcaagaag gccaatggta taaaatggat gatgccgagg tcaccgccgc tagcatcact 1080
tctgtcctga gtcaacaggc ctacgtcctc ttttacatcc agaagagtga atgggaaaga 1140
cacagtgaga gtgtgtcaag aggcagggaa ccaagagccc ttggcgcaga agacacagac 1200
aggcgagcaa cgcaaggaga gctcaagaga gaccacccct gcctccaggc ccccgagttg 1260
gacgagcact tggtggaaag agccactcag gaaagcacct tagaccactg gaaattcctt 1320
caagagcaaa acaaaacgaa gcctgagttc aacgtcagaa aagtcgaagg taccctgcct 1380
cccgacgtac ttgtgattca tcaatcaaaa tacaagtgtg ggatgaagaa ccatcatcct 1440
gaacagcaaa gctccctgct aaacctctct tcgtcgaccc cgacacatca ggagtccatg 1500
aacactggca cactcgcttc cctgcgaggg agggccagga gatccaaagg gaagaacaaa 1560
cacagcaaga gggctctgct tgtgtgccag tga 1593
<210> 39
<211> 530
<212> PRT
<213> Artificial Sequence
<220>
<223> USP17L18
<400> 39
Met Glu Asp Asp Ser Leu Tyr Leu Gly Gly Glu Trp Gln Phe Asn His
1 5 10 15
Phe Ser Lys Leu Thr Ser Ser Arg Pro Asp Ala Ala Phe Ala Glu Ile
20 25 30
Gln Arg Thr Ser Leu Pro Glu Lys Ser Pro Leu Ser Cys Glu Thr Arg
35 40 45
Val Asp Leu Cys Asp Asp Leu Ala Pro Val Ala Arg Gln Leu Ala Pro
50 55 60
Arg Glu Lys Leu Pro Leu Ser Ser Arg Arg Pro Ala Ala Val Gly Ala
65 70 75 80
Gly Leu Gln Asn Met Gly Asn Thr Cys Tyr Val Asn Ala Ser Leu Gln
85 90 95
Cys Leu Thr Tyr Thr Pro Pro Leu Ala Asn Tyr Met Leu Ser Arg Glu
100 105 110
His Ser Gln Thr Cys His Arg His Lys Gly Cys Met Leu Cys Thr Met
115 120 125
Gln Ala His Ile Thr Arg Ala Leu His Asn Pro Gly His Val Ile Gln
130 135 140
Pro Ser Gln Ala Leu Ala Ala Gly Phe His Arg Gly Lys Gln Glu Asp
145 150 155 160
Ala His Glu Phe Leu Met Phe Thr Val Asp Ala Met Lys Lys Ala Cys
165 170 175
Leu Pro Gly His Lys Gln Val Asp His His Ser Lys Asp Thr Thr Leu
180 185 190
Ile His Gln Ile Phe Gly Gly Tyr Trp Arg Ser Gln Ile Lys Cys Leu
195 200 205
His Cys His Gly Ile Ser Asp Thr Phe Asp Pro Tyr Leu Asp Ile Ala
210 215 220
Leu Asp Ile Gln Ala Ala Gln Ser Val Gln Gln Ala Leu Glu Gln Leu
225 230 235 240
Val Lys Pro Glu Glu Leu Asn Gly Glu Asn Ala Tyr His Cys Gly Val
245 250 255
Cys Leu Gln Arg Ala Pro Ala Ser Lys Thr Leu Thr Leu His Thr Ser
260 265 270
Ala Lys Val Leu Ile Leu Val Leu Lys Arg Phe Ser Asp Val Thr Gly
275 280 285
Asn Lys Ile Ala Lys Asn Val Gln Tyr Pro Glu Cys Leu Asp Met Gln
290 295 300
Pro Tyr Met Ser Gln Thr Asn Thr Gly Pro Leu Val Tyr Val Leu Tyr
305 310 315 320
Ala Val Leu Val His Ala Gly Trp Ser Cys His Asn Gly His Tyr Phe
325 330 335
Ser Tyr Val Lys Ala Gln Glu Gly Gln Trp Tyr Lys Met Asp Asp Ala
340 345 350
Glu Val Thr Ala Ser Ser Ile Thr Ser Val Leu Ser Gln Gln Ala Tyr
355 360 365
Val Leu Phe Tyr Ile Gln Lys Ser Glu Trp Glu Arg His Ser Glu Ser
370 375 380
Val Ser Arg Gly Arg Glu Pro Arg Ala Leu Gly Ala Glu Asp Thr Asp
385 390 395 400
Arg Arg Ala Lys Gln Gly Glu Leu Lys Arg Asp His Pro Cys Leu Gln
405 410 415
Ala Pro Glu Leu Asp Glu His Leu Val Glu Arg Ala Thr Gln Glu Ser
420 425 430
Thr Leu Asp His Trp Lys Phe Leu Gln Glu Gln Asn Lys Thr Lys Pro
435 440 445
Glu Phe Asn Val Arg Lys Val Glu Gly Thr Leu Pro Pro Asp Val Leu
450 455 460
Val Ile His Gln Ser Lys Tyr Lys Cys Gly Met Lys Asn His His Pro
465 470 475 480
Glu Gln Gln Ser Ser Leu Leu Asn Leu Ser Ser Thr Thr Pro Thr His
485 490 495
Gln Glu Ser Met Asn Thr Gly Thr Leu Ala Ser Leu Arg Gly Arg Ala
500 505 510
Arg Arg Ser Lys Gly Lys Asn Lys His Ser Lys Arg Ala Leu Leu Val
515 520 525
Cys Gln
530
<210> 40
<211> 1593
<212> DNA
<213> Artificial Sequence
<220>
<223> USP17L18
<400> 40
atggaggacg actcactcta cttgggaggt gagtggcagt tcaaccactt ttcaaaactc 60
acatcttctc ggcccgatgc agcttttgct gaaatccagc ggacttctct ccctgagaag 120
tcaccactct catgtgagac ccgtgtcgac ctctgtgatg atttggctcc tgtggcaaga 180
cagcttgctc ccagggagaa gcttcctctg agtagcagga gacctgctgc ggtgggggct 240
gggctccaga atatgggaaa tacctgctac gtgaacgctt ccttgcagtg cctgacatac 300
acaccgcccc ttgccaacta catgctgtcc cgggagcact ctcaaacgtg tcatcgtcac 360
aagggctgta tgctctgtac gatgcaagct cacatcacac gggccctcca caatcctggc 420
cacgtcatcc agccctcaca ggcattggct gctggcttcc atagaggcaa gcaggaagat 480
gcccatgaat ttctcatgtt cactgtggat gccatgaaaa aggcatgcct tcccgggcac 540
aagcaggtgg atcatcactc taaggacacc accctcatcc accaaatatt tggaggctac 600
tggagatctc aaatcaagtg tctccactgc cacggcattt cagacacttt tgacccttac 660
ctggacatcg ccctggatat ccaggcagct cagagtgtcc agcaagcttt ggaacagttg 720
gtgaagcccg aagaactcaa tggagagaat gcctatcatt gtggtgtttg tctccagagg 780
gcgccggcct ccaagacgtt aactttacac acctctgcca aggtcctcat ccttgtattg 840
aagagattct ccgatgtcac aggcaacaag attgccaaga atgtgcaata tcctgagtgc 900
cttgacatgc agccatacat gtctcagacg aacacaggac ctctcgtcta tgtcctctat 960
gctgtgctgg tccacgctgg gtggagttgt cacaacggac attacttctc ttatgtcaaa 1020
gctcaagaag gccagtggta taaaatggat gatgccgagg tcaccgcctc tagcatcact 1080
tctgtcctga gtcaacaggc ctacgtcctc ttttacatcc agaagagtga atgggaaaga 1140
cacagtgaga gtgtgtcaag aggcagggaa ccaagagccc ttggcgcaga agacacagac 1200
aggcgagcaa agcaaggaga gctcaagaga gaccacccct gcctccaggc ccccgagttg 1260
gacgagcact tggtggaaag agccactcag gaaagcacct tagaccactg gaaattcctt 1320
caagagcaaa acaaaacgaa gcctgagttc aacgtcagaa aagtcgaagg taccctgcct 1380
cccgacgtac ttgtgattca tcaatcaaaa tacaagtgtg ggatgaagaa ccatcatcct 1440
gaacagcaaa gctccctgct aaacctctct tcgacgaccc cgacacatca ggagtccatg 1500
aacactggca cactcgcttc cctgcgaggg agggccagga gatccaaagg gaagaacaaa 1560
cacagcaaga gggctctgct tgtgtgccag tga 1593
<210> 41
<211> 423
<212> PRT
<213> Artificial Sequence
<220>
<223> TMPRSS11E
<400> 41
Met Met Tyr Arg Pro Asp Val Val Arg Ala Arg Lys Arg Val Cys Trp
1 5 10 15
Glu Pro Trp Val Ile Gly Leu Val Ile Phe Ile Ser Leu Ile Val Leu
20 25 30
Ala Val Cys Ile Gly Leu Thr Val His Tyr Val Arg Tyr Asn Gln Lys
35 40 45
Lys Thr Tyr Asn Tyr Tyr Ser Thr Leu Ser Phe Thr Thr Asp Lys Leu
50 55 60
Tyr Ala Glu Phe Gly Arg Glu Ala Ser Asn Asn Phe Thr Glu Met Ser
65 70 75 80
Gln Arg Leu Glu Ser Met Val Lys Asn Ala Phe Tyr Lys Ser Pro Leu
85 90 95
Arg Glu Glu Phe Val Lys Ser Gln Val Ile Lys Phe Ser Gln Gln Lys
100 105 110
His Gly Val Leu Ala His Met Leu Leu Ile Cys Arg Phe His Ser Thr
115 120 125
Glu Asp Pro Glu Thr Val Asp Lys Ile Val Gln Leu Val Leu His Glu
130 135 140
Lys Leu Gln Asp Ala Val Gly Pro Pro Lys Val Asp Pro His Ser Val
145 150 155 160
Lys Ile Lys Lys Ile Asn Lys Thr Glu Thr Asp Ser Tyr Leu Asn His
165 170 175
Cys Cys Gly Thr Arg Arg Ser Lys Thr Leu Gly Gln Ser Leu Arg Ile
180 185 190
Val Gly Gly Thr Glu Val Glu Glu Gly Glu Trp Pro Trp Gln Ala Ser
195 200 205
Leu Gln Trp Asp Gly Ser His Arg Cys Gly Ala Thr Leu Ile Asn Ala
210 215 220
Thr Trp Leu Val Ser Ala Ala His Cys Phe Thr Thr Tyr Lys Asn Pro
225 230 235 240
Ala Arg Trp Thr Ala Ser Phe Gly Val Thr Ile Lys Pro Ser Lys Met
245 250 255
Lys Arg Gly Leu Arg Arg Ile Ile Val His Glu Lys Tyr Lys His Pro
260 265 270
Ser His Asp Tyr Asp Ile Ser Leu Ala Glu Leu Ser Ser Pro Val Pro
275 280 285
Tyr Thr Asn Ala Val His Arg Val Cys Leu Pro Asp Ala Ser Tyr Glu
290 295 300
Phe Gln Pro Gly Asp Val Met Phe Val Thr Gly Phe Gly Ala Leu Lys
305 310 315 320
Asn Asp Gly Tyr Ser Gln Asn His Leu Arg Gln Ala Gln Val Thr Leu
325 330 335
Ile Asp Ala Thr Thr Cys Asn Glu Pro Gln Ala Tyr Asn Asp Ala Ile
340 345 350
Thr Pro Arg Met Leu Cys Ala Gly Ser Leu Glu Gly Lys Thr Asp Ala
355 360 365
Cys Gln Gly Asp Ser Gly Gly Pro Leu Val Ser Ser Asp Ala Arg Asp
370 375 380
Ile Trp Tyr Leu Ala Gly Ile Val Ser Trp Gly Asp Glu Cys Ala Lys
385 390 395 400
Pro Asn Lys Pro Gly Val Tyr Thr Arg Val Thr Ala Leu Arg Asp Trp
405 410 415
Ile Thr Ser Lys Thr Gly Ile
420
<210> 42
<211> 1358
<212> DNA
<213> Artificial Sequence
<220>
<223> TMPRSS11E
<400> 42
ggactcttca ttgctggttg gcaatgatgt atcggccaga tgtggtgagg gctaggaaaa 60
gagtttgttg ggaaccctgg gttatcggcc tcgtcatctt catatccctg attgtcctgg 120
cagtgtgcat tggactcact gttcattatg tgagatataa tcaaaagaag acctacaatt 180
actatagcac attgtcattt acaactgaca aactatatgc tgagtttggc agagaggctt 240
ctaacaattt tacagaaatg agccagagac ttgaatcaat ggtgaaaaat gcattttata 300
aatctccatt aagggaagaa tttgtcaagt ctcaggttat caagttcagt caacagaagc 360
atggagtgtt ggctcatatg ctgttgattt gtagatttca ctctactgag gatcctgaaa 420
ctgtagataa aattgttcaa cttgttttac atgaaaagct gcaagatgct gtaggacccc 480
ctaaagtaga tcctcactca gttaaaatta aaaaaatcaa caagacagaa acagacagct 540
atctaaacca ttgctgcgga acacgaagaa gtaaaactct aggtcagagt ctcaggatcg 600
ttggtgggac agaagtagaa gagggtgaat ggccctggca ggctagcctg cagtgggatg 660
ggagtcatcg ctgtggagca accttaatta atgccacatg gcttgtgagt gctgctcact 720
gttttacaac atataagaac cctgccagat ggactgcttc ctttggagta acaataaaac 780
cttcgaaaat gaaacggggt ctccggagaa taattgtcca tgaaaaatac aaacacccat 840
cacatgacta tgatatttct cttgcagagc tttctagccc tgttccctac acaaatgcag 900
tacatagagt ttgtctccct gatgcatcct atgagtttca accaggtgat gtgatgtttg 960
tgacaggatt tggagcactg aaaaatgatg gttacagtca aaatcatctt cgacaagcac 1020
aggtgactct catagacgct acaacttgca atgaacctca agcttacaat gacgccataa 1080
ctcctagaat gttatgtgct ggctccttag aaggaaaaac agatgcatgc cagggtgact 1140
ctggaggacc actggttagt tcagatgcta gagatatctg gtaccttgct ggaatagtga 1200
gctggggaga tgaatgtgcg aaacccaaca agcctggtgt ttatactaga gttacggcct 1260
tgcgggactg gattacttca aaaactggta tctaagagac aaaagcctca tggaacagat 1320
aacatttttt tttgtttttt gggtgtggag gccatttt 1358
<210> 43
<211> 530
<212> PRT
<213> Artificial Sequence
<220>
<223> UGT2B17
<400> 43
Met Ser Leu Lys Trp Met Ser Val Phe Leu Leu Met Gln Leu Ser Cys
1 5 10 15
Tyr Phe Ser Ser Gly Ser Cys Gly Lys Val Leu Val Trp Pro Thr Glu
20 25 30
Tyr Ser His Trp Ile Asn Met Lys Thr Ile Leu Glu Glu Leu Val Gln
35 40 45
Arg Gly His Glu Val Ile Val Leu Thr Ser Ser Ala Ser Ile Leu Val
50 55 60
Asn Ala Ser Lys Ser Ser Ala Ile Lys Leu Glu Val Tyr Pro Thr Ser
65 70 75 80
Leu Thr Lys Asn Asp Leu Glu Asp Phe Phe Met Lys Met Phe Asp Arg
85 90 95
Trp Thr Tyr Ser Ile Ser Lys Asn Thr Phe Trp Ser Tyr Phe Ser Gln
100 105 110
Leu Gln Glu Leu Cys Trp Glu Tyr Ser Asp Tyr Asn Ile Lys Leu Cys
115 120 125
Glu Asp Ala Val Leu Asn Lys Lys Leu Met Arg Lys Leu Gln Glu Ser
130 135 140
Lys Phe Asp Val Leu Leu Ala Asp Ala Val Asn Pro Cys Gly Glu Leu
145 150 155 160
Leu Ala Glu Leu Leu Asn Ile Pro Phe Leu Tyr Ser Leu Arg Phe Ser
165 170 175
Val Gly Tyr Thr Val Glu Lys Asn Gly Gly Gly Phe Leu Phe Pro Pro
180 185 190
Ser Tyr Val Pro Val Val Met Ser Glu Leu Ser Asp Gln Met Ile Phe
195 200 205
Met Glu Arg Ile Lys Asn Met Ile Tyr Met Leu Tyr Phe Asp Phe Trp
210 215 220
Phe Gln Ala Tyr Asp Leu Lys Lys Trp Asp Gln Phe Tyr Ser Glu Val
225 230 235 240
Leu Gly Arg Pro Thr Thr Leu Phe Glu Thr Met Gly Lys Ala Glu Met
245 250 255
Trp Leu Ile Arg Thr Tyr Trp Asp Phe Glu Phe Pro Arg Pro Phe Leu
260 265 270
Pro Asn Val Asp Phe Val Gly Gly Leu His Cys Lys Pro Ala Lys Pro
275 280 285
Leu Pro Lys Glu Met Glu Glu Phe Val Gln Ser Ser Gly Glu Asn Gly
290 295 300
Ile Val Val Phe Ser Leu Gly Ser Met Ile Ser Asn Met Ser Glu Glu
305 310 315 320
Ser Ala Asn Met Ile Ala Ser Ala Leu Ala Gln Ile Pro Gln Lys Val
325 330 335
Leu Trp Arg Phe Asp Gly Lys Lys Pro Asn Thr Leu Gly Ser Asn Thr
340 345 350
Arg Leu Tyr Lys Trp Leu Pro Gln Asn Asp Leu Leu Gly His Pro Lys
355 360 365
Thr Lys Ala Phe Ile Thr His Gly Gly Thr Asn Gly Ile Tyr Glu Ala
370 375 380
Ile Tyr His Gly Ile Pro Met Val Gly Ile Pro Leu Phe Ala Asp Gln
385 390 395 400
His Asp Asn Ile Ala His Met Lys Ala Lys Gly Ala Ala Leu Ser Val
405 410 415
Asp Ile Arg Thr Met Ser Ser Arg Asp Leu Leu Asn Ala Leu Lys Ser
420 425 430
Val Ile Asn Asp Pro Ile Tyr Lys Glu Asn Ile Met Lys Leu Ser Arg
435 440 445
Ile His His Asp Gln Pro Val Lys Pro Leu Asp Arg Ala Val Phe Trp
450 455 460
Ile Glu Phe Val Met Arg His Lys Gly Ala Lys His Leu Arg Val Ala
465 470 475 480
Ala His Asn Leu Thr Trp Ile Gln Tyr His Ser Leu Asp Val Ile Ala
485 490 495
Phe Leu Leu Ala Cys Val Ala Thr Met Ile Phe Met Ile Thr Lys Cys
500 505 510
Cys Leu Phe Cys Phe Arg Lys Leu Ala Lys Thr Gly Lys Lys Lys Lys
515 520 525
Arg Asp
530
<210> 44
<211> 2099
<212> DNA
<213> Artificial Sequence
<220>
<223> UGT2B17
<400> 44
gaaagaaaca acaactggaa aagaagcact gcataagacc aggatgtctc tgaaatggat 60
gtcagtcttt ctgctgatgc agctcagttg ttactttagc tctgggagtt gtggaaaggt 120
gctggtgtgg cccacagaat acagccattg gataaatatg aagacaatcc tggaagagct 180
tgttcagagg ggtcatgagg tgattgtgtt gacatcttcg gcttctattc ttgtcaatgc 240
cagtaaatca tctgctatta aattagaagt ttatcctaca tctttaacta aaaatgattt 300
ggaagatttt tttatgaaaa tgttcgatag atggacatat agtatttcaa aaaatacatt 360
ttggtcatat ttttcacaac tacaagaatt gtgttgggaa tattctgact ataatataaa 420
gctctgtgaa gatgcagttt tgaacaagaa acttatgaga aaactacaag agtcaaaatt 480
tgatgtcctt ctggcagatg ccgttaatcc ctgtggtgag ctgctggctg agctacttaa 540
catacccttt ctgtacagtc tccgcttctc tgttggctac acagttgaga agaatggtgg 600
aggatttctg ttccctcctt cctatgtacc tgttgttatg tcagaattaa gtgatcaaat 660
gattttcatg gagaggataa aaaatatgat atatatgctt tattttgact tttggtttca 720
agcatatgat ctgaagaagt gggaccagtt ttatagtgaa gttctaggaa gacccactac 780
attatttgag acaatgggga aagctgaaat gtggctcatt cgaacctatt gggattttga 840
atttcctcgc ccattcttac caaatgttga ttttgttgga ggacttcact gtaaaccagc 900
caaacccttg cctaaggaaa tggaagagtt tgtgcagagc tctggagaaa atggtattgt 960
ggtgttttct ctggggtcga tgatcagtaa catgtcagaa gaaagtgcca acatgattgc 1020
atcagccctt gcccagatcc cacaaaaggt tctatggaga tttgatggca agaagccaaa 1080
tactttaggt tccaatactc gactgtataa gtggttaccc cagaatgacc ttcttggtca 1140
tcccaaaacc aaagctttta taactcatgg tggaaccaat ggcatctatg aggcaatcta 1200
ccatgggatc cctatggtgg gcattccctt gtttgcggat caacatgata acattgctca 1260
catgaaagcc aagggagcag ccctcagtgt ggacatcagg accatgtcaa gtagagattt 1320
gctcaatgca ttgaagtcag tcattaatga ccctatctat aaagagaata tcatgaaatt 1380
atcaagaatt catcatgatc aaccggtgaa gcccctggat cgagcagtct tctggattga 1440
gtttgtcatg cgccataaag gagccaagca ccttcgggtc gcagcccaca acctcacctg 1500
gatccagtac cactctttgg atgtgatagc attcctgctg gcctgcgtgg caactatgat 1560
atttatgatc acaaaatgtt gcctgttttg tttccgaaag cttgccaaaa caggaaagaa 1620
gaagaaaagg gattagttat atcaaaagcc tgaagtggaa tgaccaaaag atgggactcc 1680
tcctttattc cagcatggag ggttttaaat ggaggatttc ctttttcctg cgacaaaacg 1740
tcttttcaca acttaccctg ttaagtcaaa atttattttc caggaattta atatgtactt 1800
tagttggaat tattctatgt caatgatttt taagctatga aaaataataa tataaaacct 1860
tatgggctta tattgaaatt tattattcta atccaaaagt taccccacac aaaagttact 1920
gagcttcctt atgtttcaca cattgtattt gaacacaaaa cattaacaac tccactcata 1980
gtatcaacat tgttttgcaa atactcagaa tattttggct tcattttgag cagaattttt 2040
gtttttaatt ttgccaatga aatcttcaat aattaaaaaa aaaaaaaaaa aaaaaaaaa 2099
<210> 45
<211> 2839
<212> PRT
<213> Artificial Sequence
<220>
<223> PDZD2
<400> 45
Met Pro Ile Thr Gln Asp Asn Ala Val Leu His Leu Pro Leu Leu Tyr
1 5 10 15
Gln Trp Leu Gln Asn Ser Leu Gln Glu Gly Gly Asp Gly Pro Glu Gln
20 25 30
Arg Leu Cys Gln Ala Ala Ile Gln Lys Leu Gln Glu Tyr Ile Gln Leu
35 40 45
Asn Phe Ala Val Asp Glu Ser Thr Val Pro Pro Asp His Ser Pro Pro
50 55 60
Glu Met Glu Ile Cys Thr Val Tyr Leu Thr Lys Glu Leu Gly Asp Thr
65 70 75 80
Glu Thr Val Gly Leu Ser Phe Gly Asn Ile Pro Val Phe Gly Asp Tyr
85 90 95
Gly Glu Lys Arg Arg Gly Gly Lys Lys Arg Lys Thr His Gln Gly Pro
100 105 110
Val Leu Asp Val Gly Cys Ile Trp Val Thr Glu Leu Arg Lys Asn Ser
115 120 125
Pro Ala Gly Lys Ser Gly Lys Val Arg Leu Arg Asp Glu Ile Leu Ser
130 135 140
Leu Asn Gly Gln Leu Met Val Gly Val Asp Val Ser Gly Ala Ser Tyr
145 150 155 160
Leu Ala Glu Gln Cys Trp Asn Gly Gly Phe Ile Tyr Leu Ile Met Leu
165 170 175
Arg Arg Phe Lys His Lys Ala His Ser Thr Tyr Asn Gly Asn Ser Ser
180 185 190
Asn Ser Ser Glu Pro Gly Glu Thr Pro Thr Leu Glu Leu Gly Asp Arg
195 200 205
Thr Ala Lys Lys Gly Lys Arg Thr Arg Lys Phe Gly Val Ile Ser Arg
210 215 220
Pro Pro Ala Asn Lys Ala Pro Glu Glu Ser Lys Gly Ser Ala Gly Cys
225 230 235 240
Glu Val Ser Ser Asp Pro Ser Thr Glu Leu Glu Asn Gly Pro Asp Pro
245 250 255
Glu Leu Gly Asn Gly His Val Phe Gln Leu Glu Asn Gly Pro Asp Ser
260 265 270
Leu Lys Glu Val Ala Gly Pro His Leu Glu Arg Ser Glu Val Asp Arg
275 280 285
Gly Thr Glu His Arg Ile Pro Lys Thr Asp Ala Pro Leu Thr Thr Ser
290 295 300
Asn Asp Lys Arg Arg Phe Ser Lys Gly Gly Lys Thr Asp Phe Gln Ser
305 310 315 320
Ser Asp Cys Leu Ala Arg Glu Glu Val Gly Arg Ile Trp Lys Met Glu
325 330 335
Leu Leu Lys Glu Ser Asp Gly Leu Gly Ile Gln Val Ser Gly Gly Arg
340 345 350
Gly Ser Lys Arg Ser Pro His Ala Ile Val Val Thr Gln Val Lys Glu
355 360 365
Gly Gly Ala Ala His Arg Asp Gly Arg Leu Ser Leu Gly Asp Glu Leu
370 375 380
Leu Val Ile Asn Gly His Leu Leu Val Gly Leu Ser His Glu Glu Ala
385 390 395 400
Val Ala Ile Leu Arg Ser Ala Thr Gly Met Val Gln Leu Val Val Ala
405 410 415
Ser Lys Glu Asn Ser Ala Glu Asp Leu Leu Arg Leu Thr Ser Lys Ser
420 425 430
Leu Pro Asp Leu Thr Ser Ser Val Glu Asp Val Ser Ser Trp Thr Asp
435 440 445
Asn Glu Asp Gln Glu Ala Asp Gly Glu Glu Asp Glu Gly Thr Ser Ser
450 455 460
Ser Val Gln Arg Ala Met Pro Gly Thr Asp Glu Pro Gln Asp Val Cys
465 470 475 480
Gly Ala Glu Glu Ser Lys Gly Asn Leu Glu Ser Pro Lys Gln Gly Ser
485 490 495
Asn Lys Ile Lys Leu Lys Ser Arg Leu Ser Gly Gly Val His Arg Leu
500 505 510
Glu Ser Val Glu Glu Tyr Asn Glu Leu Met Val Arg Asn Gly Asp Pro
515 520 525
Arg Ile Arg Met Leu Glu Val Ser Arg Asp Gly Arg Lys His Ser Leu
530 535 540
Pro Gln Leu Leu Asp Ser Ser Ser Ala Ser Gln Glu Tyr His Ile Val
545 550 555 560
Lys Lys Ser Thr Arg Ser Leu Ser Thr Thr Gln Val Glu Ser Pro Trp
565 570 575
Arg Leu Ile Arg Pro Ser Val Ile Ser Ile Ile Gly Leu Tyr Lys Glu
580 585 590
Lys Gly Lys Gly Leu Gly Phe Ser Ile Ala Gly Gly Arg Asp Cys Ile
595 600 605
Arg Gly Gln Met Gly Ile Phe Val Lys Thr Ile Phe Pro Asn Gly Ser
610 615 620
Ala Ala Glu Asp Gly Arg Leu Lys Glu Gly Asp Glu Ile Leu Asp Val
625 630 635 640
Asn Gly Ile Pro Ile Lys Gly Leu Thr Phe Gln Glu Ala Ile His Thr
645 650 655
Phe Lys Gln Ile Arg Ser Gly Leu Phe Val Leu Thr Val Arg Thr Lys
660 665 670
Leu Val Ser Pro Ser Leu Thr Pro Cys Ser Thr Pro Thr His Met Ser
675 680 685
Arg Ser Ala Ser Pro Asn Phe Asn Thr Ser Gly Gly Ala Ser Ala Gly
690 695 700
Gly Ser Asp Glu Gly Ser Ser Ser Ser Leu Gly Arg Lys Thr Pro Gly
705 710 715 720
Pro Lys Asp Arg Ile Val Met Glu Val Thr Leu Asn Lys Glu Pro Arg
725 730 735
Val Gly Leu Gly Ile Gly Ala Cys Cys Leu Ala Leu Glu Asn Ser Pro
740 745 750
Pro Gly Ile Tyr Ile His Ser Leu Ala Pro Gly Ser Val Ala Lys Met
755 760 765
Glu Ser Asn Leu Ser Arg Gly Asp Gln Ile Leu Glu Val Asn Ser Val
770 775 780
Asn Val Arg His Ala Ala Leu Ser Lys Val His Ala Ile Leu Ser Lys
785 790 795 800
Cys Pro Pro Gly Pro Val Arg Leu Val Ile Gly Arg His Pro Asn Pro
805 810 815
Lys Val Ser Glu Gln Glu Met Asp Glu Val Ile Ala Arg Ser Thr Tyr
820 825 830
Gln Glu Ser Lys Glu Ala Asn Ser Ser Pro Gly Leu Gly Thr Pro Leu
835 840 845
Lys Ser Pro Ser Leu Ala Lys Lys Asp Ser Leu Ile Ser Glu Ser Glu
850 855 860
Leu Ser Gln Tyr Phe Ala His Asp Val Pro Gly Pro Leu Ser Asp Phe
865 870 875 880
Met Val Ala Gly Ser Glu Asp Glu Asp His Pro Gly Ser Gly Cys Ser
885 890 895
Thr Ser Glu Glu Gly Ser Leu Pro Pro Ser Thr Ser Thr His Lys Glu
900 905 910
Pro Gly Lys Pro Arg Ala Asn Ser Leu Val Thr Leu Gly Ser His Arg
915 920 925
Ala Ser Gly Leu Phe His Lys Gln Val Thr Val Ala Arg Gln Ala Ser
930 935 940
Leu Pro Gly Ser Pro Gln Ala Leu Arg Asn Pro Leu Leu Arg Gln Arg
945 950 955 960
Lys Val Gly Cys Tyr Asp Ala Asn Asp Ala Ser Asp Glu Glu Glu Phe
965 970 975
Asp Arg Glu Gly Asp Cys Ile Ser Leu Pro Gly Ala Leu Pro Gly Pro
980 985 990
Ile Arg Pro Leu Ser Glu Asp Asp Pro Arg Arg Val Ser Ile Ser Ser
995 1000 1005
Ser Lys Gly Met Asp Val His Asn Gln Glu Glu Arg Pro Arg Lys Thr
1010 1015 1020
Leu Val Ser Lys Ala Ile Ser Ala Pro Leu Leu Gly Ser Ser Val Asp
1025 1030 1035 1040
Leu Glu Glu Ser Ile Pro Glu Gly Met Val Asp Ala Ala Ser Tyr Ala
1045 1050 1055
Ala Asn Leu Thr Asp Ser Ala Glu Ala Pro Lys Gly Ser Pro Gly Ser
1060 1065 1070
Trp Trp Lys Lys Glu Leu Ser Gly Ser Ser Ser Ala Pro Lys Leu Glu
1075 1080 1085
Tyr Thr Val Arg Thr Asp Thr Gln Ser Pro Thr Asn Thr Gly Ser Pro
1090 1095 1100
Ser Ser Pro Gln Gln Lys Ser Glu Gly Leu Gly Ser Arg His Arg Pro
1105 1110 1115 1120
Val Ala Arg Val Ser Pro His Cys Lys Arg Ser Glu Ala Glu Ala Lys
1125 1130 1135
Pro Ser Gly Ser Gln Thr Val Asn Leu Thr Gly Arg Ala Asn Asp Pro
1140 1145 1150
Cys Asp Leu Asp Ser Arg Val Gln Ala Thr Ser Val Lys Val Thr Val
1155 1160 1165
Ala Gly Phe Gln Pro Gly Gly Ala Val Glu Lys Glu Ser Leu Gly Lys
1170 1175 1180
Leu Thr Thr Gly Asp Ala Cys Val Ser Thr Ser Cys Glu Leu Ala Ser
1185 1190 1195 1200
Ala Leu Ser His Leu Asp Ala Ser His Leu Thr Glu Asn Leu Pro Lys
1205 1210 1215
Ala Ala Ser Glu Leu Gly Gln Gln Pro Met Thr Glu Leu Asp Ser Ser
1220 1225 1230
Ser Asp Leu Ile Ser Ser Pro Gly Lys Lys Gly Ala Ala His Pro Asp
1235 1240 1245
Pro Ser Lys Thr Ser Val Asp Thr Gly Gln Val Ser Arg Pro Glu Asn
1250 1255 1260
Pro Ser Gln Pro Ala Ser Pro Arg Val Thr Lys Cys Lys Ala Arg Ser
1265 1270 1275 1280
Pro Val Arg Leu Pro His Glu Gly Ser Pro Ser Pro Gly Glu Lys Ala
1285 1290 1295
Ala Ala Pro Pro Asp Tyr Ser Lys Thr Arg Ser Ala Ser Glu Thr Ser
1300 1305 1310
Thr Pro His Asn Thr Arg Arg Val Ala Ala Leu Arg Gly Ala Gly Pro
1315 1320 1325
Gly Ala Glu Gly Met Thr Pro Ala Gly Ala Val Leu Pro Gly Asp Pro
1330 1335 1340
Leu Thr Ser Gln Glu Gln Arg Gln Gly Ala Pro Gly Asn His Ser Lys
1345 1350 1355 1360
Ala Leu Glu Met Thr Gly Ile His Ala Pro Glu Ser Ser Gln Glu Pro
1365 1370 1375
Ser Leu Leu Glu Gly Ala Asp Ser Val Ser Ser Arg Ala Pro Gln Ala
1380 1385 1390
Ser Leu Ser Met Leu Pro Ser Thr Asp Asn Thr Lys Glu Ala Cys Gly
1395 1400 1405
His Val Ser Gly His Cys Cys Pro Gly Gly Ser Arg Glu Ser Pro Val
1410 1415 1420
Thr Asp Ile Asp Ser Phe Ile Lys Glu Leu Asp Ala Ser Ala Ala Arg
1425 1430 1435 1440
Ser Pro Ser Ser Gln Thr Gly Asp Ser Gly Ser Gln Glu Gly Ser Ala
1445 1450 1455
Gln Gly His Pro Pro Ala Gly Ala Gly Gly Gly Ser Ser Cys Arg Ala
1460 1465 1470
Glu Pro Val Pro Gly Gly Gln Thr Ser Ser Pro Arg Arg Ala Trp Ala
1475 1480 1485
Ala Gly Ala Pro Ala Tyr Pro Gln Trp Ala Ser Gln Pro Ser Val Leu
1490 1495 1500
Asp Ser Ile Asn Pro Asp Lys His Phe Thr Val Asn Lys Asn Phe Leu
1505 1510 1515 1520
Ser Asn Tyr Ser Arg Asn Phe Ser Ser Phe His Glu Asp Ser Thr Ser
1525 1530 1535
Leu Ser Gly Leu Gly Asp Ser Thr Glu Pro Ser Leu Ser Ser Met Tyr
1540 1545 1550
Gly Asp Ala Glu Asp Ser Ser Ser Asp Pro Glu Ser Leu Thr Glu Ala
1555 1560 1565
Pro Arg Ala Ser Ala Arg Asp Gly Trp Ser Pro Pro Arg Ser Arg Val
1570 1575 1580
Ser Leu His Lys Glu Asp Pro Ser Glu Ser Glu Glu Glu Gln Ile Glu
1585 1590 1595 1600
Ile Cys Ser Thr Arg Gly Cys Pro Asn Pro Pro Ser Ser Pro Ala His
1605 1610 1615
Leu Pro Thr Gln Ala Ala Ile Cys Pro Ala Ser Ala Lys Val Leu Ser
1620 1625 1630
Leu Lys Tyr Ser Thr Pro Arg Glu Ser Val Ala Ser Pro Arg Glu Lys
1635 1640 1645
Ala Ala Cys Leu Pro Gly Ser Tyr Thr Ser Gly Pro Asp Ser Ser Gln
1650 1655 1660
Pro Ser Ser Leu Leu Glu Met Ser Ser Gln Glu His Glu Thr His Ala
1665 1670 1675 1680
Asp Ile Ser Thr Ser Gln Asn His Arg Pro Ser Cys Ala Glu Glu Thr
1685 1690 1695
Thr Glu Val Thr Ser Ala Ser Ser Ala Met Glu Asn Ser Pro Leu Ser
1700 1705 1710
Lys Val Ala Arg His Phe His Ser Pro Pro Ile Ile Leu Ser Ser Pro
1715 1720 1725
Asn Met Val Asn Gly Leu Glu His Asp Leu Leu Asp Asp Glu Thr Leu
1730 1735 1740
Asn Gln Tyr Glu Thr Ser Ile Asn Ala Ala Ala Ser Leu Ser Ser Phe
1745 1750 1755 1760
Ser Val Asp Val Pro Lys Asn Gly Glu Ser Val Leu Glu Asn Leu His
1765 1770 1775
Ile Ser Glu Ser Gln Asp Leu Asp Asp Leu Leu Gln Lys Pro Lys Met
1780 1785 1790
Ile Ala Arg Arg Pro Ile Met Ala Trp Phe Lys Glu Ile Asn Lys His
1795 1800 1805
Asn Gln Gly Thr His Leu Arg Ser Lys Thr Glu Lys Glu Gln Pro Leu
1810 1815 1820
Met Pro Ala Arg Ser Pro Asp Ser Lys Ile Gln Met Val Ser Ser Ser
1825 1830 1835 1840
Gln Lys Lys Gly Val Thr Val Pro His Ser Pro Pro Gln Pro Lys Thr
1845 1850 1855
Asn Leu Glu Asn Lys Asp Leu Ser Lys Lys Ser Pro Ala Glu Met Leu
1860 1865 1870
Leu Thr Asn Gly Gln Lys Ala Lys Cys Gly Pro Lys Leu Lys Arg Leu
1875 1880 1885
Ser Leu Lys Gly Lys Ala Lys Val Asn Ser Glu Ala Pro Ala Ala Asn
1890 1895 1900
Ala Val Lys Ala Gly Gly Thr Asp His Arg Lys Pro Leu Ile Ser Pro
1905 1910 1915 1920
Gln Thr Ser His Lys Thr Leu Ser Lys Ala Val Ser Gln Arg Leu His
1925 1930 1935
Val Ala Asp His Glu Asp Pro Asp Arg Asn Thr Thr Ala Ala Pro Arg
1940 1945 1950
Ser Pro Gln Cys Val Leu Glu Ser Lys Pro Pro Leu Ala Thr Ser Gly
1955 1960 1965
Pro Leu Lys Pro Ser Val Ser Asp Thr Ser Ile Arg Thr Phe Val Ser
1970 1975 1980
Pro Leu Thr Ser Pro Lys Pro Val Pro Glu Gln Gly Met Trp Ser Arg
1985 1990 1995 2000
Phe His Met Ala Val Leu Ser Glu Pro Asp Arg Gly Cys Pro Thr Thr
2005 2010 2015
Pro Lys Ser Pro Lys Cys Arg Ala Glu Gly Arg Ala Pro Arg Ala Asp
2020 2025 2030
Ser Gly Pro Val Ser Pro Ala Ala Ser Arg Asn Gly Met Ser Val Ala
2035 2040 2045
Gly Asn Arg Gln Ser Glu Pro Arg Leu Ala Ser His Val Ala Ala Asp
2050 2055 2060
Thr Ala Gln Pro Arg Pro Thr Gly Glu Lys Gly Gly Asn Ile Met Ala
2065 2070 2075 2080
Ser Asp Arg Leu Glu Arg Thr Asn Gln Leu Lys Ile Val Glu Ile Ser
2085 2090 2095
Ala Glu Ala Val Ser Glu Thr Val Cys Gly Asn Lys Pro Ala Glu Ser
2100 2105 2110
Asp Arg Arg Gly Gly Cys Leu Ala Gln Gly Asn Cys Gln Glu Lys Ser
2115 2120 2125
Glu Ile Arg Leu Tyr Arg Gln Val Ala Glu Ser Ser Thr Ser His Pro
2130 2135 2140
Ser Ser Leu Pro Ser His Ala Ser Gln Ala Glu Gln Glu Met Ser Arg
2145 2150 2155 2160
Ser Phe Ser Met Ala Lys Leu Ala Ser Ser Ser Ser Ser Leu Gln Thr
2165 2170 2175
Ala Ile Arg Lys Ala Glu Tyr Ser Gln Gly Lys Ser Ser Leu Met Ser
2180 2185 2190
Asp Ser Arg Gly Val Pro Arg Asn Ser Ile Pro Gly Gly Pro Ser Gly
2195 2200 2205
Glu Asp His Leu Tyr Phe Thr Pro Arg Pro Ala Thr Arg Thr Tyr Ser
2210 2215 2220
Met Pro Ala Gln Phe Ser Ser His Phe Gly Arg Glu Gly His Pro Pro
2225 2230 2235 2240
His Ser Leu Gly Arg Ser Arg Asp Ser Gln Val Pro Val Thr Ser Ser
2245 2250 2255
Val Val Pro Glu Ala Lys Ala Ser Arg Gly Gly Leu Pro Ser Leu Ala
2260 2265 2270
Asn Gly Gln Gly Ile Tyr Ser Val Lys Pro Leu Leu Asp Thr Ser Arg
2275 2280 2285
Asn Leu Pro Ala Thr Asp Glu Gly Asp Ile Ile Ser Val Gln Glu Thr
2290 2295 2300
Ser Cys Leu Val Thr Asp Lys Ile Lys Val Thr Arg Arg His Tyr Cys
2305 2310 2315 2320
Tyr Glu Gln Asn Trp Pro His Glu Ser Thr Ser Phe Phe Ser Val Lys
2325 2330 2335
Gln Arg Ile Lys Ser Phe Glu Asn Leu Ala Asn Ala Asp Arg Pro Val
2340 2345 2350
Ala Lys Ser Gly Ala Ser Pro Phe Leu Ser Val Ser Ser Lys Pro Pro
2355 2360 2365
Ile Gly Arg Arg Ser Ser Gly Ser Ile Val Ser Gly Ser Leu Gly His
2370 2375 2380
Pro Gly Asp Ala Ala Ala Arg Leu Leu Arg Arg Ser Leu Ser Ser Cys
2385 2390 2395 2400
Ser Glu Asn Gln Ser Glu Ala Gly Thr Leu Leu Pro Gln Met Ala Lys
2405 2410 2415
Ser Pro Ser Ile Met Thr Leu Thr Ile Ser Arg Gln Asn Pro Pro Glu
2420 2425 2430
Thr Ser Ser Lys Gly Ser Asp Ser Glu Leu Lys Lys Ser Leu Gly Pro
2435 2440 2445
Leu Gly Ile Pro Thr Pro Thr Met Thr Leu Ala Ser Pro Val Lys Arg
2450 2455 2460
Asn Lys Ser Ser Val Arg His Thr Gln Pro Ser Pro Val Ser Arg Ser
2465 2470 2475 2480
Lys Leu Gln Glu Leu Arg Ala Leu Ser Met Pro Asp Leu Asp Lys Leu
2485 2490 2495
Cys Ser Glu Asp Tyr Ser Ala Gly Pro Ser Ala Val Leu Phe Lys Thr
2500 2505 2510
Glu Leu Glu Ile Thr Pro Arg Arg Ser Pro Gly Pro Pro Ala Gly Gly
2515 2520 2525
Val Ser Cys Pro Glu Lys Gly Gly Asn Arg Ala Cys Pro Gly Gly Ser
2530 2535 2540
Gly Pro Lys Thr Ser Ala Ala Glu Thr Pro Ser Ser Ala Ser Asp Thr
2545 2550 2555 2560
Gly Glu Ala Ala Gln Asp Leu Pro Phe Arg Arg Ser Trp Ser Val Asn
2565 2570 2575
Leu Asp Gln Leu Leu Val Ser Ala Gly Asp Gln Gln Arg Leu Gln Ser
2580 2585 2590
Val Leu Ser Ser Val Gly Ser Lys Ser Thr Ile Leu Thr Leu Ile Gln
2595 2600 2605
Glu Ala Lys Ala Gln Ser Glu Asn Glu Glu Asp Val Cys Phe Ile Val
2610 2615 2620
Leu Asn Arg Lys Glu Gly Ser Gly Leu Gly Phe Ser Val Ala Gly Gly
2625 2630 2635 2640
Thr Asp Val Glu Pro Lys Ser Ile Thr Val His Arg Val Phe Ser Gln
2645 2650 2655
Gly Ala Ala Ser Gln Glu Gly Thr Met Asn Arg Gly Asp Phe Leu Leu
2660 2665 2670
Ser Val Asn Gly Ala Ser Leu Ala Gly Leu Ala His Gly Asn Val Leu
2675 2680 2685
Lys Val Leu His Gln Ala Gln Leu His Lys Asp Ala Leu Val Val Ile
2690 2695 2700
Lys Lys Gly Met Asp Gln Pro Arg Pro Ser Ala Arg Gln Glu Pro Pro
2705 2710 2715 2720
Thr Ala Asn Gly Lys Gly Leu Leu Ser Arg Lys Thr Ile Pro Leu Glu
2725 2730 2735
Pro Gly Ile Gly Arg Ser Val Ala Val His Asp Ala Leu Cys Val Glu
2740 2745 2750
Val Leu Lys Thr Ser Ala Gly Leu Gly Leu Ser Leu Asp Gly Gly Lys
2755 2760 2765
Ser Ser Val Thr Gly Asp Gly Pro Leu Val Ile Lys Arg Val Tyr Lys
2770 2775 2780
Gly Gly Ala Ala Glu Gln Ala Gly Ile Ile Glu Ala Gly Asp Glu Ile
2785 2790 2795 2800
Leu Ala Ile Asn Gly Lys Pro Leu Val Gly Leu Met His Phe Asp Ala
2805 2810 2815
Trp Asn Ile Met Lys Ser Val Pro Glu Gly Pro Val Gln Leu Leu Ile
2820 2825 2830
Arg Lys His Arg Asn Ser Ser
2835
<210> 46
<211> 11984
<212> DNA
<213> Artificial Sequence
<220>
<223> PDZD2
<400> 46
gaggctcggc ggatcccctg cgcagcgagg cgaggagcgg accccagcgc cggtgcgtgc 60
cggccccggg cagcgggacg cggcggggcg gcggctgcag gcagccgagg agccgcaggc 120
cgaacccaag gcaccgggat tgcgcctccc gcggctgccg gcgaaccgcg gctctgcagc 180
tcggggcagg cgcggcggcg gcaccggtgg tggccgcggt ggcggcagct gcgcggggac 240
ccgccgggcg gcgcctgggt ctggacgcgc gaggaagccg cgggagcctc ggccaagccg 300
cgagcaggtg tgaatgagcc cagggaagga cacacggcca ctgctggagg gatcctccat 360
tcctgtgtca tttgcatggg tcctgctgtg aaatgaacct ggcagggact tgttagacac 420
ttccttcctt ccctcattga gcactccagt gccattgttc cacagttgtt ctaattgggt 480
cctagcttcc tcctgccaag gcaaacagca tagtctcgag taggtgtccc taggctcatc 540
tgccagcctg aacatgaaca caggcaaagc tgatgatggc cagggacccc aggggacgtg 600
gggccctgtg gggtctggcc cccaggagca agacctctga tgatgctggt gtctgggagt 660
gagcaccatg cccatcaccc aggacaatgc cgtgctgcac ctgcccctcc tctaccagtg 720
gctgcagaac agcctgcagg aaggtgggga tgggccggag cagcggctct gccaggcggc 780
catccagaag ctgcaggagt acatccagct gaactttgct gtggatgaga gtacggtccc 840
acctgatcac agcccccccg aaatggagat ctgtactgtg tacctcacca aggagctggg 900
ggacacagag actgtgggcc tgagttttgg gaacatccct gttttcgggg actatggtga 960
aaagcgcagg gggggcaaga agaggaaaac ccaccagggt cctgtgctgg atgtgggctg 1020
catctgggtg acagagctga ggaagaacag cccagcaggg aagagtggga aggtccgact 1080
gcgggatgag atcctctcac tgaatgggca gctgatggtt ggagttgatg tcagtggggc 1140
cagttacctg gctgagcagt gctggaatgg cggctttatc tacctgatca tgctgcgtcg 1200
ctttaagcac aaagcccact ccacttataa tggcaacagt agcaacagct ctgaaccagg 1260
agaaacacct accttggagc tgggtgaccg aactgcgaaa aaggggaaac gaaccagaaa 1320
gtttggggtc atctccaggc ctcctgccaa caaggcccct gaagaatcca agggcagcgc 1380
tggctgtgag gtgtccagtg accccagcac tgagctggag aacggccctg accctgaact 1440
tggaaacggc catgtctttc agctagaaaa tggcccagat tctctcaagg aggtggctgg 1500
accccatcta gagaggtcag aagtggacag agggacagag catagaattc caaagacaga 1560
tgctcctctg accacaagca atgacaaacg ccgcttctca aaaggtggga agacggactt 1620
ccaatcgagt gactgcctgg cacgggagga agttggccga atatggaaga tggagctgct 1680
caaagaatcg gatgggctgg gaattcaggt tagtggaggc cgaggatcaa agcgctcacc 1740
tcacgctatc gttgtcactc aagtgaagga aggaggtgcc gctcacaggg atggcaggct 1800
gtccttagga gatgagctgc tggtaatcaa tggtcattta ctggtcgggc tctcccacga 1860
ggaagcagtg gccattcttc gctccgccac gggaatggtg cagcttgtgg tggccagcaa 1920
ggaaaactcc gcagaggacc tcctcaggtt aacatctaag agcttgccag atctgaccag 1980
ctcggtagaa gatgtgtcct cctggactga taacgaagac caggaggcag acggggaaga 2040
ggacgaagga accagctctt ctgtccagag agcaatgcct gggacagatg aaccccaaga 2100
tgtgtgcggt gctgaggaat ccaaggggaa cttggaaagt cccaaacagg gcagcaataa 2160
aatcaagctc aagagtcgcc tttcaggggg tgtacaccgc cttgagtcag ttgaagaata 2220
taacgagctg atggtgcgga atggggaccc ccggatccgg atgttggagg tctcccgaga 2280
tggccggaaa cactccctcc cgcagctgct ggactcttcc agtgcctcac aggaatacca 2340
cattgtgaag aagtctaccc gctccttaag cacgactcag gtggaatctc cttggaggct 2400
cattcggcca tccgtcatct cgatcattgg gttgtacaaa gaaaaaggca agggccttgg 2460
ctttagtatt gctggaggtc gagactgcat tcgtggacag atggggattt ttgtcaagac 2520
catcttccca aatggatcag ctgcagagga cggaagactt aaagaagggg atgaaatcct 2580
agatgtaaat ggaataccaa taaagggctt gacatttcaa gaagccattc atacctttaa 2640
gcaaatccgg agtggattat ttgttttaac ggtacgcaca aagttggtga gccccagcct 2700
cacaccctgc tcgacaccca cacacatgag cagatccgcc tccccgaact tcaataccag 2760
tgggggagcc tcagcgggag gttccgatga aggcagttct tcatccctgg gtcggaagac 2820
ccctgggccc aaggacagga tcgtcatgga agtaacactc aacaaagagc caagagttgg 2880
attaggcatt ggtgcctgct gcttggctct ggaaaacagt cctcctggca tctacattca 2940
cagccttgct ccaggatcag tggccaagat ggagagcaac ctgagccgcg gggatcaaat 3000
cctggaagtg aactccgtca acgtccgcca tgctgcttta agcaaagtcc acgccatctt 3060
gagtaaatgc cctccaggac ccgttcgcct tgtcatcggc cggcacccta atccaaaggt 3120
ttccgagcag gaaatggatg aagtcatagc acgcagcact tatcaggaga gcaaagaggc 3180
caattcctct cctggcttag gtaccccctt gaagagtccc tctcttgcaa aaaaggactc 3240
ccttatttct gaatctgaac tctcccagta ctttgcccac gatgtccctg gccccttgtc 3300
agacttcatg gtggccggtt ctgaggacga ggatcacccg ggaagtggct gcagcacgtc 3360
ggaggagggc agcctgcctc ccagcacctc cactcacaag gagcctggaa aacccagagc 3420
caacagcctc gtgactcttg ggagccatcg ggcttctggg ctcttccaca agcaggtgac 3480
agttgccaga caagccagtc tccccggaag cccacaggcc ctccgaaacc ctctcctccg 3540
ccagaggaag gtaggctgct acgatgccaa cgatgccagt gatgaggaag agtttgacag 3600
agaaggggac tgcatttcac tcccaggggc cctcccgggt cccatcaggc ctctgtcaga 3660
ggatgacccg aggcgtgtct caatttcctc ttccaagggc atggacgtcc acaaccaaga 3720
ggaacgaccc cggaaaacac tggtgagcaa ggccatctcg gcacctcttc ttggtagctc 3780
agtggactta gaggagagta tcccagaggg catggtggat gctgcgtcct atgcagccaa 3840
cctcacggac tctgcagagg cccccaaggg gagccctgga agctggtgga agaaggaact 3900
gtcaggatca agtagcgcac ccaaattgga atacacagtc cgtacagaca cccagagtcc 3960
gacgaacact gggagcccca gttcccccca gcagaaaagt gaaggcctgg gctccaggca 4020
cagaccagtg gccagggtaa gcccccactg caagagatcc gaggctgagg ccaagcccag 4080
tggctcacag acagtgaacc tgactggcag agccaatgat ccatgcgatc tggactcgag 4140
agtccaggcc acttctgtca aagtgactgt cgctggcttt cagccaggtg gagctgtgga 4200
gaaggaatct ctgggaaagc tgaccactgg agatgcttgt gtctctacca gctgtgaact 4260
agccagtgct ctgtcccatc tggatgccag ccacctcaca gagaacctgc ccaaagctgc 4320
atcagagctg gggcaacaac ccatgactga actggacagc tcctcagacc tcatctcttc 4380
cccagggaag aagggggccg ctcatcctga ccccagcaag acctctgtag acacagggca 4440
agtcagtcgg ccagagaatc ccagccagcc tgcatcgccc agggtcacca agtgcaaggc 4500
caggtctcca gtcaggctcc cccatgaggg cagcccctcc ccgggggaga aagcagcggc 4560
tccccctgac tacagcaaga ctcgatcagc atcggaaacc agcacacccc acaataccag 4620
gagggtggct gccctcaggg gagcgggacc tggagcagag ggaatgacac cagctggtgc 4680
tgtcctgcca ggagaccccc tcacatccca ggagcagaga cagggagctc caggtaacca 4740
cagtaaggct ctggaaatga caggaatcca tgcacctgaa agctcccagg agccttccct 4800
gctggaggga gcagattctg tgtcctcaag ggcaccgcag gccagcctct ccatgctgcc 4860
atccactgac aacaccaaag aagcatgtgg ccatgtctcg gggcactgct gcccaggggg 4920
gagtagagag agccctgtga cggacattga cagcttcatc aaggagctgg atgcttctgc 4980
agcaaggtct ccgtcttccc agacggggga cagtggctct caggagggca gtgctcaggg 5040
ccacccacca gccggggctg gaggtgggag ctcctgccgt gccgaaccag tcccgggggg 5100
ccagacctcc tccccgagga gggcctgggc tgctggtgcc cccgcctacc cacaatgggc 5160
ctcccagcct tcggttttag attcaattaa tcccgacaaa cattttactg tgaacaaaaa 5220
ctttctgagc aactactcta gaaattttag cagttttcat gaagacagca cctccctatc 5280
aggcctgggt gacagcacgg agccgtctct gtcatccatg tatggcgatg ctgaggattc 5340
ttcttctgac cctgagtcac tcactgaagc cccacgagct tctgccaggg acggctggtc 5400
ccctcctcgt tcccgtgtgt ctttgcacaa ggaagatcct tcggagtcag aagaggaaca 5460
gattgagatt tgttccacac gtggctgccc caatccaccc tcgagtcctg ctcatcttcc 5520
cacccaggct gccatctgtc ctgcctcagc caaagttctg tcattaaaat acagcactcc 5580
gagagagtcg gtggccagtc cccgtgagaa ggccgcctgc ttgccaggct catacacttc 5640
aggcccagac tcttcccagc catcatcact cttggagatg agctctcagg agcatgaaac 5700
tcatgcggac ataagcactt cacagaacca caggccctcg tgtgcagaag aaaccacaga 5760
agtcaccagc gctagctcag ccatggaaaa cagtccgctg tctaaagtag ccaggcattt 5820
tcacagtccg cccatcattc tcagctcccc caacatggta aatggcttgg aacatgacct 5880
gctagatgac gaaaccctga atcaatacga aacaagcatt aatgcagctg ccagtctgtc 5940
ctccttcagt gtggatgtcc ctaagaatgg agaatctgtt ttggaaaacc tccacatctc 6000
tgaaagtcaa gacctggatg acttgctaca gaaaccaaaa atgatcgcta ggaggcccat 6060
catggcctgg tttaaagaaa taaataaaca taaccaaggc acacatttga ggagcaaaac 6120
cgagaaggaa caacctctaa tgcctgccag aagtcccgac tccaagattc agatggtgag 6180
ttcaagccaa aaaaagggcg ttactgtgcc tcatagccct cctcagccga aaacaaacct 6240
ggaaaataag gacctgtcta agaagagtcc ggcagaaatg cttctgacta atggtcagaa 6300
ggcaaagtgt ggtccgaagc tgaagaggct cagcctcaag ggcaaggcca aagtcaactc 6360
tgaggcccct gctgcgaatg ctgtgaaggc tggggggacg gaccacagga aacccttgat 6420
ctcaccccag acctcccaca aaacactttc taaggcagtg tcacagcggc tccatgtagc 6480
cgaccacgag gaccctgaca gaaacaccac agctgccccc aggtcccccc agtgtgtgct 6540
ggaaagcaag ccacctcttg ccacctctgg gccactgaaa ccctcagtgt ctgacacgag 6600
catcaggaca tttgtctcgc ccctgacctc tcccaagcct gttcctgagc aaggcatgtg 6660
gagcaggttc cacatggctg tcctctctga acccgacaga ggttgcccaa ccacccctaa 6720
atctcctaag tgtagagcag agggcagggc gccccgtgct gactccgggc cggtgagtcc 6780
ggcagcgtct aggaacggca tgtccgtggc agggaacaga cagagtgagc cgcgcctggc 6840
cagccatgtg gcagcagaca cagcccaacc caggccgact ggcgaaaaag gaggcaacat 6900
aatggccagc gatcgcctcg aaagaacaaa ccagctgaaa atcgtggaga tttctgctga 6960
agcagtgtca gagactgtat gtggtaacaa gccagctgaa agcgacagac ggggagggtg 7020
cttggcccag ggcaactgtc aggagaagag tgaaatcagg ctctatcgcc aggtcgcaga 7080
atcatccaca agtcatccat cctcactccc atctcatgcc tcccaggcag agcaggaaat 7140
gtcacgatca ttcagcatgg caaaactggc gtcctcctcc tcctcccttc aaacagccat 7200
tagaaaggca gaatactccc agggaaaatc aagcctgatg tcagactccc gaggggtgcc 7260
cagaaacagc attccagggg gcccctcggg ggaggaccat ctctacttca ccccaaggcc 7320
agcgaccagg acctactcca tgccagccca gttctcaagc cattttggac gggagggtca 7380
ccccccacac agcctgggtc gctctcggga cagccaggtc cctgtgacaa gcagtgttgt 7440
ccccgaggca aaggcatcca gaggtggtct tcccagcctg gctaatggac agggcatata 7500
tagtgtaaag ccgctgctgg acacatcgag gaatcttcca gccacagatg aaggggatat 7560
catttcagtc caggagacga gctgcctagt cacagacaaa atcaaagtca ccagacgaca 7620
ctactgctat gagcagaact ggccccatga atctacctca tttttctctg tgaagcagcg 7680
gatcaagtct tttgagaacc tggccaatgc tgaccggcct gtagccaagt ccggggcttc 7740
cccatttttg tcggtgagct ccaagcctcc cattgggagg cggtcttccg gcagcattgt 7800
ttccgggagc ctgggccacc caggtgacgc agcagcaagg ttgttgagac gcagcttgag 7860
ttcctgcagc gaaaaccaaa gcgaagccgg caccctcctg ccccagatgg ccaagtctcc 7920
ctcaatcatg acactgacca tctctcggca gaacccacca gagaccagta gcaagggctc 7980
tgattcggaa ctaaagaaat cacttggtcc tttgggaatt cccaccccaa cgatgaccct 8040
ggcttctcct gttaagagga acaagtcctc ggtacgccac acgcagccct cgcccgtgtc 8100
ccgctccaag ctccaggagc tgagagcctt gagcatgcct gaccttgaca agctctgcag 8160
cgaggattac tcagcagggc cgagcgccgt gctcttcaaa actgagctgg agatcacccc 8220
caggaggtca cctggccctc ctgctggagg cgtttcgtgt cccgagaagg gcgggaacag 8280
ggcctgtcca ggaggaagtg gccctaaaac cagtgctgct gagacaccca gttcagccag 8340
tgatacgggt gaagctgccc aggatctgcc ttttagaaga agctggtcag ttaatttgga 8400
tcaacttcta gtctcagcgg gggaccagca aagattacag tctgttttat cgtcagtggg 8460
atcgaaatct accatcctaa ctctcattca ggaagcgaaa gcacaatcag agaatgaaga 8520
agatgtttgc ttcatagtct tgaatagaaa agaaggctca ggtctgggat tcagtgtggc 8580
aggagggaca gatgtggagc caaaatcaat cacggtccac agggtgtttt ctcagggggc 8640
ggcttctcag gaagggacta tgaaccgagg ggatttcctt ctgtcagtca acggcgcctc 8700
actggctggc ttagcccacg ggaatgtcct gaaggttctg caccaggcac agctgcacaa 8760
agatgccctc gtggtcatca agaaagggat ggatcagccc aggccctctg cccggcagga 8820
gcctcccaca gccaatggga agggtttgct gtccagaaag accatccccc tggagcctgg 8880
cattgggaga agtgtggctg tacacgatgc tctgtgtgtt gaagtgctga agacctcggc 8940
tgggctggga ctgagtctgg atgggggaaa atcatcggtg acgggagatg ggcccttggt 9000
cattaaaaga gtgtacaaag gtggtgcggc tgaacaagct ggaataatag aagctggaga 9060
tgaaattctt gctattaatg ggaaacctct ggttgggctc atgcactttg atgcctggaa 9120
tattatgaag tctgtcccag aaggacctgt gcagttatta attagaaagc ataggaattc 9180
ttcatgaatt ttaacaagaa tcattttctc agttctcttc tttctttagc aaatcagagt 9240
gacttcttta aaccacaggt tgttgaaatg gccaacactg gtacagacac ggactataaa 9300
aatctccaag cttgtgctta cacatgaagc ctgacttaac tgtatgtgca acagcaatga 9360
aattaactcc agaagccttc cacctgcgtc acccaggccg ggagggttcc ttcgttccag 9420
tgcctgtccc ctacctttat gttatgttta ctgatgggga tacaagatgt gacacaccct 9480
tctttatttg aaacaaacaa acatttagct agacctttgc ttccttcttg ccagctctcc 9540
caacataccc aatcctggtg atcagggaac taaaagtctg agggggacac aaatgtcaca 9600
cctaagagga caatcaatca ttttgtatga ttttgtaagt aaatgacaga atgcttttag 9660
gcacattcaa tggaaggagg agatgtaggt ctgtatatgt taccctgaaa agagaataag 9720
acttacttaa aaaaatgaat tatgacctgt taggctgagc tcaggaattg tccaaaaagg 9780
aaaaagcaaa ataattaatt gagagtattt tttagtgagt gtaatgtata atgtacgtat 9840
gcaaagttca actcaatagg ttattgatca ccatgaagta ttgatcattt tctatctcaa 9900
aagtgtaagc cataaggctg ttttacagaa tagcacttct gataagctgt attaaatagc 9960
catgagcttc actgcttaga gggagcagaa aggtcaacat ctaaaagcac cttacaacta 10020
gtttttgaac ctgtcttgat aagtgcttga attcaagact ggtcagtcca agagcagaca 10080
aaaatatcac aagtcagtca gtcactgggt ttccatttct gaattttatg cactccaacc 10140
atgaatttaa actaaatttt tagaaatcaa gtatctttct aagtgtcctt ggatttatag 10200
acaatgtatg tacaatccaa atagaggagc ttaatggaat ccttttagga gactggttgg 10260
tttttttccc tctttcccaa catgtttaag aaatgtaaca ttctaagtat tggatctctt 10320
ttcttgacct agtataatga caactgcagt gacttaagtt tttgctgttt tcgttttccc 10380
gctttgcaat ttcctccttt tgccaaaaat gttttcctac agaagactgt cgtgactcac 10440
gctacttggg aaactcactc tggccactcc tcctctggtg gcatgagctg cttcccagta 10500
gctattccga ttggatattc cgttcgtcgt cacatagctg gcttttctct cctcatgatg 10560
taccttattt tcttaggtaa ataattccaa actctcatcg ggtcataaag aggaggagaa 10620
acagggtgag tcaaggtaaa ggagcagaaa tgtagttaca agccaggtcg tcttcagtgg 10680
cacaaaccaa cccgttgagc cctgacaaca tgagtggaga gtgcatttgc catacctgtg 10740
tgcatgacac taagatttta tgttggagat acttctttaa ataacctaca gcttgggtct 10800
atggctgtga cccccagatt catggagggg ctttagccat cagctttgta catcatcatt 10860
tttctgaatg accaatccca ctaaacatct ttgaagtcgg cctagagagg tccttcagat 10920
gagagagaaa tagctggctt gtctgagtcc agatttctca tcaactggca atacaaagga 10980
aaatatggta caggagttag ttagaaaggt cttattgatt ttacttctac ttttcactac 11040
agttacaggt agaatactgt aggaagtcag tgcaaggtgc atgcttgatt gatagatatt 11100
gattgattgt ttttcagtct ctggggtcag ttttgtggtt tctgctttct tgcctaaatc 11160
aaagactatt tcaagtcaac aacactgaaa actgcttttc gcctccactc ttacagctgt 11220
gcctaataat aattaattaa taaacgcaca gccctatgtg aacagacagg aatttcttgt 11280
gcaatgtgga gcaaatggaa tggtctcctt ccgcaagtct ttttaatcct catatctgga 11340
gtacaagggt agacctctgg cttaccacat acactatgct aaagtcatca gccactgcta 11400
ctacatcttg ccagaaggtt tccctcgcca acaaacagtt gaaatttaag ggaagaagca 11460
aaagctaaac tgtctttgac cctaagatag atagaaagct atttatttgt cttcagtgtt 11520
caaggcatga ctagtatttc taattagcct aataaattcc cacactttct gaagtgaaca 11580
ctaatggtat tgtcctacta aaactgtcat tgtttctttt tttttaactg gtcagtcatt 11640
cacaataagc tatgagggta aataaatatg tgttataaca agtaaaccgt agttgcaaga 11700
atataccatg aagattaaag taggctgggt ttcatttcca tcttcccaca catctcattg 11760
aatttgatgg ttgacttaat tggcaccata actttgtatg atattataca ttaaccttta 11820
tttatgtaaa gtaaaatgcc ttatatatta aagagtaagt gcaataatat gaaatagcct 11880
gtacatttta aaaatgttgt caccaagtta tataaatcca catctctgta aacaaccttt 11940
tttaagtaat tttaaaaaaa ataaacactc tgcttactac ttga 11984
<210> 47
<211> 298
<212> PRT
<213> Artificial Sequence
<220>
<223> GOLPH3
<400> 47
Met Thr Ser Leu Thr Gln Arg Ser Ser Gly Leu Val Gln Arg Arg Thr
1 5 10 15
Glu Ala Ser Arg Asn Ala Ala Asp Lys Glu Arg Ala Ala Gly Gly Gly
20 25 30
Ala Gly Ser Ser Glu Asp Asp Ala Gln Ser Arg Arg Asp Glu Gln Asp
35 40 45
Asp Asp Asp Lys Gly Asp Ser Lys Glu Thr Arg Leu Thr Leu Met Glu
50 55 60
Glu Val Leu Leu Leu Gly Leu Lys Asp Arg Glu Gly Tyr Thr Ser Phe
65 70 75 80
Trp Asn Asp Cys Ile Ser Ser Gly Leu Arg Gly Cys Met Leu Ile Glu
85 90 95
Leu Ala Leu Arg Gly Arg Leu Gln Leu Glu Ala Cys Gly Met Arg Arg
100 105 110
Lys Ser Leu Leu Thr Arg Lys Val Ile Cys Lys Ser Asp Ala Pro Thr
115 120 125
Gly Asp Val Leu Leu Asp Glu Ala Leu Lys His Val Lys Glu Thr Gln
130 135 140
Pro Pro Glu Thr Val Gln Asn Trp Ile Glu Leu Leu Ser Gly Glu Thr
145 150 155 160
Trp Asn Pro Leu Lys Leu His Tyr Gln Leu Arg Asn Val Arg Glu Arg
165 170 175
Leu Ala Lys Asn Leu Val Glu Lys Gly Val Leu Thr Thr Glu Lys Gln
180 185 190
Asn Phe Leu Leu Phe Asp Met Thr Thr His Pro Leu Thr Asn Asn Asn
195 200 205
Ile Lys Gln Arg Leu Ile Lys Lys Val Gln Glu Ala Val Leu Asp Lys
210 215 220
Trp Val Asn Asp Pro His Arg Met Asp Arg Arg Leu Leu Ala Leu Ile
225 230 235 240
Tyr Leu Ala His Ala Ser Asp Val Leu Glu Asn Ala Phe Ala Pro Leu
245 250 255
Leu Asp Glu Gln Tyr Asp Leu Ala Thr Lys Arg Val Arg Gln Leu Leu
260 265 270
Asp Leu Asp Pro Glu Val Glu Cys Leu Lys Ala Asn Thr Asn Glu Val
275 280 285
Leu Trp Ala Val Val Ala Ala Phe Thr Lys
290 295
<210> 48
<211> 2678
<212> DNA
<213> Artificial Sequence
<220>
<223> GOLPH3
<400> 48
atattggaaa ggcgccgccg ccgcctccgc cttggagctc ggggtgtttc ggggactgcg 60
gccacaggca ggaaggcgct cctctcctgc cccgccgacg cccggccagc ccgcttcgcc 120
ctgacctgtt tcctcatgac tgcccccggc cctgctgccg acggacgtcg ccccggcgtc 180
cggatttaac acggaaaccc ggatcggagg ccgcgcgggg aggaggaggg cgacccggtc 240
ggtcctgcga ccctctcggc ccggctcggc gcctcggcgg gagccatgac ctcgctgacc 300
cagcgcagct ccggcctggt gcagcggcgc accgaggcct cccgcaacgc cgccgacaag 360
gagcgggcgg cgggcggcgg cgccggcagc agcgaggacg acgcgcagag ccgccgcgac 420
gagcaggacg acgacgacaa gggcgactcc aaggaaacgc ggctgaccct gatggaggaa 480
gtgctcctgc tgggcctcaa ggaccgcgag ggttacacat cattttggaa tgactgtata 540
tcatctggat tacgtggctg tatgttaatt gaattagcat tgagaggaag gttacaacta 600
gaggcttgtg gaatgagacg taaaagtcta ttaacaagaa aggtaatctg taagtcagat 660
gctccaacag gggatgttct tcttgatgaa gctctgaagc atgttaagga aactcagcct 720
ccagaaacgg tccagaactg gattgaatta cttagtggtg agacatggaa tccattaaaa 780
ttgcattatc agttaagaaa tgtacgggaa cgattagcta aaaacctggt ggaaaagggt 840
gtattgacaa cagagaaaca gaacttccta ctttttgaca tgacaacaca tcccctcacc 900
aataacaaca ttaagcagcg cctcatcaag aaagtacagg aagccgttct tgacaaatgg 960
gtgaatgacc ctcaccgcat ggacaggcgc ttgctggccc tcatttacct ggctcatgcc 1020
tcggacgtcc tggagaatgc ttttgctcct cttctggacg agcagtatga tttggctacc 1080
aagagagtgc ggcagcttct cgacttagac cctgaagtgg aatgtctgaa ggccaacacc 1140
aatgaggttc tgtgggcggt ggtggcggcg ttcaccaagt aactctgctc ggggtgaacc 1200
attctccttt ctctcaagta aaccagtagt ttttcttctg ttgacttctg gttttctgta 1260
atttgtactt tcccacacta taattggctt ctgttttaca aaatggtggg tggctttttc 1320
ttttttgtac gtgtacagga ttctgctggt acgagaggcc ttcctctttc tgtttttaaa 1380
aaaagtttta ctgccatatt ggcattccat tccctgttgc catcctcact gttacctgtt 1440
ttgggtttct ggtctacttt gactttcaaa gtacctccag cctcctcata cgcacagctt 1500
ttggatgacc tcagcttgag tttctccata tgtgcatgta catctagcat tctgcctaca 1560
gttcagacag aagtcacaaa aaggccttca actcaccaaa ggtaaatatc tgtatctatt 1620
aggacatttt ttacatagac ttcagttgag atgtatactt agcaaaatta tttttaaatt 1680
gaaacagcac agtaaatact taatataaaa tgtcccttgg attttgcttc ccatgtaaat 1740
ctattgtatt attacacttg ttataatttt aactataaag gtccaattgt ttcacagagc 1800
cagtttggga tgggctgcat tccatttatg ctgtatatag tttgaattat atataaatta 1860
ccccttcttc tggccacccc tgctcccatc ttagtatttt gcaagatcta atcagttgta 1920
cacctggtgc ccctcgcttg cttcaatcat ggttatttga tggcaaaatc gacctcttgt 1980
cgctgaagga gagagaaaag atgtgtgtct gattggtcct gggatttttt gagctgtgcc 2040
atttatggta ctctttgcct atgcatcccc ttgttagatt ttttttaaat tttatcttac 2100
tgtttttata atttctattg ggaagaggct tgtgaccagt accaatcttg agtttctttt 2160
tctgtccaca agtaaattaa tatctgctct gaaatgtcat ttatctactc acacattctt 2220
ggggaaaaaa atcaaatgtc agtcctagca gatgttgcat gtaaattggt agcaagtaat 2280
gattacaacc cagaggatta agaattttgt aacagaaagc tctatgtttt aattttttat 2340
atacaattag gataattagc attgtcagac tataaacctt tgctttttaa agtttatttt 2400
tactatttct ttatcacttt attgtatcat caccattggt ttcataatgt aaatactata 2460
tgttgaacaa attaaatgtc aaaatttttt attaccatag tccatgttaa tagtggggct 2520
ttcaggtgtt tagagatttt ttttgttgtt gttaacattc attgcaaaag tactagatgg 2580
tgtataactc tagagttgaa ttttaaggga ttccctaata tgtatactat ctttttatct 2640
gaagtaataa ataaacaatg atcttgaaag tgcctgaa 2678
<210> 49
<211> 587
<212> PRT
<213> Artificial Sequence
<220>
<223> KLHL3
<400> 49
Met Glu Gly Glu Ser Val Lys Leu Ser Ser Gln Thr Leu Ile Gln Ala
1 5 10 15
Gly Asp Asp Glu Lys Asn Gln Arg Thr Ile Thr Val Asn Pro Ala His
20 25 30
Met Gly Lys Ala Phe Lys Val Met Asn Glu Leu Arg Ser Lys Gln Leu
35 40 45
Leu Cys Asp Val Met Ile Val Ala Glu Asp Val Glu Ile Glu Ala His
50 55 60
Arg Val Val Leu Ala Ala Cys Ser Pro Tyr Phe Cys Ala Met Phe Thr
65 70 75 80
Gly Asp Met Ser Glu Ser Lys Ala Lys Lys Ile Glu Ile Lys Asp Val
85 90 95
Asp Gly Gln Thr Leu Ser Lys Leu Ile Asp Tyr Ile Tyr Thr Ala Glu
100 105 110
Ile Glu Val Thr Glu Glu Asn Val Gln Val Leu Leu Pro Ala Ala Ser
115 120 125
Leu Leu Gln Leu Met Asp Val Arg Gln Asn Cys Cys Asp Phe Leu Gln
130 135 140
Ser Gln Leu His Pro Thr Asn Cys Leu Gly Ile Arg Ala Phe Ala Asp
145 150 155 160
Val His Thr Cys Thr Asp Leu Leu Gln Gln Ala Asn Ala Tyr Ala Glu
165 170 175
Gln His Phe Pro Glu Val Met Leu Gly Glu Glu Phe Leu Ser Leu Ser
180 185 190
Leu Asp Gln Val Cys Ser Leu Ile Ser Ser Asp Lys Leu Thr Val Ser
195 200 205
Ser Glu Glu Lys Val Phe Glu Ala Val Ile Ser Trp Ile Asn Tyr Glu
210 215 220
Lys Glu Thr Arg Leu Glu His Met Ala Lys Leu Met Glu His Val Arg
225 230 235 240
Leu Pro Leu Leu Pro Arg Asp Tyr Leu Val Gln Thr Val Glu Glu Glu
245 250 255
Ala Leu Ile Lys Asn Asn Asn Thr Cys Lys Asp Phe Leu Ile Glu Ala
260 265 270
Met Lys Tyr His Leu Leu Pro Leu Asp Gln Arg Leu Leu Ile Lys Asn
275 280 285
Pro Arg Thr Lys Pro Arg Thr Pro Val Ser Leu Pro Lys Val Met Ile
290 295 300
Val Val Gly Gly Gln Ala Pro Lys Ala Ile Arg Ser Val Glu Cys Tyr
305 310 315 320
Asp Phe Glu Glu Asp Arg Trp Asp Gln Ile Ala Glu Leu Pro Ser Arg
325 330 335
Arg Cys Arg Ala Gly Val Val Phe Met Ala Gly His Val Tyr Ala Val
340 345 350
Gly Gly Phe Asn Gly Ser Leu Arg Val Arg Thr Val Asp Val Tyr Asp
355 360 365
Gly Val Lys Asp Gln Trp Thr Ser Ile Ala Ser Met Gln Glu Arg Arg
370 375 380
Ser Thr Leu Gly Ala Ala Val Leu Asn Asp Leu Leu Tyr Ala Val Gly
385 390 395 400
Gly Phe Asp Gly Ser Thr Gly Leu Ala Ser Val Glu Ala Tyr Ser Tyr
405 410 415
Lys Thr Asn Glu Trp Phe Phe Val Ala Pro Met Asn Thr Arg Arg Ser
420 425 430
Ser Val Gly Val Gly Val Val Glu Gly Lys Leu Tyr Ala Val Gly Gly
435 440 445
Tyr Asp Gly Ala Ser Arg Gln Cys Leu Ser Thr Val Glu Gln Tyr Asn
450 455 460
Pro Ala Thr Asn Glu Trp Ile Tyr Val Ala Asp Met Ser Thr Arg Arg
465 470 475 480
Ser Gly Ala Gly Val Gly Val Leu Ser Gly Gln Leu Tyr Ala Thr Gly
485 490 495
Gly His Asp Gly Pro Leu Val Arg Lys Ser Val Glu Val Tyr Asp Pro
500 505 510
Gly Thr Asn Thr Trp Lys Gln Val Ala Asp Met Asn Met Cys Arg Arg
515 520 525
Asn Ala Gly Val Cys Ala Val Asn Gly Leu Leu Tyr Val Val Gly Gly
530 535 540
Asp Asp Gly Ser Cys Asn Leu Ala Ser Val Glu Tyr Tyr Asn Pro Val
545 550 555 560
Thr Asp Lys Trp Thr Leu Leu Pro Thr Asn Met Ser Thr Gly Arg Ser
565 570 575
Tyr Ala Gly Val Ala Val Ile His Lys Ser Leu
580 585
<210> 50
<211> 6805
<212> DNA
<213> Artificial Sequence
<220>
<223> KLHL3
<400> 50
attctttgca gcctagacag aggaggcagg agccccaggg gcgggctaat cgcctgggct 60
ggggatgcct gggcagatgc agaggaagct ggaaaggtgg cagtgcacct gggtcgctgg 120
agctgccgcc gttcctagga gaccaaggag cagcaagcct gcgggggagg gggagcaagt 180
gggttgctgc ttttagcagc tgaaagggct gcagggagct ctgggtaaga cattttctgt 240
tgctgctgct tttgcggtag aagctgctgc gagtaagtca gaggaaggag gattgagaag 300
ggaggaggct tcacttgcag ccatgctgaa tcactgttgc tgatcagatt tcccactggt 360
ctgctgggag aatcaagaac agaactagga tccccgagtg catacacagt tcagcagcta 420
ccaccctggc tgggcctcac acaatggagg gtgaaagtgt caagctgagc tcccagactc 480
tgatacaggc tggggatgat gagaagaacc agaggacgat cactgtcaac cctgcccaca 540
tggggaaagc attcaaggtt atgaatgaac tgcggagtaa acagctgttg tgtgacgtga 600
tgattgtggc agaagatgtc gagatagaag cccaccgtgt ggtcctggca gcctgcagcc 660
cctacttctg tgcgatgttc acaggtgaca tgtctgagag taaagccaaa aagatagaaa 720
tcaaggacgt ggatgggcag acgctgagta agctgattga ctacatctat actgctgaaa 780
tcgaggtgac tgaagagaat gtccaggtgc tgctcccggc agccagcttg ctgcagctca 840
tggatgttcg gcagaactgc tgtgacttcc tgcagtctca gttgcatccc accaattgcc 900
tgggcatccg tgcatttgca gatgtacaca cctgcactga ccttctgcag caggccaatg 960
cctacgcaga gcagcacttt ccagaggtga tgctaggaga agaatttctt agcctgagtc 1020
tggaccaggt gtgcagcttg atatccagcg acaagctgac cgtttcttca gaagagaagg 1080
tgtttgaagc tgtgatctca tggatcaatt atgagaaaga aacccgttta gagcacatgg 1140
caaagctgat ggaacatgtc cgacttcctc tcttacctag ggactaccta gtccaaacgg 1200
ttgaagaaga agctttgata aagaataaca acacctgtaa agacttcctc attgaggcca 1260
tgaaatacca tctcctccct ctggatcaga gactattgat taagaaccca aggaccaagc 1320
ccaggactcc agtcagcctt cccaaggtca tgattgtggt tggcggccag gcacccaagg 1380
caatccgcag tgtggagtgc tatgatttcg aggaggaccg gtgggatcag attgctgagc 1440
ttccttccag aagatgcaga gcaggtgtgg tgttcatggc tggccacgtg tatgccgtgg 1500
gagggtttaa tggctcactg cgggtgcgga cagtggatgt gtatgacggc gtgaaggacc 1560
agtggacgtc cattgccagc atgcaggagc gccggagcac actgggcgca gcggtgctca 1620
atgacttgct ctacgcagtg ggaggctttg atggcagtac tggcctagca tcggtggaag 1680
cctacagcta caagaccaac gagtggttct ttgtggcccc gatgaacacg cggcggagca 1740
gtgtgggtgt gggcgttgtg gaggggaagc tatatgctgt tgggggttat gatggagctt 1800
cccgccagtg tctgagcact gtggagcagt acaacccagc gaccaatgaa tggatatacg 1860
tggcggacat gagcacccgc cgcagtggcg caggggttgg agtgcttagc ggacagctgt 1920
acgccacagg tgggcatgat gggcctttgg tgaggaagag cgttgaggtt tacgatcctg 1980
gaacaaatac ctggaagcaa gtggcagaca tgaacatgtg ccggcgcaac gcaggggtct 2040
gtgcagtaaa tgggctcctg tatgtggttg gaggggatga tggatcctgc aacttggctt 2100
cggtggagta ctacaatcct gtcactgaca aatggacgct gcttccaacg aacatgagca 2160
cggggcggag ctatgcaggt gttgccgtga ttcacaagtc cttgtgaccc aaactcctac 2220
tgccaggagg tggaggaagg agcaggtgct gcctgtgact ctgaacagca ggaccttggt 2280
gactggattc aacttgcttg ggagggtctg tgctgctgtg agaaccgctc tcctctgact 2340
tggcagactg gtgttgttca tcgcagtgtg gacaccatta cccacccccg ttcccctgag 2400
gtgctctggc ctatgccctg agcaaggggg gtcttgacat ccccaggcag cacctttggg 2460
ctttgttttg gtgtttctac agggacaata cagaccctgg agtgtgtgtg tgtgtgtgtg 2520
tgtgtgtaga ccatggtgtt tctctatgtt tctctaagtt ggggggtgag cgtgtgtgac 2580
agtctactgg atttctttac tactgatcct ttcgctgtgt taaaaatcaa gtcacagaga 2640
cctctcttct ggatttgtcc catggggacc ctgagactac taaagctgct ttcttctgaa 2700
ggtccagttg gacagtctgg gaatgtccag aaataaccag tgagaggggc agttctctgg 2760
ccacacccac ttatgtactt taactactgt gactttgtct gcagaagagc tggaaaattc 2820
tcgaagctgc accgtgtcct ctgtgtgcta gaataaggga caaatgggtt ccctgtgctt 2880
ctcagctcac tgtttttcct tgagttctcc tacaggaagc agatgagaac tgcccagtct 2940
tcaggtttag gccattggtc tttgatgtca tagattccag gcctgggagg tgttatgtct 3000
cttcagctgg gaaaactagc tcttcagaga agcctcgggt aacactgaaa aacaaaacaa 3060
aacaaaacaa aaacaggaaa aaaacaaaaa accaaagtgg taaggattca gttcctgcct 3120
ataatggtct cagagagggt cctactttta ggttttccca ggacaggaca gtccccattt 3180
atacttatta tcccagttta attattcaca gcaccccatt ttactcagaa gtgttctggt 3240
ctggaggata aataagaggt caccctcctc cagacccaaa gatagatttg tgcctgtgtt 3300
ggatggggtc gtgtgtgatt cagatggaca ttggatggct tcaaaggaat ataccactag 3360
agctggccct tggcactttg tgacagtggt caagtctgtc taatgtcctt gtcttctttt 3420
tcttgtgctt tccccctatt ccagggtgtg caccctctcc ccaaccccca agaaccccac 3480
tactgctttc cctgtgaggt aggagatatc agtgggcctt ggatttgagg cttcctaaga 3540
tgtgcttgca ttttaaaaag ggagcttggt gagagctttg ctaattcaca ggtaaaaatt 3600
attaacaata gaacttcaag catcttgagg agcgggcatt tgagggggca tggagtaatt 3660
tgtatttaaa aaaccttaaa gttgtgctgt tcctaaacta gcaaattgct catgctgaaa 3720
tttctggcat aagcagggga agtcttgtgt ctggagaata gtctcatacc ttgcagtctg 3780
ggacaccctc cctactttga gaatccacct acaggaagcc aaggaacttt ataaatcctg 3840
atgttggact tctgatacga ctgggctact tccaagcagg tgctgcagga gattggcatc 3900
ccccagcccc tgcagttaga aaccccgaag tcttcccagc cagtgagcca ctttgtgtat 3960
ttactgtata tttattgtgc cctaaatgtg caactctcct aaagacaaaa cttctctttc 4020
tgatgttaag cacatgttac ttcaacaaga tgcttggaga acaacaaggt acccagaatt 4080
tttagaagcc ttcagaagag gctaaaatat ccagctttgg gggacctgga agaaatgtct 4140
ccaaaggaag caaggcatgt tttagttgag tgctctggtc tcactatgaa gtggggatga 4200
ctgtggcttc ataactctac ctggctgtgg gttggaagct gatggaatga gaaatgtcct 4260
ttctccttct ctgaggaaat tttgagactt gtttcggtgt gtctgtgtga tggggatgag 4320
gctggggttg ggatctgatg tatgccattc acagaagctc tcaatttcag atgataggtg 4380
aattccctgc ccctccccca ccactgagaa gctagacttt catgcgggag aggctacttt 4440
tatgtgtcgt cttccgggga agggtccctc cactgaaagc tagccagtca tgttttctgt 4500
ttttggattt ttgcaattgg tttcacctca tgtctccctc cctacaaagc actgcctcta 4560
ctgggcgtgc tgccaaggcc atgtgcactc catcctcatg tatccttttt cacggggacc 4620
agaacactgg tacgtcatca ccaaagccaa tctgctctag ctgcccacag atgccaccaa 4680
aacctgctat ctcttcatca ccaggtacga ttctctttcc acagtggaca cagcaggcta 4740
ttttctagtt tgtgctggtc acgtggtaga tgaagcctct tactgcccca cttagggtgg 4800
ccacggctgc ttgtgaatgc agctttgcca gtggcatatc tgtcatctga ttgcggtggt 4860
gaaatggaat tgaggcccaa ggttagaagc agccgagacg ccacttggat actgatttga 4920
acaatgtaga agtcagattc tgaattccaa agttatttct cataagtacc caatggcatc 4980
tctccatcta caaagttgca gtattatgca aataaaactg acctcatttt ctgctatgca 5040
ataagaatac ttaattctag ttcccgacaa gccagttgca atatccccta agatgctttt 5100
tgagctgtct tactttgata tctgttgtgt aatgtttgta tatttctgag ccagatcctt 5160
tcaaagattg cctttttata aaattgaagc tatagctttt aggctaaaat tttaacgtag 5220
atatttttat aagatatttt tttcaagagt ttgaatcgct ttttattgtc catgataatg 5280
aaatgttgtg ttctttgcat cattcactct caaacgtagt tcatgcctgt agctctcttc 5340
cttttgtttc tcacccttca gaaacatatt tttcagtagc tccaggtaga tgagcctttt 5400
tttttttttt aaaaatacca tattcaaggg agtctgctga attttaaaac gcagtcactg 5460
gtgtttcttg aattgctagg gactgatgtt atgttcgact cagcacttgc ccgtctgtat 5520
tgattgtgtc tttttttttt tttttggagt ctgctttctg tgggggtgag gccgggctgt 5580
ctcgtggtgg ctcccactga cgggcactga gcctggtacc ctgtggcatg gagaagcctc 5640
agggaaaggc ctgccccccc agcacatact cccatagtgt cctaggtcca gccgaccatt 5700
ccttattctc ttctatctcc ttgttgatct gaagcttcca atagcttgag gcctttgctg 5760
ctggatgatg ccctttttgg gagcatcttg tctctaacct ttaaaagagg ggtcaatcct 5820
catgatccct gtgtgttaag catatgcttt gcaggtgctc acactacact tacaacttgc 5880
ttcttgagct atgtctctac tccaggctct gttttgtgta tttatctgcc atttgcatca 5940
tggtttttaa aatttattat tattattatt attgttggga caggtgccat ttaaattgcc 6000
tccatgctcc ccatttgcac ctagctggat caagttggga ggctgagcaa actcatattc 6060
cagttagttg gagtttttaa aggctctgtt tgcctggaga agcaaggagg ttagaatgta 6120
atttttttaa gcgtttgcac tatttagagt cctaagcccc tcatgttcag ctgtgctgtg 6180
tttctactga ccaagcagga gagccagcag cacttccagc atttgggaat ggaagagatt 6240
tcttctgtag tggataatta cagcctcata gcccctgtgc agccttcgtc atgggactca 6300
gtgactcatg gatatagcat cagccatggc aggaatgcac aggactgtgg catttgcagc 6360
atcaaatcac cctagtgcca tgtttggtta tgagattgta aattattcgc tcccccatcc 6420
tcccctcccc tcattttcag tggcaataga ggacccttgt tgtacttctt gtttaatttg 6480
catattatgt gtaaaatgct ttcgttgaaa gaaaactgaa gacactgaat gtgtatgtct 6540
gtgtgggtgc tctgtccctg tggttgtcat agccagtcag acttgatcac tgacaccccg 6600
tacaacatat tgcataggta agatcctcga tctggtgttc tctgcgtggc tgttagggac 6660
tgtatatctt gtaaaagaac acttgtcaca tgcttgatca gttacagcaa tagctgaaga 6720
aacatttcct caaatgtatt attttaacag gaatcatgtt ctaatttccc atcctttaat 6780
tttaataaaa gctgaactgt gtgaa 6805
<210> 51
<211> 895
<212> PRT
<213> Artificial Sequence
<220>
<223> CTNNA3
<400> 51
Met Ser Ala Glu Thr Pro Ile Thr Leu Asn Ile Asp Pro Gln Asp Leu
1 5 10 15
Gln Val Gln Thr Phe Thr Val Glu Lys Leu Leu Glu Pro Leu Ile Ile
20 25 30
Gln Val Thr Thr Leu Val Asn Cys Pro Gln Asn Pro Ser Ser Arg Lys
35 40 45
Lys Gly Arg Ser Lys Arg Ala Ser Val Leu Leu Ala Ser Val Glu Glu
50 55 60
Ala Thr Trp Asn Leu Leu Asp Lys Gly Glu Lys Ile Ala Gln Glu Ala
65 70 75 80
Thr Val Leu Lys Asp Glu Leu Thr Ala Ser Leu Glu Glu Val Arg Lys
85 90 95
Glu Ser Glu Ala Leu Lys Val Ser Ala Glu Arg Phe Thr Asp Asp Pro
100 105 110
Cys Phe Leu Pro Lys Arg Glu Ala Val Val Gln Ala Ala Arg Ala Leu
115 120 125
Leu Ala Ala Val Thr Arg Leu Leu Ile Leu Ala Asp Met Ile Asp Val
130 135 140
Met Cys Leu Leu Gln His Val Ser Ala Phe Gln Arg Thr Phe Glu Ser
145 150 155 160
Leu Lys Asn Val Ala Asn Lys Ser Asp Leu Gln Lys Thr Tyr Gln Lys
165 170 175
Leu Gly Lys Glu Leu Glu Asn Leu Asp Tyr Leu Ala Phe Lys Arg Gln
180 185 190
Gln Asp Leu Lys Ser Pro Asn Gln Arg Asp Glu Ile Ala Gly Ala Arg
195 200 205
Ala Ser Leu Lys Glu Asn Ser Pro Leu Leu His Ser Ile Cys Ser Ala
210 215 220
Cys Leu Glu His Ser Asp Val Ala Ser Leu Lys Ala Ser Lys Asp Thr
225 230 235 240
Val Cys Glu Glu Ile Gln Asn Ala Leu Asn Val Ile Ser Asn Ala Ser
245 250 255
Gln Gly Ile Gln Asn Met Thr Thr Pro Pro Glu Pro Gln Ala Ala Thr
260 265 270
Leu Gly Ser Ala Leu Asp Glu Leu Glu Asn Leu Ile Val Leu Asn Pro
275 280 285
Leu Thr Val Thr Glu Glu Glu Ile Arg Pro Ser Leu Glu Lys Arg Leu
290 295 300
Glu Ala Ile Ile Ser Gly Ala Ala Leu Leu Ala Asp Ser Ser Cys Thr
305 310 315 320
Arg Asp Leu His Arg Glu Arg Ile Ile Ala Glu Cys Asn Ala Ile Arg
325 330 335
Gln Ala Leu Gln Asp Leu Leu Ser Glu Tyr Met Asn Asn Ala Gly Lys
340 345 350
Lys Glu Arg Ser Asn Thr Leu Asn Ile Ala Leu Asp Asn Met Cys Lys
355 360 365
Lys Thr Arg Asp Leu Arg Arg Gln Leu Arg Lys Ala Ile Ile Asp His
370 375 380
Val Ser Asp Ser Phe Leu Asp Thr Thr Val Pro Leu Leu Val Leu Ile
385 390 395 400
Glu Ala Ala Lys Asn Gly Arg Glu Lys Glu Ile Lys Glu Tyr Ala Ala
405 410 415
Ile Phe His Glu His Thr Ser Arg Leu Val Glu Val Ala Asn Leu Ala
420 425 430
Cys Ser Met Ser Thr Asn Glu Asp Gly Ile Lys Ile Val Lys Ile Ala
435 440 445
Ala Asn His Leu Glu Thr Leu Cys Pro Gln Ile Ile Asn Ala Ala Leu
450 455 460
Ala Leu Ala Ala Arg Pro Lys Ser Gln Ala Val Lys Asn Thr Met Glu
465 470 475 480
Met Tyr Lys Arg Thr Trp Glu Asn His Ile His Val Leu Thr Glu Ala
485 490 495
Val Asp Asp Ile Thr Ser Ile Asp Asp Phe Leu Ala Val Ser Glu Ser
500 505 510
His Ile Leu Glu Asp Val Asn Lys Cys Ile Ile Ala Leu Arg Asp Gln
515 520 525
Asp Ala Asp Asn Leu Asp Arg Ala Ala Gly Ala Ile Arg Gly Arg Ala
530 535 540
Ala Arg Val Ala His Ile Val Thr Gly Glu Met Asp Ser Tyr Glu Pro
545 550 555 560
Gly Ala Tyr Thr Glu Gly Val Met Arg Asn Val Asn Phe Leu Thr Ser
565 570 575
Thr Val Ile Pro Glu Phe Val Thr Gln Val Asn Val Ala Leu Glu Ala
580 585 590
Leu Ser Lys Ser Ser Leu Asn Val Leu Asp Asp Asn Gln Phe Val Asp
595 600 605
Ile Ser Lys Lys Ile Tyr Asp Thr Ile His Asp Ile Arg Cys Ser Val
610 615 620
Met Met Ile Arg Thr Pro Glu Glu Leu Glu Asp Val Ser Asp Leu Glu
625 630 635 640
Glu Glu His Glu Val Arg Ser His Thr Ser Ile Gln Thr Glu Gly Lys
645 650 655
Thr Asp Arg Ala Lys Met Thr Gln Leu Pro Glu Ala Glu Lys Glu Lys
660 665 670
Ile Ala Glu Gln Val Ala Asp Phe Lys Lys Val Lys Ser Lys Leu Asp
675 680 685
Ala Glu Ile Glu Ile Trp Asp Asp Thr Ser Asn Asp Ile Ile Val Leu
690 695 700
Ala Lys Asn Met Cys Met Ile Met Met Glu Met Thr Asp Phe Thr Arg
705 710 715 720
Gly Lys Gly Pro Leu Lys His Thr Thr Asp Val Ile Tyr Ala Ala Lys
725 730 735
Met Ile Ser Glu Ser Gly Ser Arg Met Asp Val Leu Ala Arg Gln Ile
740 745 750
Ala Asn Gln Cys Pro Asp Pro Ser Cys Lys Gln Asp Leu Leu Ala Tyr
755 760 765
Leu Glu Gln Ile Lys Phe Tyr Ser His Gln Leu Lys Ile Cys Ser Gln
770 775 780
Val Lys Ala Glu Ile Gln Asn Leu Gly Gly Glu Leu Ile Met Ser Ala
785 790 795 800
Leu Asp Ser Val Thr Ser Leu Ile Gln Ala Ala Lys Asn Leu Met Asn
805 810 815
Ala Val Val Gln Thr Val Lys Met Ser Tyr Ile Ala Ser Thr Lys Ile
820 825 830
Ile Arg Ile Gln Ser Pro Ala Gly Pro Arg His Pro Val Val Met Trp
835 840 845
Arg Met Lys Ala Pro Ala Lys Lys Pro Leu Ile Lys Arg Glu Lys Pro
850 855 860
Glu Glu Thr Cys Ala Ala Val Arg Arg Gly Ser Ala Lys Lys Lys Ile
865 870 875 880
His Pro Leu Gln Val Met Ser Glu Phe Arg Gly Arg Gln Ile Tyr
885 890 895
<210> 52
<211> 10696
<212> DNA
<213> Artificial Sequence
<220>
<223> CTNNA3
<400> 52
cccctttctt tcttatcctg ggtgaacaac gctcagcgaa attgactgcc ccactgtcat 60
ctgcctctca atttggtact ctgtaactct gtgaccacca agaagccttt ttccgtcccc 120
cacaaagctc tttttggaaa attccctacg ggagctgaat tttaagccca tttactttat 180
aggaagaaac agaaaggcag catgtcagct gaaacaccaa tcacattgaa tatcgatcct 240
caggatctgc aggtccaaac attcaccgtg gagaagctac tggagcctct cataatccag 300
gttaccacac ttgtaaactg tccccagaac ccttccagca ggaaaaaagg acgttcgaaa 360
agagccagtg tccttctagc ttctgtggag gaagcaactt ggaatttatt agacaaggga 420
gagaagattg cccaggaagc tacagtttta aaggatgagc ttacggcttc acttgaggaa 480
gttcgcaaag aaagtgaagc tctgaaagta tcagctgaga gatttacaga tgacccctgt 540
tttctcccaa aaagggaggc tgtggttcaa gctgcccgtg ccttgctggc tgcggtgacg 600
agactcctta tccttgcgga catgattgat gtcatgtgcc tcttgcaaca tgtgtcagct 660
tttcaaagga catttgagtc tctcaaaaat gttgccaaca aatctgacct ccagaaaacc 720
taccagaagc ttgggaagga gctggaaaat ttggattatt tagccttcaa acgtcagcag 780
gacttaaaat ctccaaatca gagagatgaa attgcaggag cccgagcttc actgaaggag 840
aactctcccc tcttgcattc aatttgttca gcttgtttgg agcattctga tgttgcttcc 900
ctcaaagcaa gcaaggacac agtttgtgaa gaaattcaga atgctctcaa tgtaatttca 960
aatgcttcac aagggatcca gaatatgaca accccaccag aacctcaggc agcaaccctg 1020
ggaagtgccc ttgatgagct ggagaattta attgtcctga atccactcac agtaactgag 1080
gaggaaatac gaccatcact agagaaacgc cttgaagcca ttatcagtgg ggctgctctg 1140
ctggcggatt cttcatgtac gagggactta caccgagagc ggattatcgc agaatgcaac 1200
gccattcgcc aggctcttca ggatctgctt tcagagtaca tgaacaacgc tggaaaaaaa 1260
gaaaggagta ataccctgaa tattgcttta gacaacatgt gtaagaagac aagagacctt 1320
cgcagacagc tccgcaaggc tattatagat catgtgtcag actctttcct ggatacgaca 1380
gtccctcttt tggttctcat tgaagctgct aagaatggcc gggaaaagga aataaaagaa 1440
tatgctgcga tatttcatga acacaccagc aggcttgtag aggtggcaaa tcttgcttgt 1500
tccatgtcaa caaatgaaga tggaattaaa attgtcaaaa ttgcagccaa tcatttggaa 1560
accttgtgtc cacagattat taatgctgca cttgctttgg ctgcaagacc caaaagtcaa 1620
gcggtcaaaa acaccatgga aatgtacaag cgtacatggg agaatcatat acatgtcctc 1680
actgaagccg tagatgacat tacaagcatt gatgacttcc ttgctgtatc tgaaagccat 1740
atcttggaag atgtcaacaa gtgtatcata gccttaagag accaggatgc tgataattta 1800
gaccgtgctg cgggtgctat cagaggccgg gcagcaagag ttgctcacat cgtcacgggt 1860
gaaatggaca gttacgagcc aggggcttac acggaaggtg taatgagaaa tgttaacttc 1920
cttacaagta ctgtaattcc tgaatttgta acacaagtga atgttgcctt ggaagcctta 1980
agcaaaagct cattgaatgt gttggatgat aatcaatttg tggacatctc aaagaagatc 2040
tatgatacaa ttcatgatat cagatgttca gtcatgatga ttcggacccc agaggaactg 2100
gaggatgttt ctgaccttga agaggaacac gaggtccgca gtcacaccag cattcagacc 2160
gaagggaaaa ctgatagggc taagatgact caactgcctg aggcagaaaa agaaaagatt 2220
gctgagcaag ttgctgattt caagaaagta aagagtaagc tggatgctga gattgagata 2280
tgggatgata caagcaacga catcattgtt ctggccaaga acatgtgtat gatcatgatg 2340
gagatgacag acttcactag gggcaaagga ccactaaagc atacaactga tgtgatctat 2400
gcagcgaaaa tgatatcaga atcaggatca aggatggatg tccttgctcg gcagattgct 2460
aatcagtgcc cagatccatc ttgtaaacag gacttgttgg cctacctgga acagattaag 2520
ttctactccc accaactgaa aatctgcagt caagttaaag ctgagatcca gaacctggga 2580
ggagagctca tcatgtcagc tttggacagt gtcacatccc tgatccaagc agccaaaaat 2640
ttaatgaatg ctgtagtgca aacagtgaaa atgtcttaca ttgcctcaac caagatcatc 2700
cgaatccaga gtcctgctgg gccccggcac ccagttgtga tgtggagaat gaaggctcct 2760
gcaaaaaaac ccttgattaa aagagagaag ccagaggaaa cgtgtgcagc tgtcagacga 2820
ggctcagcaa agaaaaaaat ccatccattg caagtcatga gtgaatttag aggaagacaa 2880
atctactgaa accactattc tacatatagt gcctatatga caaaatcctg cctaaccaca 2940
ctgctttatt ttacacttaa gaagttctgt aatttcacta agttttggtg tttaactcac 3000
aaataacata aaatattggg cgctaaatca acaaaagcaa tatatatttg ggatcatatc 3060
actgtcattt ctgtatggtc agcacctaat agttaaggaa tatttgcttg ttgaatgaat 3120
gaaattatca cgtgtcattc agcgtttccc atcatagaga ttatctacta ttcgttacca 3180
aataaacaca ggagaggcca gagagtcctg tttatctgta atacttcatg tacacttatc 3240
atccttatct tgaattaaaa cactaacatg agctcctaac ttggtttttt aatagaaaca 3300
aaagactttt ataaaatatt ttcccattta atctccatgc tttctttatc tgatctaacc 3360
tggcacctaa ccaggcagaa atgtatgatt cctgccatag caaaaaaacc accttttaat 3420
ctctagatag ctgtactcat tgtcaactta ttaggctaat atccattata aactaatcaa 3480
atttgaatag ttaggctact tgctggattt tgaaggtcaa ccttgtttat taataaaatg 3540
ctttcttaac attataaagg ttacaatgag ttctgatgcc acatactcac cttttgggtt 3600
tccaatgtgt tagagagttc tgtacttttg agtgttaggc ctatccgaat acatatgtga 3660
aagaagagtt atcacccatt gggaaaatga gacaacaaac tatatcccaa agtgagttat 3720
attaataata acagaagtgt aagttctgta tgaccctatt tttacaaaca taaaaataca 3780
tttttatcag cttgcaattg taattaaaag gaaaatggca gtttgaaaaa tcttttgacg 3840
ttgagtaaaa tataactgca tgaactgtac cattgaacta tgaagcagtt aatggcaatg 3900
aagctggagt gattttaata acgttctttt aaataaaagt cactggggtc attttacaac 3960
tccagtcact gtgttcattc ctagttgagt tcataatgga cttcataata gtcttagagt 4020
ctagtgtacc tctctctgtc attctctctc tctttttctc tcccctgatg acctgcctct 4080
ctctgtcttt caaaaatgtc cttaacagaa actctttgga ggccataagt tttgttttca 4140
ttttttctca ttcatagagt tctgattcac agactttaaa aaacattttc tattctacat 4200
tatcataata gtattatatt gtaccttttt acttctaaaa cattgccatt gatgagaggc 4260
gtctcagcaa cgatgtaacc ttagattgat gagaaaaaaa tttacacata caataccagt 4320
tgatattagg agaaccagtt tgggtaaaag gatgcatgaa acatagaaaa gcataaagat 4380
tgacagtacg atatgtgtgg aatcttaagc ataatggtat gggaaaagaa ggatctattt 4440
aataggtcat tattaattat gcaggaaatg ttaaatttct tgtaatgaat tatttcaatc 4500
ttccaagtag ctggattgcc tgagctcagt gtaaaacttc ctaggaggtg aaggcaataa 4560
actcctccct gttttgcacc atagactggc ctacaataac tattctgaac agaatgcact 4620
gcccagaaga aaaggcagtg aagaagggaa ttagggaatt cagatgactc tgtttttcag 4680
aattatctag atgattgagg caaatagaga ttttctttta ttgaaatgct gccatcattt 4740
tctgctctaa gtgtagatgt tgcacgagtg acatctgtgg tatatccttc ttccagatag 4800
aagtataaat ttataattta gtggggagag gaattttctt ttaatcacca aactaccagc 4860
tgtatagaaa atttttttga aatcaccaaa ctagcagttg tgtatcttag ctcctaaagc 4920
cacattcaca agaatcataa gcatctcatt tagaaaatct aatgtccacc ggacccaaca 4980
tcatctttcc acatgttgga gtttagaaca tgcagtattg gacaccaaaa gaacagaaat 5040
aacttgaaat tatacaatta ccaaataacc tccatctctt tgataaaaat gactttttac 5100
ccaaagtgag atgtgaacaa gttctcctct ttaacccagg cagattttca gcttagctca 5160
aagcaaacta taattaaata aaacaataaa aattttaact caatgaatat cctaggagtt 5220
aatctgctgt ggtaatcgat agtgacataa aaacaagatg catagagctg aaaaaggcat 5280
ctgtccacct gagtgaattc ctgtagtagg cagcatttca agatttagtt tgattcaaac 5340
ctgctgataa aacaaatgct tatatgcaag aaaaactagc agcaagatat gaataaatct 5400
ccccattatg ttaaaaacta atcagcatga cacatttata tttaggtatt tcttctgcag 5460
aggattgcca gaaatttctt aatgagagtg ataatttggt gatccaccta tgtacctttt 5520
gatgtctcat cacagagatt agtgcttcct ctgataatac atttgcttga tcctgctaat 5580
aagaactcac ctttcttttt aacgaaatca atttgtgtta atttacatag ggaaaaacgt 5640
tccttgagag ggaggagagg ggtgtttacc acccacagga gcagttccat ccctggttac 5700
ccaccagtat ccctgccaga tgttagactg gaattagaat gaaaatttaa aaaaaaacga 5760
aataataatc cactaaatgg ccttctaatt agagaaattt gaagatttct atcaaatata 5820
aaaagtgaga aatagaaatc attcccttta atttatgaca aattaaaatt attgaagtaa 5880
aatgtttcta agtcagtttc tggatagttc ttgaatgagt caagaaaaaa ctcaaacact 5940
aaatttctaa gctaatttgt tcattattcc tttctgttta ttatttatta tatgagtgca 6000
aattgcaaat gtattagaaa agacaatata gccttcacat tcacaagtgt gccttctggc 6060
ctttgaaatc ttttaacata agatcaatat aacatatttg ttccccataa aaatctctta 6120
tgtacaggga ataaggtcct ccatagctac acagagacac ccattcttgc agtctggcat 6180
tcttaacacc tttagcttca aaggcagtcg gtttcaattt caagctaaaa ttgtgaatat 6240
acaattaagt gcgcaacttg ggccttcaga ccaaataatt aatgctcctc aaattaaata 6300
atgagtggct gaaaagccta atgtgtaaca acagagtaag tgacttcaac ggcttcctga 6360
accttcattt tgagggtgtt ctcatctcac ccacctctgg gctaggtatc ctctgttaat 6420
tgctaacaat ttcctaccaa aatagtgcaa tcaggactgc ccatcacagc aaattactcc 6480
aattaatgcc ctttcctctt ctgtgaacca aaaatatctc tctcctgtct ctcatctcta 6540
tcccaggtat aaattgacta ataaatttta cagagtttta gtcaaaatta tgaattttaa 6600
tttccagcaa cattattcct ttagtctttc tattccccaa aacattctta aattttggta 6660
tcaaatgttc atttctgtcc tagagattgt ggttcccatg gctacagatt tttgtatagt 6720
ggtagtatga gtggaataag atctgagact tttcctggtg atatccacca ttctcttgtg 6780
aactcattgc agaataaaca gcctctttca ctttcattgt aagtagttct ttaggttggc 6840
aaagcaggca ccagaatctg aatttgaatg aaatctgtac tttcttttgt tggcctattt 6900
tttgccaaga cagcctgcct attgtatatt taggaatcaa atattcattt tttctgtttg 6960
taaagccaga aatatttcag ttaaaaagaa tataatttta tattatttta ataattcttt 7020
ttaagtaaag taggaaaatt tctgccacct gagatgttac tgtttttatt tatatgaaaa 7080
ttccattttt ttctttatga ctcttttccc ataccattat gatatgttca gcattagtat 7140
tttccatttt accactaaac aaataagcca ggactaataa caatagcagt ttgagagttt 7200
ttattttggc attttggttt taattaggaa agaaaacatg gatgctatta aaactagttg 7260
tagttaaaaa tgttttgaat gaggctacta acgctatctt agtcatctca aagagaaaga 7320
gaagaagtaa aattttaaat acctttttgg catttttaac caattcatat gaacaaaaac 7380
atatattcta ttaaagcaaa ataaagaata gtttaaaaca ctagtcttct tgcatgtgaa 7440
tattttcccc ctaacagaca agatagaatc catttctgga cactctaaaa gaaagtaatg 7500
ttaattgaga gagctgtctc caaatggaac tcaacctgac cagaatccca tccgcaggca 7560
agagatgggc tgtggggccg accataagtt ttattctgag acgagatgct ttatggctgc 7620
tggctgcatt atggcatcta gcagtcccca ctggtttgtg ggaacaaaac acatggattg 7680
atattcaata tgagtttggt aagttggaga aaaatggaga cgatacggag tagtctaaag 7740
gcacaacaaa taagaaggtg ccagaggtag ggaagggtct gagaggggta gttgcagaag 7800
catttagctt tcatagatga ggggagagtt ttttggtttt atgaagatta ggcagccctc 7860
ctgaggaaag tattaaggat agaatgagac acttgtcttg gtggtttggt gaagtcgtgt 7920
cgaaggcaaa aggatgaact acacatattt tgagcatgat ttagtaatat tgtttgcaag 7980
ttgagaaaca gaaattcagc cattctcctg actgatggta ggtgaaattc ttcccaatat 8040
tagagccaat ctgggttacc atacgttcat atatttaaca gctatttgtt aaacaactgt 8100
gatgttccag gcactgttct aggttcttag gatacaccat ataacaaaac aaatagtgat 8160
ccttgccctc atgtgccagc tctgtgttaa atgctaacta ttatcttatt caatcctcca 8220
aataacccta taggatatat atcattccac agacaaggaa gctgagggtt aaatgggata 8280
agaatcttgc tcaaagtctc actggcttta aaatattaat atatatttta catcattcaa 8340
gacagatata taattagctg aatattctca cctctattaa ctgaatgtag agtgtcaggt 8400
acatgggaag taagttattt acgtgatctg aaataagagg acaacctttg ccaaattact 8460
tttattacca tggagaagaa agcttcaatg aaatgcctgt ttcagcccat cttcagtatt 8520
ggttggttaa taccatttgt taacaccaat cctttcctct tttactgcta acatgtgact 8580
gtgtttaaaa ttatagattg cagcagagtg ttggccaagg attattgtag taggataaag 8640
tttccctgta tttgaaattt acccaaactg tagcaaatac atctttcctt ctttatggag 8700
gtcacacgtg tgcatagtat gtgcctgaat acaggtaaac tttgtgtttt aaaattacat 8760
ggcctttttt agagtcgact gaaggctaga ctcttcttgc ccatgttgct ggataggctt 8820
tcaaatctca ggccttggag agttaaatga ctttctgata tttcttacgg tggagccaca 8880
taagaaatac acacattcct agtttgagca atggatgtgt atattgggga tcttcacttt 8940
tatgtatctg gtagtcatga tggtgcctat tcccaactgt aggaaaggaa agccctcaag 9000
gagaaaattc tctttcaaaa aggctcagat ttctaacaat tatgtctaaa tgtttcttca 9060
atttagaaca tgtacaaaag ctacttaata ttagcagagt agttcttgtg ttttcatttc 9120
aaatgaaata ttcccagttc caaagtttat ttccttggca ttttcaagaa ggatggttat 9180
ttttagttct ggttatttgg tgttggaggt ctgtgacact tccttataca tttaaagccc 9240
cctttgaata gttgtatcaa tagttcaaaa gcctgtagat atcttgcttt tctattctga 9300
cagcccagct tctgcagact caggccaacc ttaggtatgt cacatcagtt ccccatctgt 9360
gtaaattacc tgaacctctc tgtggtattc actgagagtt gagcttctac taagggattt 9420
ccttgacctg ggcttttaac accagatttg tagtggagtg tttattagta aagcaaagga 9480
gtaagtcaga gatgctagcc cctgctctca gttaaacaga actatcctcc ttcctgaata 9540
tttaaacata aaattaattt taaactacta taaaaatcat gttccattgc ttgccaaatc 9600
ttataccatt ataatgctgt atttttacat gttagtatac attgtgtcca tagggatata 9660
aatatgagaa tttcagcaac tgtcttgcag ggttaagatg tgctatttgt tgattttgca 9720
aatttttctc atgtggtaca gtccctccaa agaaaatttc cttagaccta aatggtagca 9780
taataatttc tcctccatat ccgtgactaa caagttcctg aaaacagaag tagacatatt 9840
ttatgcccaa tttagtacct aataattgag gcttaaaaca aaattttctt gggcaagttg 9900
ggtgattaaa tatgaattaa aacttctaga gaaagatttc cagatgcccc tgttttccat 9960
tttgattatg gttttagttt catagcaact tgggggaaaa aaatcattta ctttatgtcg 10020
tattacaaca acaactttga cagcttaaat gtagtgacct ctctgaaggt aatatgcaca 10080
ttataattta ggatgaaagc agatttaaac taagactttc tgagatgaaa taaaagtaac 10140
caattcataa tagtcatgtt tattgcagca atatcttggg aacttaagtt tagtggcgtt 10200
tgaggtgttt ctgactatct tgcttgttca tttcaccctg ctctgcagag acagaatatg 10260
attgaatgaa ctgcccatta tattaatgac cagttcaaag cagttttgtg ctttggctat 10320
agataaaatt tggcaaaaat actgtaggcc tttctggtcc atatttatcc ttaaagtatt 10380
tttgccaatc aataaaaatg aaatatttat ccagaaatag aaaaagctct ttcctatgac 10440
attgacatag aagcccaaat actccttgtt tgagaccctg catgtctttt gtgtttaact 10500
gaatcaatga ttttttttag ttgtgctctt aattgcatta tttcacatat gtaacactgg 10560
gtttgttttg gttatatagt attattttac tttattctaa tttcaactca tgtcactctg 10620
tagctcatta agaatttgtt caactgaaaa ttaaaatgtg tgttaaagca ataaaaatga 10680
aaagattggc tatgca 10696
<210> 53
<211> 825
<212> PRT
<213> Artificial Sequence
<220>
<223> FSCB
<400> 53
Met Val Gly Lys Ser Gln Gln Thr Asp Val Ile Glu Lys Lys Lys His
1 5 10 15
Met Ala Ile Pro Lys Ser Ser Ser Pro Lys Ala Thr His Arg Ile Gly
20 25 30
Asn Thr Ser Gly Ser Lys Gly Ser Tyr Ser Ala Lys Ala Tyr Glu Ser
35 40 45
Ile Arg Val Ser Ser Glu Leu Gln Gln Thr Trp Thr Lys Arg Lys His
50 55 60
Gly Gln Glu Met Thr Ser Lys Ser Leu Gln Thr Asp Thr Ile Val Glu
65 70 75 80
Glu Lys Lys Glu Val Lys Leu Val Glu Glu Thr Val Val Pro Glu Glu
85 90 95
Lys Ser Ala Asp Val Arg Glu Ala Ala Ile Glu Leu Pro Glu Ser Val
100 105 110
Gln Asp Val Glu Ile Pro Pro Asn Ile Pro Ser Val Gln Leu Lys Met
115 120 125
Asp Arg Ser Gln Gln Thr Ser Arg Thr Gly Tyr Trp Thr Met Met Asn
130 135 140
Ile Pro Pro Val Glu Lys Val Asp Lys Glu Gln Gln Thr Tyr Phe Ser
145 150 155 160
Glu Ser Glu Ile Val Val Ile Ser Arg Pro Asp Ser Ser Ser Thr Lys
165 170 175
Ser Lys Glu Asp Ala Leu Lys His Lys Ser Ser Gly Lys Ile Phe Ala
180 185 190
Ser Glu Gln Pro Glu Phe Gln Pro Ala Thr Asn Ser Asn Glu Glu Ile
195 200 205
Gly Gln Lys Asn Ile Ser Arg Thr Ser Phe Thr Gln Glu Thr Lys Lys
210 215 220
Gly Pro Pro Val Leu Leu Glu Asp Glu Leu Arg Glu Glu Val Thr Val
225 230 235 240
Pro Val Val Gln Glu Gly Ser Ala Val Lys Lys Val Ala Ser Ala Glu
245 250 255
Ile Glu Pro Pro Ser Thr Glu Lys Phe Pro Ala Lys Ile Gln Pro Pro
260 265 270
Leu Val Glu Glu Ala Thr Ala Lys Ala Glu Pro Arg Pro Ala Glu Glu
275 280 285
Thr His Val Gln Val Gln Pro Ser Thr Glu Glu Thr Pro Asp Ala Glu
290 295 300
Ala Ala Thr Ala Val Ala Glu Asn Ser Val Lys Val Gln Pro Pro Pro
305 310 315 320
Ala Glu Glu Ala Pro Leu Val Glu Phe Pro Ala Glu Ile Gln Pro Pro
325 330 335
Ser Ala Glu Glu Ser Pro Ser Val Glu Leu Leu Ala Glu Ile Leu Pro
340 345 350
Pro Ser Ala Glu Glu Ser Leu Ser Glu Glu Pro Pro Ala Glu Ile Leu
355 360 365
Pro Pro Pro Ala Glu Lys Ser Pro Ser Val Glu Pro Leu Gly Glu Ile
370 375 380
Arg Ser Pro Ser Ala Gln Lys Ala Pro Ile Glu Val Gln Pro Leu Pro
385 390 395 400
Ala Glu Gly Ala Leu Glu Glu Ala Ser Ala Lys Val Glu Pro Pro Thr
405 410 415
Val Glu Glu Thr Leu Ala Asp Val Gln Pro Leu Leu Pro Glu Glu Ala
420 425 430
Pro Arg Glu Glu Ala Arg Glu Leu Gln Leu Ser Thr Ala Met Glu Thr
435 440 445
Pro Ala Glu Glu Ala Pro Thr Glu Phe Gln Ser Pro Leu Pro Lys Glu
450 455 460
Thr Thr Ala Glu Glu Ala Ser Ala Glu Ile Gln Leu Leu Ala Ala Thr
465 470 475 480
Glu Pro Pro Ala Asp Glu Thr Pro Ala Glu Ala Arg Ser Pro Leu Ser
485 490 495
Glu Glu Thr Ser Ala Glu Glu Ala His Ala Glu Val Gln Ser Pro Leu
500 505 510
Ala Glu Glu Thr Thr Ala Glu Glu Ala Ser Ala Glu Ile Gln Leu Leu
515 520 525
Ala Ala Ile Glu Ala Pro Ala Asp Glu Thr Pro Ala Glu Ala Gln Ser
530 535 540
Pro Leu Ser Glu Glu Thr Ser Ala Glu Glu Ala Pro Ala Glu Val Gln
545 550 555 560
Ser Pro Ser Ala Lys Gly Val Ser Ile Glu Glu Ala Pro Leu Glu Leu
565 570 575
Gln Pro Pro Ser Gly Glu Glu Thr Thr Ala Glu Glu Ala Ser Ala Ala
580 585 590
Ile Gln Leu Leu Ala Ala Thr Glu Ala Ser Ala Glu Glu Ala Pro Ala
595 600 605
Glu Val Gln Pro Pro Pro Ala Glu Glu Ala Pro Ala Glu Val Gln Pro
610 615 620
Pro Pro Ala Glu Glu Ala Pro Ala Glu Val Gln Pro Pro Pro Ala Glu
625 630 635 640
Glu Thr Pro Ala Glu Val Gln Pro Pro Pro Ala Glu Glu Ala Pro Ala
645 650 655
Glu Val Gln Pro Pro Pro Ala Glu Glu Ala Pro Ala Glu Val Gln Pro
660 665 670
Pro Pro Ala Glu Glu Ala Pro Ala Glu Val Gln Ser Leu Pro Ala Glu
675 680 685
Glu Thr Pro Ile Glu Glu Thr Leu Ala Ala Val His Ser Pro Pro Ala
690 695 700
Asp Asp Val Pro Ala Glu Glu Ala Ser Val Asp Lys His Ser Pro Pro
705 710 715 720
Ala Asp Leu Leu Leu Thr Glu Glu Phe Pro Ile Gly Glu Ala Ser Ala
725 730 735
Glu Val Ser Pro Pro Pro Ser Glu Gln Thr Pro Glu Asp Glu Ala Leu
740 745 750
Val Glu Asn Val Ser Thr Glu Phe Gln Ser Pro Gln Val Ala Gly Ile
755 760 765
Pro Ala Val Lys Leu Gly Ser Val Val Leu Glu Gly Glu Ala Lys Phe
770 775 780
Glu Glu Val Ser Lys Ile Asn Ser Val Leu Lys Asp Leu Ser Asn Thr
785 790 795 800
Asn Asp Gly Gln Ala Pro Thr Leu Glu Ile Glu Ser Val Phe His Ile
805 810 815
Glu Leu Lys Gln Arg Pro Pro Glu Leu
820 825
<210> 54
<211> 3006
<212> DNA
<213> Artificial Sequence
<220>
<223> FSCB
<400> 54
cgcttactga ataaggtggg atccaacaag agtgtagtat ggaatgcaat agcttatgaa 60
ttactttttt tctgagggag ctcaacagaa tgacacctaa gaaagggaaa gtctttgaca 120
cttggtacgt ttgtgatttt tggtcattac ttgaaaatta ataagtttga aatcactact 180
cttagaaatg gaagaaagtg atgactctaa tcagcctatc tcagcgtgta ggcaagaaat 240
tcgaaagaga agatgaccca gcaaaccaat ggtaggcaaa tcccagcaaa ctgatgtaat 300
agagaaaaag aaacacatgg ccataccaaa atcatctagc cccaaagcta cccatcgtat 360
tggtaatact tctggaagca aaggcagcta ctctgccaaa gcctatgagt ctattagagt 420
atcttctgag cttcagcaaa cttggacaaa gagaaagcat ggacaggaaa tgactagtaa 480
gtctctccag acagacacca ttgtagaaga gaaaaaagaa gtcaagttag ttgaggaaac 540
cgtggtacct gaagaaaagt cagctgatgt tagagaagct gctattgaat tgccagagag 600
tgttcaggat gtagaaattc caccaaacat accttcagtt caactaaaaa tggacagatc 660
tcagcagacc agccgtacag gatactggac catgatgaac atcccccctg tagaaaaagt 720
ggacaaggaa caacagacat actttagtga atcagaaata gtggttattt ccaggccaga 780
tagttcttct acaaagtcaa aggaagatgc cctgaaacat aaatcgtcgg gaaagatttt 840
tgctagtgaa caacctgaat ttcaaccagc aacaaacagc aatgaagaaa ttgggcagaa 900
aaatatcagc agaacttcat ttactcagga gactaaaaaa ggtcccccgg tacttttaga 960
agatgagctt agggaagaag taactgtacc tgttgtacaa gaaggttctg ctgttaaaaa 1020
agtggcttct gctgaaatag agcctccatc aacagaaaaa ttcccagcta aaatacagcc 1080
tccattagtt gaagaggcca ctgctaaagc ggagcccaga cctgctgaag agacccatgt 1140
ccaagtacag ccatcaactg aagagactcc tgatgctgag gcagccactg cagttgcgga 1200
gaattctgtt aaagttcagc ctccacctgc tgaagaggcc cctttagtgg agtttcctgc 1260
tgaaattcag cctccatcag ctgaagagtc tccttctgta gagcttctgg ctgaaattct 1320
gcctccatca gctgaagagt ccctttcaga agagcctcct gctgaaattc tgcctccacc 1380
agctgaaaag tctccttcag tagagcctct tggtgaaatt cggtctccct cagcacaaaa 1440
ggctcccatt gaagtacagc ctttaccagc tgagggcgcc cttgaagagg cctcagctaa 1500
agtagagcct cccactgttg aagagaccct tgctgatgtt cagcctctat tacctgaaga 1560
ggctcctaga gaagaggctc gagaacttca gctttcaaca gctatggaga cccctgcaga 1620
agaggctcct actgaatttc agtctccatt acctaaagag accactgcag aagaggcctc 1680
tgctgaaatt cagcttctag cagctacgga gcctcctgca gatgaaactc ctgccgaagc 1740
tcggtctcca ctatctgagg agacttctgc agaagaggct catgctgaag ttcaatctcc 1800
attagctgaa gagaccactg cagaagaggc ctctgctgaa attcagcttc tagcagctat 1860
agaggctcct gcagatgaaa ctcctgctga agctcagtct ccactatctg aggagacttc 1920
tgcagaagag gctcctgctg aagttcagtc tccatcagct aagggagttt ctatagaaga 1980
ggcccctctt gagcttcagc ctccatcagg tgaagagacc actgcagaag aggcctctgc 2040
tgcaattcag cttctagcag ctacagaggc ttctgcagaa gaggctcctg ctgaagttca 2100
gcctccacca gctgaggagg cccccgctga agttcagcct ccaccagctg aggaggcccc 2160
cgctgaagtt cagcctccac cagctgagga gacccccgct gaagttcagc ctccaccagc 2220
tgaggaggcc cccgctgaag ttcagcctcc accagctgag gaggcccccg ctgaagttca 2280
gcctccacca gctgaggagg cccctgctga agttcagtct ctaccagctg aggagactcc 2340
tatagaagag acccttgctg cagtacactc tcccccagct gatgatgtcc ctgcagaaga 2400
ggcctccgtt gacaaacatt ccccaccagc tgatttgctt ctgactgagg agtttcctat 2460
aggagaggcc tctgctgaag tttcacctcc accatctgaa caaacccctg aagatgaggc 2520
tctggtagag aatgtgtcta cagaatttca gtcaccgcag gtggcaggaa ttccagcagt 2580
aaaattagga tcggttgttt tggaaggtga agcaaaattt gaagaggttt caaaaatcaa 2640
ttctgtcctt aaagatttgt ctaataccaa tgatggacag gctcccactc ttgaaataga 2700
aagtgttttt catatagaat taaaacaacg tcctcctgaa ctgtagtcag gttgtaccta 2760
agctagcaat cagaagctac atggttttgg aagaacatac tttagaaaag ggtgggcagc 2820
gggaagtagc tttgtcaata aggcaaatta aaggggaccc caagacttgg aatacaggtt 2880
ggaaaatgaa caataaaaac tgtagcagca taaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2940
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3000
aaaaaa 3006
<210> 55
<211> 483
<212> PRT
<213> Artificial Sequence
<220>
<223> DUOXA1
<400> 55
Met Ala Thr Leu Gly His Thr Phe Pro Phe Tyr Ala Gly Pro Lys Pro
1 5 10 15
Thr Phe Pro Met Asp Thr Thr Leu Ala Ser Ile Ile Met Ile Phe Leu
20 25 30
Thr Ala Leu Ala Thr Phe Ile Val Ile Leu Pro Gly Ile Arg Gly Lys
35 40 45
Thr Arg Leu Phe Trp Leu Leu Arg Val Val Thr Ser Leu Phe Ile Gly
50 55 60
Ala Ala Ile Leu Ala Val Asn Phe Ser Ser Glu Trp Ser Val Gly Gln
65 70 75 80
Val Ser Thr Asn Thr Ser Tyr Lys Ala Phe Ser Ser Glu Trp Ile Ser
85 90 95
Ala Asp Ile Gly Leu Gln Val Gly Leu Gly Gly Val Asn Ile Thr Leu
100 105 110
Thr Gly Thr Pro Val Gln Gln Leu Asn Glu Thr Ile Asn Tyr Asn Glu
115 120 125
Glu Phe Thr Trp Arg Leu Gly Glu Asn Tyr Ala Glu Glu Tyr Ala Lys
130 135 140
Ala Leu Glu Lys Gly Leu Pro Asp Pro Val Leu Tyr Leu Ala Glu Lys
145 150 155 160
Phe Thr Pro Arg Ser Pro Cys Gly Leu Tyr Arg Gln Tyr Arg Leu Ala
165 170 175
Gly His Tyr Thr Ser Ala Met Leu Trp Val Ala Phe Leu Cys Trp Leu
180 185 190
Leu Ala Asn Val Met Leu Ser Met Pro Val Leu Val Tyr Gly Gly Tyr
195 200 205
Met Leu Leu Ala Thr Gly Ile Phe Gln Leu Leu Ala Leu Leu Phe Phe
210 215 220
Ser Met Ala Thr Ser Leu Thr Ser Pro Cys Pro Leu His Leu Gly Ala
225 230 235 240
Ser Val Leu His Thr His His Gly Pro Ala Phe Trp Ile Thr Leu Thr
245 250 255
Thr Gly Leu Leu Cys Val Leu Leu Gly Leu Ala Met Ala Val Ala His
260 265 270
Arg Met Gln Pro His Arg Leu Lys Ala Phe Phe Asn Gln Ser Val Asp
275 280 285
Glu Asp Pro Met Leu Glu Trp Ser Pro Glu Glu Gly Gly Leu Leu Ser
290 295 300
Pro Arg Tyr Arg Ser Met Ala Asp Ser Pro Lys Ser Gln Asp Ile Pro
305 310 315 320
Leu Ser Glu Ala Ser Ser Thr Lys Ala Tyr Tyr Arg Pro Arg Arg Leu
325 330 335
Ser Leu Val Pro Ala Asp Val Arg Gly Leu Ala Pro Ala Ala Leu Ser
340 345 350
Ala Leu Pro Gly Ala Leu Leu Ala Gln Ala Trp Arg Ala Leu Leu Pro
355 360 365
Gly Leu Arg Cys Pro Lys Ala Gly Lys Glu Ser Arg Leu Gly Pro Pro
370 375 380
His Ser Pro Trp Arg Phe Gly Pro Glu Gly Cys Glu Glu Arg Trp Ala
385 390 395 400
Glu His Thr Gly Asp Ser Pro Arg Pro Leu Arg Gly Arg Gly Thr Gly
405 410 415
Arg Leu Trp Arg Trp Gly Ser Lys Glu Arg Arg Ala Cys Gly Val Arg
420 425 430
Ala Met Leu Pro Arg Leu Val Ser Asn Ser Gly Leu Lys Arg Pro Ser
435 440 445
Cys Leu Asp Leu Pro Lys Cys Trp Asp Tyr Arg Arg Asp Ala Arg Ala
450 455 460
Phe Phe His Leu Leu Glu Pro Thr Pro Cys Val Thr Ser Arg His Thr
465 470 475 480
Pro Leu Ile
<210> 56
<211> 1923
<212> DNA
<213> Artificial Sequence
<220>
<223> DUOXA1
<400> 56
cgcgcgaggt gagacggcga gggctcccgg ggcgcaggta gagatgttcc gtcggtgccg 60
aaggcccggc tagtgcggtt gtgtggacgg cgaaaaaaac caggccaggc gtcaaagcaa 120
gtcacccatc cgacctaaac ccctctaaga ccctggagtc acgtctccct ccggggatcc 180
tcgtgaggtc ttgggggtcc caccgccctg cgagccgcgc ccgcgccccg ccagacccgg 240
aactgcgtcg ctagaacgtc gggacctggt tccctcttta atcacacccc cgggggcgct 300
tcgtgtgagg ttccacgggg gaaggcgaga ggcgcaggct gccacacacg cactgtttgg 360
aagaggggaa gcattgcccc ccctgcacca cctcaccaag atggctactt tgggacacac 420
attccccttc tatgctggcc ccaagccaac cttcccgatg gacaccactt tggccagcat 480
catcatgatc tttctgactg cactggccac gttcatcgtc atcctgcctg gcattcgggg 540
aaagacgagg ctgttctggc tgcttcgggt ggtgaccagc ttattcatcg gggctgcaat 600
cctggctgtg aatttcagtt ctgagtggtc tgtgggccag gtcagcacca acacatcata 660
caaggccttc agttctgagt ggatcagcgc tgatattggg ctgcaggtcg ggctgggtgg 720
agtcaacatc acactcacag ggacccccgt gcagcagctg aatgagacca tcaattacaa 780
cgaggagttc acctggcgcc tgggtgagaa ctatgctgag gagtatgcaa aggctctgga 840
gaaggggctg ccagaccctg tgttgtacct agctgagaag ttcactccaa gaagcccatg 900
tggcctatac cgccagtacc gcctggcggg acactacacc tcagccatgc tatgggtggc 960
attcctctgc tggctgctgg ccaatgtgat gctctccatg cctgtgctgg tatatggtgg 1020
ctacatgcta ttggccacgg gcatcttcca gctgttggct ctgctcttct tctccatggc 1080
cacatcactc acctcaccct gtcccctgca cctgggcgct tctgtgctgc atactcacca 1140
tgggcctgcc ttctggatca cattgaccac aggactgctg tgtgtgctgc tgggcctggc 1200
tatggcggtg gcccacagga tgcagcctca caggctgaag gctttcttca accagagtgt 1260
ggatgaagac cccatgctgg agtggagtcc tgaggaaggt ggactcctga gcccccgcta 1320
ccggtccatg gctgacagtc ccaagtccca ggacattccc ctgtcagagg cttcctccac 1380
caaggcatac tatcgcccca ggagactttc cctggtgcct gcggatgtcc gaggcctcgc 1440
gccagcagcg ctcagtgccc ttcctggagc tctcctggcc caggcctggc gggcactgct 1500
tcccggcctg cgatgtccca aggcggggaa ggagtccaga ttgggtcccc ctcacagtcc 1560
ttggcgcttt ggtccagaag ggtgcgaaga gcgctgggcc gaacatactg gagactcacc 1620
acggcccctc cgaggaagag gcacaggacg cctgtggcgg tggggatcga aagaaaggag 1680
ggcatgtgga gtcagggcta tgttgcccag gctggtctcg aactctggcc tcaaacgacc 1740
ttcctgcctc gacctcccaa agtgctggga ttacaggcgt gatgcccggg ccttcttcca 1800
tcttttggag cctacccctt gtgttacctc ccgccacaca cctctaatct gaattacatg 1860
aaacacggca agacaccaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1920
aaa 1923
<210> 57
<211> 764
<212> PRT
<213> Artificial Sequence
<220>
<223> DLG4
<400> 57
Met Ser Gln Arg Pro Arg Ala Pro Arg Ser Ala Leu Trp Leu Leu Ala
1 5 10 15
Pro Pro Leu Leu Arg Trp Ala Pro Pro Leu Leu Thr Val Leu His Ser
20 25 30
Asp Leu Phe Gln Ala Leu Leu Asp Ile Leu Asp Tyr Tyr Glu Ala Ser
35 40 45
Leu Ser Glu Ser Gln Lys Tyr Arg Tyr Gln Asp Glu Asp Thr Pro Pro
50 55 60
Leu Glu His Ser Pro Ala His Leu Pro Asn Gln Ala Asn Ser Pro Pro
65 70 75 80
Val Ile Val Asn Thr Asp Thr Leu Glu Ala Pro Gly Tyr Val Asn Gly
85 90 95
Thr Glu Gly Glu Met Glu Tyr Glu Glu Ile Thr Leu Glu Arg Gly Asn
100 105 110
Ser Gly Leu Gly Phe Ser Ile Ala Gly Gly Thr Asp Asn Pro His Ile
115 120 125
Gly Asp Asp Pro Ser Ile Phe Ile Thr Lys Ile Ile Pro Gly Gly Ala
130 135 140
Ala Ala Gln Asp Gly Arg Leu Arg Val Asn Asp Ser Ile Leu Phe Val
145 150 155 160
Asn Glu Val Asp Val Arg Glu Val Thr His Ser Ala Ala Val Glu Ala
165 170 175
Leu Lys Glu Ala Gly Ser Ile Val Arg Leu Tyr Val Met Arg Arg Lys
180 185 190
Pro Pro Ala Glu Lys Val Met Glu Ile Lys Leu Ile Lys Gly Pro Lys
195 200 205
Gly Leu Gly Phe Ser Ile Ala Gly Gly Val Gly Asn Gln His Ile Pro
210 215 220
Gly Asp Asn Ser Ile Tyr Val Thr Lys Ile Ile Glu Gly Gly Ala Ala
225 230 235 240
His Lys Asp Gly Arg Leu Gln Ile Gly Asp Lys Ile Leu Ala Val Asn
245 250 255
Ser Val Gly Leu Glu Asp Val Met His Glu Asp Ala Val Ala Ala Leu
260 265 270
Lys Asn Thr Tyr Asp Val Val Tyr Leu Lys Val Ala Lys Pro Ser Asn
275 280 285
Ala Tyr Leu Ser Asp Ser Tyr Ala Pro Pro Asp Ile Thr Thr Ser Tyr
290 295 300
Ser Gln His Leu Asp Asn Glu Ile Ser His Ser Ser Tyr Leu Gly Thr
305 310 315 320
Asp Tyr Pro Thr Ala Met Thr Pro Thr Ser Pro Arg Arg Tyr Ser Pro
325 330 335
Val Ala Lys Asp Leu Leu Gly Glu Glu Asp Ile Pro Arg Glu Pro Arg
340 345 350
Arg Ile Val Ile His Arg Gly Ser Thr Gly Leu Gly Phe Asn Ile Val
355 360 365
Gly Gly Glu Asp Gly Glu Gly Ile Phe Ile Ser Phe Ile Leu Ala Gly
370 375 380
Gly Pro Ala Asp Leu Ser Gly Glu Leu Arg Lys Gly Asp Gln Ile Leu
385 390 395 400
Ser Val Asn Gly Val Asp Leu Arg Asn Ala Ser His Glu Gln Ala Ala
405 410 415
Ile Ala Leu Lys Asn Ala Gly Gln Thr Val Thr Ile Ile Ala Gln Tyr
420 425 430
Lys Pro Glu Glu Tyr Ser Arg Phe Glu Ala Lys Ile His Asp Leu Arg
435 440 445
Glu Gln Leu Met Asn Ser Ser Leu Gly Ser Gly Thr Ala Ser Leu Arg
450 455 460
Ser Asn Pro Lys Arg Gly Phe Tyr Ile Arg Ala Leu Phe Asp Tyr Asp
465 470 475 480
Lys Thr Lys Asp Cys Gly Phe Leu Ser Gln Ala Leu Ser Phe Arg Phe
485 490 495
Gly Asp Val Leu His Val Ile Asp Ala Ser Asp Glu Glu Trp Trp Gln
500 505 510
Ala Arg Arg Val His Ser Asp Ser Glu Thr Asp Asp Ile Gly Phe Ile
515 520 525
Pro Ser Lys Arg Arg Val Glu Arg Arg Glu Trp Ser Arg Leu Lys Ala
530 535 540
Lys Asp Trp Gly Ser Ser Ser Gly Ser Gln Gly Arg Glu Asp Ser Val
545 550 555 560
Leu Ser Tyr Glu Thr Val Thr Gln Met Glu Val His Tyr Ala Arg Pro
565 570 575
Ile Ile Ile Leu Gly Pro Thr Lys Asp Arg Ala Asn Asp Asp Leu Leu
580 585 590
Ser Glu Phe Pro Asp Lys Phe Gly Ser Cys Val Pro His Thr Thr Arg
595 600 605
Pro Lys Arg Glu Tyr Glu Ile Asp Gly Arg Asp Tyr His Phe Val Ser
610 615 620
Ser Arg Glu Lys Met Glu Lys Asp Ile Gln Ala His Lys Phe Ile Glu
625 630 635 640
Ala Gly Gln Tyr Asn Ser His Leu Tyr Gly Thr Ser Val Gln Ser Val
645 650 655
Arg Glu Val Ala Glu Gln Gly Lys His Cys Ile Leu Asp Val Ser Ala
660 665 670
Asn Ala Val Arg Arg Leu Gln Ala Ala His Leu His Pro Ile Ala Ile
675 680 685
Phe Ile Arg Pro Arg Ser Leu Glu Asn Val Leu Glu Ile Asn Lys Arg
690 695 700
Ile Thr Glu Glu Gln Ala Arg Lys Ala Phe Asp Arg Ala Thr Lys Leu
705 710 715 720
Glu Gln Glu Phe Thr Glu Cys Phe Ser Ala Ile Val Glu Gly Asp Ser
725 730 735
Phe Glu Glu Ile Tyr His Lys Val Lys Arg Val Ile Glu Asp Leu Ser
740 745 750
Gly Pro Tyr Ile Trp Val Pro Ala Arg Glu Arg Leu
755 760
<210> 58
<211> 3289
<212> DNA
<213> Artificial Sequence
<220>
<223> DLG4
<400> 58
ggcagaggca gagacatgga aagacagact ctagggttcc tgatgaaatc tatctcggcc 60
aacacaaaag ggagggtaca gtggtggggg gcacccaagc tagggtgtga gtaccctaag 120
tgtattcttc tgagatgtag gccattcact aactcttgga acagctacag tttcacagta 180
ggaagacccc cccagattca ctgcccctcc cttagtaaag cctctgagac cttcctgaac 240
attcccttct gtctttgccc tctgttcctt ccagagacca tgtgcccagg cagatggatt 300
cctcccgggc ctgagaggaa ctgcaggaat tctcctgcct cttacccgta aaaccccaac 360
ttctctagcc ctagggcagg aagtcccaaa caatttctac ccctttttct gcaattctca 420
ttggggtgag aggaggccca ggaggagaga gagctgggct cagcttcttt ttgagctgct 480
ggagccctct gtgaggaggc cctctttgct ggcttctcag gagagcgtgg ctaggttctg 540
cctgcctatg ggaagagggg gccagggtgt gtggagcaag atggtggcgg tgctggtgcc 600
ttgggacctg ggggaatggg acagctggtc ggctcagaga cggcctactt tactcacagc 660
tggaatttag tggggagaag cagctcaact ccaatcctgg aggattaggg agattaaagt 720
gagagaagag agagatgtcc cagagaccaa gagctcccag gtcagccctc tggctcctgg 780
cacccccact gctgcggtgg gcacccccac tcctcacagt gctgcatagc gacctcttcc 840
aggccttgct ggacatcctg gactattatg aggcttccct ctcagagagt cagaaatacc 900
gctaccaaga tgaagacacg ccccctctgg agcacagccc ggcccacctc cccaaccagg 960
ccaattctcc cccagtgatt gtcaacacag ataccctaga agccccagga tatgtgaacg 1020
ggaccgaggg ggagatggaa tacgaggaaa tcacattgga aaggggtaac tcaggtctgg 1080
gcttcagcat cgcaggtggc actgacaacc cacacatcgg tgacgaccca tccattttca 1140
tcaccaagat cattcctggt ggggctgcgg cccaggatgg ccgcctcagg gtcaacgaca 1200
gcatcctgtt tgtaaatgaa gtggacgtgc gcgaggtgac ccactcagcg gcggtggaag 1260
ccctcaaaga ggcaggctcc atcgttcgcc tctatgtcat gcgccggaag cccccggctg 1320
agaaggtcat ggagatcaag ctcatcaagg ggcctaaagg tcttggcttc agcatcgcag 1380
ggggcgtagg gaaccagcac atcccaggag ataatagcat ctatgtaaca aagatcatcg 1440
aagggggtgc tgcccacaag gatgggaggt tgcagattgg agacaagatc ctggcggtca 1500
acagtgtggg gctagaggac gtcatgcatg aagatgctgt ggcagccctg aagaacacgt 1560
atgatgttgt ctacctaaag gtggccaagc ccagcaatgc ctacctgagt gacagctatg 1620
ctcccccaga catcacaacc tcttattccc agcacctgga caatgagatc agtcacagca 1680
gctacctggg caccgactac cccacagcca tgacccccac ttcccctcgg cgctactctc 1740
cagtggccaa ggacctgctc ggggaggaag acattccccg agaaccgagg cgaattgtga 1800
tccaccgggg ctccacgggc ctgggcttca acatcgtggg tggcgaggac ggtgaaggca 1860
tcttcatctc ctttatcctg gccgggggcc ctgcagacct cagtggggag ctgcggaagg 1920
gggaccagat cctgtcggtc aacggtgtgg acctccgaaa tgccagccat gagcaggctg 1980
ccattgccct gaagaatgcg ggtcagacgg tcacgatcat cgctcagtat aaaccagaag 2040
agtacagccg attcgaggcc aagatccacg accttcggga acagctcatg aacagcagcc 2100
tgggctcagg gactgcgtcc ctgcggagca accccaaaag gggtttctac atcagggccc 2160
tgtttgatta cgacaagacc aaggactgcg gcttcctgag ccaggccctg agcttccgct 2220
ttggggatgt gctgcatgtc atcgatgcta gtgatgagga gtggtggcag gcacggcggg 2280
tccactctga cagtgagacc gacgacattg ggttcatccc cagcaaacgg cgggttgagc 2340
gacgagagtg gtcaaggtta aaggccaagg actggggctc cagctctgga tcgcagggtc 2400
gagaagactc ggttctgagc tacgagacag tgacgcagat ggaagtgcac tatgctcgcc 2460
ccatcatcat ccttgggccc accaaggacc gcgccaacga tgatcttctc tccgagttcc 2520
ccgacaagtt tggatcctgt gttccccata cgacacggcc caagcgggag tatgagatag 2580
atggccggga ttaccacttt gtgtcgtccc gggagaaaat ggagaaggac attcaggcgc 2640
acaagttcat tgaggccggc cagtacaaca gccacctcta tgggaccagc gtccagtccg 2700
tgcgagaggt ggcagagcag gggaagcact gcatcctcga tgtctcggcc aatgccgtgc 2760
ggcggctgca ggcggcccac ctgcacccca tcgccatctt catccgcccc cgctccctgg 2820
agaatgtgct agagattaac aagcggatca cagaggagca agcccgcaaa gccttcgaca 2880
gagccaccaa gctggagcag gagttcacag agtgcttctc agccatcgtg gagggtgaca 2940
gctttgagga gatctaccac aaggtgaagc gtgtcatcga ggacctctca ggcccctaca 3000
tctgggttcc agcccgagag agactctgat tcctgccctg gcttggcctg gactcgccct 3060
gcctccatca cctgggccct tggtctggac tgaattgccc aagcccttgg ctccccccgg 3120
cctccctccc accccttctt atttatttcc tttctaactg gatccagcct gttggagggg 3180
ggacactcct ctgcatgtat ccccgcaccc cagaactggg ctcctgaacg ccaggaacct 3240
ggggtctggg ggggagctgg gctccttgtt ccgagccctt gctccttag 3289
<210> 59
<211> 655
<212> PRT
<213> Artificial Sequence
<220>
<223> ACADVL
<400> 59
Met Gln Ala Ala Arg Met Ala Ala Ser Leu Gly Arg Gln Leu Leu Arg
1 5 10 15
Leu Gly Gly Gly Ser Ser Arg Leu Thr Ala Leu Leu Gly Gln Pro Arg
20 25 30
Pro Gly Pro Ala Arg Arg Pro Tyr Ala Gly Gly Ala Ala Gln Leu Ala
35 40 45
Leu Asp Lys Ser Asp Ser His Pro Ser Asp Ala Leu Thr Arg Lys Lys
50 55 60
Pro Ala Lys Ala Glu Ser Lys Ser Phe Ala Val Gly Met Phe Lys Gly
65 70 75 80
Gln Leu Thr Thr Asp Gln Val Phe Pro Tyr Pro Ser Val Leu Asn Glu
85 90 95
Glu Gln Thr Gln Phe Leu Lys Glu Leu Val Glu Pro Val Ser Arg Phe
100 105 110
Phe Glu Glu Val Asn Asp Pro Ala Lys Asn Asp Ala Leu Glu Met Val
115 120 125
Glu Glu Thr Thr Trp Gln Gly Leu Lys Glu Leu Gly Ala Phe Gly Leu
130 135 140
Gln Val Pro Ser Glu Leu Gly Gly Val Gly Leu Cys Asn Thr Gln Tyr
145 150 155 160
Ala Arg Leu Val Glu Ile Val Gly Met His Asp Leu Gly Val Gly Ile
165 170 175
Thr Leu Gly Ala His Gln Ser Ile Gly Phe Lys Gly Ile Leu Leu Phe
180 185 190
Gly Thr Lys Ala Gln Lys Glu Lys Tyr Leu Pro Lys Leu Ala Ser Gly
195 200 205
Glu Thr Val Ala Ala Phe Cys Leu Thr Glu Pro Ser Ser Gly Ser Asp
210 215 220
Ala Ala Ser Ile Arg Thr Ser Ala Val Pro Ser Pro Cys Gly Lys Tyr
225 230 235 240
Tyr Thr Leu Asn Gly Ser Lys Leu Trp Ile Ser Asn Gly Gly Leu Ala
245 250 255
Asp Ile Phe Thr Val Phe Ala Lys Thr Pro Val Thr Asp Pro Ala Thr
260 265 270
Gly Ala Val Lys Glu Lys Ile Thr Ala Phe Val Val Glu Arg Gly Phe
275 280 285
Gly Gly Ile Thr His Gly Pro Pro Glu Lys Lys Met Gly Ile Lys Ala
290 295 300
Ser Asn Thr Ala Glu Val Phe Phe Asp Gly Val Arg Val Pro Ser Glu
305 310 315 320
Asn Val Leu Gly Glu Val Gly Ser Gly Phe Lys Val Ala Met His Ile
325 330 335
Leu Asn Asn Gly Arg Phe Gly Met Ala Ala Ala Leu Ala Gly Thr Met
340 345 350
Arg Gly Ile Ile Ala Lys Ala Val Asp His Ala Thr Asn Arg Thr Gln
355 360 365
Phe Gly Glu Lys Ile His Asn Phe Gly Leu Ile Gln Glu Lys Leu Ala
370 375 380
Arg Met Val Met Leu Gln Tyr Val Thr Glu Ser Met Ala Tyr Met Val
385 390 395 400
Ser Ala Asn Met Asp Gln Gly Ala Thr Asp Phe Gln Ile Glu Ala Ala
405 410 415
Ile Ser Lys Ile Phe Gly Ser Glu Ala Ala Trp Lys Val Thr Asp Glu
420 425 430
Cys Ile Gln Ile Met Gly Gly Met Gly Phe Met Lys Glu Pro Gly Val
435 440 445
Glu Arg Val Leu Arg Asp Leu Arg Ile Phe Arg Ile Phe Glu Gly Thr
450 455 460
Asn Asp Ile Leu Arg Leu Phe Val Ala Leu Gln Gly Cys Met Asp Lys
465 470 475 480
Gly Lys Glu Leu Ser Gly Leu Gly Ser Ala Leu Lys Asn Pro Phe Gly
485 490 495
Asn Ala Gly Leu Leu Leu Gly Glu Ala Gly Lys Gln Leu Arg Arg Arg
500 505 510
Ala Gly Leu Gly Ser Gly Leu Ser Leu Ser Gly Leu Val His Pro Glu
515 520 525
Leu Ser Arg Ser Gly Glu Leu Ala Val Arg Ala Leu Glu Gln Phe Ala
530 535 540
Thr Val Val Glu Ala Lys Leu Ile Lys His Lys Lys Gly Ile Val Asn
545 550 555 560
Glu Gln Phe Leu Leu Gln Arg Leu Ala Asp Gly Ala Ile Asp Leu Tyr
565 570 575
Ala Met Val Val Val Leu Ser Arg Ala Ser Arg Ser Leu Ser Glu Gly
580 585 590
His Pro Thr Ala Gln His Glu Lys Met Leu Cys Asp Thr Trp Cys Ile
595 600 605
Glu Ala Ala Ala Arg Ile Arg Glu Gly Met Ala Ala Leu Gln Ser Asp
610 615 620
Pro Trp Gln Gln Glu Leu Tyr Arg Asn Phe Lys Ser Ile Ser Lys Ala
625 630 635 640
Leu Val Glu Arg Gly Gly Val Val Thr Ser Asn Pro Leu Gly Phe
645 650 655
<210> 60
<211> 5400
<212> DNA
<213> Artificial Sequence
<220>
<223> ACADVL
<400> 60
agtcagggtt aggggcgcca ggacgtggcg tgcaggacgc cagagctggg tcagagctcg 60
agccagcggc gcccggagag attcggagat gcaggcggct cggatggccg cgagcttggg 120
gcggcagctg ctgaggctcg ggggcggaag gtctgtgtgt gacaagaggg acggtgggca 180
gcggccctgg gcaccgggcc ggcactgaac ccccactccc cacagctcgc ggctcacggc 240
gctcctgggg cagccccggc ccggccctgc ccggcggccc tatgccgggg gtgccgctca 300
ggtaagtcac cgcagccttg gcaagggggt gtgggagcgg cggtccgctt cggcgcccgc 360
catcggcagg gatctccctc ttggtgccag gtacctgcct actgctcagt cgccgaaagt 420
aggggaaagg gcaagccagg gtcgcctagg gcgaaactag gggaaaggtc acccgttcgc 480
ggcctccccg cgccggtctc gcctgttctc cccttgacac agcggaagtc ccttccctga 540
acttgctaac cgtctctttt cccagctggc tctggacaag tcagattccc acccctctga 600
cgctctgacc aggaaaaaac cggccaaggc ggtaggtagc cccgaggcca ggtggacctt 660
agccagaccc aaccagagcc ctgaaatttg cctctctctg cccaggaatc taagtccttt 720
gctgtgggaa tgttcaaagg ccagctcacc acagatcagg tgttcccata cccgtccggt 780
aagggaaggg ataatcagag ctgggtgggg ccagggtggt ttcccctgcc agcctggcct 840
gaccagcctg tcccccaccc tctgcagtgc tcaacgaaga gcagacacag tttcttaaag 900
agctggtgga gcctgtgtcc cgtttcttcg aggtaaggaa tgactcgggg cttggtccct 960
ggtgaggtgt ttggagatgt taagctcaaa aggagcctgg atgtgggatc ctgtgccttc 1020
cccaggaagt gaacgatccc gccaagaatg acgctctgga gatggtggag gagaccactt 1080
ggcagggcct caaggagctg ggggcctttg gtctgcaagt gcccagtgag ctgggtggtg 1140
tgggcctttg caacacccag gtgagggcgc cctatcgcac atcccagtat gccatacccc 1200
agcttggcag actcagctct tttgccatag acctagagac tagggctaag gtctcttcta 1260
agcacctgcc ctgggtgcct gtgggatgga tgttaactgt ccaaacataa cacaatttac 1320
atgcagtccc ctaggcctga gcatgggcct caccctggtt cccaagtcct tacaaatctc 1380
taagttgggg attgcctaac aatagactga ctagtagcaa gtcaccctcc tacctagacc 1440
taagacagac cagcattctc tgctgtgccc tttgcacacc ccacttcttt tctacacact 1500
ggggatggcc caggtcagca ctgccctagg tcaggaactg ccctgttgcc cacactctcc 1560
tgttaaggtc aggtccccct gcagccagtg acaaccccag attcctgctt cccctccagt 1620
acgcccgttt ggtggagatc gtgggcatgc atgaccttgg cgtgggcatt accctggggg 1680
cccatcagag catcggtttc aaaggcatcc tgctctttgg cacaaaggcc cagaaagaaa 1740
aatacctccc caagctggca tctggtgagg caaccctagg agagccaggg attggggggc 1800
acactgggct tggcacagat taggccagtt ggcacttaga ttatcagatg gctgagcatt 1860
tcagttgggg gaaggttttg gggaggcatc acagtgtgct ggttgggaca tgcaaaagaa 1920
ctggatactc ccaggtgtta agggggaact gcctgctgga gggatgggga agtgggccga 1980
ggggactttg aagctcatca gaacttgggg taaagtagct ctctccccaa caggggagac 2040
tgtggccgct ttctgtctaa ccgagccctc aagcgggtca gatgcagcct ccatccgaac 2100
ctctgctgtg cccagcccct gtggaaaata ctataccctc aatggaagca agctttggat 2160
caggcaacct gcctcccatt tctccccttc tcctccgccc aattccaggc cccactgctc 2220
cccgtcctcc acgccctgaa tatcccattc ttccacagta atgggggcct agcagacatc 2280
ttcacggtct ttgccaagac accagttaca gatccagcca caggagccgt gaaggagaag 2340
atcacagctt ttgtggtgga gaggggcttc gggggcatta cccagtgagt gaatttgggt 2400
tgggggagct taggactgag gggcaggact gggctcctgg gcagatggct gttgcaagtc 2460
accctgggga cgtgtgcaaa agccaaagca ggtggactga atgtggcctt tggactaata 2520
tgtatgcaac gagtcaaagt tttggctcca gcaaccaagt ccaacacaaa ataggacata 2580
gccaggcctc cttaacctca gggcctgagg ggaagtggtg ctgtagcctc taatagtcta 2640
gtggtcgtca ttcctccctg gtgcataagg agcgaaggag cagtttttcc cccagtgaca 2700
acctgttgaa cacacctctg ctttcccaca ctgccctgac acagtgggcc ccctgagaag 2760
aagatgggca tcaaggcttc aaacacagca gaggtgttct ttgatggagt acgggtgcca 2820
tcggagaacg tgctgggtga ggttgggagt ggcttcaagg ttgccatgca catcctcaac 2880
aatggaaggt ttggcatggc tgcggccctg gcaggtacca tgagaggcat cattgctaag 2940
gcggtgagta ccctgcccga gtccctaggt aacccaaaca gaagtctcac tgtccccctt 3000
gccatgtgtc cctgatcact tgcaggcact ccctacacta gaaactcctc ccctaccagc 3060
agcccgactt gctagcttag gtctccatcc agcgtagact gaactctggt tgtatgcaaa 3120
acccatccct ctgcgcaagc ccagcccctt cctagggaga ctgcagaacc acactgaacc 3180
acagcgggat gtgtggaccc tcttccaggt agatcatgcc actaatcgta cccagtttgg 3240
ggagaaaatt cacaactttg ggctgatcca ggagaagctg gcacggatgg ttatgctgca 3300
gtatgtaact gaggtgaggg cctcccaagc ccctctccct ggagccctgg gcgtttcttc 3360
ccagtcgggt cagactacaa cccccagcag cacctggggc agtgggtctc cagctttaca 3420
ccaatgccct aggggatgcg gggaggcata gtcagctcag cttctgcgaa gagagacagc 3480
aatgatgttc tgctcaggtg cctgccagca gtaccagaag ttaattctac ctcatccctt 3540
acatccaccc cttttaagaa aacaaatcct ggaagcacct gatgaactga cccagaacaa 3600
gtatctgcct gacctgacaa gctaggtcag cccttatctt ggagatctgg gtgatgaggc 3660
caagtctgac aaagcccttt gcaattttcc ttcccatgtc ccaactatgc aacctcagtc 3720
catggcttac atggtgagtg ctaacatgga ccagggagcc acggacttcc agatagaggc 3780
cgccatcagc aaaatctttg gctcggtgag gtcccaggca tgctggaggg agtccagttt 3840
gggtgctcag ctcccaaaac cagtctcatc tgttctttgt ccctaggagg cagcctggaa 3900
ggtgacagat gaatgcatcc aaatcatggg gggtatgggc ttcatgaagg tacaggacgg 3960
tcttctgcag agcctcggct gggccagggg tgggatggca catctcagca cgggcatata 4020
atttgtgtgg ccctgtgcta ggaacctgga gtagagcgtg tgctccgaga tcttcgcatc 4080
ttccggatct ttgaggggac aaatgacatt cttcggctgt ttgtggctct gcagggctgt 4140
atggtaagac agagaattgg gtgggggtag aggtggggag gacagtgagt cctgactgct 4200
ggaccctctt cccccatagg acaaaggaaa ggagctctct gggcttggca gtgctctaaa 4260
gaatcccttt gggaatgctg gcctcctgct aggagaggca ggcaaacagc tgaggcggta 4320
ggcttagggc cagagccagg ggagggcagg gtggtgtatg gcaactaacc agtcattctc 4380
cctcttcctc tcaggcgggc agggctgggc agcggcctga gtctcagcgg acttgtccac 4440
ccggagttga gtcggagtgg cgagctggta aggcggccag gggtccagga gagcctgcat 4500
cagggactgc agccgatggc ccctctgagc cccgcactgt ccccatctct taaggcagta 4560
cgggctctgg agcagtttgc cactgtggtg gaggccaagc tgataaaaca caagaagggg 4620
attgtcagta agtgagctct acaccattcc gcccctccct ttcctctcct tgagactaat 4680
gcccccaccc ccacccccac cccacctacc ggacagatga acagtttctg ctgcagcggc 4740
tggcagacgg ggccatcgac ctctatgcca tggtggtggt tctctcgagg tgaggaggca 4800
ggcagggaat gcctgagccg cagggggcct gggcctggat cccagccggc ccagatttat 4860
tttcatctcc tgcttcctgc cagggcctca agatccctga gtgagggcca ccccacggcc 4920
cagcatgaga aaatgctctg tgacacctgg tgtatcgagg tgagactcgg ggctgccaag 4980
ctcaggtgag ggctggaggt gcaggcccaa cccctccttc cctctcccca ggctgcagct 5040
cggatccgag agggcatggc cgccctgcag tctgacccct ggcagcaaga gctctaccgc 5100
aacttcaaaa gcatctccaa ggccttggtg gagcggggtg gtgtggtcac cagcaaccca 5160
cttggcttct gaatactccc ggccagggcc tgtcccagtt atgtgccttc cctcaagcca 5220
aagccgaagc ccctttcctt aaggccctgg tttgtcccga aggggcctag tgttcccagc 5280
actgtgcctg ctctcaagag cacttactgc ctcgcaaata ataaaaattt ctagccagtc 5340
atgctttgct cctgtgtgac ggttctttnc ccctgctgcc tgcctccctc ccaaagaaag 5400
5400
<210> 61
<211> 752
<212> PRT
<213> Artificial Sequence
<220>
<223> CDRT1
<400> 61
Met Glu Asn Leu Glu Ser Arg Leu Lys Asn Ala Pro Tyr Phe Arg Cys
1 5 10 15
Glu Lys Gly Thr Asp Ser Ile Pro Leu Cys Arg Lys Cys Glu Thr Arg
20 25 30
Val Leu Ala Trp Lys Ile Phe Ser Thr Lys Glu Trp Phe Cys Arg Ile
35 40 45
Asn Asp Ile Ser Gln Arg Arg Phe Leu Val Gly Ile Leu Lys Gln Leu
50 55 60
Asn Ser Leu Tyr Leu Leu His Tyr Phe Gln Asn Ile Leu Gln Thr Thr
65 70 75 80
Gln Gly Lys Asp Phe Ile Tyr Asn Arg Ser Arg Ile Asp Leu Ser Lys
85 90 95
Lys Glu Gly Lys Val Val Lys Ser Ser Leu Asn Gln Met Leu Asp Lys
100 105 110
Thr Val Glu Gln Lys Met Lys Glu Ile Leu Tyr Trp Phe Ala Asn Ser
115 120 125
Thr Gln Trp Thr Lys Ala Asn Tyr Thr Leu Leu Leu Leu Gln Met Cys
130 135 140
Asn Pro Lys Leu Leu Leu Thr Ala Ala Asn Val Ile Arg Val Leu Phe
145 150 155 160
Leu Arg Glu Glu Asn Asn Ile Ser Gly Leu Asn Gln Asp Ile Thr Asp
165 170 175
Val Cys Phe Ser Pro Glu Lys Asp His Ser Ser Lys Ser Ala Thr Ser
180 185 190
Gln Val Tyr Trp Thr Ala Lys Thr Gln His Thr Ser Leu Pro Leu Ser
195 200 205
Lys Ala Pro Glu Asn Glu His Phe Leu Gly Ala Ala Ser Asn Pro Glu
210 215 220
Glu Pro Trp Arg Asn Ser Leu Arg Cys Ile Ser Glu Met Asn Arg Leu
225 230 235 240
Phe Ser Gly Lys Ala Asp Ile Thr Lys Pro Gly Tyr Asp Pro Cys Asn
245 250 255
Leu Leu Val Asp Leu Asp Asp Ile Arg Asp Leu Ser Ser Gly Phe Ser
260 265 270
Lys Tyr Arg Asp Phe Ile Arg Tyr Leu Pro Ile His Leu Ser Lys Tyr
275 280 285
Ile Leu Arg Met Leu Asp Arg His Thr Leu Asn Lys Cys Ala Ser Val
290 295 300
Ser Gln His Trp Ala Ala Met Ala Gln Gln Val Lys Met Asp Leu Ser
305 310 315 320
Ala His Gly Phe Ile Gln Asn Gln Ile Thr Phe Leu Gln Gly Ser Tyr
325 330 335
Thr Arg Gly Ile Asp Pro Asn Tyr Ala Asn Lys Val Ser Ile Pro Val
340 345 350
Pro Lys Met Val Asp Asp Gly Lys Ser Met Arg Val Lys His Pro Lys
355 360 365
Trp Lys Leu Arg Thr Lys Asn Glu Tyr Asn Leu Trp Thr Ala Tyr Gln
370 375 380
Asn Glu Glu Thr Gln Gln Val Leu Met Glu Glu Arg Asn Val Phe Cys
385 390 395 400
Gly Thr Tyr Asn Val Arg Ile Leu Ser Asp Thr Trp Asp Gln Asn Arg
405 410 415
Val Ile His Tyr Ser Gly Gly Asp Leu Ile Ala Val Ser Ser Asn Arg
420 425 430
Lys Ile His Leu Leu Asp Ile Ile Gln Val Lys Ala Ile Pro Val Glu
435 440 445
Phe Arg Gly His Ala Gly Ser Val Arg Ala Leu Phe Leu Cys Glu Glu
450 455 460
Glu Asn Phe Leu Leu Ser Gly Ser Tyr Asp Leu Ser Ile Arg Tyr Trp
465 470 475 480
Asp Leu Lys Ser Gly Val Cys Thr Arg Ile Phe Gly Gly His Gln Gly
485 490 495
Thr Ile Thr Cys Met Asp Leu Cys Lys Asn Arg Leu Val Ser Gly Gly
500 505 510
Arg Asp Cys Gln Val Lys Val Trp Asp Val Asp Thr Gly Lys Cys Leu
515 520 525
Lys Thr Phe Arg His Lys Asp Pro Ile Leu Ala Thr Arg Ile Asn Asp
530 535 540
Thr Tyr Ile Val Ser Ser Cys Glu Arg Gly Leu Val Lys Val Trp His
545 550 555 560
Ile Ala Met Ala Gln Leu Val Lys Thr Leu Ser Gly His Glu Gly Ala
565 570 575
Val Lys Cys Leu Phe Phe Asp Gln Trp His Leu Leu Ser Gly Ser Thr
580 585 590
Asp Gly Leu Val Met Ala Trp Ser Met Val Gly Lys Tyr Glu Arg Cys
595 600 605
Leu Met Ala Phe Lys His Pro Lys Glu Val Leu Asp Val Ser Leu Leu
610 615 620
Phe Leu Arg Val Ile Ser Ala Cys Ala Asp Gly Lys Ile Arg Ile Tyr
625 630 635 640
Asn Phe Phe Asn Gly Asn Cys Met Lys Val Ile Lys Ala Asn Gly Arg
645 650 655
Gly Asp Pro Val Leu Ser Phe Phe Ile Gln Gly Asn Arg Ile Ser Val
660 665 670
Cys His Ile Ser Thr Phe Ala Lys Arg Ile Asn Val Gly Trp Asn Gly
675 680 685
Ile Glu Pro Ser Ala Thr Ala Gln Gly Gly Asn Ala Ser Leu Thr Glu
690 695 700
Cys Ala His Val Arg Leu His Ile Ala Gly His Leu Pro Ala Ser Arg
705 710 715 720
Leu Pro Val Ala Ala Val Gln Pro Met Thr Gly Gly Met Ala Pro Thr
725 730 735
Thr Ala Pro Thr His Val Leu Ala Met Leu Ile Leu Phe Ser Gly Val
740 745 750
<210> 62
<211> 2780
<212> DNA
<213> Artificial Sequence
<220>
<223> CDRT1
<400> 62
agacttctgc tggcagttac tgagagagat aggctttcca tccatggcag ccatttactt 60
ttgctctggg agacgtttgt aatagaaaag gcacaactgg ggtatttatt catttccccc 120
cgttcctcta gtgtttggtg gcgttgccgt tgcaagtgcg cagggctaaa atgaactggt 180
tatcttagga tcatggaaaa cctggaatca aggctcaaga atgcccccta ttttcgttgt 240
gagaagggaa ccgattccat ccctctatgc cggaagtgtg agacgcgtgt cttagcctgg 300
aagatcttct ctaccaaaga gtggttctgc aggatcaatg acatatcaca gaggaggttt 360
ctagttggca ttctgaagca gttaaatagc ttatatttgt tacactattt ccaaaatatc 420
cttcagacca cacagggaaa ggatttcatc tataacaggt cccggatcga cctcagcaag 480
aaagagggga aagttgtgaa gtcctccttg aaccaaatgt tggataaaac agtagaacag 540
aagatgaaag agatcttgta ttggtttgcg aacagcaccc agtggaccaa ggcgaattat 600
actctcttac tgctgcagat gtgcaacccc aaattactgc tcactgctgc caatgtgatc 660
agagtcctgt ttctgagaga ggagaacaat atctcagggc tcaatcaaga catcacagat 720
gtgtgttttt cccctgagaa agaccacagc tccaagtctg cgacctcaca agtctattgg 780
acagccaaaa ctcagcacac atcccttcct ttgtccaaag ccccagaaaa tgaacacttc 840
cttggggcag catctaaccc tgaggaacca tggaggaatt cactccggtg tatatccgaa 900
atgaataggc tgttttctgg aaaagcagac ataaccaagc cagggtacga tccctgcaat 960
ctattggttg acctggatga catcagagac ctgtcttctg ggttcagcaa ataccgagac 1020
ttcatccgtt acctgcccat ccacctctcc aagtacattc taagaatgct ggatagacac 1080
accctgaaca agtgcgcctc tgtgagccag cactgggccg ccatggctca acaggtcaag 1140
atggacttgt cagcgcacgg cttcattcag aaccagatta ccttcttgca ggggtcctat 1200
acaagaggaa ttgatcctaa ttatgccaat aaggtttcta tcccagttcc taaaatggta 1260
gatgacggga agagcatgcg tgtgaaacat ccgaagtgga agttgagaac gaagaatgag 1320
tacaacctgt ggactgcata ccagaacgag gaaacgcagc aggtcctgat ggaggagaga 1380
aatgttttct gtgggaccta caatgttcgc attctctctg acacgtggga tcaaaaccga 1440
gtcatccact attccggggg agatctgata gctgtgtcat ctaatcgaaa gatccatctt 1500
ctggacatca tacaagtgaa agcgataccc gttgaattcc gaggccatgc tgggagtgtc 1560
cgggccctct tcctgtgtga ggaggaaaac tttctcctaa gcgggagcta tgacctaagt 1620
atcagatact gggatctgaa aagtggggtt tgcacacgaa tcttcggtgg tcaccagggg 1680
actatcactt gcatggactt gtgtaagaac aggctcgtat ctggaggaag agattgccag 1740
gtaaaagtat gggatgtaga cacagggaag tgcctgaaga cgtttagaca caaagacccc 1800
atcttggcca ccaggatcaa tgatacctac attgtgagca gctgtgagcg agggctggtg 1860
aaagtgtggc acattgccat ggcccagttg gtaaagactc tcagtggcca cgagggagct 1920
gtgaaatgcc tgttctttga ccagtggcat ctcctctcag gaagcactga tggcctggtc 1980
atggcctgga gcatggtggg gaagtacgag cgctgcctga tggccttcaa gcatcccaaa 2040
gaggtgctcg acgtgtccct tctcttcctc cgggtcatca gcgcctgtgc agatggcaag 2100
atccgaattt acaatttctt caatgggaac tgtatgaagg tgataaaagc caatggcaga 2160
ggcgatcctg tgctgtcctt ctttattcag ggcaacagaa tttcagtctg ccacatcagc 2220
acatttgcta aaagaattaa cgtgggatgg aatggaatcg aaccaagtgc tacagctcag 2280
ggaggaaatg cctccttgac cgagtgtgct catgtgagac tccacatcgc aggacactta 2340
ccagcatcga ggctgcccgt ggccgctgtc cagcccatga caggcgggat ggccccaacc 2400
acagctccga cccatgtgtt ggcaatgctg atccttttca gtggtgtgta gcagcaggta 2460
tacaggaaaa tgttgaagag ccccagggct cctgtgagtg gattcacccc caaggtcaga 2520
atggcaactc ctggaacagc acaacaagtg gcaaaggaca cagctagcaa cgggctggaa 2580
aaaggcagag aacgtggagt gatcatctcc aactcaactc aatcacactg cctaaaatcg 2640
gcactgacaa agtacaaata atccacctat ccagtagagg cagcactgat ctggctgatt 2700
cctggatatg gggaatcctc tttatgaaac aaacttacaa taagaaagaa tatgtttgtg 2760
ggaaaaaaaa aaaaaaaaaa 2780
<210> 63
<211> 269
<212> PRT
<213> Artificial Sequence
<220>
<223> Basigin
<400> 63
Met Ala Ala Ala Leu Phe Val Leu Leu Gly Phe Ala Leu Leu Gly Thr
1 5 10 15
His Gly Ala Ser Gly Ala Ala Gly Thr Val Phe Thr Thr Val Glu Asp
20 25 30
Leu Gly Ser Lys Ile Leu Leu Thr Cys Ser Leu Asn Asp Ser Ala Thr
35 40 45
Glu Val Thr Gly His Arg Trp Leu Lys Gly Gly Val Val Leu Lys Glu
50 55 60
Asp Ala Leu Pro Gly Gln Lys Thr Glu Phe Lys Val Asp Ser Asp Asp
65 70 75 80
Gln Trp Gly Glu Tyr Ser Cys Val Phe Leu Pro Glu Pro Met Gly Thr
85 90 95
Ala Asn Ile Gln Leu His Gly Pro Pro Arg Val Lys Ala Val Lys Ser
100 105 110
Ser Glu His Ile Asn Glu Gly Glu Thr Ala Met Leu Val Cys Lys Ser
115 120 125
Glu Ser Val Pro Pro Val Thr Asp Trp Ala Trp Tyr Lys Ile Thr Asp
130 135 140
Ser Glu Asp Lys Ala Leu Met Asn Gly Ser Glu Ser Arg Phe Phe Val
145 150 155 160
Ser Ser Ser Gln Gly Arg Ser Glu Leu His Ile Glu Asn Leu Asn Met
165 170 175
Glu Ala Asp Pro Gly Gln Tyr Arg Cys Asn Gly Thr Ser Ser Lys Gly
180 185 190
Ser Asp Gln Ala Ile Ile Thr Leu Arg Val Arg Ser His Leu Ala Ala
195 200 205
Leu Trp Pro Phe Leu Gly Ile Val Ala Glu Val Leu Val Leu Val Thr
210 215 220
Ile Ile Phe Ile Tyr Glu Lys Arg Arg Lys Pro Glu Asp Val Leu Asp
225 230 235 240
Asp Asp Asp Ala Gly Ser Ala Pro Leu Lys Ser Ser Gly Gln His Gln
245 250 255
Asn Asp Lys Gly Lys Asn Val Arg Gln Arg Asn Ser Ser
260 265
<210> 64
<211> 1586
<212> DNA
<213> Artificial Sequence
<220>
<223> Basigin
<400> 64
ggttgtagga ccggcgagga ataggaatca tggcggctgc gctgttcgtg ctgctgggat 60
tcgcgctgct gggcacccac ggagcctccg gggctgccgg cacagtcttc actaccgtag 120
aagaccttgg ctccaagata ctcctcacct gctccttgaa tgacagcgcc acagaggtca 180
cagggcaccg ctggctgaag gggggcgtgg tgctgaagga ggacgcgctg cccggccaga 240
aaacggagtt caaggtggac tccgacgacc agtggggaga gtactcctgc gtcttcctcc 300
ccgagcccat gggcacggcc aacatccagc tccacgggcc tcccagagtg aaggccgtga 360
agtcgtcaga acacatcaac gagggggaga cggccatgct ggtctgcaag tcagagtccg 420
tgccacctgt cactgactgg gcctggtaca agatcactga ctctgaggac aaggccctca 480
tgaacggctc cgagagcagg ttcttcgtga gttcctcgca gggccggtca gagctacaca 540
ttgagaacct gaacatggag gccgaccccg gccagtaccg gtgcaacggc accagctcca 600
agggctccga ccaggccatc atcacgctcc gcgtgcgcag ccacctggcc gccctctggc 660
ccttcctggg catcgtggct gaggtgctgg tgctggtcac catcatcttc atctacgaga 720
agcgccggaa gcccgaggac gtcctggatg atgacgacgc cggctctgca cccctgaaga 780
gcagcgggca gcaccagaat gacaaaggca agaacgtccg ccagaggaac tcttcctgag 840
gcaggtggcc cgaggacgct ccctgctccg cgtctgcgcc gccgccggag tccactccca 900
gtgcttgcaa gattccaagt tctcacctct taaagaaaac ccaccccgta gattcccatc 960
atacacttcc ttctttttta aaaaagttgg gttttctcca ttcaggattc tgttccttag 1020
gattttttct tctgaagtgt ttcacgagag cccgggagct gctgccctgc ggccccgtct 1080
gtggctttca gcctctgggt ctgagtcatg gccgggtggg cggcacagcc ttctccactg 1140
gccggagtca gtgccaggtc cttgcccttt gtggaaagtc acaggtcaca cgaggggccc 1200
cgtgtcctgc ctgtctgaag ccaatgctgt ctggttgcgc catttttgtg cttttatgtt 1260
taattttatg agggccacgg gtctgtgttc gactcagcct cagggacgac tctgacctct 1320
tggccacaga ggactcactt gcccacaccg agggcgaccc cgtcacagcc tcaagtcact 1380
cccaagcccc ctccttgtct gtgcatccgg gggcagctct ggagggggtt tgctggggaa 1440
ctggcgccat cgccgggact ccagaaccgc agaagcctcc ccagctcacc cctggaggac 1500
ggccggctct ctatagcacc agggctcacg tgggaacccc cctcccaccc accgccacaa 1560
taaagatcgc ccccacctcc agggtc 1586
<210> 65
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TPH2(chr12, start: 72335380) PCR primer sequence(left)
<400> 65
gagtgacacg gcaacttcac 20
<210> 66
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TPH2(chr12, start: 72335380) PCR primer sequence(right)
<400> 66
caactgctgt cttgccactt 20
<210> 67
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> HTR1F(chr3, start: 88040035) PCR primer sequence(left)
<400> 67
tggtgtccct cactctgtct 20
<210> 68
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> HTR1F(chr3, start: 88040035) PCR primer sequence(right)
<400> 68
gccagtggga tgtagaaagc t 21
<210> 69
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9, start: 117070009) PCR primer sequence(left)
<400> 69
tcacaagatg cagggtccat 20
<210> 70
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9, start: 117070009) PCR primer sequence(right)
<400> 70
ctggggatag aggcagacag 20
<210> 71
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TPH2(chr12, start: 72335380) PCR primer sequence(left)
<400> 71
gagtgacacg gcaacttcac 20
<210> 72
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TPH2(chr12, start: 72335380) PCR primer sequence(right)
<400> 72
caactgctgt cttgccactt 20
<210> 73
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9) PCR primer sequence(left)
<400> 73
cagccaccaa aatccccaaa 20
<210> 74
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9) PCR primer sequence(right)
<400> 74
aactggacgg gaagtaggtg 20
<210> 75
<211> 17
<212> DNA
<213> Artificial Sequence
<220>
<223> DPP6(chr7) PCR primer sequence(left)
<400> 75
agtgggaacc ggagaga 17
<210> 76
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> DPP6(chr7) PCR primer sequence(right)
<400> 76
ggaacgtaag gcgaattcc 19
<210> 77
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> BTBD9(chr6) PCR primer sequence(left)
<400> 77
cgctgcctcc tttattggtg 20
<210> 78
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> BTBD9(chr6) PCR primer sequence(right)
<400> 78
ctttgagtgt ccagagcagc 20
<210> 79
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> IL1RN(chr2) PCR primer sequence(left)
<400> 79
aacatcactg acctgagcga 20
<210> 80
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> IL1RN(chr2) PCR primer sequence(right)
<400> 80
ggcagtacta ctcgtcctcc 20
<210> 81
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> BTBD9(chr6, start: 38142846) PCR primer sequence(left)
<400> 81
cgctgcctcc tttattggtg 20
<210> 82
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> BTBD9(chr6, start: 38142846) PCR primer sequence(right)
<400> 82
ctttgagtgt ccagagcagc 20
<210> 83
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> SGCE(chr7) PCR primer sequence(left)
<400> 83
gacacaagtg ttttgcctt 19
<210> 84
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> SGCE(chr7) PCR primer sequence(right)
<400> 84
ggggtcatag tttacccg 18
<210> 85
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> MECP2(chr10) PCR primer sequence(left)
<400> 85
aggcatcttg acaaggagct 20
<210> 86
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> MECP2(chr10) PCR primer sequence(right)
<400> 86
ttcacggtaa ctgggagagg 20
<210> 87
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> ITGA1(chr5) PCR primer sequence(left)
<400> 87
tcggagtgaa aatgcatctc tg 22
<210> 88
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ITGA1(chr5) PCR primer sequence(right)
<400> 88
tctgtcactt accgagagca 20
<210> 89
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DRD3(chr3) PCR primer sequence(left)
<400> 89
tggatgaggg acaggatggt 20
<210> 90
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DRD3(chr3) PCR primer sequence(right)
<400> 90
accaagcccc aaagagtctg 20
<210> 91
<211> 17
<212> DNA
<213> Artificial Sequence
<220>
<223> DPP6(chr7) PCR primer sequence(left)
<400> 91
agtgggaacc ggagaga 17
<210> 92
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> DPP6(chr7) PCR primer sequence(right)
<400> 92
ggaacgtaag gcgaattcc 19
<210> 93
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> SGCE(chr7) PCR primer sequence(left)
<400> 93
caggttttgg gtaaggtgga 20
<210> 94
<211> 22
<212> DNA
<213> Artificial Sequence
<220>
<223> SGCE(chr7) PCR primer sequence(right)
<400> 94
gacccctctt tataaacagc gt 22
<210> 95
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9) PCR primer sequence(left)
<400> 95
cttctgtggc ctagagtccc 20
<210> 96
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9) PCR primer sequence(right)
<400> 96
cacagattta ggggaggcca 20
<210> 97
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9) PCR primer sequence(left)
<400> 97
aaagtcagcc ctacccactc 20
<210> 98
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9) PCR primer sequence(right)
<400> 98
catggctggt tatcttggcc 20
<210> 99
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> USH2A-1(chr1) PCR primer sequence(left)
<400> 99
aaagtcagcc ctacccactc 20
<210> 100
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> USH2A-1(chr1) PCR primer sequence(right)
<400> 100
catggctggt tatcttggcc 20
<210> 101
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> USH2A-2(chr1) PCR primer sequence(left)
<400> 101
gcttgaaagg ctagctgtgc 20
<210> 102
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> USH2A-2(chr1) PCR primer sequence(right)
<400> 102
tcatgctgga actgttgggt 20
<210> 103
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> CEP290(chr12) PCR primer sequence(left)
<400> 103
gcagatccac aatagaaca 19
<210> 104
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> CEP290(chr12) PCR primer sequence(right)
<400> 104
cacttaaaac agcagcag 18
<210> 105
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> CYP2D6(chr22) PCR primer sequence(left)
<400> 105
catctgggaa acagtgca 18
<210> 106
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> CYP2D6(chr22) PCR primer sequence(right)
<400> 106
atgtcacggg atgtcata 18
<210> 107
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DRD5(chr4) PCR primer sequence(left)
<400> 107
ggggcagttc gctctatacc 20
<210> 108
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> DRD5(chr4) PCR primer sequence(right)
<400> 108
cggtccacgc tgatgacgc 19
<210> 109
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> SLC6A2(chr16) PCR primer sequence(left)
<400> 109
ttctctccct tctctgccca 20
<210> 110
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> SLC6A2(chr16) PCR primer sequence(right)
<400> 110
gacatcacag tgagctgggt 20
<210> 111
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> PRODH(chr22) PCR primer sequence(left)
<400> 111
catgacataa aagctgagg 19
<210> 112
<211> 17
<212> DNA
<213> Artificial Sequence
<220>
<223> PRODH(chr22) PCR primer sequence(right)
<400> 112
ccacaggatg cctatga 17
<210> 113
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ASIC3-1(chr7) PCR primer sequence(left)
<400> 113
catcatcgat cagctgggct 20
<210> 114
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ASIC3-1(chr7) PCR primer sequence(right)
<400> 114
gggtgggcac agttcttgta 20
<210> 115
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ASIC3-2(chr7) PCR primer sequence(left)
<400> 115
tagccccctg actgactctc 20
<210> 116
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ASIC3-2(chr7) PCR primer sequence(right)
<400> 116
agtccagcag catgtcatcc 20
<210> 117
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TRAPPC9-1(chr8) PCR primer sequence(left)
<400> 117
agcttcactg tgacggcttt 20
<210> 118
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TRAPPC9-1(chr8) PCR primer sequence(right)
<400> 118
aaaacaaaac cagcctgggc 20
<210> 119
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TRAPPC9-2(chr8) PCR primer sequence(left)
<400> 119
gaaggaggcc cagttctgtc 20
<210> 120
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TRAPPC9-2(chr8) PCR primer sequence(right)
<400> 120
agtctgtaag cctcccccat 20
<210> 121
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> HTR3A(chr11) PCR primer sequence(left)
<400> 121
accatgttca ggtcaccacc 20
<210> 122
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> HTR3A(chr11) PCR primer sequence(right)
<400> 122
agggttcaga ccttggcttg 20
<210> 123
<211> 17
<212> DNA
<213> Artificial Sequence
<220>
<223> DPP6(chr7, start: 153750096) PCR primer sequence(left)
<400> 123
agtgggaacc ggagaga 17
<210> 124
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> DPP6(chr7, start: 153750096) PCR primer sequence(right)
<400> 124
ggaacgtaag gcgaattcc 19
<210> 125
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ARHGAP32(chr11) PCR primer sequence(left)
<400> 125
ctgaccagga ggaactgagc 20
<210> 126
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> ARHGAP32(chr11) PCR primer sequence(right)
<400> 126
ggcgcaaatg tcacaaact 19
<210> 127
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ARHGAP32(chr11, start: 60704125) PCR primer sequence(left)
<400> 127
ctgaccagga ggaactgagc 20
<210> 128
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> ARHGAP32(chr11, start: 60704125) PCR primer sequence(right)
<400> 128
ggcgcaaatg tcacaaact 19
<210> 129
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DNAH10(chr12, start: 124323060) PCR primer sequence(left)
<400> 129
gaacagtgtc tccgctctcc 20
<210> 130
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DNAH10(chr12, start: 124323060) PCR primer sequence(right)
<400> 130
ttgaggcttt tctggcattt 20
<210> 131
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DNAH10(chr12, start: 124315194) PCR primer sequence(left)
<400> 131
aaatgaccga aacgttcacc 20
<210> 132
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DNAH10(chr12, start: 124315194) PCR primer sequence(right)
<400> 132
cataccacca cgctcagcta 20
<210> 133
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FN1(chr2, start: 216239973) PCR primer sequence(left)
<400> 133
agcatggaag cagcaatacc 20
<210> 134
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FN1(chr2, start: 216239973) PCR primer sequence(right)
<400> 134
attgatgcac catccaacct 20
<210> 135
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9, start: 116930998) PCR primer sequence(left)
<400> 135
cgctcaacca tcacagaaga 20
<210> 136
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9, start: 116930998) PCR primer sequence(right)
<400> 136
gagactctgg caggaactgg 20
<210> 137
<211> 23
<212> DNA
<213> Artificial Sequence
<220>
<223> TBCD(chr17, start: 80755631) PCR primer sequence(left)
<400> 137
ttttcagatg aatttttggg aga 23
<210> 138
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TBCD(chr17, start: 80755631) PCR primer sequence(right)
<400> 138
gggcaaacag tcttcacgtt 20
<210> 139
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> PRX(chr19, start: 40900339) PCR primer sequence(left)
<400> 139
ttccccagtg accatctca 19
<210> 140
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PRX(chr19, start: 40900339) PCR primer sequence(right)
<400> 140
gcgtaccttc tgcctctcac 20
<210> 141
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PRX(chr19, start: 40901579) PCR primer sequence(left)
<400> 141
gaacttggaa gagggcttga 20
<210> 142
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PRX(chr19, start: 40901579) PCR primer sequence(right)
<400> 142
tagacctgcc aggagcactt 20
<210> 143
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DRD3(chr3, start: 113890728) PCR primer sequence(left)
<400> 143
tataccaccc agggcatcac 20
<210> 144
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DRD3(chr3, start: 113890728) PCR primer sequence(right)
<400> 144
actacacctg tggggcagag 20
<210> 145
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FBN3(chr19, start: 8161788) PCR primer sequence(left)
<400> 145
gacctggaca gagccatacc 20
<210> 146
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FBN3(chr19, start: 8161788) PCR primer sequence(right)
<400> 146
cccagatgtc gatgagtgtg 20
<210> 147
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FBN3(chr19, start: 8151993) PCR primer sequence(left)
<400> 147
agtttcctgc acccatgaag 20
<210> 148
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FBN3(chr19, start: 8151993) PCR primer sequence(right)
<400> 148
agtgtgcaga tggtcagcag 20
<210> 149
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9, start: 116931124) PCR primer sequence(left)
<400> 149
cagttcctgc cagagtctcc 20
<210> 150
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9, start: 116931124) PCR primer sequence(right)
<400> 150
ctggcatggc tggttatctt 20
<210> 151
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9, start: 116994117) PCR primer sequence(left)
<400> 151
catttgcccc cttttacaga 20
<210> 152
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL27A1(chr9, start: 116994117) PCR primer sequence(right)
<400> 152
gcagagaaac cacagtgcaa 20
<210> 153
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> CACNA1B(chr9, start: 141014669) PCR primer sequence(left)
<400> 153
tgactgtgag accaggatgg 20
<210> 154
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> CACNA1B(chr9, start: 141014669) PCR primer sequence(right)
<400> 154
tggtgctgca aagatgagtc 20
<210> 155
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> CACNA1B(chr9, start: 140772393) PCR primer sequence(left)
<400> 155
acgtgaccgg ccccttat 18
<210> 156
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> CACNA1B(chr9, start: 140772393) PCR primer sequence(right)
<400> 156
cgatcgattg cttgtagagg a 21
<210> 157
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> KAT6B(chr10, start: 76781852) PCR primer sequence(left)
<400> 157
cagtaggcaa tcacctgcaa 20
<210> 158
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> KAT6B(chr10, start: 76781852) PCR primer sequence(right)
<400> 158
ttgggggaga gctttgaata 20
<210> 159
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL6A1(chr21, start: 47422538) PCR primer sequence(left)
<400> 159
cttgtcccca gaaagacgag 20
<210> 160
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> COL6A1(chr21, start: 47422538) PCR primer sequence(right)
<400> 160
gcggtgacat tcttcagga 19
<210> 161
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO3(chr11, start: 124744033) PCR primer sequence(left)
<400> 161
ggagtaggca ggttgggagt 20
<210> 162
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO3(chr11, start: 124744033) PCR primer sequence(right)
<400> 162
cactgctcga accagaaaca 20
<210> 163
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PRODH(chr22, start: 18905859) PCR primer sequence(left)
<400> 163
ctgccctgag aagacagagg 20
<210> 164
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PRODH(chr22, start: 18905859) PCR primer sequence(right)
<400> 164
ccacaggatg cctatgacaa 20
<210> 165
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> CTNNA3(chr10, start: 68040262) PCR primer sequence(left)
<400> 165
aggcattcca gatggtgaag 20
<210> 166
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> CTNNA3(chr10, start: 68040262) PCR primer sequence(right)
<400> 166
caagtgaatg ttgccttgga 20
<210> 167
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> DGKQ(chr4, start: 955317) PCR primer sequence(left)
<400> 167
gctcaccatg tgcacgac 18
<210> 168
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> DGKQ(chr4, start: 955317) PCR primer sequence(right)
<400> 168
cttcatcaac atccccaggt 20
<210> 169
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> DGKQ(chr4, start: 961785) PCR primer sequence(left)
<400> 169
aagctctgcg tcttgctga 19
<210> 170
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> DGKQ(chr4, start: 961785) PCR primer sequence(right)
<400> 170
gtggggtctt tccctggac 19
<210> 171
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO4(chr11, start: 124764205) PCR primer sequence(left)
<400> 171
gcactgccct cacctaaaag 20
<210> 172
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO4(chr11, start: 124764205) PCR primer sequence(right)
<400> 172
gctgttcacc tctgcttgtg 20
<210> 173
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> MDN1(chr6, start: 90428873) PCR primer sequence(left)
<400> 173
ccacgggaaa ggactgagta 20
<210> 174
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> MDN1(chr6, start: 90428873) PCR primer sequence(right)
<400> 174
acccatacat gggaaccaga 20
<210> 175
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> MDN1(chr6, start: 90382295) PCR primer sequence(left)
<400> 175
tgcctgattt cagacatacc a 21
<210> 176
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> MDN1(chr6, start: 90382295) PCR primer sequence(right)
<400> 176
gttggacgaa ggatttgtgg 20
<210> 177
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PCNT(chr21, start: 47832788) PCR primer sequence(left)
<400> 177
gtactggttc ccagctccag 20
<210> 178
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PCNT(chr21, start: 47832788) PCR primer sequence(right)
<400> 178
aggcgcattt catttttcac 20
<210> 179
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PCNT(chr21, start: 47847674) PCR primer sequence(left)
<400> 179
ttctgcaggt tgtgcaagag 20
<210> 180
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> PCNT(chr21, start: 47847674) PCR primer sequence(right)
<400> 180
gcagagctga cactcacctg 20
<210> 181
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> NTN4(chr12, start: 96076512) PCR primer sequence(left)
<400> 181
tcccctcata ggatccaaaa 20
<210> 182
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> NTN4(chr12, start: 96076512) PCR primer sequence(right)
<400> 182
tgcacaataa gagcgaacca 20
<210> 183
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> RCAN1(chr21, start: 35897642) PCR primer sequence(left)
<400> 183
cgttaaggag cagtcggaac 20
<210> 184
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> RCAN1(chr21, start: 35897642) PCR primer sequence(right)
<400> 184
tcaagagagg tggggaaaaa 20
<210> 185
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> KAT6B(chr10, start: 76788690) PCR primer sequence(left)
<400> 185
acatgtgccc ctgtaagtcc 20
<210> 186
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> KAT6B(chr10, start: 76788690) PCR primer sequence(right)
<400> 186
ttttccgtgg agatttctgg 20
<210> 187
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL8A1(chr3, start: 99509813) PCR primer sequence(left)
<400> 187
agatgcccca cttgcagtat 20
<210> 188
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> COL8A1(chr3, start: 99509813) PCR primer sequence(right)
<400> 188
tccccctctg atcccataat 20
<210> 189
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FN1(chr2, start: 216296589) PCR primer sequence(left)
<400> 189
ctaagcatcc cagctcttgc 20
<210> 190
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FN1(chr2, start: 216296589) PCR primer sequence(right)
<400> 190
catgaagggg gtcagtccta 20
<210> 191
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO4(chr11, start: 124763789) PCR primer sequence(left)
<400> 191
cctggtcaga gatccaaagc 20
<210> 192
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO4(chr11, start: 124763789) PCR primer sequence(right)
<400> 192
cagctgaggg ctaccttgaa 20
<210> 193
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO4(chr11, start: 124765478) PCR primer sequence(left)
<400> 193
gccagaggat ggtctcactt 20
<210> 194
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO4(chr11, start: 124765478) PCR primer sequence(right)
<400> 194
cgttcctgag ctctctgacc 20
<210> 195
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO3(chr11, start: 124742934) PCR primer sequence(left)
<400> 195
gagtgactgg gaaccctcaa 20
<210> 196
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO3(chr11, start: 124742934) PCR primer sequence(right)
<400> 196
ggctacaggc ccagtgagta 20
<210> 197
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> SLC6A2(chr16, start: 55690691) PCR primer sequence(left)
<400> 197
gaccggtaaa gttcctctcg 20
<210> 198
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> SLC6A2(chr16, start: 55690691) PCR primer sequence(right)
<400> 198
atcttcttgc cccaggtctc 20
<210> 199
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO4(chr11, start: 124761429) PCR primer sequence(left)
<400> 199
gaggctgtct gagctggaac 20
<210> 200
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO4(chr11, start: 124761429) PCR primer sequence(right)
<400> 200
gatctcaggg atggaaagca 20
<210> 201
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TH(chr11, start: 2186957) PCR primer sequence(left)
<400> 201
gaggactggg cagagacaag 20
<210> 202
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> TH(chr11, start: 2186957) PCR primer sequence(right)
<400> 202
actggttcac ggtggagttc 20
<210> 203
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> SLC6A20(chr3, start: 45814094) PCR primer sequence(left)
<400> 203
gcccctgatg aggtagatga 20
<210> 204
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> SLC6A20(chr3, start: 45814094) PCR primer sequence(right)
<400> 204
gaatctccat gccttttcca 20
<210> 205
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FN1(chr2, start: 216244028) PCR primer sequence(left)
<400> 205
ttcattggtc cggtcttctc 20
<210> 206
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> FN1(chr2, start: 216244028) PCR primer sequence(right)
<400> 206
ttttcctttt cccccatttc 20
<210> 207
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO3(chr11, start: 124739454) PCR primer sequence(left)
<400> 207
ccagtcctcc gtgatgattt 20
<210> 208
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> ROBO3(chr11, start: 124739454) PCR primer sequence(right)
<400> 208
cctatgtccc ctcccttgtt 20
Claims (22)
- 제4항에 있어서,
상기 검출 제제가 SGCE 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브인 것인, 뚜렛증후군 진단용 키트. - 제5항에 있어서,
상기 검출 제제가 TPH2, HTR1F, COL27A1, BTBD9, IL1RN, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브인 것인, 뚜렛증후군 진단용 키트. - 제8항에 있어서,
상기 검출 제제가 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브인 것인, 뚜렛증후군 진단용 키트. - 제10항 또는 제11항에 있어서,
상기 유전자 패널은 차세대 염기서열 분석(next generation sequencing; NGS)용인 유전자 패널. - 1) 개체로부터 분리된 시료에서 제10항 또는 제11항의 유전자 패널의 돌연변이를 확인하는 단계; 및
2) 상기 돌연변이가 존재하는 경우, 개체를 뚜렛증후군으로 판정하는 단계를 포함하는, 뚜렛증후군 진단에 대한 정보의 제공방법. - 제13항에 있어서,
상기 1) 단계는, 시료의 DNA를 추출하여 제작된 NGS 라이브러리를 차세대 염기서열 분석방법으로 분석하여 돌연변이 유무를 확인하는, 뚜렛증후군 진단에 대한 정보의 제공방법.
- 삭제
- 삭제
- 삭제
- 삭제
- 삭제
- 삭제
- 삭제
- 삭제
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020190070779A KR102250063B1 (ko) | 2019-06-14 | 2019-06-14 | 뚜렛증후군의 원인 유전자를 동정하는 방법 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020190070779A KR102250063B1 (ko) | 2019-06-14 | 2019-06-14 | 뚜렛증후군의 원인 유전자를 동정하는 방법 |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20200143026A KR20200143026A (ko) | 2020-12-23 |
KR102250063B1 true KR102250063B1 (ko) | 2021-05-12 |
Family
ID=74089200
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020190070779A KR102250063B1 (ko) | 2019-06-14 | 2019-06-14 | 뚜렛증후군의 원인 유전자를 동정하는 방법 |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR102250063B1 (ko) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114457154B (zh) * | 2022-04-13 | 2022-06-28 | 山东第一医科大学附属省立医院(山东省立医院) | KIBRA rs17070145检测试剂在制备嗅觉功能评价试剂盒中的应用 |
CN114999572B (zh) * | 2022-07-13 | 2024-07-26 | 圣湘生物科技股份有限公司 | 一种设计引物的方法、设备、可读介质及装置 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012511895A (ja) | 2008-11-14 | 2012-05-31 | ザ チルドレンズ ホスピタル オブ フィラデルフィア | ヒト認知の原因となる遺伝子変異体及び診断標的及び治療標的としてのそれらを使用する方法 |
JP2018526398A (ja) | 2015-09-08 | 2018-09-13 | ザ・チルドレンズ・ホスピタル・オブ・フィラデルフィアThe Children’S Hospital Of Philadelphia | トゥレット症候群の診断及び治療方法 |
-
2019
- 2019-06-14 KR KR1020190070779A patent/KR102250063B1/ko active IP Right Grant
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012511895A (ja) | 2008-11-14 | 2012-05-31 | ザ チルドレンズ ホスピタル オブ フィラデルフィア | ヒト認知の原因となる遺伝子変異体及び診断標的及び治療標的としてのそれらを使用する方法 |
JP2018526398A (ja) | 2015-09-08 | 2018-09-13 | ザ・チルドレンズ・ホスピタル・オブ・フィラデルフィアThe Children’S Hospital Of Philadelphia | トゥレット症候群の診断及び治療方法 |
Non-Patent Citations (7)
Title |
---|
Biological Psychiatry (2012) 71:392-402* |
BMC Medical Genetics (2012) 13:123 |
Cell Reports (2018) 24(13):3441-3454 |
Current Behavioral Neuroscience Reports (2016) 3:218-231* |
Movement Disorder (2004) 19(10):1237-1238 |
Neuron (2017) 94:486-499 |
Psychiatric Genetics (2008) 18(2):98* |
Also Published As
Publication number | Publication date |
---|---|
KR20200143026A (ko) | 2020-12-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10889865B2 (en) | Thyroid tumors identified | |
DK2681333T3 (en) | EVALUATION OF RESPONSE TO GASTROENTEROPANCREATIC NEUROENDOCRINE NEOPLASIS (GEP-NENE) THERAPY | |
US20230416827A1 (en) | Assay for distinguishing between sepsis and systemic inflammatory response syndrome | |
KR101421326B1 (ko) | 유방암 예후 예측을 위한 조성물 및 이를 포함하는 키트 | |
AU2016331663A1 (en) | Pathogen biomarkers and uses therefor | |
WO2003042661A2 (en) | Methods of diagnosis of cancer, compositions and methods of screening for modulators of cancer | |
KR20150043566A (ko) | 심장독성 약제의 동정에 마커를 사용하는 용도 | |
AU2012381038A1 (en) | Interrogatory cell-based assays for identifying drug-induced toxicity markers | |
KR20140140069A (ko) | 전반적 발달장애의 진단 및 치료용 조성물 및 그 진단 및 치료 방법 | |
CN101258249A (zh) | 检测黑素瘤的方法和试剂 | |
CN110628894A (zh) | 用于帕金森病基因突变检测的靶向捕获测序试剂盒及其应用 | |
MXPA05005653A (es) | Determinacion y seleccion terapeutica de genes de insuficiencia cardiaca. | |
CA2403946A1 (en) | Genes expressed in foam cell differentiation | |
CN106636344A (zh) | 一种基于二代高通量测序技术的地中海贫血症的基因检测试剂盒 | |
KR102250063B1 (ko) | 뚜렛증후군의 원인 유전자를 동정하는 방법 | |
CN113355332B (zh) | Heg1基因突变体及其应用 | |
AU2016377391A1 (en) | Triage biomarkers and uses therefor | |
US20020137077A1 (en) | Genes regulated in activated T cells | |
JP2003235573A (ja) | 糖尿病性腎症マーカーおよびその利用 | |
AU2018304242A1 (en) | Methods for detection of plasma cell dyscrasia | |
AU2016349950A1 (en) | Viral biomarkers and uses therefor | |
EP1497454A2 (en) | Methods of diagnosis of cancer, compositions and methods of screening for modulators of cancer | |
KR102513462B1 (ko) | 유전성 혈액응고 장애 진단용 조성물 및 이의 용도 | |
KR102480128B1 (ko) | 면역력 강화 소 African Humped Cattle (AFH) 품종 특이적 단일염기다형성 및 그의 용도 | |
KR101656744B1 (ko) | 트리클로산 노출에 대응하는 바다송사리 유전자 및 이를 이용한 수생태계 환경오염 진단 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AMND | Amendment | ||
E601 | Decision to refuse application | ||
AMND | Amendment | ||
X701 | Decision to grant (after re-examination) | ||
GRNT | Written decision to grant |