KR20200143026A - 뚜렛증후군의 원인 유전자를 동정하는 방법 - Google Patents

뚜렛증후군의 원인 유전자를 동정하는 방법 Download PDF

Info

Publication number
KR20200143026A
KR20200143026A KR1020190070779A KR20190070779A KR20200143026A KR 20200143026 A KR20200143026 A KR 20200143026A KR 1020190070779 A KR1020190070779 A KR 1020190070779A KR 20190070779 A KR20190070779 A KR 20190070779A KR 20200143026 A KR20200143026 A KR 20200143026A
Authority
KR
South Korea
Prior art keywords
gly
pro
leu
ser
seq
Prior art date
Application number
KR1020190070779A
Other languages
English (en)
Other versions
KR102250063B1 (ko
Inventor
김남순
정경숙
김대수
조은위
강제욱
정우영
우영재
임원희
이정주
Original Assignee
한국생명공학연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국생명공학연구원 filed Critical 한국생명공학연구원
Priority to KR1020190070779A priority Critical patent/KR102250063B1/ko
Publication of KR20200143026A publication Critical patent/KR20200143026A/ko
Application granted granted Critical
Publication of KR102250063B1 publication Critical patent/KR102250063B1/ko

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Organic Chemistry (AREA)
  • Biotechnology (AREA)
  • General Health & Medical Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Medical Informatics (AREA)
  • Theoretical Computer Science (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

본 발명은 뚜렛증후군의 원인 유전자를 동정하는 방법에 관한 것이다. 본 발명의 뚜렛증후군 유전자 군의 분석을 통해 효과적으로 뚜렛증후군을 진단할 수 있다. 또한, 본 발명에 따른 뚜렛증후군 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법은 뚜렛증후군 원인 유전자의 동정 효율을 높이면서 비용 및 시간을 절감할 수 있다.

Description

뚜렛증후군의 원인 유전자를 동정하는 방법{METHOD FOR IDENTIFYING CAUSATIVE GENES OF TOURETTE SYNDROME}
본 발명은 뚜렛증후군의 원인 유전자를 동정하는 방법에 관한 것이다.
뚜렛증후군(tourette syndrome, TS)은 신경질환의 한 종류로서, 스스로 조절하기 힘든 갑작스럽고 단순하며 반복적인 동작인 운동틱이나 소리를 내는 현상인 음성틱을 나타내는 가장 흔한 원인으로 알려져 있다. 뚜렛증후군은 일시적 및 영구(만성) 틱을 포함하는 틱 장애 스펙트럼의 일부로 정의되고, 정확한 원인은 알려지지 않았지만 유전적 요소와 환경적 요소가 결합된 것으로 여겨진다. 뚜렛증후군은 흔히 다른 질병이 동반되는데, 뚜렛증후군의 60%에서 주의력결핍 과잉행동장애가 동반되었다는 보고가 있으며, 그 외에도 강박장애(27%)나 강박적 행동(32%), 학습장애(23%), 행동장애/적대적반항장애(15%)가 함께 나타날 수 있다(권순재, 이진영 및 신재필, 2012).
한편, 차세대 시퀀싱(next-generation sequenceing, NGS) 기술에 기반한 전체엑솜염기서열분석(whole exome sequencing, WES)의 등장으로 다양한 질환의 원인유전자 및 그의 돌연변이 등 다양한 질환의 유전적 연구가 빠르게 확대되고 있다.
현재까지 난치 질환 중 하나인 뚜렛증후군의 경우, 진단에 대한 구체적인 검사는 현재 없고 의사의 소견과 MRI 검사가 활용되고 있다. 따라서, 뚜렛증후군과 관련되는 모든 원인유전자에 대한 분석이 가능한 진단시스템 개발의 필요성이 대두되고 있다. 특히, 원인 유전자가 1-2개가 아닌 이질적 질환의 원인 유전자 및 돌연변이 확인에 NGS와 같은 유전적 연구가 매우 적절한 방법으로 여겨지고 있다.
권순재, 이진영 및 신재필, 뚜렛 증후군을 가진 15세 소아에서 발생한 양측성 망막박리. J Korean Ophthalmol Soc 2012;53(11):1704-1707
이러한 배경하에서, 본 연구자들은 뚜렛증후군의 원인 유전자 및 CNV(copy number variation)를 동정하는 방법을 개발하기 위해 예의 연구 노력한 결과, 뚜렛증후군의 원인 유전자 동정하였고, 이를 동정하기 위한 알고리즘을 확립하여 진단 시스템을 구축함으로써 본 발명을 완성하였다.
본 발명의 목적은 뚜렛증후군의 원인 유전자를 동정하는 방법을 제공하는 것이다.
상기 목적을 달성하기 위하여, 본 발명은 COL27A1, BTBD9, SGCE, MECP2, USH2A, CEP290, DRD5, ASIC3-1, ASIC3-2 및 TRAPPC9로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물을 제공한다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물을 제공한다.
또한, 본 발명은 TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질 또는 이를 코딩하는 유전자를 검출하는 제제를 포함하는 뚜렛증후군 진단용 키트를 제공한다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 유효성분으로 포함하는 뚜렛증후군 진단용 키트를 제공한다.
또한, 본 발명은 TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩을 제공한다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩을 제공한다.
또한, 본 발명은 1) 뚜렛증후군 환자, 그의 부모 또는 형제자매로부터 SNV(single nucleotide variation) 및 CNV(copy number variation) 데이터를 수득하는 단계; 2) 상기 SNV 및 CNV 데이터를 맵핑하는 단계; 및 3) 뚜렛증후군 원인 유전자의 변이된 위치를 확인 또는 CNV 변이를 확인하는 단계를 포함하는 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법을 제공한다.
또한, 본 발명은 1) 뚜렛증후군 SNV 데이터, CNV 데이터 및 가족 정보가 입력되는 데이터 취득부; 2) 상기 입력된 SNV 데이터, CNV 데이터 및 가족 정보를 기 설정된 수식 및 윈도우를 이용하여 우선순위 점수 연산을 수행하는 데이터 연산부; 3) 상기 연산부에서 연산된 우선순위 점수에 따라 선정된 SNV 및 CNV, 그리고 뚜렛증후군의 SNV 및 CNV 데이터를 맵핑하는 맵핑부; 및 4) 상기 맵핑된 SNV 및 CNV를 이용하여 뚜렛증후군 위험 여부를 출력하는 동정부를 포함하는 뚜렛증후군 진단용 시스템을 제공한다.
또한, 본 발명은 1) 뚜렛증후군이 의심되는 개체로부터 분리된 시료에서 SNV 관련 유전자 또는 CNV 관련 유전자의 변이를 확인하는 단계; 및 2) 상기 SNV 관련 유전자 또는 CNV 관련 유전자가 변이가 일어난 경우, 개체를 뚜렛증후군으로 판정하는 단계를 포함하는, 뚜렛증후군 진단에 대한 정보의 제공방법을 제공한다.
본 발명의 뚜렛증후군 유전자 군의 분석을 통해 효과적으로 뚜렛증후군을 진단할 수 있다. 또한, 본 발명에 따른 뚜렛증후군 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법은 뚜렛증후군 원인 유전자의 동정 효율을 높이면서 비용 및 시간을 절감할 수 있다.
도 1은 뚜렛증후군 샘플의 뉴클레오티드 변이를 분석방법을 도식화한 것이다.
도 2는 뚜렛증후군 샘플의 뉴클레오티드 변이 분석 및 CNV 변이 분석을 통하여 뚜렛증후군 관련 유전자를 도출하는 분석방법을 도식화한 것이다.
본 발명은 일 측면으로, COL27A1, BTBD9, SGCE, MECP2, USH2A, CEP290, DRD5, ASIC3-1, ASIC3-2 및 TRAPPC9로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물을 제공한다.
이때, 상기 COL27A1이 서열번호 1의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 BTBD9가 서열번호 3의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 SGCE가 서열번호 5의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 MECP2가 서열번호 7의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USH2A가 서열번호 9의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CEP290가 서열번호 11의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DRD5가 서열번호 13의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ASIC3-1이 서열번호 15의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ASIC3-2가 서열번호 17의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 TRAPPC9가 서열번호 19의 아미노산 서열을 갖는 폴리펩타이드일 수 있다.
본 명세서에서 사용한 용어 "COL27A1"은 "Collagen Type XXVII Alpha 1 Chain"의 약자이다. 본 발명의 COL27A1 단백질은 서열번호 1의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 COL27A1 단백질은 서열번호 1의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 1의 아미노산 서열을 갖는 폴리펩타이드인 COL27A1 단백질을 코딩하는 유전자는 서열번호 2의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 COL27A1 단백질을 코딩하는 염기 서열은 서열번호 2의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "BTBD9"는 "BTB domain containing 9"의 약자이다. 본 발명의 BTBD9 단백질은 서열번호 3의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 BTBD9 단백질은 서열번호 3의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 3의 아미노산 서열을 갖는 폴리펩타이드인 BTBD9 단백질을 코딩하는 유전자는 서열번호 4의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 BTBD9 단백질을 코딩하는 염기 서열은 서열번호 4의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "SGCE"는 "sarcoglycan epsilon"의 약자이다. 본 발명의 SGCE 단백질은 서열번호 5의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 SGCE 단백질은 서열번호 5의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 5의 아미노산 서열을 갖는 폴리펩타이드인 SGCE 단백질을 코딩하는 유전자는 서열번호 6의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 SGCE 단백질을 코딩하는 염기 서열은 서열번호 6의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "MECP2"는 "methyl CpG binding protein 2"의 약자이다. 본 발명의 MECP2 단백질은 서열번호 7의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 MECP2 단백질은 서열번호 7의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 7의 아미노산 서열을 갖는 폴리펩타이드인 MECP2 단백질을 코딩하는 유전자는 서열번호 8의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 MECP2 단백질을 코딩하는 염기 서열은 서열번호 8의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "USH2A"는 "Usher syndrome 2A"의 약자이다. 본 발명의 USH2A 단백질은 서열번호 9의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 USH2A 단백질은 서열번호 9의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 9의 아미노산 서열을 갖는 폴리펩타이드인 USH2A 단백질을 코딩하는 유전자는 서열번호 10의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 USH2A 단백질을 코딩하는 염기 서열은 서열번호 10의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CEP290"는 "Centrosomal protein of 290 kDa"의 약자이다. 본 발명의 CEP290 단백질은 서열번호 11의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CEP290 단백질은 서열번호 11의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 11의 아미노산 서열을 갖는 폴리펩타이드인 CEP290 단백질을 코딩하는 유전자는 서열번호 12의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CEP290 단백질을 코딩하는 염기 서열은 서열번호 12의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 본 발명의 "DRD5" 단백질은 서열번호 13의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 DRD5 단백질은 서열번호 13의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 13의 아미노산 서열을 갖는 폴리펩타이드인 DRD5 단백질을 코딩하는 유전자는 서열번호 14의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 DRD5 단백질을 코딩하는 염기 서열은 서열번호 14의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "ASIC3-1"은 "Acid-sensing ion channel 1"의 약자이다. 본 발명의 ASIC3-1 단백질은 서열번호 15의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 ASIC3-1 단백질은 서열번호 15의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 15의 아미노산 서열을 갖는 폴리펩타이드인 ASIC3-1 단백질을 코딩하는 유전자는 서열번호 16의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 ASIC3-1 단백질을 코딩하는 염기 서열은 서열번호 16의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "ASIC3-2"는 "Acid-sensing ion channel 2"의 약자이다. 본 발명의 ASIC3-2 단백질은 서열번호 17의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 ASIC3-2 단백질은 서열번호 17의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 17의 아미노산 서열을 갖는 폴리펩타이드인 ASIC3-2 단백질을 코딩하는 유전자는 서열번호 18의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 ASIC3-2 단백질을 코딩하는 염기 서열은 서열번호 18의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "RAPPC9"는 서열번호 19의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 RAPPC9 단백질은 서열번호 19의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 19의 아미노산 서열을 갖는 폴리펩타이드인 RAPPC9 단백질을 코딩하는 유전자는 서열번호 20의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 RAPPC9 단백질을 코딩하는 염기 서열은 서열번호 20의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물을 제공한다.
이때, 상기 MST1L이 서열번호 21의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 GBP3이 서열번호 23의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CFHR3이 서열번호 25의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CFHR1이 서열번호 27의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 OR2T2가 서열번호 29의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 OR2T3이 서열번호 31의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 AQP12A가 서열번호 33의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 MUC4가 서열번호 35의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USP17L17이 서열번호 37의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USP17L18이 서열번호 39의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 TMPRSS11E가 서열번호 41의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 UGT2B17이 서열번호 43의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 PDZD2가 서열번호 45의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 GOLPH3이 서열번호 47의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 KLHL3이 서열번호 49의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CTNNA3이 서열번호 51의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 FSCB가 서열번호 53의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DUOXA1이 서열번호 55의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DLG4가 서열번호 57의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ACADVL이 서열번호 59의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CDRT1이 서열번호 61의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 BSG가 서열번호 63의 아미노산 서열을 갖는 폴리펩타이드일 수 있다.
본 명세서에서 사용한 용어 "MST1L"은 "macrophage stimulating 1-like"의 약자이다. 본 발명의 MST1L 단백질은 서열번호 21의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 MST1L 단백질은 서열번호 21의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 21의 아미노산 서열을 갖는 폴리펩타이드인 MST1L 단백질을 코딩하는 유전자는 서열번호 22의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 MST1L 단백질을 코딩하는 염기 서열은 서열번호 22의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "GBP3"은 "guanylate binding protein 3"의 약자이다. 본 발명의 GBP3 단백질은 서열번호 23의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 GBP3 단백질은 서열번호 23의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 23의 아미노산 서열을 갖는 폴리펩타이드인 GBP3 단백질을 코딩하는 유전자는 서열번호 24의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 GBP3 단백질을 코딩하는 염기 서열은 서열번호 24의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CFHR3"은 "Complement Factor H Related 3"의 약자이다. 본 발명의 CFHR3 단백질은 서열번호 25의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CFHR3 단백질은 서열번호 25의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 25의 아미노산 서열을 갖는 폴리펩타이드인 CFHR3 단백질을 코딩하는 유전자는 서열번호 26의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CFHR3 단백질을 코딩하는 염기 서열은 서열번호 26의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CFHR1"은 "Complement Factor H Related 1"의 약자이다. 본 발명의 CFHR1 단백질은 서열번호 27의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CFHR1 단백질은 서열번호 27의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 27의 아미노산 서열을 갖는 폴리펩타이드인 CFHR1 단백질을 코딩하는 유전자는 서열번호 28의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CFHR1 단백질을 코딩하는 염기 서열은 서열번호 28의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "OR2T2"는 "Olfactory receptor 2T2"의 약자이다. 본 발명의 OR2T2 단백질은 서열번호 29의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 OR2T2 단백질은 서열번호 29의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 29의 아미노산 서열을 갖는 폴리펩타이드인 OR2T2 단백질을 코딩하는 유전자는 서열번호 30의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 OR2T2 단백질을 코딩하는 염기 서열은 서열번호 30의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "OR2T3"은 "Olfactory Receptor Family 2 Subfamily T Member 3"의 약자이다. 본 발명의 OR2T3 단백질은 서열번호 31의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 OR2T3 단백질은 서열번호 31의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 31의 아미노산 서열을 갖는 폴리펩타이드인 OR2T3 단백질을 코딩하는 유전자는 서열번호 32의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 OR2T3 단백질을 코딩하는 염기 서열은 서열번호 32의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "AQP12A"는 "Aquaporin 12A"의 약자이다. 본 발명의 AQP12A 단백질은 서열번호 33의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 AQP12A 단백질은 서열번호 33의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 33의 아미노산 서열을 갖는 폴리펩타이드인 AQP12A 단백질을 코딩하는 유전자는 서열번호 34의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 AQP12A 단백질을 코딩하는 염기 서열은 서열번호 34의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "MUC4"는 "Mucin 4"의 약자이다. 본 발명의 MUC4 단백질은 서열번호 35의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 MUC4 단백질은 서열번호 35의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 35의 아미노산 서열을 갖는 폴리펩타이드인 MUC4 단백질을 코딩하는 유전자는 서열번호 36의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 MUC4 단백질을 코딩하는 염기 서열은 서열번호 36의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "USP17L17"은 "Ubiquitin Specific Peptidase 17-Like Family Member 17"의 약자이다. 본 발명의 USP17L17 단백질은 서열번호 37의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 USP17L17 단백질은 서열번호 37의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 37의 아미노산 서열을 갖는 폴리펩타이드인 USP17L17 단백질을 코딩하는 유전자는 서열번호 38의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 USP17L17 단백질을 코딩하는 염기 서열은 서열번호 38의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "USP17L18"은 "Ubiquitin Specific Peptidase 17-Like Family Member 18"의 약자이다. 본 발명의 USP17L18 단백질은 서열번호 39의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 USP17L18 단백질은 서열번호 39의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 39의 아미노산 서열을 갖는 폴리펩타이드인 USP17L18 단백질을 코딩하는 유전자는 서열번호 40의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 USP17L18 단백질을 코딩하는 염기 서열은 서열번호 40의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "TMPRSS11E"는 "Transmembrane Serine Protease 11E"의 약자이다. 본 발명의 TMPRSS11E 단백질은 서열번호 41의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 TMPRSS11E 단백질은 서열번호 41의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 41의 아미노산 서열을 갖는 폴리펩타이드인 TMPRSS11E 단백질을 코딩하는 유전자는 서열번호 42의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 TMPRSS11E 단백질을 코딩하는 염기 서열은 서열번호 42의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "UGT2B17"은 "UDP Glucuronosyltransferase Family 2 Member B17"의 약자이다. 본 발명의 UGT2B17 단백질은 서열번호 43의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 UGT2B17 단백질은 서열번호 43의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 43의 아미노산 서열을 갖는 폴리펩타이드인 UGT2B17 단백질을 코딩하는 유전자는 서열번호 44의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 UGT2B17 단백질을 코딩하는 염기 서열은 서열번호 44의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "PDZD2"는 "PDZ Domain Containing 2"의 약자이다. 본 발명의 PDZD2 단백질은 서열번호 45의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 PDZD2 단백질은 서열번호 45의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 45의 아미노산 서열을 갖는 폴리펩타이드인 PDZD2 단백질을 코딩하는 유전자는 서열번호 46의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 PDZD2 단백질을 코딩하는 염기 서열은 서열번호 46의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "GOLPH3"은 "Golgi Phosphoprotein 3"의 약자이다. 본 발명의 GOLPH3 단백질은 서열번호 47의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 GOLPH3 단백질은 서열번호 47의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 47의 아미노산 서열을 갖는 폴리펩타이드인 GOLPH3 단백질을 코딩하는 유전자는 서열번호 48의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 GOLPH3 단백질을 코딩하는 염기 서열은 서열번호 48의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "KLHL3"은 "Kelch Like Family Member 3"의 약자이다. 본 발명의 KLHL3 단백질은 서열번호 49의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 KLHL3 단백질은 서열번호 49의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 49의 아미노산 서열을 갖는 폴리펩타이드인 KLHL3 단백질을 코딩하는 유전자는 서열번호 50의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 KLHL3 단백질을 코딩하는 염기 서열은 서열번호 50의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CTNNA3"은 "Catenin Alpha 3"의 약자이다. 본 발명의 CTNNA3 단백질은 서열번호 51의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CTNNA3 단백질은 서열번호 51의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 51의 아미노산 서열을 갖는 폴리펩타이드인 CTNNA3 단백질을 코딩하는 유전자는 서열번호 52의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CTNNA3 단백질을 코딩하는 염기 서열은 서열번호 52의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "FSCB"는 "Fibrous Sheath CABYR Binding Protein"의 약자이다. 본 발명의 FSCB 단백질은 서열번호 53의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 FSCB 단백질은 서열번호 53의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 53의 아미노산 서열을 갖는 폴리펩타이드인 FSCB 단백질을 코딩하는 유전자는 서열번호 54의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 FSCB 단백질을 코딩하는 염기 서열은 서열번호 54의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "DUOXA1"는 "Dual Oxidase Maturation Factor 1"의 약자이다. 본 발명의 DUOXA1 단백질은 서열번호 55의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 DUOXA1 단백질은 서열번호 55의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 55의 아미노산 서열을 갖는 폴리펩타이드인 DUOXA1 단백질을 코딩하는 유전자는 서열번호 56의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 DUOXA1 단백질을 코딩하는 염기 서열은 서열번호 56의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "DLG4"는 "Discs Large MAGUK Scaffold Protein 4"의 약자이다. 본 발명의 DLG4 단백질은 서열번호 57의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 DLG4 단백질은 서열번호 57의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 57의 아미노산 서열을 갖는 폴리펩타이드인 DLG4 단백질을 코딩하는 유전자는 서열번호 58의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 DLG4 단백질을 코딩하는 염기 서열은 서열번호 58의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "ACADVL"는 "Acyl-CoA Dehydrogenase Very Long Chain"의 약자이다. 본 발명의 ACADVL 단백질은 서열번호 59의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 ACADVL 단백질은 서열번호 59의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 59의 아미노산 서열을 갖는 폴리펩타이드인 ACADVL 단백질을 코딩하는 유전자는 서열번호 60의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 ACADVL 단백질을 코딩하는 염기 서열은 서열번호 60의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "CDRT1"는 "CMT1A Duplicated Region Transcript 1"의 약자이다. 본 발명의 CDRT1 단백질은 서열번호 61의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 CDRT1 단백질은 서열번호 61의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 61의 아미노산 서열을 갖는 폴리펩타이드인 CDRT1 단백질을 코딩하는 유전자는 서열번호 62의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 CDRT1 단백질을 코딩하는 염기 서열은 서열번호 62의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용한 용어 "BSG"는 "Basigin"의 약자이다. 본 발명의 BSG 단백질은 서열번호 63의 아미노산 서열을 갖는 폴리펩타이드일 수 있으며, 상기 BSG 단백질은 서열번호 63의 아미노산 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다. 한편, 상기 서열번호 63의 아미노산 서열을 갖는 폴리펩타이드인 BSG 단백질을 코딩하는 유전자는 서열번호 64의 염기서열을 갖는 폴리뉴클레오타이드일 수 있다. 상기 BSG 단백질을 코딩하는 염기 서열은 서열번호 64의 염기 서열과 약 70%, 80%, 90% 또는 95% 이상의 상동성을 가질 수 있다.
본 명세서에서 사용하는 용어 "단일염기서열 변이(single nucleotide variant, SNV)"는 유전체상의 변이 중 단일염기서열이 다른 차이를 보이는 변이를 의미한다. 또한, 본 명세서에서 사용하는 용어 "유전자 단위 반복 변이(copy number variation, CNV)"는 서로 다른 두 DNA 시퀀스를 비교하여 50bp 이상의 DNA 세그먼트의 카피 수(copy number)가 서로 다른 경우의 변이로 정의된다. 상기 CNV는 자폐증(autism), 지적 장애(intellectual disability), 뇌전증(epilepsy), 조현병(schizophrenia), 소아비만(obesity), 암(cancer) 등과 같은 인간의 질병과 연관성이 있는 매우 중요한 변이 유형 중 하나이다.
본 발명은 다른 측면으로, TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질 또는 이를 코딩하는 유전자를 검출하는 제제를 포함하는 뚜렛증후군 진단용 키트를 제공한다. 이때, 상기 검출 제제는 상기 SNV 관련 단백질을 코딩하는 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브일 수 있다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 유효성분으로 포함하는 뚜렛증후군 진단용 키트를 제공한다. 이때, 상기 검출제제는 상기 CNV 관련 단백질을 코딩하는 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브일 수 있다.
“프라이머” 또는 “프로브”는 주형과 상보적으로 결합할 수 있고 역전사효소 또는 DNA 중합효소가 주형의 복제를 개시할 수 있도록 하는 자유 3말단 수산화기(free 3' hydroxyl group)를 가지는 핵산 서열로 상보적인 핵산의 주형과 염기쌍(base pair)을 형성할 수 있고, 핵산 주형의 가닥 복사를 위한 시작 지점으로부터 기능하는 핵산 서열을 의미한다. 본 발명에 따른 유전자들의 공지된 서열 정보를 기반으로, 당 분야에 잘 알려진 기술들을 이용하여 상기 프로브 또는 프라이머를 용이하게 제조할 수 있다.
본 발명의 일 실시예에서는, Sureselect(Agilent Technologies, Santa Clara, CA)를 이용하여 엑솜 프로브를 제작하였다. 구체적으로, 유전자의 UTR(untranslated region) 영역은 타겟에서 제외하여 총 219개 유전자의 CDS(coding sequence) 영역을 포함한 4402개 엑손 영역의 서열을 바탕으로 엑솜 프로브를 제작하였다.
이때, 상기 엑솜(exome)은 유전체의 영역 중에서 진유전자 영역의 총합을 일컫는 말로서, 본 발명에서는 뚜렛증후군 원인 유전자 분석방법을 보정하기 위해 한국인 뚜렛증후군 환자의 전체엑솜염기서열 즉, WES(whole exome sequencing) 유전체데이터를 분석하였다. 상기 WES는 차세대 시퀀싱 기술(NGS)에 기반한 것이다.
상기 NGS는 검체로부터 DNA를 추출한 이후 기계적으로 조각화(fragmentation)를 시킨 후 특정 크기를 가지는 라이브러리를 제작하여 시퀀싱에 사용한다. 대용량 시퀀싱 장비를 사용하여 한 개의 염기단위로 4가지 종류의 상보적 뉴클레오타이드의 결합 및 분리 반응을 반복하면서 초기 시퀀싱 데이터를 생산하게 된다. 이후에 초기 데이터의 가공(Trimming), 맵핑(Mapping), 유전체 변이의 동정 및 변이 정보의 해석(Annotation) 등 생물적보학(Bioinformatics)을 이용한 분석 단계를 수행하여 질병 및 다양한 생물학적 형태(phenotype)에 영향을 미치거나 가능성이 높은 유전체 변이를 발굴한다. 이러한 차세대 시퀀싱 기술 중, 앰플리콘(amplicon) 기반의 NGS 방법은 목적하는 유전자를 증폭시킬 수 있는 프라이머를 설계하여 짧은 길이의 리드를 다양하게 생산한 다음, 이를 정렬하여 분석하는 기술이다. 대표적인 기술은 Emulstion PCR 방법이 있고, 이를 바탕으로 하는 기기는 Roche의 454 platform, Thermo FIsher의 SOLid platform 및 Ion Torrent platform 등이 있다.
본 명세서에서 사용한 용어 "진단용 키트"는 뚜렛증후군 환자로부터 채취한 생물학적 시료를 정상인으로부터 채취한 생물학적 시료와 구분하여 진단할 수 있는 물질이다. 본 발명에서, 상기 키트는 RT-PCR 키트, DNA 칩 키트 또는 단백질 칩 키트인 것일 수 있으나, 이에 제한되는 것은 아니다.
본 발명은 또 다른 측면으로, TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩을 제공한다.
또한, 본 발명은 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩을 제공한다.
본 명세서에서 사용하는 용어 "마이크로어레이 칩"은 고체 기판표면에 DNA, 단백질, 세포 등과 같은 생체물질을 고밀도로 집적한 것을 말하며, 이를 이용해 생물학적 정보를 얻음으로써 유전자 발현양상, 유전자 결함, DNA-Protein 상호작용, Protein-Protein 상호작용, Chemical-Protein 상호작용, 질병진단 등의 목적을 수행하는 유용한 도구로 사용된다. 상기 마이크로어레이 칩의 분석장치는 형광, 화학발광, 질량분석 등의 검출기술을 사용할 수 있다. 마이크로어레이 칩의 분석장치 중 형광물질을 시료에 표지하고 형광스캐너로 분석하는 형광표지분석법이 보편적으로 사용된다.
본 발명은 또 다른 측면으로, 1) 뚜렛증후군 환자, 그의 부모 또는 형제자매로부터 SNV(single nucleotide variation) 및 CNV(copy number variation) 데이터를 수득하는 단계; 2) 상기 SNV 및 CNV 데이터를 맵핑하는 단계; 및 3) 뚜렛증후군 원인 유전자의 변이된 위치를 확인 또는 CNV 변이를 확인하는 단계를 포함하는 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법을 제공한다.
구체적으로, 본 발명은 i) Sickle 프로그램(https://github.com/najoshi/sickle)을 이용하여 Exome 서열 데이터의 전처리하는 단계, ii) Burrows-Wheeler Aligner(BWA) 0.1.17 버젼의 프로그램을 이용하여 인간 표준서열에 맵핑하는 단계, iii) GATK Lite 2.3.9 버전의 프로그램을 이용하여 Local realignment 하는 단계, iv) GATK Lite 2.3.9 프로그램에서 제공하는 BaseRecalibrator 옵션을 이용하여 Base recalibration하는 단계, v) GATK Lite 2.3.9 프로그램에서 제공하는 UnifiedGenotyper 옵션을 이용하여 변이체를 발굴하는 단계, vi) GATK Lite 2.3.9 프로그램에서 제공하는 VariantFiltration 옵션을 사용하여 변이체를 필터링하는 단계, 및 vii) 뚜렛증후군 환자, 뚜렛증후군 환자의 부모 및 형제자매의 염기서열 정보를 이용하여, 뚜렛증후군 환자와 정상인의 염기서열이 각각 일치하는 위치를 선별하는 단계를 통해 뚜렛증후군의 원인 유전자를 스크리닝하였다. 이때, 본 발명에 따른 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법은 카피(copy) 수가 2 이상인 경우 뚜렛증후군 관련 CNV라고 판정하는 단계를 추가적으로 포함할 수 있다.
상기 SNV는 하기 유전자 중 어느 하나 이상에서 발생될 수 있다: ABCA13, ADCY2, ADORA1, ADORA2, AGPAT5, AMBRA1, ANK3, ARHGAP26, ARHGAP30, ARID1A, ARL8A, ASIC3-1, ASIC3-2, ATF6, ATP1A1, BARD1, BCAS3, BCAT1, BDNF, BSN, BTBD9, C8A, CACNA1D, CAMSAP1, CAPRIN2, CARD8, CBFA2T1, CCAR1, CDK12, CELSR3, CEP290, CHD2, CHD5, CHRNA7, CIT, CLCN1, CNTNAP2, COL27A1, CPA4, CREBBP, CSDE1, CSNK1G3, CTCF, CX3CL1, CYP2B6, CYP2C18, DBH, DCLK2, DENND5A, DHX15, DLG5, DLGAP3, DNAH2, DNAJC13, DOCK7, DPP6, DRD1, DRD2, DRD3, DSCAM, DSCAML1, EVPL, FAM120A, FAM71A, FBXO15, FMNL2, FN1, FRY, GAPVD1, GBX2, GCH1, GDNF, GET4, GIGYF1, GNB2L1, GOPC, HDAC5, HDC, HEATR5B, HECTD3, HEPACAM2, HERC1, HERC2, HIST1H1T, HLA-E, HNRNPA0, HTR1F, HTR2C, HTR3A, IL16, IL1RN, ITGA1, IMMP2L, ITOA1, ITPR2, KBTBD8, KDM5B, KIAA0368, KIAA1429, KLHL32, KLHL9, KNDC1, KRTAP10-4, LAX1, LILRA2, LLGL1, LMNA, LRP8, LZTR1, MAB21L2, MARK2, MCM7, ME2, MECP2, MGAM, MPL, MRPL3, MUC5B, MYH10, MYH4, MYO5A, NCBP1, NID1, NIPBL, NLGN4X, NLRP11, NPC1, NUP85, OFCC1, OLFM1, OPA1, OR9I1, PAG1, PDP1, PKD1L1, PREX2, PROM1, PYROXD2, RELN, RFWD3, RNF213, RYR1, RYR2, RYR3, SCN11A, SCNN1B, SEL1L3, Serotonin 1B, SGCE, SH3TC1, SKP2, SLC1A3, SLC38A8, SLC6A1, SLC6A2, SLITRK1, SLO6A2, SNRNP200, SOCE, SPEN, SPRY2, SPTBN1, SRGAP3, SSBP2, ST18, STAB2, TDRD9, TGM1, THBS3, TLN2, TMEM147, TNPO1, TOX, TP53BP2, TPH2, TPX2, TRAPPC9, TTN, TULP4, UBASH3A, UBR4, UNC13C, USH2A, USPL1, WDFY3, WDR72, WNK4, WNT7B, WWC1, YLPM1, ZMIZ1, ZNF385A, ZNF799 또는 DRD5.
상기 CNV는 하기 유전자 중 어느 하나 이상에서 발생될 수 있다: A2BP1, AADAC, ACADVL, ADSL, ALDH18A1, AQP12A, ASTN2, AUTS2, BSG, CACNA1C, CBR2, CDH10, CDH13, CDH18, CDRT1, CFHR1, CFHR3, CNTN4, CNTNAP2, Col8A1, CTNNA3, CTNND2, DISC1, DLG4, DOPEY2, DPP6, DUOXA1, FHIT, FSCB, GABRA4, GABRB1, GABRG1, GALNT13, GBP3, GOLPH3, GPR89A, GRM8, KCNE1, KCNMA1, KLHL3, MACROD2, MST1L, MUC4, NF1, NRXN1, NSD1, OR2R2, OR2T2, OR2T3, OXTR, P2RX2, PAK7, PARK2, PDE9A, PDZD2, POLE, RB1CC1, SEMA5A, TBX1, TMEM195, TMPRSS11E, UGT2B17, USP17L17, USP17L18 또는 WDR4.
상기 뚜렛증후군 환자, 그의 부모 또는 형제자매로부터 SNV(single nucleotide variation) 및 CNV(copy number variation) 데이터를 수득하는 단계는 SNV 후보군 그룹 및 CNV 후보군 그룹으로부터 선택되는 하나 이상의 유전자를 코딩하는 뉴클레오타이드 서열에 특이적으로 결합하는 프라이머 또는 프로브를 이용하여 NGS(next generation sequencing)를 통해 수행되는 것일 수 있다.
본 발명은 또 다른 측면으로, 1) 뚜렛증후군 SNV 데이터, CNV 데이터 및 가족 정보가 입력되는 데이터 취득부; 2) 상기 입력된 SNV 데이터, CNV 데이터 및 가족 정보를 기 설정된 수식 및 윈도우를 이용하여 우선순위 점수 연산을 수행하는 데이터 연산부; 3) 상기 연산부에서 연산된 우선순위 점수에 따라 선정된 SNV 및 CNV, 그리고 뚜렛증후군의 SNV 및 CNV 데이터를 맵핑하는 맵핑부; 및 4) 상기 맵핑된 SNV 및 CNV를 이용하여 뚜렛증후군 위험 여부를 출력하는 동정부를 포함하는 뚜렛증후군 진단용 시스템을 제공한다.
상기 SNV 데이터는 COL27A1, BTBD9, SGCE, MECP2, USH2A, CEP290, DRD5, ASIC3-1, ASIC3-2 및 TRAPPC9로 구성된 군으로부터 선택되는 어느 하나 이상의 유전자를 포함할 수 있다. 또한, 상기 CNV 데이터는 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 유전자를 포함할 수 있다.
또한, 본 발명의 뚜렛증후군 진단용 시스템에서, 상기 데이터 연산부는 (i) 상기 입력된 SNV 데이터, CNV 데이터 및 가족 정보를 하기 수학식 1 또는 수학식 2를 이용하여 SNV 및 CNV 데이터의 수치화하는 단계;
[수학식 1]
Figure pat00001
[수학식 2]
Figure pat00002
(ii) 수치화된 SNV 데이터 및 CNV 데이터를 분석 가능한 가족 구성원의 수를 윈도우 사이즈 n으로 설정하는 단계; (iii) 설정된 윈도우 내 수치화된 SNV 데이터 및 CNV 데이터를 이용하여 분석 대상 가족의 비율을 연산하는 단계; (iv) 상기 설정된 윈도우 내 SNV 및 CNV 위치에서 단일 비율 검정을 이용하여 유의 확률(p-value)을 연산하는 단계; (v) 상기 설정된 윈도우의 양측 말단의 물리적인 위치 보정을 위한 가중치 연산하는 단계; (vi) 상기 연산된 유의 확률 및 가중치를 이용하여 점수를 계산하는 단계; (vii) 상기 계산된 (vi)의 점수가 -log(0.05)=2.996 이상인 단일 염기서열에서 뚜렛증후군 환자와 정상인의 패턴이 각각 일치하는지 확인하는 단계; (viii) 상기 (vii)의 조건을 만족하는 단일 염기서열 위치가 암호화 부위(coding region)인지 확인하는 단계; (ix) 상기 (viii)의 조건을 만족하는 위치의 단일 염기서열을 유전자 기호(gene symbol)로 변환하는 단계; 및 (x) 점수에 따라 우선순위를 매긴 후 원인 후보 유전자 리스트를 확인하는 단계로 이루어질 수 있다.
상기 SNV 및 CNV 데이터의 수치화는 뚜렛증후군 환자의 부모 염기서열 데이터를 모두 사용할 수 있는 경우, 상기 수학식 1을 이용하여 수치화할 수 있다. 이때, 상기 수학식 1에서 SNVjv(S)는 v번째 가족구성원의 j번째 SNV를 의미한다. 상기 S는 뚜렛증후군 환자와 정상인을 구분하는 변수이며, SNVjv(0) 및 SNVjv(1)은 각각 정상인 및 뚜렛증후군 환자를 의미한다. 또한, "v=1", "v=2", 및 "v=3,...,V"는 각각 부, 모, 및 v-2번째 자녀의 염기서열 데이터를 의미한다. 상기 수학식 1에서 REFj는 인간게놈 참조서열의 j번째 위치의 유전자형(genotype)을 나타내고, 가족 전체의 패턴 빈도인 LFj는 하기 수학식 3을 이용하여 계산할 수 있다. 이때, 상기 C는 자녀 수를 의미한다.
[수학식 3]
Figure pat00003
또한, 상기 SNV 및 CNV 데이터의 수치화에서, 뚜렛증후군 환자의 부모 염기서열 데이터 중 하나만 사용 가능하거나 부모 염기서열 데이터를 모두 사용할 수 없는 경우, 상기 수학식 2를 이용하여 단일 염기서열 정보를 수치화할 수 있다.
이때, 수학식 2에서 MSNVj(S=1)은 분석 대상 가족 구성원 내에서 뚜렛증후군 환자들만이 가지고 있는 SNV 중 빈도가 높은 패턴을 나타내며, 상기 수학식 3을 이용하여 가족 전체의 패턴 빈도 LFj를 계산할 수 있다. SNV에서 LFj 가계 뚜렛증후군의 원인 후보 유전자에 점수를 부여한다. 만약, SNV에서 뚜렛증후군 원인 후보유전자를 선발하지 못하였을 경우, CNV 뚜렛증후군 원인 유전자로 알려진 유전자 리스트에서 CNV 발견시 이를 해당 가계의 뚜렛증후군 원인 유전자로 정의할 수 있다.
또한, 본 발명은 1) 뚜렛증후군이 의심되는 개체로부터 분리된 시료에서 SNV 관련 유전자 또는 CNV 관련 유전자의 변이를 확인하는 단계; 및 2) 상기 SNV 관련 유전자 또는 CNV 관련 유전자가 변이가 일어난 경우, 개체를 뚜렛증후군으로 판정하는 단계를 포함하는, 뚜렛증후군 진단에 대한 정보의 제공방법을 제공한다.
구체적으로, 본 발명은 뚜렛증후군이 의심되는 개체로부터 분리된 시료에서 SNV 관련 유전자 또는 CNV 관련 유전자의 변이는 기존에 알려진 유전자로부터 SNV 및 CNV 데이터가 모두 예측된 경우, 유전자 변이가 단백질 구조에 어느 정도 영향을 주는지를 예측할 수 있는지 예측할 수 있는 프로그램 정보 및 SNV 종간 보존상태를 예측한 데이터를 추가 제공함으로써 뚜렛증후군 관련 SNV 또는 CNV 원인 변이체를 판단할 수 있는 단계를 포함한다. 또한, 정상인 유전체 데이터와의 비교분석을 통해 정상인에서 SNV 발생빈도를 판단하여 예측 프로그램에서 선발된 SNV 및 CNV가 정상인에서 발생하는 경우를 제외할 수 있어, 뚜렛증후군 원인 유전자를 판정하는데 정확도를 높일 수 있었다.
이하, 실시예를 통하여 본 발명을 보다 상세히 설명하고자 한다. 이들 실시예는 본 발명을 보다 구체적으로 설명하기 위한 것으로, 본 발명의 범위가 이들 실시예에 한정되는 것은 아니다.
I. 뚜렛증후군 관련 유전자
실시예 1. 연구 대상자 선정
총 34가계 81명의 뚜렛증후군 환자 또는 상기 환자의 정상 가계원을 본 실험에 등록하였다. 두 명의 독자적 신경과 전문의에 의한 임상 평가를 실시하였다. 모든 참가자는 인제대학교 백병원의 기관생명윤리위원회에 의해 승인된 절차에 따라 피험자 동의서를 제공하였다. 뚜렛증후군 환자 36명 및 뚜렛증후군 환자의 정상 가계원 45명을 건강한 대조군으로 이용하여 연구를 수행하였다.
실시예 2. 뚜렛증후군 원인 유전자의 검출을 위한 후보 유전자 선정
뚜렛증후군 관련 유전자를 선정하기 위하여, 다양한 질병에 관한 정보를 포함한 OMIM(online mendelian inheritance in man) 데이터베이스 및 선행문헌 연구를 통해 뚜렛증후군 환자에서 빈번하게 발생하는 유전자에 우선순위를 두고 최적의 뚜렛증후군 관련 후보 유전자를 선정하였다.
그 결과, CNV(46개 유전자: A2BP1, AADAC, ADSL, ALDH18A1, ASTN2, AUTS2, CACNA1C, CBR2, CDH10, CDH13, CDH18, CNTN4, CNTNAP2, Col8A1, CTNNA3, CTNND2, DISC1, DOPEY2, DPP6, DUOXA1, FHIT, FSCB, GABRA4, GABRB1, GABRG1, GALNT13, GPR89A, GRM8, KCNE1, KCNMA1, KLHL3, MACROD2, NF1, NRXN1, NSD1, OXTR, P2RX2, PAK7, PARK2, PDE9A, POLE, RB1CC1, SEMA5A, TBX1, TMEM195, WDR4) 및 SNV(176개 유전자 : ABCA13, ADCY2, ADORA1, ADORA2, AGPAT5, AMBRA1, ANK3, ARHGAP26, ARHGAP30, ARID1A, ARL8A, ATF6, ATP1A1, BARD1, BCAS3, BCAT1, BDNF, BSN, BTBD9, C8A, CACNA1D, CAMSAP1, CAPRIN2, CARD8, CBFA2T1, CCAR1, CDK12, CELSR3, CHD2, CHD5, CHRNA7, CIT, CLCN1, CNTNAP2, COL27A1, CPA4, CREBBP, CSDE1, CSNK1G3, CTCF, CX3CL1, CYP2B6, CYP2C18, DBH, DCLK2, DENND5A, DHX15, DLG5, DLGAP3, DNAH2, DNAJC13, DOCK7, DPP6, DRD1, DRD2, DSCAM, DSCAML1, EVPL, FAM120A, FAM71A, FBXO15, FMNL2, FN1, FRY, GAPVD1, GBX2, GCH1, GDNF, GET4, GIGYF1, GNB2L1, GOPC, HDAC5, HDC, HEATR5B, HECTD3, HEPACAM2, HERC1, HERC2, HIST1H1T, HLA-E, HNRNPA0, HTR2C, IL16, IMMP2L, ITPR2, KBTBD8, KDM5B, KIAA0368, KIAA1429, KLHL32, KLHL9, KNDC1, KRTAP10-4, LAX1, LILRA2, LLGL1, LMNA, LRP8, LZTR1, MAB21L2, MARK2, MCM7, ME2, MGAM, MPL, MRPL3, MUC5B, MYH10, MYH4, MYO5A, NCBP1, NID1, NIPBL, NLGN4X, NLRP11, NPC1, NUP85, OFCC1, OLFM1, OPA1, OR9I1, PAG1, PDP1, PKD1L1, PREX2, PROM1, PYROXD2, RELN, RFWD3, RNF213, RYR1, RYR2, RYR3, SCN11A, SCNN1B, SEL1L3, Serotonin 1B, SH3TC1, SKP2, SLC1A3, SLC38A8, SLC6A1, SLITRK1, SNRNP200, SPEN, SPRY2, SPTBN1, SRGAP3, SSBP2, ST18, STAB2, TDRD9, TGM1, THBS3, TLN2, TMEM147, TNPO1, TOX, TP53BP2, TPX2, TTN, TULP4, UBASH3A, UBR4, UNC13C, USPL1, WDFY3, WDR72, WNK4, WNT7B, WWC1, YLPM1, ZMIZ1, ZNF385A, ZNF799)를 뚜렛증후군 원인 유전자 검출하기 위한 유전자로 선정하였다.
실시예 3. 뚜렛증후군 원인 유전자 검출을 위한 프로브 제작
프라이머 설계는 Sureselect(Agilent Technologies, Santa Clara, CA)로 실시하였다. 유전자의 UTR(untranslated region) 영역의 경우 타겟에서 제외한 후, 단백질로 코딩이 되는 영역을 최대한 커버할 수 있는 프로브를 제작하였다. 총 219개 유전자의 CDS(coding sequence) 영역을 포함한 4402개 엑손 영역의 서열을 바탕으로 엑솜 프로브를 제작하였다. 총 4402개 액손 영역의 사이즈는 1,170,663bp 이다.
실시예 4. 뚜렛증후군 검출 유전자의 NGS 라이브러리 제작
뚜렛증후군을 검출하는 유전자의 NGS(next generation sequencing) 실험을 위해, 뚜렛증후군 환자의 혈액으로부터 유전체 DNA를 QiAmp DNA Mini kit(Qiagen, Valencia, CA, USA)를 사용하여 분리하였다. 이후, Nanodrop 8000 UV-Vis spectrometer(Thermo Scientific Inc., DE, USA), Qubit 2.0 Fluorometer(Life technologies Inc., Grand Island, NY, USA) 및 2200 TapeStation Instrument(Aglient Technologies, Santa Clara, CA, USA) 장비를 사용하여 분리된 유전체 DNA의 농도, 순도 및 분해(degradation) 여부를 확인하였다. QC(quality control) 기준에 부합한 임상시료의 경우 다음 단계의 실험에 사용하였다.
QC를 통과한 임상시료의 혈액으로부터 확보한 유전체 DNA(~250 ng)는 Covaris S220(Covaris, MA, USA)를 사용하여 전단(shearing)을 수행한 후, end-repair, A-tailing, paired-end adaptor ligation 및 amplification 단계를 거쳐 시퀀싱 라이브러리 제작을 수행하였다. 상기 실시예 1에서 선정된 219개 유전자의 4402개 엑손 영역들을 캡처하기 위해, 제작된 프로브를 모두 포함하는 조성물을 사용하였다. 라이브러리의 Hybridization은 65℃에서 24시간 동안 반응하였으며, Hybridization 의해 캡처된 유전체 DNA 라이브러리 조각들을 정제하였다. 정제는 엑손에 부착된 바이오틴과 스트렙타비딘의 결합 특성을 이용하였다. 구체적으로, 자성비드로 코팅된 스트렙타비딘과 캡처된 라이브러리 조각에 부착된 바이오틴을 결합시킨 후 자기력을 이용하여 혼합물로부터 캡처된 라이브러리 조각을 분리하였다. 이후, 정제된 유전체 DNA 라이브러리 조각을 index barcode tag와 함께 증폭하였다.
실시예 5. 뚜렛증후군 검출 유전자 서열정보 수득
뚜렛증후군 임상시료에서 222개 유전자의 4402개 엑손 영역들을 포함하는 진유전체(exome)를 캡처한 시퀀싱 라이브러리를 NGS 시퀀싱 기계(Miseq, illumina, USA)에 주입하여 각 DNA 절편의 서열 정보를 획득하였다. 그리고, 유전체 데이터를 가공(trimming) 및 표준 인간 유전체에 정렬하여 샘플에서 각 유전자에 대한 서열정보를 수득하였다. 시퀀싱 반응은 TruSeq Rapid PE Cluster kit 및 TruSeq Rapid SBS kit(Illumina, USA)를 사용하여 이루어졌으며, 양방향 100 bp를 읽을 수 있는 paired-end 조건으로 수행하였다.
실시예 6. 뚜렛증후군 변이체 데이터 추출
NGS 시퀀싱 장비에서 만들어진 시퀀싱 리드(reads) 데이터를 유전체 데이터의 가공 절차를 수행한 후, Burrows-Wheeler Aligner(BWA) 알고리즘을 사용하여 UCSC hg19 reference genome(http://genome.ucsc.edu)에 정렬(alignment)을 수행하였다. QC가 완료된 유전체 데이터의 서열을 UCSC hg19 표준 서열에 맵핑하는 과정에서, NGS 라이브러리를 만들 때 생기는 PCR 중복 리드(PCR duplicated reads)가 포함될 수 있는데, 이를 제거하였다. PCR 중복 리드를 제거하는 이유는 시퀀싱을 하기 위해 증폭하는 과정이 필요한데, 오류로 증폭이 더 많이 된 부분을 제거하기 위함이다. PCR 중복 리드는 picard-tools-1.119(http://picard.sourceforge.net/)를 사용하여 제거하였으며, GenomeAnalysisTK-3.8 알고리즘을 사용하여 단일 뉴클레오티드 변이(single Nucleotide Variation, SNV) 및 삽입-결실변이(indel)를 동정하였다. UCSC hg19의 정보를 바탕으로 동정된 뉴클레오티드 각 변이에 주석을 달기 위하여 ANNOVAR 프로그램을 이용하였다.
실시예 7. 뚜렛증후군 원인 유전자 변이 선발
각 임상시료별 추출이 완료된 SNV 및 삽입-결실변이 중에서 변이 특성, 변이 형태, 아미노산 변경 정보 및 SNP DB 수록 정보(dbSNP)를 고려하여, 변이 특성이 단백질을 변화시킬 수 있는 SNV 및 삽입-결실 변이를 제외한 나머지 변이를 원인 유전자 선발에서 제거하는 과정을 수행하였다.
또한, 뚜렛증후군의 경우 희귀질환으로 연구가 진행되어야 되기 때문에, 전체 동정된 SNV 및 삽입-결실 중 전체 발생 빈도가 1% 이상을 보이는 변이는 공통변이로 정의하여 제거하는 과정을 수행하였다. 구체적으로, dbSNP, 1000 Genome 및 NHLBI GO Exome Project의 유전체데이터를 사용하여 각 SNV 및 삽입-결실 중 전체 발생 빈도를 확인한 후 전체 발생 빈도가 1% 이상을 보이는 변이는 제거하였다. 공통변이를 제거하는 과정을 수행한 후, 남은 변이를 뚜렛증후군 원인 유전자 변이 후보로 정의하였다. 서로 다른 두 개 이상의 유전자에서 변이가 선발되었을 경우, 종간 유전체 서열의 보존된 정도(conservation rate)가 높은 변이를 우선순위로 선발하였다.
또한, 뚜렛증후군 원인 유전자 선발에 혼동되는 유전자를 제거하기 위하여, 뚜렛증후군과 동반되어 나타나는 ADHD(359개: AANAT, ADRA1A, ADRA1B, ADRA1D, ADRA2A, ADRA2B, ADRA2C, ADRB1, ADRB2, ADRB3, ADRBK1, ADRBK2, AGBL1, AK094352, AK8, ANK3, ANO5, ARRB1, ARRB2, ARSB, ARVCF, AS3MT, ASMT, ASTN2, ATP11A, ATP2B3, ATP2C2, ATXN1, ATXN2, BAIAP2, BCHE, BCL11A, BDNF, BMPR1B, BRE, C1orf173, CACNA1C, CACNB2, CADM2, CALY, CAMK1D, CCSER1, CDH13, CDH23, CDK20, CEP112, CHMP7, CHRNA3, CHRNA4, CHRNA7, CLASP2, CLOCK, CLYBL, CMTM8, CNR1, CNTF, CNTFR, CNTN4, CNTN5, COMT, COX7B2, CPLX1, CPLX2, CPLX4, CREB5, CRYGC, CSMD2, CSNK1E, CTNNA2, DACT1, DBH, DCDC2, DCLK1, DCLK2, DDC, DENND3, DGKH, DHCR7, DIRAS2, DISC1, DLEU2, DMRT2, DNAJA1P4, DNAJC27, DNM1, DOCK10, DPH6, DPP6, DRD1, DRD2, DRD3, DRD4, DRD5, DSCC1, DUSP1, DYX1C1, ELK3, ELOVL6, EMP2, EREG, ETV5, FADS1, FADS2, FAIM2, FANCL, FGF10, FGF12, FHIT, FLNC, FLRT2, FOXP1, FOXP2, FTO, FURIN, GABRG1, GDNF, GEMIN2, GFI1B, GFOD1, GIT1, GNAL, GNAO1, GNAT2, GNAZ, GNPDA2, GPC5, GPC6, GPR125, GPR50, GPRC5B, GPX6, GRID2, GRIK1, GRIK4, GRIN2A, GRIN2B, GRM1, GRM5, GRM7, GRM8, GSK3B, H2AFY, HAS3, HCN1, HES1, HES6, HK1, HKDC1, HLA-DRB1, HOXB1, HTR1A, HTR1B, HTR1D, HTR1E, HTR1F, HTR2A, HTR2C, HTR3A, HTR3B, HTR4, HTR5A, HTR6, HTR7, ID2, IL16, IL1RN, IL20RA, ISL1, ITGA1, ITGA11, ITGAE, ITIH3, KANK2, KANSL1, KCNC1, KCNIP4, KCTD15, KIAA0319, LARP7, LECT1, LHFPL3, LIN7C, LINC00478, LINGO2, LMAN2L, LMO4, LOC100128765, LOC100188947, LOC151121, LOC643308, LPHN3, LPL, LRP1B, MACROD2, MAD1L1, MAGI2, MAN2A2, MAOA, MAOB, MAP1B, MAP2K3, MAP2K5, MC4R, MCTP1, MED27, MEIS2, MIR96, MMP24, MMP7, MOBP, MOG, MTCH2, MTHFR, MTIF3, MTNR1A, MTNR1B, MYBPC1, MYO5B, MYT1L, NADSYN1, NAPRT1, NCAN, NCKAP5, NEGR1, NET1, NEUROD6, NFIL3, NGF, NLN, NOS1, NPAS3, NPPC, NPSR1, NR3C2, NR4A2, NRSN1, NRXN1, NRXN3, NT5C2, NT5DC3, NTF3, NTM, NTRK2, NUCB1, NUDT3, NXPH1, NYAP2, OPRM1, OR4C3, OXER1, OXTR, PARK2, PCDP1, PER1, PER2, PEX5L, PGRMC2, PHLDA1, PLCL1, PNMT, POC5, PPM1F, PPM1H, PPP1R1B, PPP2R2C, PRELID2, PRKAG2, PRKD1, PRKG1, PRTG, PSMC3, PTBP2, PTPRG, PTPRJ, PTPRN2, PYDC2, QPCTL, RASSF2, RBMS3, REEP5, RGS18, RPL23AP56, RPL27A, RPL31P43, SDK2, SEC16B, SGTB, SH2B1, SH3BP5, SLC18A2, SLC1A3, SLC38A1, SLC39A3, SLC39A8, SLC5A7, SLC6A1, SLC6A2, SLC6A3, SLC6A4, SLC7A10, SLC9A9, SLCO3A1, SLIT1, SNAP25, SNCA, SNPH, SORCS1, SORCS3, SPOCK3, SRGAP1, SSFA2, STS, STX1A, SUPT3H, SYN3, SYP, SYT1, SYT2, TAAR3, TAF2, TCEB1, TCERG1L, TDO2, TDP2, TENM4, TFAP2B, TFEB, TH, TIAM2, TLE1, TLE4, TLL2, TMEM132B, TMEM160, TMEM18, TNNI3K, TPH1, TPH2, TRANK1, TRIM32, TRIO, TSPAN8, UGT1A9, UNC5B, USP24, VAMP2, VEGFA, WDR96, XKR3, XKR4, XPO1, ZBBX, ZNF423, ZNF516, ZNF544, ZNF608, ZNF75A, ZNF804A, ZNF805) 관련 유전자를 제거함으로써 뚜렛증후군 원인 유전자의 선발을 용의하게 하였다.
실시예 8. 뚜렛 유전자 패널의 In silico 분석
한국인 뚜렛증후군 환자의 34가계 81명의 임상시료에 대해서 선발된 뚜렛증후군 gene panel sequencing in silico 분석 결과, 전체 임상시료 중 30가계에서 뚜렛증후군 환자의 원인 유전자가 선발되었다. 나머지 4가계에서는 뚜렛증후군 환자의 원인 유전자가 선발되지 않았다. 이에, 뚜렛증후군 원인 유전자 분석방법을 보정하기 위해, 한국인 뚜렛증후군 환자의 WES(whole exome sequencing) 유전체데이터 SNV 분석결과 중 뚜렛증후군의 원인 유전자가 선발되지 않은 가계에서 CNV를 분석할 수 있는 분석방법을 보안하였다. 1차로 알려진 뚜렛증후군 관련 유전자에서 SNV 분석을 수행한 후, 뚜렛증후군 원인 유전자 변이를 찾지 못한 가계에서 문헌연구(Pubmed)를 통해 선발된 CNV를 보이는지 분석하는 방법을 추가함으로써 뚜렛증후군 진단 패널의 진단율을 높이는 방식을 도입하였다.
실시예 9. 유전자 변이 확인을 위한 프라이머 합성
NGS 방법으로 확인된 뚜렛증후군 원인 유전자 변이를 확인하기 위해, 변이 부분을 포함하는 150 내지 300 bp 크기의 유전자 부위를 선정하고, 이를 증폭시킬 수 있는 PCR 프라이머를 결정 및 합성하였다(바이오니아, 대전).
실시예 10. gDNA 시료 확보
상기 실시예 9에서 합성한 프라이머를 이용하여 gDNA(genomic DNA) PCR 조건을 확인하기 위해, 인간 정상 폐 세포주인 WI38의 gDNA를 대상으로 PCR을 실시한 후 PCR 산물에 대한 서열 분석을 실시하였다. 서열 분석에 활용 가능한 PCR 프라이머를 확보한 후, 환자 혈액의 혈구세포로부터 gDNA를 분리하여 PCR을 실시하였고, 이에 대한 서열 분석을 실시하였다. 세포주 또는 환자 혈구세포의 gDNA 추출시, gDNA prep kit(NANOHELIX, Cat No.GCBL 200)를 이용하였고, 구체적인 방법은 하기와 같다.
먼저, 혈구세포를 분리하기 위해, 1.5 mL의 마이크로 튜브에 전혈 200 uL를 1 mL의 RBL(Red blood cell lysis) 용액과 섞어준 후 0℃를 유지하면서 2 내지 3회 살짝 흔들어주었다. 약 10분 후, 12,000 rpm의 속도로 10분 동안 원심분리하여 백혈구세포들을 모았다. 1x106 내지 1x106의 WI38 세포를 1.5 mL의 마이크로튜브에 모은 후, 3,000 rpm의 속도로 5분 동안 원심분리하였다. 분리한 세포들이 들어있는 마이크로 튜브에 세포용해용액을 처리한 후, gDNA prep kit(PureHelix Genomic DNA prep kit, NANOHELIX Co. 대전)에서 제공하는 방법에 따라 핵산분리 컬럼을 이용하는 등의 gDNA 분리과정을 수행하였다. 분리한 gDNA의 정량은 Nanodrop을 사용하여 260 nm의 흡광도에서 측정하여 결정하였다.
실시예 11. gDNA의 PCR 및 서열분석
100 ng의 gDNA 또는 1 uL의 환자 gDNA를 주형으로 PCR을 실시하였다. PCR 반응물 조성은 DNA 주형(100 ng 또는 1 uL), 합성 프라이머(forward, reverse 프라이머 각각 1 uL), AccuPower® HotStart Pfu PCR premix(바이오니아, 대전) 및 증류수를 포함하여 총 20 ul가 되도록 하였다. PCR은 초기 변성(95°C, 15분, 1회) 후, 변성(95°C, 30초), 어닐링(55°C, 30초) 및 확장(72°C, 1분)의 조건으로 30회 반복 반응하고, 최후 확장(72°C, 5분)으로 반응하였다. 최종 PCR 반응물 중, 2 uL를 1% 아가로오즈 겔상에서 분석하여 예상한 크기의 PCR 산물을 확인하였고, 남은 반응물은 모두 아가로오즈 겔상에서 분리하여 PCR 산물을 겔 추출(gel extraction) 방법으로 정제하였다. 겔 추출은 Gel extraction kit(PureHelix Gel extraction kit, NANOHELIX Co. 대전)에서 제공하는 방법에 따라 진행하였다.
총 부피 30 uL로 정제한 PCR 산물 중 2 uL를 1% 아가로오즈 겔상에서 분리하여 다시 확인하였고, 남은 반응 산물 중 10 uL로 서열분석을 실시하였다. 서열분석 프라이머로는 PCR 프라이머를 이용하였고, 서열분석은 솔젠트(대전)에서 실시하였다. 이때, PCR 프라이머 서열을 하기 표 1 및 표 2에 나타내었다. 또한, 뚜렛증후군 유전자 패널 검증 결과를 하기 표 3에 나타내었다.
Family Chr Chromosomal localization Gene Name RefSeq Exon reference mutation LEFT(서열번호) RIGHT(서열번호) PRODUCT SIZE
Family_1 chr12 72335380 TPH2 C A GAGTGACACGGCAACTTCAC(서열번호 65) CAACTGCTGTCTTGCCACTT(서열번호 66) 238
Family_1 chr3 88040035 HTR1F G A TGGTGTCCCTCACTCTGTCT(서열번호 67) GCCAGTGGGATGTAGAAAGCT(서열번호 68) 511
Family_2 chr9 117070009 COL27A1 A G TCACAAGATGCAGGGTCCAT(서열번호 69) CTGGGGATAGAGGCAGACAG(서열번호 70) 248
Family_3 chr12 72335380 TPH2 C A GAGTGACACGGCAACTTCAC(서열번호 71) CAACTGCTGTCTTGCCACTT(서열번호72) 238
Family_3 chr9 116930998 COL27A1 C T CAGCCACCAAAATCCCCAAA(서열번호 73) AACTGGACGGGAAGTAGGTG(서열번호 74) 164
Family_3 chr7 153749970 DPP6 C T AGTGGGAACCGGAGAGA(서열번호 75) GGAACGTAAGGCGAATTC C(서열번호 76) 596
Family_3 chr6 38142846 BTBD9 G C CGCTGCCTCCTTTATTGGTG(서열번호 77) CTTTGAGTGTCCAGAGCAGC(서열번호 78) 192
Family_5 chr2 113890284 IL1RN G A AACATCACTGACCTGAGCGA(서열번호 79) GGCAGTACTACTCGTCCTCC(서열번호 80) 217
Family_5 chr6 38142846 BTBD9 G C CGCTGCCTCCTTTATTGGTG(서열번호 81) CTTTGAGTGTCCAGAGCAGC(서열번호 82) 192
Family_6 chr7 94218004 SGCE T C GACACAAGTGTTTTGCCTT(서열번호 83) GGGGTCATAGTTTACCCG(서열번호 84) 267
Family-6 chrX 153296684 MECP2 G A AGGCATCTTGACAAGGAGCT(서열번호 85) TTCACGGTAACTGGGAGAGG(서열번호 86) 207
Family-11 chr5 52240850 ITGA1 C A TCGGAGTGAAAATGCATCTCTG(서열번호 87) TCTGTCACTTACCGAGAGCA(서열번호 88) 173
Family-11 chr3 113890728 DRD3 C T TGGATGAGGGACAGGATGGT(서열번호 89) ACCAAGCCCCAAAGAGTCTG(서열번호 90) 546
Family-12 chr7 153750096 DPP6 G A AGTGGGAAC CGGAGAGA(서열번호 91) GGAACGTAAGGCGAATTCC(서열번호 92) 596
Family-12 chr7 94259133 SGCE C T CAGGTTTTGGGTAAGGTGGA(서열번호 93) GACCCCTCTTTATAAACAGCGT(서열번호 94) 214
Family-13 chr9 116994117 COL27A1 G T CTTCTGTGGCCTAGAGTCCC(서열번호 95) CACAGATTTAGGGGAGGCCA(서열번호 96) 236
Family_13 chr9 116931124 COL27A1 C T AAAGTCAGCCCTACCCACTC(서열번호 97) CATGGCTGGTTATCTTGGCC(서열번호 98) 212
Family_15 chr1 216166484 USH2A-1 A T AAAGTCAGCCCTACCCACTC(서열번호 99) CATGGCTGGTTATCTTGGCC(서열번호 100) 212
Family_15 chr1 216144109 USH2A-2 G A GCTTGAAAGGCTAGCTGTGC(서열번호 101) TCATGCTGGAACTGTTGGGT(서열번호 102) 505
Family_15 chr12 88512305 CEP290 T GCAGATCCACAATAGAACA(서열번호 103) CACTTAAAACAGCAGCAG(서열번호 104) 319
Family_15 chr22 42523636 CYP2D6 C A CATCTGGGAAACAGTGCA(서열번호 105) ATGTCACGG GATGTCATA(서열번호 106) 360
Family_16 chr4 9783901 DRD5 T C GGGGCAGTTCGCTCTATACC(서열번호 107) CGGTCCACGCTGATGACGC(서열번호 108) 378
Family_16 chr16 55734106 SLC6A2 T C TTCTCTCCCTTCTCTGCCCA(서열번호 109) GACATCACAGTGAGCTGGGT(서열번호 110) 536
Family_19 chr22 18905859 PRODH G A CATGACATAAAAGCTGAGG(서열번호 111) CCACAGGAT GCCTATGA(서열번호 112) 322
Family_19 chr7 150747934 ASIC3-1 CCCCAG CATCATCGATCAGCTGGGCT(서열번호 113) GGGTGGGCACAGTTCTTGTA(서열번호 114) 549
Family_19 chr7 150746097 ASIC3-2 G A TAGCCCCCTGACTGACTCTC(서열번호 115) AGTCCAGCAGCATGTCATCC(서열번호 116) 560
Family_21 chr8 141231575 TRAPPC9-1 C T AGCTTCACTGTGACGGCTTT(서열번호 117) AAAACAAAACCAGCCTGGGC(서열번호 118) 579
Family_21 chr8 141468504 TRAPPC9-2 T C GAAGGAGGCCCAGTTCTGTC(서열번호 119) AGTCTGTAAGCCTCCCCCAT(서열번호 120) 518
Family_21 chr11 113860274 HTR3A G A ACCATGTTCAGGTCACCACC(서열번호 121) AGGGTTCAGACCTTGGCTTG(서열번호 122) 539
Family_23 chr7 153750096 DPP6 G A AGTGGGAACCGGAGAGA(서열번호 123) GGAACGTAAGGCGAATTCC(서열번호 124) 596
Family Chr Start Gene Name RefSeq Exon mutation LEFT(서열번호) RIGHT(서열번호) PRODUCT SIZE
Family_1 chr11 60704123 ARHGAP32 NM_014715 exon13 c.C4390T
p.R1464C,
CTGACCAGGAGGAACTGAGC(서열번호 125) GGCGCAAATGTCACAAACT(서열번호 126) 211
Family_1 chr11 60704125 ARHGAP32 NM_014715 exon13 c.G4391A
p.R1464H
CTGACCAGGAGGAACTGAGC(서열번호127) GGCGCAAATGTCACAAACT(서열번호 128) 211
Family_1 chr12 124323060 DNAH10 NM_207437 exon28 c.C4606T
p.R1536C
GAACAGTGTCTCCGCTCTCC(서열번호 129) TTGAGGCTTTTCTGGCATTT(서열번호 130) 177
Family_1 chr12 124315194 DNAH10 NM_207437 exon25 c.A4139T
p.D1380V
AAATGACCGAAACGTTCACC(서열번호 131) CATACCACCACGCTCAGCTA(서열번호 132) 230
Family_3 chr2 216239973 FN1 NM_212474 exon36 c.C5578T
p.R1860W
AGCATGGAAGCAGCAATACC(서열번호 133) ATTGATGCACCATCCAACCT(서열번호 134) 197
Family_3 chr9 116930998 COL27A1 NM_032888 exon3 c.C1163T
p.T388I
CGCTCAACCATCACAGAAGA(서열번호 135) GAGACTCTGGCAGGAACTGG(서열번호 136) 201
Family_2 chr17 80755631 TBCD NM_005993 exon8 c.772-2A>T
TTTTCAGATGAATTTTTGGGAGA(서열번호 137) GGGCAAACAGTCTTCACGTT(서열번호 138) 247
Family_6 chr19 40900339 PRX NM_181882 exon7 c.G3920C
p.R1307P
TTCCCCAGTGACCATCTCA(서열번호 139) GCGTACCTTCTGCCTCTCAC(서열번호140) 235
Family_6 chr19 40901579 PRX NM_181882 exon7 c.G2680Ap.V894M GAACTTGGAAGAGGGCTTGA(서열번호141) TAGACCTGCCAGGAGCACTT(서열번호 142) 218
Family_ 11 chr3 113890728 DRD3 NM_000796 exon2 c.G112A
p.A38T
TATACCACCCAGGGCATCAC(서열번호 143) ACTACACCTGTGGGGCAGAG(서열번호 144) 229
Family_ 12 chr19 8161788 FBN3 NM_032447 exon42 c.T5390C
p.L1797P
GACCTGGACAGAGCCATACC(서열번호 145) CCCAGATGTCGATGAGTGTG(서열번호 146) 215
Family_ 12 chr19 8151993 FBN3 NM_032447 exon53 c.G6722Ap.R2241Q AGTTTCCTGCACCCATGAAG(서열번호 147) AGTGTGCAGATGGTCAGCAG(서열번호 148) 162
Family_ 13 chr9 116931124 COL27A1 NM_032888 exon3 c.C1289Tp.P430L CAGTTCCTGCCAGAGTCTCC(서열번호 149) CTGGCATGGCTGGTTATCTT(서열번호 150) 169
Family_ 13 chr9 116994117 COL27A1 NM_032888 exon16 c.G2536Tp.V846L CATTTGCCCCCTTTTACAGA(서열번호 151) GCAGAGAAACCACAGTGCAA(서열번호 152) 226
Family_ 15 chr9 141014669 CACNA1B NM_000718 exon44 c.G6083Ap.R2028Q TGACTGTGAGACCAGGATGG(서열번호 153) TGGTGCTGCAAAGATGAGTC(서열번호 154) 210
Family_ 15 chr9 140772393 CACNA1B NM_000718 exon1 c.G8Tp.R3L ACGTGACCGGCCCCTTAT(서열번호 155) CGATCGATTGCTTGTAGAGGA(서열번호 156) 344
Family_ 16 chr10 76781852 KAT6B NM_001256468 exon16 c.2686_2697delp.896_899del CAGTAGGCAATCACCTGCAA(서열번호 157) TTGGGGGAGAGCTTTGAATA(서열번호 158) 242
Family_ 16 chr21 47422538 COL6A1 NM_001848 exon33 c.G2348Ap.R783Q CTTGTCCCCAGAAAGACGAG(서열번호 159) GCGGTGACATTCTTCAGGA(서열번호 160) 233
Family_ 17 chr11 124744033 ROBO3 NM_022370 exon12 c.G1852Ap.G618S GGAGTAGGCAGGTTGGGAGT(서열번호 161) CACTGCTCGAACCAGAAACA(서열번호 162) 172
Family_ 19 chr22 18905859 PRODH NM_001195226 exon11 c.C1073T
p.T358M
CTGCCCTGAGAAGACAGAGG(서열번호 163) CCACAGGATGCCTATGACAA(서열번호 164) 220
Family_ 20 chr10 68040262 CTNNA3 NM_001127384 exon13 c.T1850C
p.I617T
AGGCATTCCAGATGGTGAAG(서열번호 165) CAAGTGAATGTTGCCTTGGA(서열번호 166) 191
Family_ 21 chr4 955317 DGKQ NM_001347 exon21 c.G2512C
p.E838Q
GCTCACCATGTGCACGAC(서열번호 167) CTTCATCAACATCCCCAGGT(서열번호 168) 245
Family_ 21 chr4 961785 DGKQ NM_001347 exon6 c.C694G
p.P232A
AAGCTCTGCGTCTTGCTGA(서열번호 169) GTGGGGTCTTTCCCTGGAC(서열번호 170) 206
Family_ 22 chr11 124764205 ROBO4 NM_001301088 exon8 c.A775G
p.T259A
GCACTGCCCTCACCTAAAAG(서열번호 171) GCTGTTCACCTCTGCTTGTG(서열번호 172) 208
Family_ 23 chr6 90428873 MDN1 NM_014611 exon41 c.C6039G
p.I2013M
CCACGGGAAAGGACTGAGTA(서열번호 173) ACCCATACATGGGAACCAGA(서열번호 174) 181
Family_ 23 chr6 90382295 MDN1 NM_014611 exon81 c.C13601Gp.T4534S TGCCTGATTTCAGACATACCA(서열번호 175) GTTGGACGAAGGATTTGTGG(서열번호 176) 165
Family_ 24 chr21 47832788 PCNT NM_006031 exon29 c.C6032Tp.A2011V GTACTGGTTCCCAGCTCCAG(서열번호 177) AGGCGCATTTCATTTTTCAC(서열번호 178) 222
Family_ 24 chr21 47847674 PCNT NM_006031 exon34 c.C7459Gp.L2487V TTCTGCAGGTTGTGCAAGAG(서열번호 179) GCAGAGCTGACACTCACCTG(서열번호 180) 154
Family_ 26 chr12 96076512 NTN4 NM_021229 exon7 c.C1481Tp.A494V TCCCCTCATAGGATCCAAAA(서열번호 181) TGCACAATAAGAGCGAACCA(서열번호 182) 168
Fam_1 chr21 35897642 RCAN1 NM_001285391 exon1 c.T71Gp.L24R CGTTAAGGAGCAGTCGGAAC(서열번호 183) TCAAGAGAGGTGGGGAAAAA(서열번호 184) 188
Fam_2 chr10 76788690 KAT6B NM_001256468 exon18 c.3559_3561delp.1187_1187del ACATGTGCCCCTGTAAGTCC(서열번호 185) TTTTCCGTGGAGATTTCTGG(서열번호 186) 212
Fam_4 chr3 99509813 COL8A1 NM_020351 exon3 c.C287Tp.A96V;COL8A1 AGATGCCCCACTTGCAGTAT(서열번호 187) TCCCCCTCTGATCCCATAAT(서열번호 188) 183
Fam_5 chr2 216296589 FN1 NM_001306129 exon4 c.A514Gp.N172D CTAAGCATCCCAGCTCTTGC(서열번호 189) CATGAAGGGGGTCAGTCCTA(서열번호 190) 165
Fam_6 chr11 124763789 ROBO4 NM_001301088 exon9 c.C1036Tp.R346C CCTGGTCAGAGATCCAAAGC(서열번호 191) CAGCTGAGGGCTACCTTGAA(서열번호 192) 213
Fam_7 chr11 124765478 ROBO4 NM_001301088 exon6 c.G476Tp.G159V GCCAGAGGATGGTCTCACTT(서열번호 193) CGTTCCTGAGCTCTCTGACC(서열번호 194) 226
Fam_7 chr11 124742934 ROBO3 NM_022370 exon9 c.C1485Ap.D495E GAGTGACTGGGAACCCTCAA(서열번호 195) GGCTACAGGCCCAGTGAGTA(서열번호 196) 208
Fam_8 chr16 55690691 SLC6A2 NM_001043 exon1 c.A85Gp.K29E GACCGGTAAAGTTCCTCTCG(서열번호 197) ATCTTCTTGCCCCAGGTCTC(서열번호 198) 220
Fam_8 chr11 124761429 ROBO4 NM_001301088 exon12 c.G1279Ap.V427M GAGGCTGTCTGAGCTGGAAC(서열번호 199) GATCTCAGGGATGGAAAGCA(서열번호 200) 249
Fam_9 chr11 2186957 TH NM_000360 exon11 c.C1141Tp.Q381X; GAGGACTGGGCAGAGACAAG(서열번호 201) ACTGGTTCACGGTGGAGTTC(서열번호 202) 244
Fam_10 chr3 45814094 SLC6A20 NM_020208 exon5 c.C596Tp.T199M GCCCCTGATGAGGTAGATGA(서열번호 203) GAATCTCCATGCCTTTTCCA(서열번호 204) 199
Fam_12 chr2 216244028 FN1 NM_001306131 exon32 c.C4904Ap.P1635Q TTCATTGGTCCGGTCTTCTC(서열번호 205) TTTTCCTTTTCCCCCATTTC(서열번호206) 193
Fam_13 chr11 124739454 ROBO3 NM_022370 exon3 c.C596Gp.S199C CCAGTCCTCCGTGATGATTT(서열번호 207) CCTATGTCCCCTCCCTTGTT(서열번호 208) 211
Sample Chr Start Gene Ref Alt Left Primer
Family-1 chr12 72335380 TPH2 C A
Figure pat00004
chr3 88040035 HTR1F G A
Figure pat00005
Family-2 chr9 117070009 COL27A1 A G
Figure pat00006
Family-3 chr12 72335380 TPH2 C A
Figure pat00007
chr9 116930998 COL27A1 C T
Figure pat00008
Family-3 chr6 38142846 BTBD9 G C
Figure pat00009
Family-5 chr2 113890284 IL1RN G A
Figure pat00010
Family-5 chr6 38142846 BTBD9 G C
Figure pat00011
Family-6 chr7 94218004 SGCE T C
Figure pat00012
Family-6 chrX 153296684 MECP2 G A
Figure pat00013
Family-11 chr5 52240850 ITGA1 C A
Figure pat00014
Family-11 chr3 113890728 DRD3 C T
Figure pat00015
Family-12 chr7 94259133 SGCE C T
Figure pat00016
Family-13 chr9 116994117 COL27A1 G T
Figure pat00017
Family-13 chr9 116931124 COL27A1 C T
Figure pat00018
Family-15 chr1 216166484 USH2A A T
Figure pat00019
Family-15 chr1 216144109 USH2A G A
Figure pat00020
Family-15 chr12 88512305 CEP290 T *
Figure pat00021
Family-16 chr4 42523636 DRD5 T C
Figure pat00022
Family-16 chr16 55734106 SLC6A2 T C
Figure pat00023
Family-19 chr7 150747934 ASIC3-1 CCCCAG *
Figure pat00024
Family-19 chr7 150746097 ASIC3-2 G A
Figure pat00025
Family-21 chr8 141231575 TRAPPC9 C T
Figure pat00026
Family-21 chr11 113860274 HTR3A G A
Figure pat00027
실시예 12. 뚜렛증후군 유전자 패널의 CNV 유전자 분석 결과
뚜렛증후군 관련 유전자 중 원인 변이를 찾지 못한 가계에서 문헌연구(Pubmed)를 통해 선발된 CNV를 나타내는지 분석하였다. 뚜렛증후군 유전자 패널의 in silico analysis 결과, 전체 임상시료 중 뚜렛증후군 원인 변이체가 선발되지 않은 가계는 4가계였다. 이중 Trio-sample로 구성된 3가계에서 CNV 분석을 수행한 결과, 2가계에서 기존에 보고된 뚜렛증후군 CNV가 동정되었다. 뚜렛증후군 유전자 패널의 뚜렛증후군 진단 결과를 종합한 결과, 전체 34가계 중 30가계(88%)는 SNV에 의해 원인 유전자가 선발되었다. 또한, 나머지 4가계 중 Trio-sample을 가진 3가계에서 CNV를 분석한 결과, 2가계에서 기존에 알려진 CNV 유전자의 중복 및 삭제가 관찰되었다. 이를 통해, 뚜렛증후군 유전자 패널의 뚜렛증후군 진단율은 약 94%임을 확인하였다. 상기 결과를 하기 표 4에 나타내었다.
Index Sample
name
chr CNV type Start
(bp)
End
(bp)
Length
(kb)
Raw
coverage
copy_no Gene
Family 5 TN1512D0471 chr1 del 17081128 17090975 9.848 3822 1.468 MST1L
TN1512D0471 chr1 dup 89476586 89477710 1.125 334 3.064 GBP3
TN1512D0471 chr1 dup 1.97E+08 1.97E+08 48.451 593 2.771 CFHR3, CFHR1
TN1512D0471 chr1 del 2.49E+08 2.49E+08 21.511 1146 1.541 OR2T2, OR2T3
TN1512D0471 chr2 del 2.42E+08 2.42E+08 0.678 200 1.258 AQP12A
TN1512D0471 chr3 dup 1.96E+08 1.96E+08 6.497 16770 2.805 MUC4
TN1512D0471 chr4 del 9245604 9251948 6.345 212 0.696 USP17L17, USP17L18
TN1512D0471 chr4 del 69337177 69434245 97.069 424 1.42 TMPRSS11E, UGT2B17
TN1512D0471 chr5 del 32101210 32174425 73.216 674 1.516 PDZD2, GOLPH3
TN1512D0471 chr5 del 1.3E +08 1.4E +08 3352.853 18571 1.904 KLHL3
Family 3 TN1711D1003 chr10 del 6.5E +07 7E+07 5386.132 8532 1.853 CTNNA3
TN1711D1003 chr14 del 3.8E +07 4.8E +07 9596.145 13042 1.886 FSCB
TN1711D1003 chr15 dup 45398322 45440195 41.874 2247 2.303 DUOXA1
TN1711D1003 chr17 dup 7121559 7128586 7.028 1379 2.436 DLG4, ACADVL
TN1711D1003 chr17 dup 15468795 15496809 28.015 475 2.987 CDRT1
TN1711D1003 chr19 dup 580645 581591 0.947 227 3.22 BSG
II. 뚜렛증후군 후보 유전자 스크리닝 방법
실시예 13. 뚜렛증후군 유전자 패널의 In silico 분석 파이프라인 개발
본 실시예에서 사용된 프로그램은 가계성 및 비가계성 뚜렛증후군의 원인 유전자 후보를 찾아내는 방법을 제공하였다. 구체적으로, 뚜렛증후군 환자, 및 뚜렛증후군 환자의 부모 또는 형제자매의 염기 서열 정보를 수치화하고 그 패턴을 분석하여 뚜렛증후군의 원인 유전자 후보 리스트를 제공하였다. 상기 프로그램이 뚜렛증후군의 원인 유전자 후보 리스트를 제공하는 방법은 하기 실시예 13.1 내지 실시예 13.7와 같다.
실시예 13.1. Exome 서열 데이터의 전처리
먼저, Exome 서열 데이터의 전처리 과정으로, 뚜렛증후군 환자에서 원인 유전자를 선발하는데 있어서 정확한 데이터를 확보하기 위한 단계이다. 해독기계에서 생산된 단서열로부터 정확한 변이정보를 얻기 위해, 해독된 단서열 중 퀄리티(quality)가 떨어지는 리드를 제거하였다. 이를 위해 Sickle 프로그램(https://github.com/najoshi/sickle)을 이용하였다. 상기 프로그램은 각 단서열의 평균 Q score를 계산한 후, 이 값이 사용자가 선택한 퀄리티 기준 값 미만(Q score<20)이면 단서열을 끝에서부터 시작하여 하나씩 잘랐다(trimming). 나머지 염기로 다시 Q score를 계산하여 평균값이 기준 퀄리티 이상이 나올 때까지 상기 작업을 반복하였다. 상기 trimming은 사용자의 선택에 따라 3'말단이나 5'말단으로부터 시작할 수 있다. 최종적으로 얻어진 단서열의 길이가 사용자가 선택한 기준 값(50 bp)보다 작을 경우, 해당 단서열을 제거하고, 그렇지 않을 경우 해당 단서열의 필터링을 종료하였다.
실시예 13.2. 인간 표준서열에 맵핑
그 다음, 인간 표준서열에 맵핑하는 단계이다. 각 샘플 당 전장게놈(whole genome)의 약 100배수 이상에 해당하는 단서열을 인간 표준서열(human reference sequence, NCBI build GRCh37, UCSC build hg19)에 효과적으로 맵핑하기 위하여, Burrows-Wheeler Aligner(BWA) 0.1.17 버젼의 프로그램을 이용하였다. 이때, 사용한 명령어와 옵션은 하기와 같은 옵션을 사용하였다: bwa aln -I -t 3 -l 45 -k 2 ref.fa sample1_1.fq.gz > sample1_1.sai, bwa aln -I -t 3 -l 45 -k 2 ref.fa sample1_2.fq.gz > sample1_2.sai, bwa sampe -r '@RG\tID:TGP2010D0009\tSM:TGP2010D0009\tPL:Illumina' ref.fa sample1_1.sai sample1_2.sai sample1_1.fq.gz sample1_2.fq.gz > sample1.sam.
다음으로, PCR 중복을 제거하는 단계이다. 정확한 변이체 발굴을 위해 Samtools 0.1.18 버전의 프로그램을 이용하여 PCR 중복 리드를 제거하였다. 이때, 사용한 명령어와 옵션은 하기와 같다: samtools rmdup sample1.sorted.bam sample1.sorted.rmdup.bam.
실시예 13.3. Local realignment
그 다음은 Local realignment 단계로서, 이전 단계에서 생성된 BAM 파일로부터 GATK Lite 2.3.9 버전의 프로그램을 이용하여 부분적인 재배열(local realignment)을 수행하였다. 이 단계는 단서열의 부분적인 재배열을 통해 전체 단서열 중에 mismatch 되는 염기의 수를 최소화하기 위해 고안되었으며, 하기의 두 세부 단계로 이루어진다: Realignment가 필요해 보이는 의심스러운 짧은 간격(interval)을 결정하는 단계(RealignerTargetCreator) 및 이들 간격에 걸쳐 부분적인 재배열을 하는 단계(IndelRealigner). 이때, indel로 인하여 misalignment가 일어나는 리드들의 local realignment를 실행하였다.
실시예 13.4. Base recalibration
그 다음은, Base recalibration 하는 단계이다. GATK Lite 2.3.9 프로그램에서 제공하는 BaseRecalibrator 옵션을 이용하여, 단서열의 성질과 관련하여 사용자가 지정한 다양한 공변량(covariate)들을 바탕으로 재보정(recalibration) 테이블을 만들었다. 이에 따라, 염기의 quality score를 재보정 하였다. 사용자가 지정할 수 있는 공변량들로는 리드 그룹(read group), 기존에 보고된 quality score, machine cycle 및 nucleotide context 값들이 있다. 이 단계에서 사용한 명령어와 옵션은 하기와 같다; java -jar GenomeAnalysisTKLite.jar -T BaseRecalibrator -R ref.fa -I sample1.sorted.rmdup.realign.bam -o out.grp --plot_pdf_file out.grp.pdf --knownSites dbsnp_137.hg19.vcf ―disable_indel_quals.
실시예 13.5. 변이체 발굴
다음으로, 변이체 발굴하는 단계이다. GATK Lite 2.3.9 프로그램에서 제공하는 UnifiedGenotyper 옵션을 이용하여 변이체를 발굴하였다. 이 옵션은 Bayesian genotype likelihood model을 이용하여 가장 높은 확률의 변이체 유전형 및 위치를 예측하는 프로그램이다.
실시예 13.6. 변이체 필터링
그 다음은, 변이체 필터링하는 단계이다. GATK Lite 2.3.9 프로그램에서 제공하는 VariantFiltration 옵션을 사용하여 변이체 필터링을 수행하였다. 이 옵션은 사용자 지정 필터를 이용하여 변이체 자료 값을 평가한 후, 기준에 미달되는지의 여부를 새 파일의 'FILTER' 컬럼에 PASS, HARD_TO_VALIDATE 혹은 DepthFilter 등으로 표시한다.
실시예 13.7. 뚜렛증후군 환자 및 정상인의 일치하는 염기서열 위치 선별
마지막으로, 상기 조건을 만족한 SNV에서 뚜렛증후군 환자, 뚜렛증후군 환자의 부모 및 형제자매의 염기서열 정보를 이용하여, 뚜렛증후군 환자와 정상인의 염기서열이 각각 일치하는 위치를 선별하는 단계이다. 분석 대상 가족 내에서 뚜렛증후군 환자와 정상인의 염기서열을 정확하게 구분 짓는 위치는 SNV 및 CNV 위치이다. 상기 걸러낸 염기서열의 위치가 암호화 부위(coding region)인지 확인하고, 암호화 부위의 염기서열만 대상으로 삼았다. 상기 선정된 SNV 염기서열을 유전자기호(gene symbol)로 변환하였다. 상기 선정된 SNV에 해당하는 유전자에 대한 설명과 표현형, PolyPhen-2와 SIFT 프로그램의 점수 등 추가 정보를 제공하였다. PolyPhen-2와 SIFT는 염기서열의 변이에 따른 질병을 예측하는 프로그램이다. 상기 모든 과정은 LINUX에서 실행할 수 있는 파이프라인 형식으로 구성하였다. 분석 결과는 뚜렛증후군 유전자 패널에서 나타나는 모든 변이체를 제공함으로써, 뚜렛증후군 환자의 원인 유전자를 동정하는데 도움이 될 수 있는 형식으로 데이터를 제공하였다.
<110> Korea Research Institute of Bioscience and Biotechnology <120> METHOD FOR IDENTIFYING CAUSATIVE GENES OF TOURETTE SYNDROME <130> FPD/201901-0012 <160> 208 <170> KoPatentIn 3.0 <210> 1 <211> 1860 <212> PRT <213> Artificial Sequence <220> <223> COL27A1 <400> 1 Met Gly Ala Gly Ser Ala Arg Gly Ala Arg Gly Thr Ala Ala Ala Ala 1 5 10 15 Ala Ala Arg Gly Gly Gly Phe Leu Phe Ser Trp Ile Leu Val Ser Phe 20 25 30 Ala Cys His Leu Ala Ser Thr Gln Gly Ala Pro Glu Asp Val Asp Ile 35 40 45 Leu Gln Arg Leu Gly Leu Ser Trp Thr Lys Ala Gly Ser Pro Ala Pro 50 55 60 Pro Gly Val Ile Pro Phe Gln Ser Gly Phe Ile Phe Thr Gln Arg Ala 65 70 75 80 Arg Leu Gln Ala Pro Thr Gly Thr Val Ile Pro Ala Ala Leu Gly Thr 85 90 95 Glu Leu Ala Leu Val Leu Ser Leu Cys Ser His Arg Val Asn His Ala 100 105 110 Phe Leu Phe Ala Val Arg Ser Gln Lys Arg Lys Leu Gln Leu Gly Leu 115 120 125 Gln Phe Leu Pro Gly Lys Thr Val Val His Leu Gly Ser Arg Arg Ser 130 135 140 Val Ala Phe Asp Leu Asp Met His Asp Gly Arg Trp His His Leu Ala 145 150 155 160 Leu Glu Leu Arg Gly Arg Thr Val Thr Leu Val Thr Ala Cys Gly Gln 165 170 175 Arg Arg Val Pro Val Leu Leu Pro Phe His Arg Asp Pro Ala Leu Asp 180 185 190 Pro Gly Gly Ser Phe Leu Phe Gly Lys Met Asn Pro His Ala Val Gln 195 200 205 Phe Glu Gly Ala Leu Cys Gln Phe Ser Ile Tyr Pro Val Thr Gln Val 210 215 220 Ala His Asn Tyr Cys Thr His Leu Arg Lys Gln Cys Gly Gln Ala Asp 225 230 235 240 Thr Tyr Gln Ser Pro Leu Gly Pro Leu Phe Ser Gln Asp Ser Gly Arg 245 250 255 Pro Phe Thr Phe Gln Ser Asp Leu Ala Leu Leu Gly Leu Glu Asn Leu 260 265 270 Thr Thr Ala Thr Pro Ala Leu Gly Ser Leu Pro Ala Gly Arg Gly Pro 275 280 285 Arg Gly Thr Val Ala Pro Ala Thr Pro Thr Lys Pro Gln Arg Thr Ser 290 295 300 Pro Thr Asn Pro His Gln His Met Ala Val Gly Gly Pro Ala Gln Thr 305 310 315 320 Pro Leu Leu Pro Ala Lys Leu Ser Ala Ser Asn Ala Leu Asp Pro Met 325 330 335 Leu Pro Ala Ser Val Gly Gly Ser Thr Arg Thr Pro Arg Pro Ala Ala 340 345 350 Ala Gln Pro Ser Gln Lys Ile Thr Ala Thr Lys Ile Pro Lys Ser Leu 355 360 365 Pro Thr Lys Pro Ser Ala Pro Ser Thr Ser Ile Val Pro Ile Lys Ser 370 375 380 Pro His Pro Thr Gln Lys Thr Ala Pro Ser Ser Phe Thr Lys Ser Ala 385 390 395 400 Leu Pro Thr Gln Lys Gln Val Pro Pro Thr Ser Arg Pro Val Pro Ala 405 410 415 Arg Val Ser Arg Pro Ala Glu Lys Pro Ile Gln Arg Asn Pro Gly Met 420 425 430 Pro Arg Pro Pro Pro Pro Ser Thr Arg Pro Leu Pro Pro Thr Thr Ser 435 440 445 Ser Ser Lys Lys Pro Ile Pro Thr Leu Ala Arg Thr Glu Ala Lys Ile 450 455 460 Thr Ser His Ala Ser Lys Pro Ala Ser Ala Arg Thr Ser Thr His Lys 465 470 475 480 Pro Pro Pro Phe Thr Ala Leu Ser Ser Ser Pro Ala Pro Thr Pro Gly 485 490 495 Ser Thr Arg Ser Thr Arg Pro Pro Ala Thr Met Val Pro Pro Thr Ser 500 505 510 Gly Thr Ser Thr Pro Arg Thr Ala Pro Ala Val Pro Thr Pro Gly Ser 515 520 525 Ala Pro Thr Gly Ser Lys Lys Pro Ile Gly Ser Glu Ala Ser Lys Lys 530 535 540 Ala Gly Pro Lys Ser Ser Pro Arg Lys Pro Val Pro Leu Arg Pro Gly 545 550 555 560 Lys Ala Ala Arg Asp Val Pro Leu Ser Asp Leu Thr Thr Arg Pro Ser 565 570 575 Pro Arg Gln Pro Gln Pro Ser Gln Gln Thr Thr Pro Ala Leu Val Leu 580 585 590 Ala Pro Ala Gln Phe Leu Ser Ser Ser Pro Arg Pro Thr Ser Ser Gly 595 600 605 Tyr Ser Ile Phe His Leu Ala Gly Ser Thr Pro Phe Pro Leu Leu Met 610 615 620 Gly Pro Pro Gly Pro Lys Gly Asp Cys Gly Leu Pro Gly Pro Pro Gly 625 630 635 640 Leu Pro Gly Leu Pro Gly Ile Pro Gly Ala Arg Gly Pro Arg Gly Pro 645 650 655 Pro Gly Pro Tyr Gly Asn Pro Gly Leu Pro Gly Pro Pro Gly Ala Lys 660 665 670 Gly Gln Lys Gly Asp Pro Gly Leu Ser Pro Gly Lys Ala His Asp Gly 675 680 685 Ala Lys Gly Asp Met Gly Leu Pro Gly Leu Ser Gly Asn Pro Gly Pro 690 695 700 Pro Gly Arg Lys Gly His Lys Gly Tyr Pro Gly Pro Ala Gly His Pro 705 710 715 720 Gly Glu Gln Gly Gln Pro Gly Pro Glu Gly Ser Pro Gly Ala Lys Gly 725 730 735 Tyr Pro Gly Arg Gln Gly Leu Pro Gly Pro Val Gly Asp Pro Gly Pro 740 745 750 Lys Gly Ser Arg Gly Tyr Ile Gly Leu Pro Gly Leu Phe Gly Leu Pro 755 760 765 Gly Ser Asp Gly Glu Arg Gly Leu Pro Gly Val Pro Gly Lys Arg Gly 770 775 780 Lys Met Gly Met Pro Gly Phe Pro Gly Val Phe Gly Glu Arg Gly Pro 785 790 795 800 Pro Gly Leu Asp Gly Asn Pro Gly Glu Leu Gly Leu Pro Gly Pro Pro 805 810 815 Gly Val Pro Gly Leu Ile Gly Asp Leu Gly Val Leu Gly Pro Ile Gly 820 825 830 Tyr Pro Gly Pro Lys Gly Met Lys Gly Leu Met Gly Ser Val Gly Glu 835 840 845 Pro Gly Leu Lys Gly Asp Lys Gly Glu Gln Gly Val Pro Gly Val Ser 850 855 860 Gly Asp Pro Gly Phe Gln Gly Asp Lys Gly Ser Gln Gly Leu Pro Gly 865 870 875 880 Phe Pro Gly Ala Arg Gly Lys Pro Gly Pro Leu Gly Lys Val Gly Asp 885 890 895 Lys Gly Ser Ile Gly Phe Pro Gly Pro Pro Gly Pro Glu Gly Phe Pro 900 905 910 Gly Asp Ile Gly Pro Pro Gly Asp Asn Gly Pro Glu Gly Met Lys Gly 915 920 925 Lys Pro Gly Ala Arg Gly Leu Pro Gly Pro Arg Gly Gln Leu Gly Pro 930 935 940 Glu Gly Asp Glu Gly Pro Met Gly Pro Pro Gly Ala Pro Gly Leu Glu 945 950 955 960 Gly Gln Pro Gly Arg Lys Gly Phe Pro Gly Arg Pro Gly Leu Asp Gly 965 970 975 Val Lys Gly Glu Pro Gly Asp Pro Gly Arg Pro Gly Pro Val Gly Glu 980 985 990 Gln Gly Phe Met Gly Phe Ile Gly Leu Val Gly Glu Pro Gly Ile Val 995 1000 1005 Gly Glu Lys Gly Asp Arg Gly Met Met Gly Pro Pro Gly Val Pro Gly 1010 1015 1020 Pro Lys Gly Ser Met Gly His Pro Gly Met Pro Gly Gly Met Gly Thr 1025 1030 1035 1040 Pro Gly Glu Pro Gly Pro Gln Gly Pro Pro Gly Ser Arg Gly Pro Pro 1045 1050 1055 Gly Met Arg Gly Ala Lys Gly Arg Arg Gly Pro Arg Gly Pro Asp Gly 1060 1065 1070 Pro Ala Gly Glu Gln Gly Ser Arg Gly Leu Lys Gly Pro Pro Gly Pro 1075 1080 1085 Gln Gly Arg Pro Gly Arg Pro Gly Gln Gln Gly Val Ala Gly Glu Arg 1090 1095 1100 Gly His Leu Gly Ser Arg Gly Phe Pro Gly Ile Pro Gly Pro Ser Gly 1105 1110 1115 1120 Pro Pro Gly Thr Lys Gly Leu Pro Gly Glu Pro Gly Pro Gln Gly Pro 1125 1130 1135 Gln Gly Pro Ile Gly Pro Pro Gly Glu Met Gly Pro Lys Gly Pro Pro 1140 1145 1150 Gly Ala Val Gly Glu Pro Gly Leu Pro Gly Glu Ala Gly Met Lys Gly 1155 1160 1165 Asp Leu Gly Pro Leu Gly Thr Pro Gly Glu Gln Gly Leu Ile Gly Gln 1170 1175 1180 Arg Gly Glu Pro Gly Leu Glu Gly Asp Ser Gly Pro Met Gly Pro Asp 1185 1190 1195 1200 Gly Leu Lys Gly Asp Arg Gly Asp Pro Gly Pro Asp Gly Glu His Gly 1205 1210 1215 Glu Lys Gly Gln Glu Gly Leu Met Gly Glu Asp Gly Pro Pro Gly Pro 1220 1225 1230 Pro Gly Val Thr Gly Val Arg Gly Pro Glu Gly Lys Ser Gly Lys Gln 1235 1240 1245 Gly Glu Lys Gly Arg Thr Gly Ala Lys Gly Ala Lys Gly Tyr Gln Gly 1250 1255 1260 Gln Leu Gly Glu Met Gly Val Pro Gly Asp Pro Gly Pro Pro Gly Thr 1265 1270 1275 1280 Pro Gly Pro Lys Gly Ser Arg Gly Ser Leu Gly Pro Thr Gly Ala Pro 1285 1290 1295 Gly Arg Met Gly Ala Gln Gly Glu Pro Gly Leu Ala Gly Tyr Asp Gly 1300 1305 1310 His Lys Gly Ile Val Gly Pro Leu Gly Pro Pro Gly Pro Lys Gly Glu 1315 1320 1325 Lys Gly Glu Gln Gly Glu Asp Gly Lys Ala Glu Gly Pro Pro Gly Pro 1330 1335 1340 Pro Gly Asp Arg Gly Pro Val Gly Asp Arg Gly Asp Arg Gly Glu Pro 1345 1350 1355 1360 Gly Asp Pro Gly Tyr Pro Gly Gln Glu Gly Val Gln Gly Leu Arg Gly 1365 1370 1375 Lys Pro Gly Gln Gln Gly Gln Pro Gly His Pro Gly Pro Arg Gly Trp 1380 1385 1390 Pro Gly Pro Lys Gly Ser Lys Gly Ala Glu Gly Pro Lys Gly Lys Gln 1395 1400 1405 Gly Lys Ala Gly Ala Pro Gly Arg Arg Gly Val Gln Gly Leu Gln Gly 1410 1415 1420 Leu Pro Gly Pro Arg Gly Val Val Gly Arg Gln Gly Leu Glu Gly Ile 1425 1430 1435 1440 Ala Gly Pro Asp Gly Leu Pro Gly Arg Asp Gly Gln Ala Gly Gln Gln 1445 1450 1455 Gly Glu Gln Gly Asp Asp Gly Asp Pro Gly Pro Met Gly Pro Ala Gly 1460 1465 1470 Lys Arg Gly Asn Pro Gly Val Ala Gly Leu Pro Gly Ala Gln Gly Pro 1475 1480 1485 Pro Gly Phe Lys Gly Glu Ser Gly Leu Pro Gly Gln Leu Gly Pro Pro 1490 1495 1500 Gly Lys Arg Gly Thr Glu Gly Arg Thr Gly Leu Pro Gly Asn Gln Gly 1505 1510 1515 1520 Glu Pro Gly Ser Lys Gly Gln Pro Gly Asp Ser Gly Glu Met Gly Phe 1525 1530 1535 Pro Gly Met Ala Gly Leu Phe Gly Pro Lys Gly Pro Pro Gly Asp Ile 1540 1545 1550 Gly Phe Lys Gly Ile Gln Gly Pro Arg Gly Pro Pro Gly Leu Met Gly 1555 1560 1565 Lys Glu Gly Ile Val Gly Pro Leu Gly Ile Leu Gly Pro Ser Gly Leu 1570 1575 1580 Pro Gly Pro Lys Gly Asp Lys Gly Ser Arg Gly Asp Trp Gly Leu Gln 1585 1590 1595 1600 Gly Pro Arg Gly Pro Pro Gly Pro Arg Gly Arg Pro Gly Pro Pro Gly 1605 1610 1615 Pro Pro Gly Gly Pro Ile Gln Leu Gln Gln Asp Asp Leu Gly Ala Ala 1620 1625 1630 Phe Gln Thr Trp Met Asp Thr Ser Gly Ala Leu Arg Pro Glu Ser Tyr 1635 1640 1645 Ser Tyr Pro Asp Arg Leu Val Leu Asp Gln Gly Gly Glu Ile Phe Lys 1650 1655 1660 Thr Leu His Tyr Leu Ser Asn Leu Ile Gln Ser Ile Lys Thr Pro Leu 1665 1670 1675 1680 Gly Thr Lys Glu Asn Pro Ala Arg Val Cys Arg Asp Leu Met Asp Cys 1685 1690 1695 Glu Gln Lys Met Val Asp Gly Thr Tyr Trp Val Asp Pro Asn Leu Gly 1700 1705 1710 Cys Ser Ser Asp Thr Ile Glu Val Ser Cys Asn Phe Thr His Gly Gly 1715 1720 1725 Gln Thr Cys Leu Lys Pro Ile Thr Ala Ser Lys Val Glu Phe Ala Ile 1730 1735 1740 Ser Arg Val Gln Met Asn Phe Leu His Leu Leu Ser Ser Glu Val Thr 1745 1750 1755 1760 Gln His Ile Thr Ile His Cys Leu Asn Met Thr Val Trp Gln Glu Gly 1765 1770 1775 Thr Gly Gln Thr Pro Ala Lys Gln Ala Val Arg Phe Arg Ala Trp Asn 1780 1785 1790 Gly Gln Ile Phe Glu Ala Gly Gly Gln Phe Arg Pro Glu Val Ser Met 1795 1800 1805 Asp Gly Cys Lys Val Gln Asp Gly Arg Trp His Gln Thr Leu Phe Thr 1810 1815 1820 Phe Arg Thr Gln Asp Pro Gln Gln Leu Pro Ile Ile Ser Val Asp Asn 1825 1830 1835 1840 Leu Pro Pro Ala Ser Ser Gly Lys Gln Tyr Arg Leu Glu Val Gly Pro 1845 1850 1855 Ala Cys Phe Leu 1860 <210> 2 <211> 7813 <212> DNA <213> Artificial Sequence <220> <223> COL27A1 <400> 2 ccttttcctc tcctccccca ggccggcggg gaggcagctt ccaccgccct ccgcgcgccc 60 tcacccggcc ttgctctgcc tccggggacc gccagcagcc cgcctccaaa agtttgatca 120 tctctctctc tctttttctt gcttcttctt cctttttggt ggaagcagaa aaggaccgag 180 gcaggggcga gcgcggcgcc cggactcctg ggaccatggg cctggcgcgg gcgcccgcgg 240 ggccccagcc gcgctgcctg cctgctcggg cgcccctggg cgcggggctg cgctgggggc 300 gcgggggccg cgcgctctaa gccggcctgg cgcggcgggg cggggggctg gcggccccat 360 ggggcgcgcc cacacttgcc ccccgggctc gggagcatga agtaggggcc tgccatggga 420 gcgggatcgg cgcggggggc ccgaggcaca gcggcggcgg cggcggcgcg cggggggggg 480 tttctcttct cctggatctt agtctcgttt gcctgtcacc tggcctccac ccaaggagct 540 cctgaagatg tggacatcct ccagcggctg ggcctcagct ggacgaaggc cgggagccct 600 gcacccccgg gagtcattcc tttccagtcg ggcttcatct ttacgcagcg ggcccggctc 660 caggctccca cgggcaccgt cattcctgcc gccttgggca cagagctggc actggtgctg 720 agcctctgct cccaccgggt gaaccatgcc ttcctcttcg ctgtccgcag ccagaaacgc 780 aagctgcagc tgggcctgca gttcctcccc ggcaagacgg tcgtccacct cgggtcccgg 840 cgctcagtgg ccttcgacct cgacatgcac gacgggcgct ggcaccacct ggccctcgag 900 ctccgaggcc gcacagtcac tctggtgact gcctgcgggc agcgccgggt gcctgtcctg 960 ctgcctttcc acagggaccc tgcactcgac cctgggggct ccttcctctt tgggaagatg 1020 aacccgcatg cagtccagtt tgaaggtgct ctctgccagt tcagtatcta ccctgtgacg 1080 caggtcgctc acaattactg tacccacctg aggaagcagt gtggacaggc tgacacgtac 1140 cagtccccac tgggacctct cttctcccaa gactctggca gaccttttac cttccagtcc 1200 gacctcgccc tgctaggcct ggagaacttg accactgcca caccagccct ggggtcactg 1260 ccagcaggca ggggacccag ggggactgtg gcacccgcca cgcccaccaa gccccaaagg 1320 actagcccca caaaccctca ccagcatatg gcggtgggag gcccagccca aaccccgctg 1380 ctacctgcca agctgtcagc cagtaacgca cttgatccca tgctcccagc ctctgttggc 1440 ggctctacca gaacgcctcg ccctgcggcc gctcaaccat cacagaagat cacagccacc 1500 aaaatcccca aaagcctccc taccaagcct tcggcccctt ctacttcaat tgtgcccatc 1560 aaaagccccc atcctaccca gaaaacagct ccatcttcat ttacaaagtc agccctaccc 1620 actcagaagc aagtgccacc tacttcccgt ccagttcctg ccagagtctc ccgtcccgca 1680 gagaagccca tccagaggaa cccgggaatg cccaggcccc caccgcccag cacccggccc 1740 ctacctccta ccaccagctc ctctaaaaaa cccattccca cactagctcg gactgaggcc 1800 aagataacca gccatgccag taagccggcc tctgcccgca ccagcaccca caaacctccc 1860 ccatttactg ctttatcctc atctcctgcc cctactcctg gttctaccag gagtactcgg 1920 ccaccagcca cgatggtacc tccaacttcg ggcaccagca ctcccagaac agcacctgcc 1980 gtccccactc ctggctcagc tcccactgga agcaagaagc ccattggatc ggaagcctca 2040 aagaaagccg gacccaagag cagcccccgg aagcctgtcc ccctcagacc tgggaaggca 2100 gccagggatg tccccttgag cgatctgaca accaggccta gccccagaca gccccagccc 2160 agtcagcaga ccaccccggc cctggtattg gccccggcgc aattcctgtc ctccagcccc 2220 cggcccacga gcagtggcta ttcgatcttc cacctggcag gatctacgcc tttccctctg 2280 ctgatggggc ctccgggacc caagggagac tgtggcttgc cgggtccccc tgggctacct 2340 gggctacctg gaatccctgg tgcacgtggg cctcggggtc ctcctgggcc ttatggaaat 2400 ccaggtctcc ccggccctcc tggagccaaa ggacagaaag gggacccagg gctctcacca 2460 ggaaaggccc acgatggggc aaagggtgac atgggcttgc ctgggctctc cgggaatcca 2520 ggacctccgg gacgaaaggg acacaagggc tatcctggac cggcagggca ccccggagaa 2580 caggggcagc caggacctga gggcagccca ggggccaaag gttaccctgg caggcagggg 2640 ttacctggac cggtaggaga tcccggcccc aaaggcagca ggggctacat tgggctccca 2700 gggctcttcg gcctgccagg gtctgatgga gaacgaggcc tgcctggcgt tcctggcaag 2760 aggggcaaga tgggtatgcc ggggtttcct ggagtctttg gggaaagagg ccctcctgga 2820 ctggatggaa atcctggaga actgggcctg ccaggccccc ctggagtccc cggcctcatt 2880 ggtgacttag gagtgttggg tccgattggc tacccgggac ccaagggcat gaagggactg 2940 atgggcagcg tgggggagcc cggactgaaa ggtgataagg gtgaacaagg ggttccaggt 3000 gtgtcaggag atcccggatt ccaaggagac aaggggagcc aggggttgcc agggttcccc 3060 ggtgcacggg ggaagccagg gcctctgggc aaagtcggag acaaaggatc cattgggttt 3120 cccgggcccc ctggacccga gggattccca ggagacatcg gcccccctgg cgacaatggc 3180 ccagaaggca tgaagggtaa gcctggagcc cgaggcctgc cgggaccccg tgggcagctg 3240 gggcccgagg gagatgaggg acccatgggg ccgccagggg cccctggctt ggagggtcag 3300 cctggcagga aggggtttcc tgggaggccc ggcctggatg gcgtgaaggg ggaaccaggg 3360 gatcctggtc ggccggggcc tgtgggagag cagggattta tgggattcat tggtctggtc 3420 ggggagccag gaatcgtggg agaaaagggt gatcgtggca tgatgggacc cccaggcgtg 3480 cctggaccca aggggtcgat gggtcatcct ggaatgccag gtggtatggg gacccctgga 3540 gagcctggac cccagggtcc tccaggatct cgaggcccac caggcatgag gggagcaaag 3600 ggacgtcggg gcccccgagg accggacgga ccagctgggg agcaagggtc caggggcctg 3660 aagggccctc caggacccca gggcagaccg ggccggcctg gacagcaggg tgtggctggt 3720 gagcgaggcc acttgggctc gagaggcttt cctggcatcc cgggtccctc aggcccccca 3780 ggcaccaagg gcctcccagg agaaccgggc cctcagggac cccaggggcc aattgggcct 3840 ccaggagaga tgggacccaa ggggccgcct ggtgcagtgg gagaaccggg ccttcctggg 3900 gaagccggga tgaagggtga ccttggaccc ctgggcactc ctggggagca gggcctcatt 3960 gggcaacggg gagagccagg ccttgagggt gacagtggcc ccatgggacc tgatgggctg 4020 aagggggaca ggggagaccc agggcctgat ggagaacatg gcgagaaagg ccaggaaggg 4080 ctgatgggtg aggacgggcc ccccggcccc cctggcgtca ctggtgtccg gggtcctgaa 4140 ggaaaatcag ggaagcaagg cgagaagggc cgcactggag ccaagggtgc caagggctat 4200 caaggacagc tgggtgagat gggcgtccct ggagaccctg gaccccctgg cactccaggc 4260 cctaaagggt cccggggcag cctgggacca acgggtgctc cgggacgcat gggggcccaa 4320 ggagaaccgg gactggctgg ttatgatgga cacaaaggca ttgtgggacc ccttggacct 4380 cctggaccaa aaggcgaaaa gggggagcag ggcgaggacg gcaaggctga ggggccccct 4440 gggccacctg gagatcgggg ccctgtgggt gatcgaggag accgcgggga accgggagac 4500 cctgggtacc ctggacagga gggtgtgcaa ggcctccgtg gaaagccagg ccagcagggc 4560 caacccgggc atccgggacc ccgggggtgg ccgggaccca aaggatcgaa aggcgcagag 4620 ggaccaaagg gaaagcaagg caaggcaggg gccccaggcc ggaggggggt ccagggcctg 4680 caggggctgc cagggccccg gggcgtggtg gggagacagg gcctcgaggg catcgctgga 4740 ccagatgggc ttcctggcag ggacgggcaa gcaggacagc agggggagca gggagacgat 4800 ggggaccctg gccccatggg ccctgctggg aagagaggaa atccaggtgt ggccggctta 4860 cctggagcac agggaccccc aggattcaag ggtgagagtg ggttacccgg acagctgggt 4920 ccccctggca agcgaggaac agagggcaga acggggctcc ctggaaacca gggggagcct 4980 gggtccaaag gccagccggg cgactctggc gagatgggct tcccaggaat ggcaggtctc 5040 ttcggaccca agggcccgcc tggagacatt ggcttcaaag gcatccaggg ccctcggggg 5100 ccacctggct tgatgggaaa ggaaggcatc gtcgggcccc tcggaatcct gggaccttcg 5160 ggactcccgg gtccgaaggg tgacaaaggc agccgtgggg actggggatt gcaaggtccg 5220 aggggtcctc ccggccccag agggcggccc ggccccccgg gtcctccagg gggtcctatc 5280 caattgcaac aagatgatct tggggcagct ttccagacgt ggatggacac cagtggagca 5340 ctcaggccag agagttacag ctatccagac cggctggtgc tggaccaggg aggagagatc 5400 tttaaaacct tacactacct cagcaacctc atccagagca ttaagacgcc cctgggcacc 5460 aaagagaacc ccgcccgggt ctgcagggac ctcatggact gtgagcagaa gatggtggat 5520 ggtacctact gggtggatcc aaaccttggc tgctcctctg acaccatcga ggtctcctgc 5580 aacttcactc atggtggaca gacgtgtctc aagcccatca cggcctccaa ggtcgagttt 5640 gccatcagcc gggtccagat gaatttcctg cacctgctaa gctccgaggt gacccagcac 5700 atcaccatcc actgccttaa catgaccgtg tggcaggagg gcactgggca gaccccagcc 5760 aagcaggccg tacgcttccg ggcctggaat ggacagattt ttgaagctgg gggtcagttc 5820 cggcccgagg tgtccatgga tggctgcaag gtccaagatg gccgctggca tcagacactc 5880 ttcaccttcc ggacccaaga cccccaacag ctgcccatca tcagtgtgga caacctccct 5940 cctgcctcat cagggaagca gtaccgcctg gaagttggac ctgcgtgctt cctctgacct 6000 ctgacctcgt ggccactcta ggcctcacgg aggagggaag aggaagaggc aaggggaggg 6060 tactgagggg cagatggctc caggagaggc agctcccctg cccaagggtc cttgggcaga 6120 ccccagctgt tgtctgccca gtagaagtgg gtgggggtag gaggggatag ggtgtccttg 6180 ggaacaatgg atcccagctt agccccaaag accaaccaaa gagccagcca gagtaagctg 6240 gacctgcaac ctgcctgagc cccgtggcct ctcagctctg cggccacccc gttccctccc 6300 cagcttcctg cccaaagagc cccacattca agccaacttg agggaagggg gcgtctcgtc 6360 agctggtccc tgctagggag ctattgatgt gcaatattag aaaggagaca tgaaaaaagg 6420 agaaaaggaa agacagaagt gtatatatat attatttaaa caaacaaaaa gaaggtgcgt 6480 tactattttt ttttcacccg ggaaagaggt gagaggatgg gaaggagcag ccaggcgtgg 6540 gaagcggcga gatcctcggg ctgggggtgc ccacgtttgc tacctcccac tgtgaaatcg 6600 ctggtgctca caattgtctc tcacagtgta tgtgattttt ttaaggaaaa aaaaaaatcc 6660 ctatttaaga ttctgaaggt gctaccatta ttttgccaca gactttgaag aaacttttgg 6720 atgtggggca tcatccgcat ctttctctct cctccaaatg acaaagtttg gggaattttt 6780 gaattttcct agcatcgccc ttgtgctcat caggtaatct gctaaggagg aaaaaagaaa 6840 agaaaaaagg aaaaaaaaaa aaaaaaagca aaacaaaaac aaaaacaaaa accctaccag 6900 aaaccagaag tagagagatt taccatataa cttatggact ttgaaatgtc tgtcctttta 6960 aggcagcagg gaggcctggg tgcgaagcat gttggcttgg cccttcacgg tcctggaggg 7020 aggtgaggct ggccttggaa ggcgtgccct ggagaggtct tgggtgaaaa cttgaccttg 7080 aagaaaccaa tcacaaaagc ggcgttgggt cagggctagg cttagaggtg aagcatcaac 7140 atggaaccat ctcaggaagc cgcatcgcct cttccgaggt cctcacttcc aggagcctgt 7200 ccttgcaaga tgcaatcatc gttcctgctt tttcattgtc attaaattct gtagaaaccc 7260 attgtcatta gctccaagtg taaatttggg tcaaggagac agaataataa tgggaatctc 7320 ggagttcgac accatagtga cgttcagcgt cctctgaatt gtgctacatc agcgaacaag 7380 tcggcgcttg aattggattt tgaggttatt ttaaccatgg aattattttt atagaagggg 7440 aaaatgtatg tgaaagtctc tatttgtgta tttctctcct aaagttgtgt ctctttggga 7500 attggatttg atttttatta tttaatacct cactttggcc cgtcccccct cccaacactt 7560 ctgtatcctc gccctgccgc cccagcctgg acgctctgcg tggaagtgcg tgtttgtagc 7620 agctcgggcc tcatctcagc gctcggatcc ctcctgctgc cagaatccac tggcctctgt 7680 ctcattcttg ggttttcctg ctgtcttcgt ttacgtctct gtccacatgt cagtgtatta 7740 aaaccccaat gggttccgtt tctccttttc ccctctggat tttaaataaa tatttaaaac 7800 tgaggcaatg gaa 7813 <210> 3 <211> 583 <212> PRT <213> Artificial Sequence <220> <223> BTBD9 <400> 3 Met Cys Arg Ala Leu Leu Tyr Gly Gly Met Arg Glu Ser Gln Pro Glu 1 5 10 15 Ala Glu Ile Pro Leu Gln Asp Thr Thr Ala Glu Ala Phe Thr Met Leu 20 25 30 Leu Lys Tyr Ile Tyr Thr Gly Arg Ala Thr Leu Thr Asp Glu Lys Glu 35 40 45 Glu Val Leu Leu Asp Phe Leu Ser Leu Ala His Lys Tyr Gly Phe Pro 50 55 60 Glu Leu Glu Asp Ser Thr Ser Glu Tyr Leu Cys Thr Ile Leu Asn Ile 65 70 75 80 Gln Asn Val Cys Met Thr Phe Asp Val Ala Ser Leu Tyr Ser Leu Pro 85 90 95 Lys Leu Thr Cys Met Cys Cys Met Phe Met Asp Arg Asn Ala Gln Glu 100 105 110 Val Leu Ser Ser Glu Gly Phe Leu Ser Leu Ser Lys Thr Ala Leu Leu 115 120 125 Asn Ile Val Leu Arg Asp Ser Phe Ala Ala Pro Glu Lys Asp Ile Phe 130 135 140 Leu Ala Leu Leu Asn Trp Cys Lys His Asn Ser Lys Glu Asn His Ala 145 150 155 160 Glu Ile Met Gln Ala Val Arg Leu Pro Leu Met Ser Leu Thr Glu Leu 165 170 175 Leu Asn Val Val Arg Pro Ser Gly Leu Leu Ser Pro Asp Ala Ile Leu 180 185 190 Asp Ala Ile Lys Val Arg Ser Glu Ser Arg Asp Met Asp Leu Asn Tyr 195 200 205 Arg Gly Met Leu Ile Pro Glu Glu Asn Ile Ala Thr Met Lys Tyr Gly 210 215 220 Ala Gln Val Val Lys Gly Glu Leu Lys Ser Ala Leu Leu Asp Gly Asp 225 230 235 240 Thr Gln Asn Tyr Asp Leu Asp His Gly Phe Ser Arg His Pro Ile Asp 245 250 255 Asp Asp Cys Arg Ser Gly Ile Glu Ile Lys Leu Gly Gln Pro Ser Ile 260 265 270 Ile Asn His Ile Arg Ile Leu Leu Trp Asp Arg Asp Ser Arg Ser Tyr 275 280 285 Ser Tyr Phe Ile Glu Val Ser Met Asp Glu Leu Asp Trp Val Arg Val 290 295 300 Ile Asp His Ser Gln Tyr Leu Cys Arg Ser Trp Gln Lys Leu Tyr Phe 305 310 315 320 Pro Ala Arg Val Cys Ser Gly Asp Gly Val Ser Leu Trp Cys Pro Leu 325 330 335 Trp Ser Arg Thr Pro Glu Leu Lys Gln Ser Ser Leu Leu Gly Leu Pro 340 345 350 Lys Cys Arg Tyr Ile Arg Ile Val Gly Thr His Asn Thr Val Asn Lys 355 360 365 Ile Phe His Ile Val Ala Phe Glu Cys Met Phe Thr Asn Lys Thr Phe 370 375 380 Thr Leu Glu Lys Gly Leu Ile Val Pro Met Glu Asn Val Ala Thr Ile 385 390 395 400 Ala Asp Cys Ala Ser Val Ile Glu Gly Val Ser Arg Ser Arg Asn Ala 405 410 415 Leu Leu Asn Gly Asp Thr Lys Asn Tyr Asp Trp Asp Ser Gly Tyr Thr 420 425 430 Cys His Gln Leu Gly Ser Gly Ala Ile Val Val Gln Leu Ala Gln Pro 435 440 445 Tyr Met Ile Gly Ser Ile Arg Leu Leu Leu Trp Asp Cys Asp Asp Arg 450 455 460 Ser Tyr Ser Tyr Tyr Val Glu Val Ser Thr Asn Gln Gln Gln Trp Thr 465 470 475 480 Met Val Ala Asp Arg Thr Lys Val Ser Cys Lys Ser Trp Gln Ser Val 485 490 495 Thr Phe Glu Arg Gln Pro Ala Ser Phe Ile Arg Ile Val Gly Thr His 500 505 510 Asn Thr Ala Asn Glu Val Phe His Cys Val His Phe Glu Cys Pro Glu 515 520 525 Gln Gln Ser Ser Gln Lys Glu Glu Asn Ser Glu Glu Ser Gly Thr Gly 530 535 540 Asp Thr Ser Leu Ala Gly Gln Gln Leu Asp Ser His Ala Leu Arg Ala 545 550 555 560 Pro Ser Gly Ser Ser Leu Pro Ser Ser Pro Gly Ser Asn Ser Arg Ser 565 570 575 Pro Asn Arg Gln His Gln Ala 580 <210> 4 <211> 2034 <212> DNA <213> Artificial Sequence <220> <223> BTBD9 <400> 4 catgtttatt gttccaaggg agcctcatca aaaccacttt taaagcattg agaatccaaa 60 taaataccat ggatcataca ggattgggga taagtcttga ggtctccccg agaaggagaa 120 gttctttaat ttggtaattt tagggtaatg gttggatgct ctctggatta atggaaacct 180 tcaggggtac ttttaggcag agtataaccc agtgaagagc aaatcctatt tgagggtttc 240 ttcatgtgtt tcttccctga tgtgcagagc attattatat ggtggaatgc gagagtctca 300 gcctgaagca gaaattcctc tccaagacac cactgcagaa gcattcacaa tgctactcaa 360 atatatctac actgggcggg caacgctgac agatgagaag gaggaggtgc tgctggactt 420 tttgagcctg gctcataaat atggatttcc agagctagag gattctacct ctgagtatct 480 ctgcaccata cttaacattc agaatgtctg catgactttt gatgttgcca gtctctactc 540 acttcccaag ttaacttgta tgtgctgcat gtttatggat aggaatgctc aggaagtcct 600 ctcaagtgaa ggtttcctct ccctttctaa gacagcactt ttaaacatcg tgttaagaga 660 ctcatttgca gctcccgaaa aagatatttt cctagcctta ttaaactggt gtaagcacaa 720 ttcaaaggag aatcatgctg aaatcatgca ggctgtgcgt ttacctctca tgagcctcac 780 agagcttctg aatgttgtga ggccttcagg actgctgtct cctgatgcca tcctggatgc 840 cattaaagtg cgatctgaga gccgggatat ggacctcaat tatagaggca tgctcatacc 900 agaagaaaac attgcaacta tgaagtatgg agcccaagtt gtaaaggggg agctgaaatc 960 agccttatta gatggtgata ctcaaaatta tgatttggat catggatttt caaggcaccc 1020 aattgatgat gactgccgtt ccggcatcga gattaagcta ggtcagccat ccattatcaa 1080 tcacatacgg atactcttgt gggaccgaga tagccggtct tactcatact tcattgaagt 1140 gtcaatggat gaacttgatt gggtcagagt gatagatcat tcacaatatc tgtgtcgttc 1200 ttggcagaaa ttatattttc cagcccgtgt ctgcagtgga gatggagtct cactatggtg 1260 cccactctgg tctcgaactc ctgagctcaa gcaatcctcc ctccttggcc ttccaaagtg 1320 caggtatatt cgaattgttg ggactcacaa cacagtgaac aagatttttc acattgtggc 1380 ttttgaatgt atgtttacaa acaaaacctt cactcttgag aaggggctga tagttcccat 1440 ggagaatgtt gcaacaattg ctgattgtgc cagtgtgatt gaaggagtca gtcggagccg 1500 aaatgccttg ctgaatgggg acactaagaa ttatgactgg gattctggct acacatgtca 1560 ccagctagga agtggtgcga ttgtggttca gttggcacaa ccgtacatga ttgggtcaat 1620 acggttacta ctttgggatt gtgatgatcg aagctatagc tactacgttg aggtttctac 1680 caaccagcaa cagtggacca tggttgctga cagaactaaa gtctcctgca agtcctggca 1740 gtcagtaact tttgaaaggc agcctgcctc cttcatccgt atcgttggga cacacaacac 1800 agcaaatgag gtgttccact gtgtccactt tgagtgtcca gagcagcaga gcagccagaa 1860 ggaggaaaat agtgaggaat cggggacagg ggacaccagc ctggccggtc agcagctcga 1920 ctcccatgcg ctgcgggcgc ctagtggcag ctcactaccc tccagcccag gctccaactc 1980 acgctccccc aaccggcagc accaataaag gaggcagcgg gcctggtgtg actt 2034 <210> 5 <211> 437 <212> PRT <213> Artificial Sequence <220> <223> SGCE <400> 5 Met Gln Leu Pro Arg Trp Trp Glu Leu Gly Asp Pro Cys Ala Trp Thr 1 5 10 15 Gly Gln Gly Arg Gly Thr Arg Arg Met Ser Pro Ala Thr Thr Gly Thr 20 25 30 Phe Leu Leu Thr Val Tyr Ser Ile Phe Ser Lys Val His Ser Asp Arg 35 40 45 Asn Val Tyr Pro Ser Ala Gly Val Leu Phe Val His Val Leu Glu Arg 50 55 60 Glu Tyr Phe Lys Gly Glu Phe Pro Pro Tyr Pro Lys Pro Gly Glu Ile 65 70 75 80 Ser Asn Asp Pro Ile Thr Phe Asn Thr Asn Leu Met Gly Tyr Pro Asp 85 90 95 Arg Pro Gly Trp Leu Arg Tyr Ile Gln Arg Thr Pro Tyr Ser Asp Gly 100 105 110 Val Leu Tyr Gly Ser Pro Thr Ala Glu Asn Val Gly Lys Pro Thr Ile 115 120 125 Ile Glu Ile Thr Ala Tyr Asn Arg Arg Thr Phe Glu Thr Ala Arg His 130 135 140 Asn Leu Ile Ile Asn Ile Met Ser Ala Glu Asp Phe Pro Leu Pro Tyr 145 150 155 160 Gln Ala Glu Phe Phe Ile Lys Asn Met Asn Val Glu Glu Met Leu Ala 165 170 175 Ser Glu Val Leu Gly Asp Phe Leu Gly Ala Val Lys Asn Val Trp Gln 180 185 190 Pro Glu Arg Leu Asn Ala Ile Asn Ile Thr Ser Ala Leu Asp Arg Gly 195 200 205 Gly Arg Val Pro Leu Pro Ile Asn Asp Leu Lys Glu Gly Val Tyr Val 210 215 220 Met Val Gly Ala Asp Val Pro Phe Ser Ser Cys Leu Arg Glu Val Glu 225 230 235 240 Asn Pro Gln Asn Gln Leu Arg Cys Ser Gln Glu Met Glu Pro Val Ile 245 250 255 Thr Cys Asp Lys Lys Phe Arg Thr Gln Phe Tyr Ile Asp Trp Cys Lys 260 265 270 Ile Ser Leu Val Asp Lys Thr Lys Gln Val Ser Thr Tyr Gln Glu Val 275 280 285 Ile Arg Gly Glu Gly Ile Leu Pro Asp Gly Gly Glu Tyr Lys Pro Pro 290 295 300 Ser Asp Ser Leu Lys Ser Arg Asp Tyr Tyr Thr Asp Phe Leu Ile Thr 305 310 315 320 Leu Ala Val Pro Ser Ala Val Ala Leu Val Leu Phe Leu Ile Leu Ala 325 330 335 Tyr Ile Met Cys Cys Arg Arg Glu Gly Val Glu Lys Arg Asn Met Gln 340 345 350 Thr Pro Asp Ile Gln Leu Val His His Ser Ala Ile Gln Lys Ser Thr 355 360 365 Lys Glu Leu Arg Asp Met Ser Lys Asn Arg Glu Ile Ala Trp Pro Leu 370 375 380 Ser Thr Leu Pro Val Phe His Pro Val Thr Gly Glu Ile Ile Pro Pro 385 390 395 400 Leu His Thr Asp Asn Tyr Asp Ser Thr Asn Met Pro Leu Met Gln Thr 405 410 415 Gln Gln Asn Leu Pro His Gln Thr Gln Ile Pro Gln Gln Gln Thr Thr 420 425 430 Gly Lys Trp Tyr Pro 435 <210> 6 <211> 1615 <212> DNA <213> Artificial Sequence <220> <223> SGCE <400> 6 gcctagccag gccaagaatg caattgcccc ggtggtggga gctgggagac ccctgtgctt 60 ggacgggaca gggtcggggg acacgcagga tgagccccgc gaccactggc acattcttgc 120 tgacagtgta cagtattttc tccaaggtac actccgatcg gaatgtatac ccatcagcag 180 gtgtcctctt tgttcatgtt ttggaaagag aatattttaa gggggaattt ccaccttacc 240 caaaacctgg cgagattagt aatgatccca taacatttaa tacaaattta atgggttacc 300 cagaccgacc tggatggctt cgatatatcc aaaggacacc atatagtgat ggagtcctat 360 atgggtcccc aacagctgaa aatgtgggga agccaacaat cattgagata actgcctaca 420 acaggcgcac ctttgagact gcaaggcata atttgataat taatataatg tctgcagaag 480 acttcccgtt gccatatcaa gcagaattct tcattaagaa tatgaatgta gaagaaatgt 540 tggccagtga ggttcttgga gactttcttg gcgcagtgaa aaatgtgtgg cagccagagc 600 gcctgaacgc cataaacatc acatcggccc tagacagggg tggcagggtg ccacttccca 660 ttaatgacct gaaggagggc gtttatgtca tggttggtgc agatgtcccg ttttcttctt 720 gtttacgaga agttgaaaat ccacagaatc aattgagatg tagtcaagaa atggagcctg 780 taataacatg tgataaaaaa tttcgtactc aattttacat tgactggtgc aaaatttcat 840 tggttgataa aacaaagcaa gtgtccacct atcaggaagt gattcgtgga gaggggattt 900 tacctgatgg tggagaatac aaaccccctt ctgattcttt gaaaagcaga gactattaca 960 cggatttcct aattacactg gctgtgccct cggcagtggc actggtcctt tttctaatac 1020 ttgcttatat catgtgctgc cgacgggaag gcgtggaaaa gagaaacatg caaacaccag 1080 acatccaact ggtccatcac agtgctattc agaaatctac caaggagctt cgagacatgt 1140 ccaagaatag agagatagca tggcccctgt caacgcttcc tgtgttccac cctgtgactg 1200 gggaaatcat acctccttta cacacagaca actatgatag cacaaacatg ccattgatgc 1260 aaacgcagca gaacttgcca catcagactc agattcccca acagcagact acaggtaaat 1320 ggtatccctg aagaaagaaa actgactgaa gcaatgaatt tataatcaga caatatagca 1380 gttacatcac atttcttttc tcttccaata atgcatgagc ttttctggca tatgttatgc 1440 atgttggcag tattaagtgt ataccaaata atacaacata actttcattt tactaatgta 1500 tttttttgta cttaaagcat ttttgacaat ttgtaaaaca ttgatgactt tatatttgtt 1560 acaataaaag ttgatcttta aaataaatat tattaatgaa gcctaaaaaa aaaaa 1615 <210> 7 <211> 486 <212> PRT <213> Artificial Sequence <220> <223> MECP2 <400> 7 Met Val Ala Gly Met Leu Gly Leu Arg Glu Glu Lys Ser Glu Asp Gln 1 5 10 15 Asp Leu Gln Gly Leu Lys Asp Lys Pro Leu Lys Phe Lys Lys Val Lys 20 25 30 Lys Asp Lys Lys Glu Glu Lys Glu Gly Lys His Glu Pro Val Gln Pro 35 40 45 Ser Ala His His Ser Ala Glu Pro Ala Glu Ala Gly Lys Ala Glu Thr 50 55 60 Ser Glu Gly Ser Gly Ser Ala Pro Ala Val Pro Glu Ala Ser Ala Ser 65 70 75 80 Pro Lys Gln Arg Arg Ser Ile Ile Arg Asp Arg Gly Pro Met Tyr Asp 85 90 95 Asp Pro Thr Leu Pro Glu Gly Trp Thr Arg Lys Leu Lys Gln Arg Lys 100 105 110 Ser Gly Arg Ser Ala Gly Lys Tyr Asp Val Tyr Leu Ile Asn Pro Gln 115 120 125 Gly Lys Ala Phe Arg Ser Lys Val Glu Leu Ile Ala Tyr Phe Glu Lys 130 135 140 Val Gly Asp Thr Ser Leu Asp Pro Asn Asp Phe Asp Phe Thr Val Thr 145 150 155 160 Gly Arg Gly Ser Pro Ser Arg Arg Glu Gln Lys Pro Pro Lys Lys Pro 165 170 175 Lys Ser Pro Lys Ala Pro Gly Thr Gly Arg Gly Arg Gly Arg Pro Lys 180 185 190 Gly Ser Gly Thr Thr Arg Pro Lys Ala Ala Thr Ser Glu Gly Val Gln 195 200 205 Val Lys Arg Val Leu Glu Lys Ser Pro Gly Lys Leu Leu Val Lys Met 210 215 220 Pro Phe Gln Thr Ser Pro Gly Gly Lys Ala Glu Gly Gly Gly Ala Thr 225 230 235 240 Thr Ser Thr Gln Val Met Val Ile Lys Arg Pro Gly Arg Lys Arg Lys 245 250 255 Ala Glu Ala Asp Pro Gln Ala Ile Pro Lys Lys Arg Gly Arg Lys Pro 260 265 270 Gly Ser Val Val Ala Ala Ala Ala Ala Glu Ala Lys Lys Lys Ala Val 275 280 285 Lys Glu Ser Ser Ile Arg Ser Val Gln Glu Thr Val Leu Pro Ile Lys 290 295 300 Lys Arg Lys Thr Arg Glu Thr Val Ser Ile Glu Val Lys Glu Val Val 305 310 315 320 Lys Pro Leu Leu Val Ser Thr Leu Gly Glu Lys Ser Gly Lys Gly Leu 325 330 335 Lys Thr Cys Lys Ser Pro Gly Arg Lys Ser Lys Glu Ser Ser Pro Lys 340 345 350 Gly Arg Ser Ser Ser Ala Ser Ser Pro Pro Lys Lys Glu His His His 355 360 365 His His His His Ser Glu Ser Pro Lys Ala Pro Val Pro Leu Leu Pro 370 375 380 Pro Leu Pro Pro Pro Pro Pro Glu Pro Glu Ser Ser Glu Asp Pro Thr 385 390 395 400 Ser Pro Pro Glu Pro Gln Asp Leu Ser Ser Ser Val Cys Lys Glu Glu 405 410 415 Lys Met Pro Arg Gly Gly Ser Leu Glu Ser Asp Gly Cys Pro Lys Glu 420 425 430 Pro Ala Lys Thr Gln Pro Ala Val Ala Thr Ala Ala Thr Ala Ala Glu 435 440 445 Lys Tyr Lys His Arg Gly Glu Gly Glu Arg Lys Asp Ile Val Ser Ser 450 455 460 Ser Met Pro Arg Pro Asn Arg Glu Glu Pro Val Asp Ser Arg Thr Pro 465 470 475 480 Val Thr Glu Arg Val Ser 485 <210> 8 <211> 10241 <212> DNA <213> Artificial Sequence <220> <223> MECP2 <400> 8 ccggcgtcgg cggcgcgcgc gctccctcct ctcggagaga gggctgtggt aaaagccgtc 60 cggaaaatgg ccgccgccgc cgccgccgcg ccgagcggag gaggaggagg aggcgaggag 120 gagagactgc tccataaaaa tacagactca ccagttcctg ctttgatgtg acatgtgact 180 ccccagaata caccttgctt ctgtagacca gctccaacag gattccatgg tagctgggat 240 gttagggctc agggaagaaa agtcagaaga ccaggacctc cagggcctca aggacaaacc 300 cctcaagttt aaaaaggtga agaaagataa gaaagaagag aaagagggca agcatgagcc 360 cgtgcagcca tcagcccacc actctgctga gcccgcagag gcaggcaaag cagagacatc 420 agaagggtca ggctccgccc cggctgtgcc ggaagcttct gcctccccca aacagcggcg 480 ctccatcatc cgtgaccggg gacccatgta tgatgacccc accctgcctg aaggctggac 540 acggaagctt aagcaaagga aatctggccg ctctgctggg aagtatgatg tgtatttgat 600 caatccccag ggaaaagcct ttcgctctaa agtggagttg attgcgtact tcgaaaaggt 660 aggcgacaca tccctggacc ctaatgattt tgacttcacg gtaactggga gagggagccc 720 ctcccggcga gagcagaaac cacctaagaa gcccaaatct cccaaagctc caggaactgg 780 cagaggccgg ggacgcccca aagggagcgg caccacgaga cccaaggcgg ccacgtcaga 840 gggtgtgcag gtgaaaaggg tcctggagaa aagtcctggg aagctccttg tcaagatgcc 900 ttttcaaact tcgccagggg gcaaggctga ggggggtggg gccaccacat ccacccaggt 960 catggtgatc aaacgccccg gcaggaagcg aaaagctgag gccgaccctc aggccattcc 1020 gaagaaacgg ggccgaaagc cggggagtgt ggtggcagcc gctgccgccg aggccaaaaa 1080 gaaagccgtg aaggagtctt ctatccgatc tgtgcaggag accgtactcc ccatcaagaa 1140 gcgcaagacc cgggagacgg tcagcatcga ggtcaaggaa gtggtgaagc ccctgctggt 1200 gtccaccctc ggtgagaaga gcgggaaagg actgaagacc tgtaagagcc ctgggcggaa 1260 aagcaaggag agcagcccca aggggcgcag cagcagcgcc tcctcacccc ccaagaagga 1320 gcaccaccac catcaccacc actcagagtc cccaaaggcc cccgtgccac tgctcccacc 1380 cctgccccca cctccacctg agcccgagag ctccgaggac cccaccagcc cccctgagcc 1440 ccaggacttg agcagcagcg tctgcaaaga ggagaagatg cccagaggag gctcactgga 1500 gagcgacggc tgccccaagg agccagctaa gactcagccc gcggttgcca ccgccgccac 1560 ggccgcagaa aagtacaaac accgagggga gggagagcgc aaagacattg tttcatcctc 1620 catgccaagg ccaaacagag aggagcctgt ggacagccgg acgcccgtga ccgagagagt 1680 tagctgactt tacacggagc ggattgcaaa gcaaaccaac aagaataaag gcagctgttg 1740 tctcttctcc ttatgggtag ggctctgaca aagcttcccg attaactgaa ataaaaaata 1800 tttttttttc tttcagtaaa cttagagttt cgtggcttca gggtgggagt agttggagca 1860 ttggggatgt ttttcttacc gacaagcaca gtcaggttga agacctaacc agggccagaa 1920 gtagctttgc acttttctaa actaggctcc ttcaacaagg cttgctgcag atactactga 1980 ccagacaagc tgttgaccag gcacctcccc tcccgcccaa acctttcccc catgtggtcg 2040 ttagagacag agcgacagag cagttgagag gacactcccg ttttcggtgc catcagtgcc 2100 ccgtctacag ctcccccagc tccccccacc tcccccactc ccaaccacgt tgggacaggg 2160 aggtgtgagg caggagagac agttggattc tttagagaag atggatatga ccagtggcta 2220 tggcctgtgc gatcccaccc gtggtggctc aagtctggcc ccacaccagc cccaatccaa 2280 aactggcaag gacgcttcac aggacaggaa agtggcacct gtctgctcca gctctggcat 2340 ggctaggagg ggggagtccc ttgaactact gggtgtagac tggcctgaac cacaggagag 2400 gatggcccag ggtgaggtgg catggtccat tctcaaggga cgtcctccaa cgggtggcgc 2460 tagaggccat ggaggcagta ggacaaggtg caggcaggct ggcctggggt caggccgggc 2520 agagcacagc ggggtgagag ggattcctaa tcactcagag cagtctgtga cttagtggac 2580 aggggagggg gcaaaggggg aggagaagaa aatgttcttc cagttacttt ccaattctcc 2640 tttagggaca gcttagaatt atttgcacta ttgagtcttc atgttcccac ttcaaaacaa 2700 acagatgctc tgagagcaaa ctggcttgaa ttggtgacat ttagtccctc aagccaccag 2760 atgtgacagt gttgagaact acctggattt gtatatatac ctgcgcttgt tttaaagtgg 2820 gctcagcaca tagggttccc acgaagctcc gaaactctaa gtgtttgctg caattttata 2880 aggacttcct gattggtttc tcttctcccc ttccatttct gccttttgtt catttcatcc 2940 tttcacttct ttcccttcct ccgtcctcct ccttcctagt tcatcccttc tcttccaggc 3000 agccgcggtg cccaaccaca cttgtcggct ccagtcccca gaactctgcc tgccctttgt 3060 cctcctgctg ccagtaccag ccccaccctg ttttgagccc tgaggaggcc ttgggctctg 3120 ctgagtccga cctggcctgt ctgtgaagag caagagagca gcaaggtctt gctctcctag 3180 gtagccccct cttccctggt aagaaaaagc aaaaggcatt tcccaccctg aacaacgagc 3240 cttttcaccc ttctactcta gagaagtgga ctggaggagc tgggcccgat ttggtagttg 3300 aggaaagcac agaggcctcc tgtggcctgc cagtcatcga gtggcccaac aggggctcca 3360 tgccagccga ccttgacctc actcagaagt ccagagtcta gcgtagtgca gcagggcagt 3420 agcggtacca atgcagaact cccaagaccc gagctgggac cagtacctgg gtccccagcc 3480 cttcctctgc tccccctttt ccctcggagt tcttcttgaa tggcaatgtt ttgcttttgc 3540 tcgatgcaga cagggggcca gaacaccaca catttcactg tctgtctggt ccatagctgt 3600 ggtgtagggg cttagaggca tgggcttgct gtgggttttt aattgatcag ttttcatgtg 3660 ggatcccatc tttttaacct ctgttcagga agtccttatc tagctgcata tcttcatcat 3720 attggtatat ccttttctgt gtttacagag atgtctctta tatctaaatc tgtccaactg 3780 agaagtacct tatcaaagta gcaaatgaga cagcagtctt atgcttccag aaacacccac 3840 aggcatgtcc catgtgagct gctgccatga actgtcaagt gtgtgttgtc ttgtgtattt 3900 cagttattgt ccctggcttc cttactatgg tgtaatcatg aaggagtgaa acatcataga 3960 aactgtctag cacttccttg ccagtcttta gtgatcagga accatagttg acagttccaa 4020 tcagtagctt aagaaaaaac cgtgtttgtc tcttctggaa tggttagaag tgagggagtt 4080 tgccccgttc tgtttgtaga gtctcatagt tggactttct agcatatatg tgtccatttc 4140 cttatgctgt aaaagcaagt cctgcaacca aactcccatc agcccaatcc ctgatccctg 4200 atcccttcca cctgctctgc tgatgacccc cccagcttca cttctgactc ttccccagga 4260 agggaagggg ggtcagaaga gagggtgagt cctccagaac tcttcctcca aggacagaag 4320 gctcctgccc ccatagtggc ctcgaactcc tggcactacc aaaggacact tatccacgag 4380 agcgcagcat ccgaccaggt tgtcactgag aagatgttta ttttggtcag ttgggttttt 4440 atgtattata cttagtcaaa tgtaatgtgg cttctggaat cattgtccag agctgcttcc 4500 ccgtcacctg ggcgtcatct ggtcctggta agaggagtgc gtggcccacc aggcccccct 4560 gtcacccatg acagttcatt cagggccgat ggggcagtcg tggttgggaa cacagcattt 4620 caagcgtcac tttatttcat tcgggcccca cctgcagctc cctcaaagag gcagttgccc 4680 agcctctttc ccttccagtt tattccagag ctgccagtgg ggcctgaggc tccttagggt 4740 tttctctcta tttccccctt tcttcctcat tccctcgtct ttcccaaagg catcacgagt 4800 cagtcgcctt tcagcaggca gccttggcgg tttatcgccc tggcaggcag gggccctgca 4860 gctctcatgc tgcccctgcc ttggggtcag gttgacagga ggttggaggg aaagccttaa 4920 gctgcaggat tctcaccagc tgtgtccggc ccagttttgg ggtgtgacct caatttcaat 4980 tttgtctgta cttgaacatt atgaagatgg gggcctcttt cagtgaattt gtgaacagca 5040 gaattgaccg acagctttcc agtacccatg gggctaggtc attaaggcca catccacagt 5100 ctcccccacc cttgttccag ttgttagtta ctacctcctc tcctgacaat actgtatgtc 5160 gtcgagctcc ccccaggtct acccctcccg gccctgcctg ctggtgggct tgtcatagcc 5220 agtgggattg ccggtcttga cagctcagtg agctggagat acttggtcac agccaggcgc 5280 tagcacagct cccttctgtt gatgctgtat tcccatatca aaagacacag gggacaccca 5340 gaaacgccac atcccccaat ccatcagtgc caaactagcc aacggcccca gcttctcagc 5400 tcgctggatg gcggaagctg ctactcgtga gcgccagtgc gggtgcagac aatcttctgt 5460 tgggtggcat cattccaggc ccgaagcatg aacagtgcac ctgggacagg gagcagcccc 5520 aaattgtcac ctgcttctct gcccagcttt tcattgctgt gacagtgatg gcgaaagagg 5580 gtaataacca gacacaaact gccaagttgg gtggagaaag gagtttcttt agctgacaga 5640 atctctgaat tttaaatcac ttagtaagcg gctcaagccc aggagggagc agagggatac 5700 gagcggagtc ccctgcgcgg gaccatctgg aattggttta gcccaagtgg agcctgacag 5760 ccagaactct gtgtcccccg tctaaccaca gctccttttc cagagcattc cagtcaggct 5820 ctctgggctg actgggccag gggaggttac aggtaccagt tctttaagaa gatctttggg 5880 catatacatt tttagcctgt gtcattgccc caaatggatt cctgtttcaa gttcacacct 5940 gcagattcta ggacctgtgt cctagacttc agggagtcag ctgtttctag agttcctacc 6000 atggagtggg tctggaggac ctgcccggtg ggggggcaga gccctgctcc ctccgggtct 6060 tcctactctt ctctctgctc tgacgggatt tgttgattct ctccattttg gtgtctttct 6120 cttttagata ttgtatcaat ctttagaaaa ggcatagtct acttgttata aatcgttagg 6180 atactgcctc ccccagggtc taaaattaca tattagaggg gaaaagctga acactgaagt 6240 cagttctcaa caatttagaa ggaaaaccta gaaaacattt ggcagaaaat tacatttcga 6300 tgtttttgaa tgaatacgag caagctttta caacagtgct gatctaaaaa tacttagcac 6360 ttggcctgag atgcctggtg agcattacag gcaaggggaa tctggaggta gccgacctga 6420 ggacatggct tctgaacctg tcttttggga gtggtatgga aggtggagcg ttcaccagtg 6480 acctggaagg cccagcacca ccctccttcc cactcttctc atcttgacag agcctgcccc 6540 agcgctgacg tgtcaggaaa acacccaggg aactaggaag gcacttctgc ctgaggggca 6600 gcctgccttg cccactcctg ctctgctcgc ctcggatcag ctgagccttc tgagctggcc 6660 tctcactgcc tccccaaggc cccctgcctg ccctgtcagg aggcagaagg aagcaggtgt 6720 gagggcagtg caaggaggga gcacaacccc cagctcccgc tccgggctcc gacttgtgca 6780 caggcagagc ccagaccctg gaggaaatcc tacctttgaa ttcaagaaca tttggggaat 6840 ttggaaatct ctttgccccc aaacccccat tctgtcctac ctttaatcag gtcctgctca 6900 gcagtgagag cagatgaggt gaaaaggcca agaggtttgg ctcctgccca ctgatagccc 6960 ctctccccgc agtgtttgtg tgtcaagtgg caaagctgtt cttcctggtg accctgatta 7020 tatccagtaa cacatagact gtgcgcatag gcctgctttg tctcctctat cctgggcttt 7080 tgttttgctt tttagttttg cttttagttt ttctgtccct tttatttaac gcaccgacta 7140 gacacacaaa gcagttgaat ttttatatat atatctgtat attgcacaat tataaactca 7200 ttttgcttgt ggctccacac acacaaaaaa agacctgtta aaattatacc tgttgcttaa 7260 ttacaatatt tctgataacc atagcatagg acaagggaaa ataaaaaaag aaaaaaaaga 7320 aaaaaaaacg acaaatctgt ctgctggtca cttcttctgt ccaagcagat tcgtggtctt 7380 ttcctcgctt ctttcaaggg ctttcctgtg ccaggtgaag gaggctccag gcagcaccca 7440 ggttttgcac tcttgtttct cccgtgcttg tgaaagaggt cccaaggttc tgggtgcagg 7500 agcgctccct tgacctgctg aagtccggaa cgtagtcggc acagcctggt cgccttccac 7560 ctctgggagc tggagtccac tggggtggcc tgactccccc agtccccttc ccgtgacctg 7620 gtcagggtga gcccatgtgg agtcagcctc gcaggcctcc ctgccagtag ggtccgagtg 7680 tgtttcatcc ttcccactct gtcgagcctg ggggctggag cggagacggg aggcctggcc 7740 tgtctcggaa cctgtgagct gcaccaggta gaacgccagg gaccccagaa tcatgtgcgt 7800 cagtccaagg ggtcccctcc aggagtagtg aagactccag aaatgtccct ttcttctccc 7860 ccatcctacg agtaattgca tttgcttttg taattcttaa tgagcaatat ctgctagaga 7920 gtttagctgt aacagttctt tttgatcatc tttttttaat aattagaaac accaaaaaaa 7980 tccagaaact tgttcttcca aagcagagag cattataatc accagggcca aaagcttccc 8040 tccctgctgt cattgcttct tctgaggcct gaatccaaaa gaaaaacagc cataggccct 8100 ttcagtggcc gggctacccg tgagcccttc ggaggaccag ggctggggca gcctctgggc 8160 ccacatccgg ggccagctcc ggcgtgtgtt cagtgttagc agtgggtcat gatgctcttt 8220 cccacccagc ctgggatagg ggcagaggag gcgaggaggc cgttgccgct gatgtttggc 8280 cgtgaacagg tgggtgtctg cgtgcgtcca cgtgcgtgtt ttctgactga catgaaatcg 8340 acgcccgagt tagcctcacc cggtgacctc tagccctgcc cggatggagc ggggcccacc 8400 cggttcagtg tttctgggga gctggacagt ggagtgcaaa aggcttgcag aacttgaagc 8460 ctgctccttc ccttgctacc acggcctcct ttccgtttga tttgtcactg cttcaatcaa 8520 taacagccgc tccagagtca gtagtcaatg aatatatgac caaatatcac caggactgtt 8580 actcaatgtg tgccgagccc ttgcccatgc tgggctcccg tgtatctgga cactgtaacg 8640 tgtgctgtgt ttgctcccct tccccttcct tctttgccct ttacttgtct ttctggggtt 8700 tttctgtttg ggtttggttt ggtttttatt tctccttttg tgttccaaac atgaggttct 8760 ctctactggt cctcttaact gtggtgttga ggcttatatt tgtgtaattt ttggtgggtg 8820 aaaggaattt tgctaagtaa atctcttctg tgtttgaact gaagtctgta ttgtaactat 8880 gtttaaagta attgttccag agacaaatat ttctagacac tttttcttta caaacaaaag 8940 cattcggagg gagggggatg gtgactgaga tgagagggga gagctgaaca gatgacccct 9000 gcccagatca gccagaagcc acccaaagca gtggagccca ggagtcccac tccaagccag 9060 caagccgaat agctgatgtg ttgccacttt ccaagtcact gcaaaaccag gttttgttcc 9120 gcccagtgga ttcttgtttt gcttcccctc cccccgagat tattaccacc atcccgtgct 9180 tttaaggaaa ggcaagattg atgtttcctt gaggggagcc aggaggggat gtgtgtgtgc 9240 agagctgaag agctggggag aatggggctg ggcccaccca agcaggaggc tgggacgctc 9300 tgctgtgggc acaggtcagg ctaatgttgg cagatgcagc tcttcctgga caggccaggt 9360 ggtgggcatt ctctctccaa ggtgtgcccc gtgggcatta ctgtttaaga cacttccgtc 9420 acatcccacc ccatcctcca gggctcaaca ctgtgacatc tctattcccc accctcccct 9480 tcccagggca ataaaatgac catggagggg gcttgcactc tcttggctgt cacccgatcg 9540 ccagcaaaac ttagatgtga gaaaacccct tcccattcca tggcgaaaac atctccttag 9600 aaaagccatt accctcatta ggcatggttt tgggctccca aaacacctga cagcccctcc 9660 ctcctctgag aggcggagag tgctgactgt agtgaccatt gcatgccggg tgcagcatct 9720 ggaagagcta ggcagggtgt ctgccccctc ctgagttgaa gtcatgctcc cctgtgccag 9780 cccagaggcc gagagctatg gacagcattg ccagtaacac aggccaccct gtgcagaagg 9840 gagctggctc cagcctggaa acctgtctga ggttgggaga ggtgcacttg gggcacaggg 9900 agaggccggg acacacttag ctggagatgt ctctaaaagc cctgtatcgt attcaccttc 9960 agtttttgtg ttttgggaca attactttag aaaataagta ggtcgtttta aaaacaaaaa 10020 ttattgattg cttttttgta gtgttcagaa aaaaggttct ttgtgtatag ccaaatgact 10080 gaaagcactg atatatttaa aaacaaaagg caatttatta aggaaatttg taccatttca 10140 gtaaacctgt ctgaatgtac ctgtatacgt ttcaaaaaca cccccccccc actgaatccc 10200 tgtaacctat ttattatata aagagtttgc cttataaatt t 10241 <210> 9 <211> 5202 <212> PRT <213> Artificial Sequence <220> <223> USH2A <400> 9 Met Asn Cys Pro Val Leu Ser Leu Gly Ser Gly Phe Leu Phe Gln Val 1 5 10 15 Ile Glu Met Leu Ile Phe Ala Tyr Phe Ala Ser Ile Ser Leu Thr Glu 20 25 30 Ser Arg Gly Leu Phe Pro Arg Leu Glu Asn Val Gly Ala Phe Lys Lys 35 40 45 Val Ser Ile Val Pro Thr Gln Ala Val Cys Gly Leu Pro Asp Arg Ser 50 55 60 Thr Phe Cys His Ser Ser Ala Ala Ala Glu Ser Ile Gln Phe Cys Thr 65 70 75 80 Gln Arg Phe Cys Ile Gln Asp Cys Pro Tyr Arg Ser Ser His Pro Thr 85 90 95 Tyr Thr Ala Leu Phe Ser Ala Gly Leu Ser Ser Cys Ile Thr Pro Asp 100 105 110 Lys Asn Asp Leu His Pro Asn Ala His Ser Asn Ser Ala Ser Phe Ile 115 120 125 Phe Gly Asn His Lys Ser Cys Phe Ser Ser Pro Pro Ser Pro Lys Leu 130 135 140 Met Ala Ser Phe Thr Leu Ala Val Trp Leu Lys Pro Glu Gln Gln Gly 145 150 155 160 Val Met Cys Val Ile Glu Lys Thr Val Asp Gly Gln Ile Val Phe Lys 165 170 175 Leu Thr Ile Ser Glu Lys Glu Thr Met Phe Tyr Tyr Arg Thr Val Asn 180 185 190 Gly Leu Gln Pro Pro Ile Lys Val Met Thr Leu Gly Arg Ile Leu Val 195 200 205 Lys Lys Trp Ile His Leu Ser Val Gln Val His Gln Thr Lys Ile Ser 210 215 220 Phe Phe Ile Asn Gly Val Glu Lys Asp His Thr Pro Phe Asn Ala Arg 225 230 235 240 Thr Leu Ser Gly Ser Ile Thr Asp Phe Ala Ser Gly Thr Val Gln Ile 245 250 255 Gly Gln Ser Leu Asn Gly Leu Glu Gln Phe Val Gly Arg Met Gln Asp 260 265 270 Phe Arg Leu Tyr Gln Val Ala Leu Thr Asn Arg Glu Ile Leu Glu Val 275 280 285 Phe Ser Gly Asp Leu Leu Arg Leu His Ala Gln Ser His Cys Arg Cys 290 295 300 Pro Gly Ser His Pro Arg Val His Pro Leu Ala Gln Arg Tyr Cys Ile 305 310 315 320 Pro Asn Asp Ala Gly Asp Thr Ala Asp Asn Arg Val Ser Arg Leu Asn 325 330 335 Pro Glu Ala His Pro Leu Ser Phe Val Asn Asp Asn Asp Val Gly Thr 340 345 350 Ser Trp Val Ser Asn Val Phe Thr Asn Ile Thr Gln Leu Asn Gln Gly 355 360 365 Val Thr Ile Ser Val Asp Leu Glu Asn Gly Gln Tyr Gln Val Phe Tyr 370 375 380 Ile Ile Ile Gln Phe Phe Ser Pro Gln Pro Thr Glu Ile Arg Ile Gln 385 390 395 400 Arg Lys Lys Glu Asn Ser Leu Asp Trp Glu Asp Trp Gln Tyr Phe Ala 405 410 415 Arg Asn Cys Gly Ala Phe Gly Met Lys Asn Asn Gly Asp Leu Glu Lys 420 425 430 Pro Asp Ser Val Asn Cys Leu Gln Leu Ser Asn Phe Thr Pro Tyr Ser 435 440 445 Arg Gly Asn Val Thr Phe Ser Ile Leu Thr Pro Gly Pro Asn Tyr Arg 450 455 460 Pro Gly Tyr Asn Asn Phe Tyr Asn Thr Pro Ser Leu Gln Glu Phe Val 465 470 475 480 Lys Ala Thr Gln Ile Arg Phe His Phe His Gly Gln Tyr Tyr Thr Thr 485 490 495 Glu Thr Ala Val Asn Leu Arg His Arg Tyr Tyr Ala Val Asp Glu Ile 500 505 510 Thr Ile Ser Gly Arg Cys Gln Cys His Gly His Ala Asp Asn Cys Asp 515 520 525 Thr Thr Ser Gln Pro Tyr Arg Cys Leu Cys Ser Gln Glu Ser Phe Thr 530 535 540 Glu Gly Leu His Cys Asp Arg Cys Leu Pro Leu Tyr Asn Asp Lys Pro 545 550 555 560 Phe Arg Gln Gly Asp Gln Val Tyr Ala Phe Asn Cys Lys Pro Cys Gln 565 570 575 Cys Asn Ser His Ser Lys Ser Cys His Tyr Asn Ile Ser Val Asp Pro 580 585 590 Phe Pro Phe Glu His Phe Arg Gly Gly Gly Gly Val Cys Asp Asp Cys 595 600 605 Glu His Asn Thr Thr Gly Arg Asn Cys Glu Leu Cys Lys Asp Tyr Phe 610 615 620 Phe Arg Gln Val Gly Ala Asp Pro Ser Ala Ile Asp Val Cys Lys Pro 625 630 635 640 Cys Asp Cys Asp Thr Val Gly Thr Arg Asn Gly Ser Ile Leu Cys Asp 645 650 655 Gln Ile Gly Gly Gln Cys Asn Cys Lys Arg His Val Ser Gly Arg Gln 660 665 670 Cys Asn Gln Cys Gln Asn Gly Phe Tyr Asn Leu Gln Glu Leu Asp Pro 675 680 685 Asp Gly Cys Ser Pro Cys Asn Cys Asn Thr Ser Gly Thr Val Asp Gly 690 695 700 Asp Ile Thr Cys His Gln Asn Ser Gly Gln Cys Lys Cys Lys Ala Asn 705 710 715 720 Val Ile Gly Leu Arg Cys Asp His Cys Asn Phe Gly Phe Lys Phe Leu 725 730 735 Arg Ser Phe Asn Asp Val Gly Cys Glu Pro Cys Gln Cys Asn Leu His 740 745 750 Gly Ser Val Asn Lys Phe Cys Asn Pro His Ser Gly Gln Cys Glu Cys 755 760 765 Lys Lys Glu Ala Lys Gly Leu Gln Cys Asp Thr Cys Arg Glu Asn Phe 770 775 780 Tyr Gly Leu Asp Val Thr Asn Cys Lys Ala Cys Asp Cys Asp Thr Ala 785 790 795 800 Gly Ser Leu Pro Gly Thr Val Cys Asn Ala Lys Thr Gly Gln Cys Ile 805 810 815 Cys Lys Pro Asn Val Glu Gly Arg Gln Cys Asn Lys Cys Leu Glu Gly 820 825 830 Asn Phe Tyr Leu Arg Gln Asn Asn Ser Phe Leu Cys Leu Pro Cys Asn 835 840 845 Cys Asp Lys Thr Gly Thr Ile Asn Gly Ser Leu Leu Cys Asn Lys Ser 850 855 860 Thr Gly Gln Cys Pro Cys Lys Leu Gly Val Thr Gly Leu Arg Cys Asn 865 870 875 880 Gln Cys Glu Pro His Arg Tyr Asn Leu Thr Ile Asp Asn Phe Gln His 885 890 895 Cys Gln Met Cys Glu Cys Asp Ser Leu Gly Thr Leu Pro Gly Thr Ile 900 905 910 Cys Asp Pro Ile Ser Gly Gln Cys Leu Cys Val Pro Asn Arg Gln Gly 915 920 925 Arg Arg Cys Asn Gln Cys Gln Pro Gly Phe Tyr Ile Ser Pro Gly Asn 930 935 940 Ala Thr Gly Cys Leu Pro Cys Ser Cys His Thr Thr Gly Ala Val Asn 945 950 955 960 His Ile Cys Asn Ser Leu Thr Gly Gln Cys Val Cys Gln Asp Ala Ser 965 970 975 Ile Ala Gly Gln Arg Cys Asp Gln Cys Lys Asp His Tyr Phe Gly Phe 980 985 990 Asp Pro Gln Thr Gly Arg Cys Gln Pro Cys Asn Cys His Leu Ser Gly 995 1000 1005 Ala Leu Asn Glu Thr Cys His Leu Val Thr Gly Gln Cys Phe Cys Lys 1010 1015 1020 Gln Phe Val Thr Gly Ser Lys Cys Asp Ala Cys Val Pro Ser Ala Ser 1025 1030 1035 1040 His Leu Asp Val Asn Asn Leu Leu Gly Cys Ser Lys Thr Pro Phe Gln 1045 1050 1055 Gln Pro Pro Pro Arg Gly Gln Val Gln Ser Ser Ser Ala Ile Asn Leu 1060 1065 1070 Ser Trp Ser Pro Pro Asp Ser Pro Asn Ala His Trp Leu Thr Tyr Ser 1075 1080 1085 Leu Leu Arg Asp Gly Phe Glu Ile Tyr Thr Thr Glu Asp Gln Tyr Pro 1090 1095 1100 Tyr Ser Ile Gln Tyr Phe Leu Asp Thr Asp Leu Leu Pro Tyr Thr Lys 1105 1110 1115 1120 Tyr Ser Tyr Tyr Ile Glu Thr Thr Asn Val His Gly Ser Thr Arg Ser 1125 1130 1135 Val Ala Val Thr Tyr Lys Thr Lys Pro Gly Val Pro Glu Gly Asn Leu 1140 1145 1150 Thr Leu Ser Tyr Ile Ile Pro Ile Gly Ser Asp Ser Val Thr Leu Thr 1155 1160 1165 Trp Thr Thr Leu Ser Asn Gln Ser Gly Pro Ile Glu Lys Tyr Ile Leu 1170 1175 1180 Ser Cys Ala Pro Leu Ala Gly Gly Gln Pro Cys Val Ser Tyr Glu Gly 1185 1190 1195 1200 His Glu Thr Ser Ala Thr Ile Trp Asn Leu Val Pro Phe Ala Lys Tyr 1205 1210 1215 Asp Phe Ser Val Gln Ala Cys Thr Ser Gly Gly Cys Leu His Ser Leu 1220 1225 1230 Pro Ile Thr Val Thr Thr Ala Gln Ala Pro Pro Gln Arg Leu Ser Pro 1235 1240 1245 Pro Lys Met Gln Lys Ile Ser Ser Thr Glu Leu His Val Glu Trp Ser 1250 1255 1260 Pro Pro Ala Glu Leu Asn Gly Ile Ile Ile Arg Tyr Glu Leu Tyr Met 1265 1270 1275 1280 Arg Arg Leu Arg Ser Thr Lys Glu Thr Thr Ser Glu Glu Ser Arg Val 1285 1290 1295 Phe Gln Ser Ser Gly Trp Leu Ser Pro His Ser Phe Val Glu Ser Ala 1300 1305 1310 Asn Glu Asn Ala Leu Lys Pro Pro Gln Thr Met Thr Thr Ile Thr Gly 1315 1320 1325 Leu Glu Pro Tyr Thr Lys Tyr Glu Phe Arg Val Leu Ala Val Asn Met 1330 1335 1340 Ala Gly Ser Val Ser Ser Ala Trp Val Ser Glu Arg Thr Gly Glu Ser 1345 1350 1355 1360 Ala Pro Val Phe Met Ile Pro Pro Ser Val Phe Pro Leu Ser Ser Tyr 1365 1370 1375 Ser Leu Asn Ile Ser Trp Glu Lys Pro Ala Asp Asn Val Thr Arg Gly 1380 1385 1390 Lys Val Val Gly Tyr Asp Ile Asn Met Leu Ser Glu Gln Ser Pro Gln 1395 1400 1405 Gln Ser Ile Pro Met Ala Phe Ser Gln Leu Leu His Thr Ala Lys Ser 1410 1415 1420 Gln Glu Leu Ser Tyr Thr Val Glu Gly Leu Lys Pro Tyr Arg Ile Tyr 1425 1430 1435 1440 Glu Phe Thr Ile Thr Leu Cys Asn Ser Val Gly Cys Val Thr Ser Ala 1445 1450 1455 Ser Gly Ala Gly Gln Thr Leu Ala Ala Ala Pro Ala Gln Leu Arg Pro 1460 1465 1470 Pro Leu Val Lys Gly Ile Asn Ser Thr Thr Ile His Leu Lys Trp Phe 1475 1480 1485 Pro Pro Glu Glu Leu Asn Gly Pro Ser Pro Ile Tyr Gln Leu Glu Arg 1490 1495 1500 Arg Glu Ser Ser Leu Pro Ala Leu Met Thr Thr Met Met Lys Gly Ile 1505 1510 1515 1520 Arg Phe Ile Gly Asn Gly Tyr Cys Lys Phe Pro Ser Ser Thr His Pro 1525 1530 1535 Val Asn Thr Asp Phe Thr Gly Ile Lys Ala Ser Phe Arg Thr Lys Val 1540 1545 1550 Pro Glu Gly Leu Ile Val Phe Ala Ala Ser Pro Gly Asn Gln Glu Glu 1555 1560 1565 Tyr Phe Ala Leu Gln Leu Lys Lys Gly Arg Leu Tyr Phe Leu Phe Asp 1570 1575 1580 Pro Gln Gly Ser Pro Val Glu Val Thr Thr Thr Asn Asp His Gly Lys 1585 1590 1595 1600 Gln Tyr Ser Asp Gly Lys Trp His Glu Ile Ile Ala Ile Arg His Gln 1605 1610 1615 Ala Phe Gly Gln Ile Thr Leu Asp Gly Ile Tyr Thr Gly Ser Ser Ala 1620 1625 1630 Ile Leu Asn Gly Ser Thr Val Ile Gly Asp Asn Thr Gly Val Phe Leu 1635 1640 1645 Gly Gly Leu Pro Arg Ser Tyr Thr Ile Leu Arg Lys Asp Pro Glu Ile 1650 1655 1660 Ile Gln Lys Gly Phe Val Gly Cys Leu Lys Asp Val His Phe Met Lys 1665 1670 1675 1680 Asn Tyr Asn Pro Ser Ala Ile Trp Glu Pro Leu Asp Trp Gln Ser Ser 1685 1690 1695 Glu Glu Gln Ile Asn Val Tyr Asn Ser Trp Glu Gly Cys Pro Ala Ser 1700 1705 1710 Leu Asn Glu Gly Ala Gln Phe Leu Gly Ala Gly Phe Leu Glu Leu His 1715 1720 1725 Pro Tyr Met Phe His Gly Gly Met Asn Phe Glu Ile Ser Phe Lys Phe 1730 1735 1740 Arg Thr Asp Gln Leu Asn Gly Leu Leu Leu Phe Val Tyr Asn Lys Asp 1745 1750 1755 1760 Gly Pro Asp Phe Leu Ala Met Glu Leu Lys Ser Gly Ile Leu Thr Phe 1765 1770 1775 Arg Leu Asn Thr Ser Leu Ala Phe Thr Gln Val Asp Leu Leu Leu Gly 1780 1785 1790 Leu Ser Tyr Cys Asn Gly Lys Trp Asn Lys Val Ile Ile Lys Lys Glu 1795 1800 1805 Gly Ser Phe Ile Ser Ala Ser Val Asn Gly Leu Met Lys His Ala Ser 1810 1815 1820 Glu Ser Gly Asp Gln Pro Leu Val Val Asn Ser Pro Val Tyr Val Gly 1825 1830 1835 1840 Gly Ile Pro Gln Glu Leu Leu Asn Ser Tyr Gln His Leu Cys Leu Glu 1845 1850 1855 Gln Gly Phe Gly Gly Cys Met Lys Asp Val Lys Phe Thr Arg Gly Ala 1860 1865 1870 Val Val Asn Leu Ala Ser Val Ser Ser Gly Ala Val Arg Val Asn Leu 1875 1880 1885 Asp Gly Cys Leu Ser Thr Asp Ser Ala Val Asn Cys Arg Gly Asn Asp 1890 1895 1900 Ser Ile Leu Val Tyr Gln Gly Lys Glu Gln Ser Val Tyr Glu Gly Gly 1905 1910 1915 1920 Leu Gln Pro Phe Thr Glu Tyr Leu Tyr Arg Val Ile Ala Ser His Glu 1925 1930 1935 Gly Gly Ser Val Tyr Ser Asp Trp Ser Arg Gly Arg Thr Thr Gly Ala 1940 1945 1950 Ala Pro Gln Ser Val Pro Thr Pro Ser Arg Val Arg Ser Leu Asn Gly 1955 1960 1965 Tyr Ser Ile Glu Val Thr Trp Asp Glu Pro Val Val Arg Gly Val Ile 1970 1975 1980 Glu Lys Tyr Ile Leu Lys Ala Tyr Ser Glu Asp Ser Thr Arg Pro Pro 1985 1990 1995 2000 Arg Met Pro Ser Ala Ser Ala Glu Phe Val Asn Thr Ser Asn Leu Thr 2005 2010 2015 Gly Ile Leu Thr Gly Leu Leu Pro Phe Lys Asn Tyr Ala Val Thr Leu 2020 2025 2030 Thr Ala Cys Thr Leu Ala Gly Cys Thr Glu Ser Ser His Ala Leu Asn 2035 2040 2045 Ile Ser Thr Pro Gln Glu Ala Pro Gln Glu Val Gln Pro Pro Val Ala 2050 2055 2060 Lys Ser Leu Pro Ser Ser Leu Leu Leu Ser Trp Asn Pro Pro Lys Lys 2065 2070 2075 2080 Ala Asn Gly Ile Ile Thr Gln Tyr Cys Leu Tyr Met Asp Gly Arg Leu 2085 2090 2095 Ile Tyr Ser Gly Ser Glu Glu Asn Tyr Thr Val Thr Asp Leu Ala Val 2100 2105 2110 Phe Thr Pro His Gln Phe Leu Leu Ser Ala Cys Thr His Val Gly Cys 2115 2120 2125 Thr Asn Ser Ser Trp Val Leu Leu Tyr Thr Ala Gln Leu Pro Pro Glu 2130 2135 2140 His Val Asp Ser Pro Val Leu Thr Val Leu Asp Ser Arg Thr Ile His 2145 2150 2155 2160 Ile Gln Trp Lys Gln Pro Arg Lys Ile Ser Gly Ile Leu Glu Arg Tyr 2165 2170 2175 Val Leu Tyr Met Ser Asn His Thr His Asp Phe Thr Ile Trp Ser Val 2180 2185 2190 Ile Tyr Asn Ser Thr Glu Leu Phe Gln Asp His Met Leu Gln Tyr Val 2195 2200 2205 Leu Pro Gly Asn Lys Tyr Leu Ile Lys Leu Gly Ala Cys Thr Gly Gly 2210 2215 2220 Gly Cys Thr Val Ser Glu Ala Ser Glu Ala Leu Thr Asp Glu Asp Ile 2225 2230 2235 2240 Pro Glu Gly Val Pro Ala Pro Lys Ala His Ser Tyr Ser Pro Asp Ser 2245 2250 2255 Phe Asn Val Ser Trp Thr Glu Pro Glu Tyr Pro Asn Gly Val Ile Thr 2260 2265 2270 Ser Tyr Gly Leu Tyr Leu Asp Gly Ile Leu Ile His Asn Ser Ser Glu 2275 2280 2285 Leu Ser Tyr Arg Ala Tyr Gly Phe Ala Pro Trp Ser Leu His Ser Phe 2290 2295 2300 Arg Val Gln Ala Cys Thr Ala Lys Gly Cys Ala Leu Gly Pro Leu Val 2305 2310 2315 2320 Glu Asn Arg Thr Leu Glu Ala Pro Pro Glu Gly Thr Val Asn Val Phe 2325 2330 2335 Val Lys Thr Gln Gly Ser Arg Lys Ala His Val Arg Trp Glu Ala Pro 2340 2345 2350 Phe Arg Pro Asn Gly Leu Leu Thr His Ser Val Leu Phe Thr Gly Ile 2355 2360 2365 Phe Tyr Val Asp Pro Val Gly Asn Asn Tyr Thr Leu Leu Asn Val Thr 2370 2375 2380 Lys Val Met Tyr Ser Gly Glu Glu Thr Asn Leu Trp Val Leu Ile Asp 2385 2390 2395 2400 Gly Leu Val Pro Phe Thr Asn Tyr Thr Val Gln Val Asn Ile Ser Asn 2405 2410 2415 Ser Gln Gly Ser Leu Ile Thr Asp Pro Ile Thr Ile Ala Met Pro Pro 2420 2425 2430 Gly Ala Pro Asp Gly Val Leu Pro Pro Arg Leu Ser Ser Ala Thr Pro 2435 2440 2445 Thr Ser Leu Gln Val Val Trp Ser Thr Pro Ala Arg Asn Asn Ala Pro 2450 2455 2460 Gly Ser Pro Arg Tyr Gln Leu Gln Met Arg Ser Gly Asp Ser Thr His 2465 2470 2475 2480 Gly Phe Leu Glu Leu Phe Ser Asn Pro Ser Ala Ser Leu Ser Tyr Glu 2485 2490 2495 Val Ser Asp Leu Gln Pro Tyr Thr Glu Tyr Met Phe Arg Leu Val Ala 2500 2505 2510 Ser Asn Gly Phe Gly Ser Ala His Ser Ser Trp Ile Pro Phe Met Thr 2515 2520 2525 Ala Glu Asp Lys Pro Gly Pro Val Val Pro Pro Ile Leu Leu Asp Val 2530 2535 2540 Lys Ser Arg Met Met Leu Val Thr Trp Gln His Pro Arg Lys Ser Asn 2545 2550 2555 2560 Gly Val Ile Thr His Tyr Asn Ile Tyr Leu His Gly Arg Leu Tyr Leu 2565 2570 2575 Arg Thr Pro Gly Asn Val Thr Asn Cys Thr Val Met His Leu His Pro 2580 2585 2590 Tyr Thr Ala Tyr Lys Phe Gln Val Glu Ala Cys Thr Ser Lys Gly Cys 2595 2600 2605 Ser Leu Ser Pro Glu Ser Gln Thr Val Trp Thr Leu Pro Gly Ala Pro 2610 2615 2620 Glu Gly Ile Pro Ser Pro Glu Leu Phe Ser Asp Thr Pro Thr Ser Val 2625 2630 2635 2640 Ile Ile Ser Trp Gln Pro Pro Thr His Pro Asn Gly Leu Val Glu Asn 2645 2650 2655 Phe Thr Ile Glu Arg Arg Val Lys Gly Lys Glu Glu Val Thr Thr Leu 2660 2665 2670 Val Thr Leu Pro Arg Ser His Ser Met Arg Phe Ile Asp Lys Thr Ser 2675 2680 2685 Ala Leu Ser Pro Trp Thr Lys Tyr Glu Tyr Arg Val Leu Met Ser Thr 2690 2695 2700 Leu His Gly Gly Thr Asn Ser Ser Ala Trp Val Glu Val Thr Thr Arg 2705 2710 2715 2720 Pro Ser Arg Pro Ala Gly Val Gln Pro Pro Val Val Thr Val Leu Glu 2725 2730 2735 Pro Asp Ala Val Gln Val Thr Trp Lys Pro Pro Leu Ile Gln Asn Gly 2740 2745 2750 Asp Ile Leu Ser Tyr Glu Ile His Met Pro Asp Pro His Ile Thr Leu 2755 2760 2765 Thr Asn Val Thr Ser Ala Val Leu Ser Gln Lys Val Thr His Leu Ile 2770 2775 2780 Pro Phe Thr Asn Tyr Ser Val Thr Ile Val Ala Cys Ser Gly Gly Asn 2785 2790 2795 2800 Gly Tyr Leu Gly Gly Cys Thr Glu Ser Leu Pro Thr Tyr Val Thr Thr 2805 2810 2815 His Pro Thr Val Pro Gln Asn Val Gly Pro Leu Ser Val Ile Pro Leu 2820 2825 2830 Ser Glu Ser Tyr Val Val Ile Ser Trp Gln Pro Pro Ser Lys Pro Asn 2835 2840 2845 Gly Pro Asn Leu Arg Tyr Glu Leu Leu Arg Arg Lys Ile Gln Gln Pro 2850 2855 2860 Leu Ala Ser Asn Pro Pro Glu Asp Leu Asn Arg Trp His Asn Ile Tyr 2865 2870 2875 2880 Ser Gly Thr Gln Trp Leu Tyr Glu Asp Lys Gly Leu Ser Arg Phe Thr 2885 2890 2895 Thr Tyr Glu Tyr Met Leu Phe Val His Asn Ser Val Gly Phe Thr Pro 2900 2905 2910 Ser Arg Glu Val Thr Val Thr Thr Leu Ala Gly Leu Pro Glu Arg Gly 2915 2920 2925 Ala Asn Leu Thr Ala Ser Val Leu Asn His Thr Ala Ile Asp Val Arg 2930 2935 2940 Trp Ala Lys Pro Thr Val Gln Asp Leu Gln Gly Glu Val Glu Tyr Tyr 2945 2950 2955 2960 Thr Leu Phe Trp Ser Ser Ala Thr Ser Asn Asp Ser Leu Lys Ile Leu 2965 2970 2975 Pro Asp Val Asn Ser His Val Ile Gly His Leu Lys Pro Asn Thr Glu 2980 2985 2990 Tyr Trp Ile Phe Ile Ser Val Phe Asn Gly Val His Ser Ile Asn Ser 2995 3000 3005 Ala Gly Leu His Ala Thr Thr Cys Asp Gly Glu Pro Gln Gly Met Leu 3010 3015 3020 Pro Pro Glu Val Val Ile Ile Asn Ser Thr Ala Val Arg Val Ile Trp 3025 3030 3035 3040 Thr Ser Pro Ser Asn Pro Asn Gly Val Val Thr Glu Tyr Ser Ile Tyr 3045 3050 3055 Val Asn Asn Lys Leu Tyr Lys Thr Gly Met Asn Val Pro Gly Ser Phe 3060 3065 3070 Ile Leu Arg Asp Leu Ser Pro Phe Thr Ile Tyr Asp Ile Gln Val Glu 3075 3080 3085 Val Cys Thr Ile Tyr Ala Cys Val Lys Ser Asn Gly Thr Gln Ile Thr 3090 3095 3100 Thr Val Glu Asp Thr Pro Ser Asp Ile Pro Thr Pro Thr Ile Arg Gly 3105 3110 3115 3120 Ile Thr Ser Arg Ser Leu Gln Ile Asp Trp Val Ser Pro Arg Lys Pro 3125 3130 3135 Asn Gly Ile Ile Leu Gly Tyr Asp Leu Leu Trp Lys Thr Trp Tyr Pro 3140 3145 3150 Cys Ala Lys Thr Gln Lys Leu Val Gln Asp Gln Ser Asp Glu Leu Cys 3155 3160 3165 Lys Ala Val Arg Cys Gln Lys Pro Glu Ser Ile Cys Gly His Ile Cys 3170 3175 3180 Tyr Ser Ser Glu Ala Lys Val Cys Cys Asn Gly Val Leu Tyr Asn Pro 3185 3190 3195 3200 Lys Pro Gly His Arg Cys Cys Glu Glu Lys Tyr Ile Pro Phe Val Leu 3205 3210 3215 Asn Ser Thr Gly Val Cys Cys Gly Gly Arg Ile Gln Glu Ala Gln Pro 3220 3225 3230 Asn His Gln Cys Cys Ser Gly Tyr Tyr Ala Arg Ile Leu Pro Gly Glu 3235 3240 3245 Val Cys Cys Pro Asp Glu Gln His Asn Arg Val Ser Val Gly Ile Gly 3250 3255 3260 Asp Ser Cys Cys Gly Arg Met Pro Tyr Ser Thr Ser Gly Asn Gln Ile 3265 3270 3275 3280 Cys Cys Ala Gly Arg Leu His Asp Gly His Gly Gln Lys Cys Cys Gly 3285 3290 3295 Arg Gln Ile Val Ser Asn Asp Leu Glu Cys Cys Gly Gly Glu Glu Gly 3300 3305 3310 Val Val Tyr Asn Arg Leu Pro Gly Met Phe Cys Cys Gly Gln Asp Tyr 3315 3320 3325 Val Asn Met Ser Asp Thr Ile Cys Cys Ser Ala Ser Ser Gly Glu Ser 3330 3335 3340 Lys Ala His Ile Lys Lys Asn Asp Pro Val Pro Val Lys Cys Cys Glu 3345 3350 3355 3360 Thr Glu Leu Ile Pro Lys Ser Gln Lys Cys Cys Asn Gly Val Gly Tyr 3365 3370 3375 Asn Pro Leu Lys Tyr Val Cys Ser Asp Lys Ile Ser Thr Gly Met Met 3380 3385 3390 Met Lys Glu Thr Lys Glu Cys Arg Ile Leu Cys Pro Ala Ser Met Glu 3395 3400 3405 Ala Thr Glu His Cys Gly Arg Cys Asp Phe Asn Phe Thr Ser His Ile 3410 3415 3420 Cys Thr Val Ile Arg Gly Ser His Asn Ser Thr Gly Lys Ala Ser Ile 3425 3430 3435 3440 Glu Glu Met Cys Ser Ser Ala Glu Glu Thr Ile His Thr Gly Ser Val 3445 3450 3455 Asn Thr Tyr Ser Tyr Thr Asp Val Asn Leu Lys Pro Tyr Met Thr Tyr 3460 3465 3470 Glu Tyr Arg Ile Ser Ala Trp Asn Ser Tyr Gly Arg Gly Leu Ser Lys 3475 3480 3485 Ala Val Arg Ala Arg Thr Lys Glu Asp Val Pro Gln Gly Val Ser Pro 3490 3495 3500 Pro Thr Trp Thr Lys Ile Asp Asn Leu Glu Asp Thr Ile Val Leu Asn 3505 3510 3515 3520 Trp Arg Lys Pro Ile Gln Ser Asn Gly Pro Ile Ile Tyr Tyr Ile Leu 3525 3530 3535 Leu Arg Asn Gly Ile Glu Arg Phe Arg Gly Thr Ser Leu Ser Phe Ser 3540 3545 3550 Asp Lys Glu Gly Ile Gln Pro Phe Gln Glu Tyr Ser Tyr Gln Leu Lys 3555 3560 3565 Ala Cys Thr Val Ala Gly Cys Ala Thr Ser Ser Lys Val Val Ala Ala 3570 3575 3580 Thr Thr Gln Gly Val Pro Glu Ser Ile Leu Pro Pro Ser Ile Thr Ala 3585 3590 3595 3600 Leu Ser Ala Val Ala Leu His Leu Ser Trp Ser Val Pro Glu Lys Ser 3605 3610 3615 Asn Gly Val Ile Lys Glu Tyr Gln Ile Arg Gln Val Gly Lys Gly Leu 3620 3625 3630 Ile His Thr Asp Thr Thr Asp Arg Arg Gln His Thr Val Thr Gly Leu 3635 3640 3645 Gln Pro Tyr Thr Asn Tyr Ser Phe Thr Leu Thr Ala Cys Thr Ser Ala 3650 3655 3660 Gly Cys Thr Ser Ser Glu Pro Phe Leu Gly Gln Thr Leu Gln Ala Ala 3665 3670 3675 3680 Pro Glu Gly Val Trp Val Thr Pro Arg His Ile Ile Ile Asn Ser Thr 3685 3690 3695 Thr Val Glu Leu Tyr Trp Ser Leu Pro Glu Lys Pro Asn Gly Leu Val 3700 3705 3710 Ser Gln Tyr Gln Leu Ser Arg Asn Gly Asn Leu Leu Phe Leu Gly Gly 3715 3720 3725 Ser Glu Glu Gln Asn Phe Thr Asp Lys Asn Leu Glu Pro Asn Ser Arg 3730 3735 3740 Tyr Thr Tyr Lys Leu Glu Val Lys Thr Gly Gly Gly Ser Ser Ala Ser 3745 3750 3755 3760 Asp Asp Tyr Ile Val Gln Thr Pro Met Ser Thr Pro Glu Glu Ile Tyr 3765 3770 3775 Pro Pro Tyr Asn Ile Thr Val Ile Gly Pro Tyr Ser Ile Phe Val Ala 3780 3785 3790 Trp Ile Pro Pro Gly Ile Leu Ile Pro Glu Ile Pro Val Glu Tyr Asn 3795 3800 3805 Val Leu Leu Asn Asp Gly Ser Val Thr Pro Leu Ala Phe Ser Val Gly 3810 3815 3820 His His Gln Ser Thr Leu Leu Glu Asn Leu Thr Pro Phe Thr Gln Tyr 3825 3830 3835 3840 Glu Ile Arg Ile Gln Ala Cys Gln Asn Gly Ser Cys Gly Val Ser Ser 3845 3850 3855 Arg Met Phe Val Lys Thr Pro Glu Ala Ala Pro Met Asp Leu Asn Ser 3860 3865 3870 Pro Val Leu Lys Ala Leu Gly Ser Ala Cys Ile Glu Ile Lys Trp Met 3875 3880 3885 Pro Pro Glu Lys Pro Asn Gly Ile Ile Ile Asn Tyr Phe Ile Tyr Arg 3890 3895 3900 Arg Pro Ala Gly Ile Glu Glu Glu Ser Val Leu Phe Val Trp Ser Glu 3905 3910 3915 3920 Gly Ala Leu Glu Phe Met Asp Glu Gly Asp Thr Leu Arg Pro Phe Thr 3925 3930 3935 Leu Tyr Glu Tyr Arg Val Arg Ala Cys Asn Ser Lys Gly Ser Val Glu 3940 3945 3950 Ser Leu Trp Ser Leu Thr Gln Thr Leu Glu Ala Pro Pro Gln Asp Phe 3955 3960 3965 Pro Ala Pro Trp Ala Gln Ala Thr Ser Ala His Ser Val Leu Leu Asn 3970 3975 3980 Trp Thr Lys Pro Glu Ser Pro Asn Gly Ile Ile Ser His Tyr Arg Val 3985 3990 3995 4000 Val Tyr Gln Glu Arg Pro Asp Asp Pro Thr Phe Asn Ser Pro Thr Val 4005 4010 4015 His Ala Phe Thr Val Lys Gly Thr Ser His Gln Ala His Leu Tyr Gly 4020 4025 4030 Leu Glu Pro Phe Thr Thr Tyr Arg Ile Gly Val Val Ala Ala Asn His 4035 4040 4045 Ala Gly Glu Ile Leu Ser Pro Trp Thr Leu Ile Gln Thr Leu Glu Ser 4050 4055 4060 Ser Pro Ser Gly Leu Arg Asn Phe Ile Val Glu Gln Lys Glu Asn Gly 4065 4070 4075 4080 Arg Ala Leu Leu Leu Gln Trp Ser Glu Pro Met Arg Thr Asn Gly Val 4085 4090 4095 Ile Lys Thr Tyr Asn Ile Phe Ser Asp Gly Phe Leu Glu Tyr Ser Gly 4100 4105 4110 Leu Asn Arg Gln Phe Leu Phe Arg Arg Leu Asp Pro Phe Thr Leu Tyr 4115 4120 4125 Thr Leu Thr Leu Glu Ala Cys Thr Arg Ala Gly Cys Ala His Ser Ala 4130 4135 4140 Pro Gln Pro Leu Trp Thr Asp Glu Ala Pro Pro Asp Ser Gln Leu Ala 4145 4150 4155 4160 Pro Thr Val His Ser Val Lys Ser Thr Ser Val Glu Leu Ser Trp Ser 4165 4170 4175 Glu Pro Val Asn Pro Asn Gly Lys Ile Ile Arg Tyr Glu Val Ile Arg 4180 4185 4190 Arg Cys Phe Glu Gly Lys Ala Trp Gly Asn Gln Thr Ile Gln Ala Asp 4195 4200 4205 Glu Lys Ile Val Phe Thr Glu Tyr Asn Thr Glu Arg Asn Thr Phe Met 4210 4215 4220 Tyr Asn Asp Thr Gly Leu Gln Pro Trp Thr Gln Cys Glu Tyr Lys Ile 4225 4230 4235 4240 Tyr Thr Trp Asn Ser Ala Gly His Thr Cys Ser Ser Trp Asn Val Val 4245 4250 4255 Arg Thr Leu Gln Ala Pro Pro Glu Gly Leu Ser Pro Pro Val Ile Ser 4260 4265 4270 Tyr Val Ser Met Asn Pro Gln Lys Leu Leu Ile Ser Trp Ile Pro Pro 4275 4280 4285 Glu Gln Ser Asn Gly Ile Ile Gln Ser Tyr Arg Leu Gln Arg Asn Glu 4290 4295 4300 Met Leu Tyr Pro Phe Ser Phe Asp Pro Val Thr Phe Asn Tyr Thr Asp 4305 4310 4315 4320 Glu Glu Leu Leu Pro Phe Ser Thr Tyr Ser Tyr Ala Leu Gln Ala Cys 4325 4330 4335 Thr Ser Gly Gly Cys Ser Thr Ser Lys Pro Thr Ser Ile Thr Thr Leu 4340 4345 4350 Glu Ala Ala Pro Ser Glu Val Ser Pro Pro Asp Leu Trp Ala Val Ser 4355 4360 4365 Ala Thr Gln Met Asn Val Cys Trp Ser Pro Pro Thr Val Gln Asn Gly 4370 4375 4380 Lys Ile Thr Lys Tyr Leu Val Arg Tyr Asp Asn Lys Glu Ser Leu Ala 4385 4390 4395 4400 Gly Gln Gly Leu Cys Leu Leu Val Ser His Leu Lys Pro Tyr Ser Gln 4405 4410 4415 Tyr Asn Phe Ser Leu Val Ala Cys Thr Asn Gly Gly Cys Thr Ala Ser 4420 4425 4430 Val Ser Lys Ser Ala Trp Thr Met Glu Ala Leu Pro Glu Asn Met Asp 4435 4440 4445 Ser Pro Thr Leu Gln Val Thr Gly Ser Glu Ser Ile Glu Ile Thr Trp 4450 4455 4460 Lys Pro Pro Arg Asn Pro Asn Gly Gln Ile Arg Ser Tyr Glu Leu Arg 4465 4470 4475 4480 Arg Asp Gly Thr Ile Val Tyr Thr Gly Leu Glu Thr Arg Tyr Arg Asp 4485 4490 4495 Phe Thr Leu Thr Pro Gly Val Glu Tyr Ser Tyr Thr Val Thr Ala Ser 4500 4505 4510 Asn Ser Gln Gly Gly Ile Leu Ser Pro Leu Val Lys Asp Arg Thr Ser 4515 4520 4525 Pro Ser Ala Pro Ser Gly Met Glu Pro Pro Lys Leu Gln Ala Arg Gly 4530 4535 4540 Pro Gln Glu Ile Leu Val Asn Trp Asp Pro Pro Val Arg Thr Asn Gly 4545 4550 4555 4560 Asp Ile Ile Asn Tyr Thr Leu Phe Ile Arg Glu Leu Phe Glu Arg Glu 4565 4570 4575 Thr Lys Ile Ile His Ile Asn Thr Thr His Asn Ser Phe Gly Met Gln 4580 4585 4590 Ser Tyr Ile Val Asn Gln Leu Lys Pro Phe His Arg Tyr Glu Ile Arg 4595 4600 4605 Ile Gln Ala Cys Thr Thr Leu Gly Cys Ala Ser Ser Asp Trp Thr Phe 4610 4615 4620 Ile Gln Thr Pro Glu Ile Ala Pro Leu Met Gln Pro Pro Pro His Leu 4625 4630 4635 4640 Glu Val Gln Met Ala Pro Gly Gly Phe Gln Pro Thr Val Ser Leu Leu 4645 4650 4655 Trp Thr Gly Pro Leu Gln Pro Asn Gly Lys Val Leu Tyr Tyr Glu Leu 4660 4665 4670 Tyr Arg Arg Gln Ile Ala Thr Gln Pro Arg Lys Ser Asn Pro Val Leu 4675 4680 4685 Ile Tyr Asn Gly Ser Ser Thr Ser Phe Ile Asp Ser Glu Leu Leu Pro 4690 4695 4700 Phe Thr Glu Tyr Glu Tyr Gln Val Trp Ala Val Asn Ser Ala Gly Lys 4705 4710 4715 4720 Ala Pro Ser Ser Trp Thr Trp Cys Arg Thr Gly Pro Ala Pro Pro Glu 4725 4730 4735 Gly Leu Arg Ala Pro Thr Phe His Val Ile Ser Ser Thr Gln Ala Val 4740 4745 4750 Val Asn Ile Ser Ala Pro Gly Lys Pro Asn Gly Ile Val Ser Leu Tyr 4755 4760 4765 Arg Leu Phe Ser Ser Ser Ala His Gly Ala Glu Thr Val Leu Ser Glu 4770 4775 4780 Gly Met Ala Thr Gln Gln Thr Leu His Gly Leu Gln Ala Phe Thr Asn 4785 4790 4795 4800 Tyr Ser Ile Gly Val Glu Ala Cys Thr Cys Phe Asn Cys Cys Ser Lys 4805 4810 4815 Gly Pro Thr Ala Glu Leu Arg Thr His Pro Ala Pro Pro Ser Gly Leu 4820 4825 4830 Ser Ser Pro Gln Ile Gly Thr Leu Ala Ser Arg Thr Ala Ser Phe Arg 4835 4840 4845 Trp Ser Pro Pro Met Phe Pro Asn Gly Val Ile His Ser Tyr Glu Leu 4850 4855 4860 Gln Phe His Val Ala Cys Pro Pro Asp Ser Ala Leu Pro Cys Thr Pro 4865 4870 4875 4880 Ser Gln Ile Glu Thr Lys Tyr Thr Gly Leu Gly Gln Lys Ala Ser Leu 4885 4890 4895 Gly Gly Leu Gln Pro Tyr Thr Thr Tyr Lys Leu Arg Val Val Ala His 4900 4905 4910 Asn Glu Val Gly Ser Thr Ala Ser Glu Trp Ile Ser Phe Thr Thr Gln 4915 4920 4925 Lys Glu Leu Pro Gln Tyr Arg Ala Pro Phe Ser Val Asp Ser Asn Leu 4930 4935 4940 Ser Val Val Cys Val Asn Trp Ser Asp Thr Phe Leu Leu Asn Gly Gln 4945 4950 4955 4960 Leu Lys Glu Tyr Val Leu Thr Asp Gly Gly Arg Arg Val Tyr Ser Gly 4965 4970 4975 Leu Asp Thr Thr Leu Tyr Ile Pro Arg Thr Ala Asp Lys Thr Phe Phe 4980 4985 4990 Phe Gln Val Ile Cys Thr Thr Asp Glu Gly Ser Val Lys Thr Pro Leu 4995 5000 5005 Ile Gln Tyr Asp Thr Ser Thr Gly Leu Gly Leu Val Leu Thr Thr Pro 5010 5015 5020 Gly Lys Lys Lys Gly Ser Arg Ser Lys Ser Thr Glu Phe Tyr Ser Glu 5025 5030 5035 5040 Leu Trp Phe Ile Val Leu Met Ala Met Leu Gly Leu Ile Leu Leu Ala 5045 5050 5055 Ile Phe Leu Ser Leu Ile Leu Gln Arg Lys Ile His Lys Glu Pro Tyr 5060 5065 5070 Ile Arg Glu Arg Pro Pro Leu Val Pro Leu Gln Lys Arg Met Ser Pro 5075 5080 5085 Leu Asn Val Tyr Pro Pro Gly Glu Asn His Met Gly Leu Ala Asp Thr 5090 5095 5100 Lys Ile Pro Arg Ser Gly Thr Pro Val Ser Ile Arg Ser Asn Arg Ser 5105 5110 5115 5120 Ala Cys Val Leu Arg Ile Pro Ser Gln Asn Gln Thr Ser Leu Thr Tyr 5125 5130 5135 Ser Gln Gly Ser Leu His Arg Ser Val Ser Gln Leu Met Asp Ile Gln 5140 5145 5150 Asp Lys Lys Val Leu Met Asp Asn Ser Leu Trp Glu Ala Ile Met Gly 5155 5160 5165 His Asn Ser Gly Leu Tyr Val Asp Glu Glu Asp Leu Met Asn Ala Ile 5170 5175 5180 Lys Asp Phe Ser Ser Val Thr Lys Glu Arg Thr Thr Phe Thr Asp Thr 5185 5190 5195 5200 His Leu <210> 10 <211> 25983 <212> DNA <213> Artificial Sequence <220> <223> USH2A <400> 10 tgtttgctct gcagaatact ttacctgggc acccaagtct tccttccagc attcctgctg 60 ctacagccta tttgctgagt aaccaggggt tacagcagcg ttgccaggca acgagggaca 120 gcggtcctgt tgaagagcca tttgtcacac tgaggggact ggttgaaatg caataaagaa 180 atgnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnataccag cagctactca 300 tgtcttcgcc attgctaaga acgtcgttgg tattacctta ctctgagaac gtgtctgcag 360 tttccagaaa atggagtatc gcaacatcac ttaaagtacc ctgcttcaaa gtattgctgg 420 caagtggcgt gggcctgatt atttatttag aaatgcttta tcaggaggag aatgcttttt 480 tgtaaacatg aattgcccag ttctttcatt gggctctggc ttcttgtttc aggtcattga 540 aatgttgatc tttgcctatt ttgcttcaat atccttgact gagtcacgag gtcttttccc 600 aaggctggag aacgtgggag ctttcaagaa agtttccatc gtgccaaccc aagcagtatg 660 tggactccca gaccgaagca ctttttgtca cagctctgct gctgctgaaa gtattcagtt 720 ctgtacccag cggttttgta ttcaggattg cccatacaga tcttcacacc ctacctacac 780 tgcccttttc tcagcaggcc tcagtagctg catcacacca gacaagaatg atctgcatcc 840 taacgcccat agcaattctg caagttttat ttttggaaat cacaagagct gcttttcttc 900 tcctccttct ccaaagctga tggcatcatt taccttagct gtatggctga aacctgagca 960 acaaggtgta atnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1020 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nngtgtgtta 1080 tagaaaagac agtagatggg cagattgtgt tcaaacttac aatatctgag aaagagacca 1140 tgttttatta tcgcacagta aatggtttgc aacctccaat aaaagtaatg acactgggga 1200 gaattcttgt gaagaaatgg attcatctta gtgtgcagnn nnnnnnnnnn nnnnnnnnnn 1260 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1320 nnnnnnnnnn nnnnnnnngt gcatcagaca aaaatcagct tctttatcaa tggcgtggag 1380 aaggatcata cacctttcaa tgcaagaact ctaagtggtt caattacaga ttttgcatct 1440 ggtactgtgc aaataggaca gagtttaaat gnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1500 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1560 nnnnnnnnnn ngtttagagc agtttgtcgg aagaatgcaa gattttcgat tataccaagt 1620 ggcacttaca aacagnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 1680 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnagaga 1740 ttctggaagt cttctctgga gatcttctca gattgcatgc ccaatcacat tgccgttgcc 1800 ctggcagcca cccgcgggtc caccctttgg cacagcggta ctgcattcct aatgatgcag 1860 gagacacagc tgataataga gtgtcacggt tgaatcctga agcccatcct ctctcttttg 1920 tcaatgataa tgatgttggt acttcatggg tttcaaatgt gtttacaaac attacacagc 1980 ttaatcaagg agtgactatt tcagttgatt tggaaaatgg acagtatcag nnnnnnnnnn 2040 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2100 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn gtgttttata ttatcattca gttctttagt 2160 ccacaaccaa cggaaataag gattcaaagg aagaaggaaa atagtttaga ttgggaggac 2220 tggcaatatt ttgccaggaa ttgtggtgct tttggaatga aaaacaatgg agatttggaa 2280 aaacctgatt ctgtcaactg tcttcagctt tccaannnnn nnnnnnnnnn nnnnnnnnnn 2340 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2400 nnnnnnnnnn nnnnntttta ctccatattc ccgtggcaat gtcacattta gcatcctgac 2460 acctggacca aattatcgtc ctggatacaa taacttctat aataccccat ctcttcaaga 2520 gttcgtaaaa gccacgcaaa taaggtttca ttttcatggg cagtactata caactgagac 2580 tgctgttaac ctcagacaca gatattatgc agtggacgaa atcaccatta gtgggagnnn 2640 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2700 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnatg tcagtgccat ggtcatgccg 2760 ataactgcga cacaacaagc cagccatata gatgcctctg ctcccaggag agcttcactg 2820 aaggacttca tnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 2880 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ntgtgatcgc 2940 tgcttgcctc tttataatga caagcctttc cgccaaggtg atcaagttta cgctttcaat 3000 tgtaaacctt gtcaatgcaa cagccattcc aaaagctgcc attacaacat ctctgtagac 3060 ccatttcctt ttgagcactt cagaggggga ggaggagttt gtgatgattg tgagcataac 3120 actacagnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnngaa ggaactgtga 3240 gctgtgcaag gattactttt tccgacaagt tggtgcagat ccttcggcca tagatgtttg 3300 caaaccctgt gactgtgata cagttggcac tagaaatggt agcattcttt gtgatcagnn 3360 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3420 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnat tggaggacag tgtaattgta 3480 agagacacgt gtctggcagg cagtgcaatc agtgccagaa tggattctac aatctacaag 3540 agttggatcc tgatggctgc agtccctgta actgcaatac ctctgggaca gtggatggag 3600 atattacctg tcaccaaaat tcaggccagt gcaagtgcaa agcaaacgtt attgnnnnnn 3660 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3720 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnggctta ggtgtgatca ttgcaatttt 3780 ggatttaaat ttctccgaag ctttaatgat gttggatgtg agccctgcca gtgtaacctc 3840 catggctcag tgaacaaatt ctgcaatcct cactctgggc agtgtgagtg caaaaaagaa 3900 gccaaaggac ttcagtgtga cacctgcaga gaaaactttt atgggttaga tgtcaccaat 3960 tgtaaggcct gtgactgtga cacagctgga tccctccctg ggactgtctg taatgctaag 4020 acagggcagt gcatctgcaa gcccaatgtt gaagggagac agtgcaataa atgtttggag 4080 ggaaacttct acctacggca aaataattct ttcctctgtc tgccttgcaa ctgtgataag 4140 actgggacaa taaatggctc tctgctgtgt aacaaatcaa caggacaatg tccttgcaaa 4200 ttaggggtaa caggtcttcg ctgtaatcag tgtgagcctc acaggtacaa tttgaccatt 4260 gacaattttc aacactgcca gatgtgtgag tgtgattcct tggggacatt acctgggacc 4320 atttgtgacc caatcagtgg ccagtgcctg tgtgtgccta atcgtcaagg aagaaggtgt 4380 aatcagtgtc aaccagnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4440 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnngttt 4500 ttatatttct ccaggcaatg ccactggctg cctgccatgc tcatgccata caactggtgc 4560 agttaatcac atctgtaata gcctgactgg tcagtgtgtt tgccaagatg cttccattgc 4620 tgggcaacgt tgtgaccaat gcaaagacca ttactttgga tttgatcctc agactggaag 4680 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4740 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn atgtcagcct tgtaattgtc 4800 atctctcagg agccttgaat gaaacctgtc acttggtcac aggccagtgt ttctgtaaac 4860 aatttgtcac tggctcaaag tgtgatgctt gtgttcccag tgcaagccac ttggatgtca 4920 acaatctatt gggttgcagc aaaannnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4980 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5040 nnnnctccat tccagcaacc tccgcccaga ggacaagttc aaagttcttc tgctatcaat 5100 ctctcctgga gtccacctga ttctccaaat gcccactggc ttacttacag tttactcagg 5160 gatggttttg aaatctacac aacagaggat caatacccat acannnnnnn nnnnnnnnnn 5220 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5280 nnnnnnnnnn nnnnnnnnnn nnngtattca atacttctta gacacagacc tgttaccata 5340 taccaaatat tcctattaca ttgagaccac caatgtgcat ggttcaacaa ggagtgtagc 5400 tgtcacttac aagacaaaac caggggtccc agagggaaac ttgactttaa gttatatcat 5460 tcctattggc tcagactctg tgacacttac ctggacaaca ctctcaaatc aatctggtcc 5520 catagagaaa tatattttgt cctgtgcccc tttggctggt ggtcagccat gtgtttccta 5580 cgaaggtcat gaaacctcag ctaccatctg gaatctggtt ccatttgcca agtacgattt 5640 ttctgtacag gcgtgtacta gcgggggctg tttacacagc ttgcccatta cagtgaccac 5700 agcccaggcc cctccccaaa gactaagtcc acctaagatg cagaaaatca gttctacaga 5760 acttcatgta gaatggtctc caccagcgga actaaatgnn nnnnnnnnnn nnnnnnnnnn 5820 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5880 nnnnnnnnnn nnnnnnnnga ataattataa gatatgaact atacatgaga agactgagat 5940 ctactaaaga aaccacatct gaggaaagtc gagtttttca gagcagtggt tggctcagtc 6000 ctcattcatt tgtagaatcg gccaatgaaa atgcattaaa acctcctcaa acaatgacaa 6060 ccatcactgg cttggagcca tacaccaagt atgagttcag agtcttagct gtgaatatgg 6120 ctggaagtgt gtcttctgcc tgggtctcag aaagaacggg agaatcagnn nnnnnnnnnn 6180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6240 nnnnnnnnnn nnnnnnnnnn nnnnnnnnca cctgtattca tgatccctcc ttcagtcttt 6300 cccctctctt cgtactctct caatatctcc tgggagaagc cagcagataa tgttacaaga 6360 ggaaaagttg tggggtatga catcaatatg ctttctgaac aatcacctca acagtctatt 6420 cccatggcgt tttcacagnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6480 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnct 6540 gttgcacact gctaaatccc aagaactatc ttacactgta gaaggactga aaccttatag 6600 gatatatgag tttactatta ctctctgcaa ttcagttggt tgtgtgacca gtgcttcggg 6660 agcaggacaa actttagcag cagnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6720 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6780 nnncaccagc acaactgagg ccacctctgg ttaaaggaat caacagcaca acaatccatc 6840 ttaagtggtt tccacctgaa gaactgaatg gaccctctcc tatatatcag ctggaaagga 6900 gagagtcatc tctaccagct ctgatgacca cgatgatgaa aggaatccgt ttcataggaa 6960 atgggtattg taaatttccc agctccactc acccagtcaa tacagacttc actgnnnnnn 7020 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7080 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnngcatta aggccagctt tcgaacaaaa 7140 gtgcctgaag gtttgattgt ctttgcagca tcacctggca atcaggaaga gtattttgca 7200 cttcagttga agaagggacg tctttatttt ctttttgatc ctcagnnnnn nnnnnnnnnn 7260 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7320 nnnnnnnnnn nnnnnnnnnn nnnnngggtc accagtggaa gtaactacaa ctaatgatca 7380 tggcaaacaa tatagtgatg gaaaatggca tgaaataatt gctattaggc atcaggcttt 7440 tggccaaatc actctggatg ggatatatac agnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7500 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7560 nnnnnnnnnn nngttcctct gccatcctga atggtagtac tgttattgga gataacacag 7620 gagtctttct gggagggctc ccgcgaagtt ataccatcct caggaaggat cctgnnnnnn 7680 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7740 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnagataa tccaaaaagg ttttgtgggc 7800 tgtctcaagg atgtacattt tatgaagaat tacaatccgt cagctatttg ggaacctctg 7860 gattggcaga gttctgaaga acaaatcaac gtgtataaca gctgggaggg atgtcccgct 7920 tcattaaatg agggagctca gttcctagga gcagnnnnnn nnnnnnnnnn nnnnnnnnnn 7980 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8040 nnnnnnnnnn nnnnggttcc tggaacttca tccatatatg tttcatggtg gaatgaactt 8100 tgagatttcc tttaagttca gaactgacca attaaatgga ttgcttcttt tcgtttataa 8160 caaagatgga cctgattttc ttgctnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8220 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8280 nnnnnatgga gctgaaaagt ggaatattga ccttccggtt aaataccagt cttgccttta 8340 cacaagtgga tctattgctg gggctatcct attgtaatgg aaagtggaat aaagtcatta 8400 ttaaaaagga aggctctttc atatcagcaa gtgtgaatgg actgatgaag catgcatcgg 8460 agtccggaga ccagccactg gtggtgaatt caccagttta tgtgggagga atcccacagg 8520 aactgctgaa ctcttatcaa catttgtgtt tggaacaagn nnnnnnnnnn nnnnnnnnnn 8580 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8640 nnnnnnnnnn nnnnnnnnng tttcggtggt tgcatgaagg atgttaaatt tacacggggt 8700 gctgtcgtta acttggcatc tgtgtccagc ggtgctgtca gagtcaatct ggatggatgc 8760 ctatcaactg acagtgctgt taactgcagg ggaaatgact ccatcctggt ttaccaggga 8820 aaagagcaga gtgtttacga gggtggtctc cagcctttta cagnnnnnnn nnnnnnnnnn 8880 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8940 nnnnnnnnnn nnnnnnnnnn nnnaatacct gtatcgagtg atagcctcgc atgaaggagg 9000 ttcagtatat agtgattgga gtcgaggacg tacaacagga gcagnnnnnn nnnnnnnnnn 9060 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9120 nnnnnnnnnn nnnnnnnnnn nnnnctccac aaagtgtgcc aactccctca agagtccgca 9180 gcttaaatgg atacagcatt gaggtgacct gggatgaacc tgttgtcaga ggtgtaattg 9240 agaagtacat tctgaaagcc tatagtgagg acagcacccg tccaccccgc atgccctctg 9300 ccagtgctga atttgtcaat acaagcaacc tcacagnnnn nnnnnnnnnn nnnnnnnnnn 9360 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9420 nnnnnnnnnn nnnnnngcat attgacaggc ttgctaccct tcaaaaacta tgcagtaacc 9480 ctaactgctt gcactttggc tggctgtact gagagctcac atgcattgaa catctctact 9540 ccacaagaag nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9600 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ccccacaaga 9660 ggttcagcca ccagtagcca aatcccttcc cagttctttg ctgctctcct ggaacccacc 9720 caaaaaggca aatggtatta taactcagta ctgtttatac atggatggga ggctgatcta 9780 ttcaggcagt gaggagaact acacagtcac agnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9840 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9900 nnnnnnnnnn nnatttagca gtatttacac cccaccagtt tctactaagt gcatgcacac 9960 atgtgggctg tacaaacagt tcctgggtcc tactgtacac agcacagctg ccaccagaac 10020 acgtggattc cccagttctg actgtcctgg attctagaac tatacacata cannnnnnnn 10080 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10140 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nngtggaaac aaccaagaaa aataagtggg 10200 attctggaac gctatgtatt atatatgtca aaccatacac atgattttac aatttggagt 10260 gtcatctata acagtacaga acttttccag gatcatatgc tacaatacgt tttacctggt 10320 aataaatatc tcatcaagct gggannnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10380 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10440 nnnngcttgc acaggtggtg ggtgcacagt gagtgaggcc agtgaggccc taactgacga 10500 ggacataccc gaaggcgtgc cagcccccaa agcccactca tattcacctg actcctttaa 10560 tgtctcctgg actgagcctg aatatccgaa tgnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10620 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10680 nnnnnnnnnn nngtgttatc acgagttatg gattatatct agatggtata ttaatccaca 10740 attcctcaga actcagctat cgtgcttacg gatttgctcc ttggagttta cattccttca 10800 gagtccaagc atgcacggcc aaaggttgtg ctctgggccc actgnnnnnn nnnnnnnnnn 10860 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 10920 nnnnnnnnnn nnnnnnnnnn nnnngtggaa aatcgaactc tagaagctcc tcctgaagga 10980 acagtaaatg tgtttgtcaa aacacaggga tcccggaaag cccacgtgag gtgggaagca 11040 ccttttcgcc ctaatggact cttaacacac tcagtccttt tcactgggat attctatgta 11100 gacccagnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 11160 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnntag gtaataacta 11220 cacccttctg aatgtcacaa aagtcatgta cagcggagaa gagacaaacc tttgggtgct 11280 catcgatggg ctggttcctt ttaccaacta tactgtacaa gtgaatattt caaatagcca 11340 aggcagcttg ataactgatc ctataacaat tgcaatgcct ccaggagnnn nnnnnnnnnn 11400 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 11460 nnnnnnnnnn nnnnnnnnnn nnnnnnnctc cagatggcgt gctgcctccc aggctttcat 11520 ctgccactcc aaccagtctt caggttgtct ggtctacacc agctcgtaat aacgctcctg 11580 gctctcccag ataccaactc cagatgaggt ctggcgactc cacccatgga tttctagann 11640 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 11700 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnngt tattttccaa tccttctgca 11760 tcgttaagct atgaagtgag tgatctccaa ccgtacacag agtatatgtt tcggttggtt 11820 gcctccaatg gatttggcag tgcacatagt tcttggattc cattcatgac cgcagaggac 11880 annnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 11940 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn naacctggac ctgtagttcc 12000 tccgattctt ctggatgtga agtcaagaat gatgttggtc acctggcagc atcctagaaa 12060 atccaatggg gttattaccc attataacat ttatctacat ggccgtctat acttgagaac 12120 tcctggaaat gtcactaatt gcacagtgat gcatttacac ccatacactg cctataagtt 12180 tcaggtagaa gcctgcactt caaaaggatg ttccctttca ccagagtccc agactgtatg 12240 gacactccca ggggcaccgg aagggatccc aagtccagag ctgttctctg atactccaac 12300 atctgtgatt atatcttggc aaccccctac ccaccccaat ggcttggtgg agaatttcac 12360 aattgagaga agagtcaaag gaaaggaaga agttactacc ctggtgactc tcccgaggag 12420 tcattccatg aggtttattg acaagacttc tgctcttagc ccatggacaa aatatgaata 12480 tcgggtactg atgagcactc ttcatggagg cacaaacagc agtgcttggg tagaagttac 12540 cacaagaccc tcacgacctg ctggggtgca gccacctgtg gtgacagtgc tggaacccga 12600 tgcagtccag nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 12660 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn gtcacttgga 12720 aacccccact catccagaac ggagacatac ttagctatga gattcacatg cctgaccctc 12780 acatcacttt aaccaatgtg acttccgcag tgttaagtca aaaagttact catctgattc 12840 ctttcactaa ttattctgtc accattgttg cttgctcagg gggtaatggg taccttggag 12900 ggtgcacaga gagtttacct acctatgtta ccactcaccc caccgtacct cagaatgttg 12960 gcccattgtc tgtgattcca ctaagtgaat catatgttgt gatttcttgg caaccaccat 13020 ccaagccaaa tggacctaat ttgagnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13080 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13140 nnnnnatatg agcttctgag acgtaaaatc cagcagccac ttgcatcaaa tcccccagaa 13200 gatttaaatc ggtggcacaa tatttattca ggaactcagt ggctttatga agataagggt 13260 cttagcagnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13320 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnngt ttacaaccta 13380 tgaatatatg ctcttcgtac acaacagtgt gggttttaca ccgagccgag aagtgactgt 13440 gacaacgtta gctggtcttc cagagagagg agccaatctc actgcgagtg tccttaacca 13500 cacagccatc gacgtgaggt gggctaaacc aannnnnnnn nnnnnnnnnn nnnnnnnnnn 13560 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13620 nnnnnnnnnn nnctgttcaa gacctacaag gtgaagttga atattacaca cttttttgga 13680 gttctgctac ctcaaacgac tctctaaaaa tcttgccaga tgtaaactct catgtcattg 13740 gccacctaaa gccaaacaca gagtattgga tctttatctc tgtcttcaat ggagtccaca 13800 gcatcaacag tgcaggactt catgcaacca cttgcgatgg ggnnnnnnnn nnnnnnnnnn 13860 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13920 nnnnnnnnnn nnnnnnnnnn nnagcctcag ggcatgcttc ctccagaggt tgtcatcatc 13980 aacagtacag ctgtacgtgt catctggaca tctccttcaa acccaaatgg tgttgtcact 14040 gagtattcta tctatgtaaa taataagctc tacaagactg gaatgaatgt gcctgggtcg 14100 tttattctga gagacctgtc tcccttcact atctatgaca ttcagnnnnn nnnnnnnnnn 14160 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 14220 nnnnnnnnnn nnnnnnnnnn nnnnngttga agtctgcaca atatatgcct gcgtgaaaag 14280 caatggaacc caaattacca ctgtggaaga cactccaagt gatataccaa cacccacaat 14340 tcgtggcatc acttcaagnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 14400 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnat 14460 ctcttcaaat tgattgggtg tctccacgga agccaaatgg catcattctt ggatatgatc 14520 tcctatggaa aacatggtat ccatgcgcta aaactcaaaa gttagtgcag gatcagagtg 14580 atgagctctg caaggcagtg aggtgtcaaa aacctgaatc tatctgtgga cacatttgct 14640 attcttctga agctaagnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 14700 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnngtt 14760 tgttgtaacg gagtgctcta taaccccaag cctggacatc gctgttgtga agaaaagtat 14820 atcccgtttg ttctgaattc tactggagtt tgttgtggtg gccgaataca ggaggcacaa 14880 ccaaatcatc agtgctgctc tgggtattac gctagaattc taccagnnnn nnnnnnnnnn 14940 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15000 nnnnnnnnnn nnnnnnnnnn nnnnnngtga agtatgctgt ccagatgaac agcacaatcg 15060 ggtttctgtt ggcattggtg attcctgctg tggcagaatg ccgtactcca cctcaggaaa 15120 ccagatttgc tgtgctggga ggcttcatga tggccatggc cagaagtgct gtggcagaca 15180 gattgtgagc aacgatttag agtgttgtgg tggagaagaa ggagtggtgt acaatcgcct 15240 tccagnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15300 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnngtatg ttctgttgtg 15360 ggcaggatta tgtgaatatg tcagatacca tatgctgctc agcttccagt ggagagtcta 15420 aagcacatat taaaaagaat gacccggtgc cagtaaaatg ctgtgagact gaacttattc 15480 caaagagcca gaaatgctgt aatggagttg gatataatcc tttgaaatat gtttgctctg 15540 acaagatttc aactggaatg atgatgaagn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15600 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15660 nnnnnnnnng aaaccaaaga gtgcaggatc ctctgcccag catctatgga agccacagaa 15720 cattgtggca ggtgtgactt caactttacc agccacattt gcactgtgat aagagggtct 15780 cacaattcca cagggaaggc atcaattgaa gaaatgtgtt catctgccga agaaaccatt 15840 catacaggga gtgtaaacac gtactcttac acagnnnnnn nnnnnnnnnn nnnnnnnnnn 15900 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 15960 nnnnnnnnnn nnnnatgtga acctcaagcc ctacatgaca tatgagtaca ggatttctgc 16020 ctggaacagc tatgggcgag gactcagcaa agctgtgaga gccagaacaa aagaagatgt 16080 gcctcaagga gtgagtcccc ctacgtggac caaaatagac aatcttgaag atacaattgt 16140 cttaaactgg agaaaaccta tacaatcaaa tgnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16200 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16260 nnnnnnnnnn nngtcctatt atttactaca tccttcttcg aaatggaatt gaacgttttc 16320 ggggaacatc actgagcttc tctgataaag agggaattca accatttcag gaatattcat 16380 atcagctgaa agcttgcacg gttgctggct gtgccaccag tagcaagnnn nnnnnnnnnn 16440 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16500 nnnnnnnnnn nnnnnnnnnn nnnnnnngta gttgcagcta ctacccaagg agttccggag 16560 agcatcctgc caccaagcat cacagcccta agtgcagtgg ctctgcatct gagctggagt 16620 gtccctgaga aatcaaacgg cgtcattaaa gagtaccaga tcaggcaggt tgggaaaggt 16680 ctcatccaca ctgacaccac tgacaggaga cagcatacgg tcacagnnnn nnnnnnnnnn 16740 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16800 nnnnnnnnnn nnnnnnnnnn nnnnnngtct ccagccatac accaactaca gcttcactct 16860 tacagcttgt acatctgctg ggtgcacttc aagcgagcct tttctaggtc agacactgca 16920 ggcagctcct gaagnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 16980 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnngagttt 17040 gggtgacacc tcgacacatt atcatcaatt ctacaacagt ggaattatat tggagtctgc 17100 cagaaaagcc caatggcctc gtttctcaat atcaattgag tcgtaatgga aacttgcttt 17160 tcctgggtgg cagtgaggag cagaatttca ctgataaaaa cctggagccc aatagcagnn 17220 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 17280 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnat acacttacaa gttagaagtc 17340 aaaactggag gtggcagcag tgctagtgat gattacattg ttcaaacacc tatgtcaaca 17400 ccagaagaaa tctatcctcc atataatatc acagtaattg ggccttattc tatatttgta 17460 gcttggatac caccagnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 17520 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnggat 17580 cctcatcccc gaaattcctg tggagtacaa tgtcttactc aatgatggaa gtgtaacacc 17640 tctggccttc tccgttggtc atcatcaatc cacccttctg gaaaatttga ctccattcac 17700 acagtatgag ataaggatac aagcatgtca aaatgnnnnn nnnnnnnnnn nnnnnnnnnn 17760 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 17820 nnnnnnnnnn nnnnngaagt tgtggagtta gcagtaggat gtttgtcaaa acacctgaag 17880 cagccccaat ggatcttaat tctcctgttc ttaaggcact ggggtcagct tgcatagaga 17940 ttaagtggat gccacctgaa aaaccaaatg gaatcatcat caactacttt atttacagnn 18000 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 18060 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnac gccctgctgg cattgaagag 18120 gagtctgttt tatttgtctg gtcagaagga gcccttgaat ttatggatga aggagacacc 18180 ctgaggcctt tcacactcta cgaatatcgg gtcagagcct gtaactccaa gggttcagtg 18240 gagagtctgt ggtcattaac acaaactctg gaagctccac ctcaagattt tccagctcct 18300 tgggctcaag ccacgagtgc tcattcagtt ctgttgaatt ggacaaagcc agaatctccc 18360 aatggcatta tctcccatta ccgtgtggtc taccaggaga gacccgacga tcctacattt 18420 aacagcccta ccgtgcatgc tttcacagtg aagnnnnnnn nnnnnnnnnn nnnnnnnnnn 18480 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 18540 nnnnnnnnnn nnnggaacaa gccatcaagc ccacctgtac gggttagaac cattcacaac 18600 atatcgcatt ggtgttgtgg ctgcaaacca tgcaggagaa attttaagcc cttggactct 18660 gattcaaacc ttagaatctt ccccaagtgg actgagaaac tttatagtag aacagaaaga 18720 gaatggccgg gcattgctac tacagtggtc agaacctatg agaaccaatg gtgtgattaa 18780 gnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 18840 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nacatacaac atcttcagtg 18900 acgggttcct ggagtactct ggtttgaatc gtcagtttct cttccgccgc ctggatcctt 18960 tcactctcta cacactgacc ctggaggcct gcaccagagc aggttgtgca cactcggcgc 19020 ctcagcctct gtggacagat gaagcccctc cagactctca gctggctcct actgtccact 19080 ctgtgaagtc caccagtgtt gagctgagct ggtctgagcc tgttaaccca aatggaaaaa 19140 taattcgcta tgaagtgatt cgcagatgct tcgagggaaa agcttgggga aatcagacga 19200 tccaggccga cgagaaaatt gttttcacag aatataacac tgaaaggaat acatttatgt 19260 ataatgacac aggtttgcaa ccatggacgc agtgtgaata taaaatctac acttggaatt 19320 cagctgggca tacctgtagc tcttggaatg tggtgaggac attgcaagca cctccagaag 19380 gtctctctcc acctgtgata tcctatgttt ctatgaatcc ccaaaaactg ctgatttcct 19440 ggatcccacc agaacagtct aatggtatta tccagtccta taggcttcaa aggaatgaaa 19500 tgctctatcc ttttagcttt gatcctgtga ctttcaatta cactgatgaa gagcttcttc 19560 ctttttccac ctatagctat gcactccaag cctgcacgag tggaggatgc tccaccagca 19620 aacccaccag catcacaact ctggaggctg ctccatcaga agtcagccct ccagatcttt 19680 gggccgtcag tgccactcaa atgaatgtat gttggtcacc gcccacagtg caaaatggaa 19740 agattactaa atatttagtt agatatgata ataaagagtc ccttgctggc cagggcctgt 19800 gcctgctggt ttcccacctg aagccttact ctcagtataa cttctccctt gtagcctgca 19860 cgaatggagg ttgcacagct agtgtgtcaa aatctgcctg gacaatggag gccctgccag 19920 agaacatgga ctctccaaca ttgcaagtca caggctcaga atcaatagaa atcacctgga 19980 aacctccaag aaacccaaat ggccagatca gaagttatga acttaggagg gatggaacca 20040 ttgtatatac aggcttggaa acacgctatc gtgattttac tctcacccca ggtgtggagt 20100 atagctacac agtaactgcc agcaacagcc aagggggtat tttgagtcct cttgtcaaag 20160 atcgaaccag cccctcagca ccctcaggga tggaacctcc aaaattgcag gccaggggtc 20220 ctcaggagat cttagtgaac tgggaccctc cagtgagaac aaatggtgat atcatcaatt 20280 ataccctctt catccgtgaa ctatttgaaa gagaaactaa aatcatacac ataaacacaa 20340 ctcataattc ttttggtatg cagtcatata tagtaaacca gctgaagcca tttcacagnn 20400 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 20460 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnngt atgaaatacg aattcaagcg 20520 tgcaccaccc tgggatgtgc atcaagtgac tggacattca tacagacccc tgagattgca 20580 cctttgatgc aaccccctcc acatctggag gtacaaatgg ctccaggagg attccagcca 20640 actgtttctc ttttgtggac aggaccgctg cagccaaatg gaaaagtttt gtattacgaa 20700 ttatacagaa gacaaatagc aactcagcct agaaaatcca atccagtcct aatctataac 20760 ggaagctcaa catcttttat agattccgaa ctattgcctt tcacagagta tgagtatcag 20820 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 20880 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn gtctgggcag tgaattctgc 20940 aggaaaagcc cccagtagct ggacatggtg cagaaccggg ccagccccac cagaaggtct 21000 cagagccccc acgttccatg tgatctcttc tacccaagca gtggtcaaca tcagtgcccc 21060 tgggaagccc aacgggatcg tcagtctcta caggctgttc tccagcagcg cccatggggc 21120 tgagacagtg nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 21180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ctatccgaag 21240 gcatggccac ccagcagact ctccatggcc ttcaagcctt cactaactac tctattggag 21300 tagaggcctg cacctgcttc aactgttgca gcaaaggacc gacagctgaa ctgagaaccc 21360 atcctgcccc accctcagga ctgtcctctc cacaaatcgg gacgctggcc tcaaggacgg 21420 cctccttccg gtggagtccc cccatgttcc ccaatggtgt cattcacagn nnnnnnnnnn 21480 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 21540 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnc tatgaactcc aattccacgt ggcttgccct 21600 cctgactcag ccctcccctg tactcccagc caaatagaaa caaagtacac ggggctgggg 21660 cagaaagcca gccttggggg tctccagccc tacaccacat acaagctgag agtggtggca 21720 cacaacgagg tgggcagtac ggcttccgag tggatcagtt tcaccaccca aaaagaatnn 21780 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 21840 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnntg cctcagtacc gagccccatt 21900 ttcggtggac agcaatttgt ctgtggtgtg tgtgaactgg agtgacacct tcctcctgaa 21960 cggccaactg aaggagtacg tgttaaccga cggagggcga cgcgtgtaca gcggcttgga 22020 caccaccctc tacataccga gaacggcgga caaaannnnn nnnnnnnnnn nnnnnnnnnn 22080 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22140 nnnnnnnnnn nnnnnccttc tttttccagg tcatctgcac gactgacgaa ggaagtgtta 22200 agacgccgtt gatccaatat gatacctcta ctggacttgn nnnnnnnnnn nnnnnnnnnn 22260 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22320 nnnnnnnnnn nnnnnnnnng cttggtccta acaactcctg ggaaaaagaa gggatcgcgg 22380 agcaaaagca cagagttcta cagcgagctg tggttcatag tgttaatggc gatgctgggc 22440 ttgatcttgt tggccatttt tctgtccctg atactacaaa gaaaaatcca caaagagcca 22500 tatatcagag aaagacctcc cttggtacct cttcagaaga ggatgtctcc attgaatgtt 22560 tacccaccgg gggaaaacca tatgnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22620 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22680 nnnngggtta gccgatacca aaattccccg gtctgggaca cctgtgagta tccgcagcaa 22740 ccggagtgca tgtgtcctgc gcatcccgag tcaaaaccaa accagcctaa cctactccca 22800 gggttctctt caccgcagcg tcagccagct catggacatt caagacaaga aagtcttgat 22860 ggacaactca ctgtgggaag ccatcatggg ccacaacagt ggactgnnnn nnnnnnnnnn 22920 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 22980 nnnnnnnnnn nnnnnnnnnn nnnnnntatg tggatgaaga ggacctgatg aacgccatca 23040 aggatttcag ctcagtgact aaggaacgca ccacattcac agacacccac ctgtaaagga 23100 tggaaaccca gaagacgtaa ccctggaatg caaggtctgc acccatttcc tcctgggtta 23160 tcactcacac atcataaatg ctgaaaagcc attgtttatt atcctataat tctttaaaga 23220 aatgatgact gtttttgaaa gtgttccttc ctaatagagg tctaagaaat gatatttttc 23280 tcatcttaaa tgagagagaa tattcatatg aaaatacttg atttgctctt attttgtaga 23340 agacaaagaa gtatgtaatt gtcacttggt tctgtttggc agtgatgctc ctggttaact 23400 gaataatcag tggcaatttc aagatggctc acagttgtta gaagtagtaa gttagttact 23460 ggctcaaaaa tgattctgtt gaaaggatgt cactgctgtt catttctatc tgccatttct 23520 gtcagggttg acacaatcct gcaagaatag ttattctaat gatcacagct gctaaatgaa 23580 tcccaaactt tgcaccaggt cgacaaactt ttctgaaggt tctatttatt taccatacat 23640 agggttactt accaaacttt ttgacaaggc tgaaggttct atttatttac aatacatagg 23700 gttactcacc aaactttttg acaaggcaac acataactta cacataaatg tctctgttct 23760 tgcatttatg aattttccaa aaatctaagg agtaaacagc ttatttatac attttgagga 23820 gaaaacaaag tgtttcacta ggaacacctc tacttgaacc aatgttttta tttcatatat 23880 tttatagttt tgaaactagt ttctcataaa attctgtcaa ttcactgaat atcagagaat 23940 actgacatct tcaacctagc acatttcaaa tggaaactac tgttctattt gcaatattag 24000 gctgcgtgaa attttaaaag gaaaaatgta tctgttcctt ctagcattaa catatataca 24060 tgtagagaca agactatacc tatgtgtata tatatgtata tcatgtatat attactctgc 24120 actatatccc ttctttttgg agaactagcc attattttag ccacagaatc agtaagaaca 24180 gatgatatgc aacagtacca attacggttc aaaaatgtct gtcacctgct ctagttggat 24240 tacaaagtca ttggtgaaag tcctatggca agaaaaattt tcttgcaaat catccacata 24300 aaatcagata tttaaatttg ttcttcatgg aaaacagagt aagaaaacct cttgtcttcc 24360 ttcatcctta aaggtctttg tgaccccagg aaaatattga ctctgtctaa cacacaatag 24420 tcacaatact ttttgtgaat ctacaaccag agacaggcaa aaacttgtaa agtaagggat 24480 agtcttactt attctgcctg aaaacaatgt attaccccag ggcccaacag taaaagattg 24540 tggacttttt gggtattgag atttcatcta gctctgtgag agagcagctc ctcagactga 24600 ccaactccta gacaaagttt gccaaccata agtgtcaaaa gcacaggcca gtattaagca 24660 gaagttctac caccttatta gaactgctat aaacaaaagc atctgaaata attgtgcaca 24720 tctggcagtg actgtagaaa atacgaaata tatatttctc gccaagtttt tatactttct 24780 gaaatgaaaa cataggattg actagtttac tggtttttat tcccatatgc cgattctggg 24840 acaataaagt tgtttaaagc tggcacaaat aagcattaac caaggctgtg tccaccttct 24900 gtgagctact taaggtatat aggaaaggag tggtcacaaa cttgcatcct aatccttggt 24960 ggactcttct aagaatacag tttgctagtc acaaagaata gtctacaaat atgctttgct 25020 aggttcagaa gattgagttt atcctgattt ttgaaaaatt aaccaggtat ctttatcact 25080 gtgtattttt ccaagcacag tataaaattt taacaacgca caaaaaaata cagaactgca 25140 ggggatttta tcttggatca ttatccattt aatcatctaa ttagacatga actcagttag 25200 ctgaatcatt tacattttga ctccatagct tagggcagac agaagcctgt atggcttctg 25260 cccagaactc tgtcccctgc tacatgtcta agtttacttg tatttatttc agagaagaac 25320 tctaagatgt tgctttgcta ctttaagtgg tattgcgtgc caagcctcta ttatacaaac 25380 catgcagact cgcctctaga gattctgatt cggttgatct ggggtgtgtg gctgaggcat 25440 cagtactttt taaagcttcc aggtgttcta atgttgagac ccactgatgt tccacaatct 25500 ggaagaaatc atgtacagga ataatatgct atgcacaggg actatgctcc ttggctcacc 25560 ccttctccct tataaacaat gagcagttct tgatgaacct ctttaaattt aaatctcctg 25620 actcacattt taccaattgt acatgccaca ttctcagctt acgaactacc atgttttgtt 25680 attcttaata tcaactgttt ggtaagagta cagttgtttt tatacactct aagaaatgtg 25740 tttataatct actgtaattt ccactaaatg gaacccaaat attaatgtta tggtaccata 25800 tactgatgta aaaatcatgc tggcatccat gaacacaccg gtaaataaaa catagtccaa 25860 gtggaagaat tcattaataa ggaactttta attatgtcac aaatgaatag ttggtttcca 25920 atgcacaaat atcatgtaaa ctaatctaaa gatggtttgc ttaataaata tttgaatgtg 25980 acc 25983 <210> 11 <211> 805 <212> PRT <213> Artificial Sequence <220> <223> CEP290 <400> 11 Met Pro Pro Asn Ile Asn Trp Lys Glu Ile Met Lys Val Asp Pro Asp 1 5 10 15 Asp Leu Pro Arg Gln Glu Glu Leu Ala Asp Asn Leu Leu Ile Ser Leu 20 25 30 Ser Lys Val Glu Val Asn Glu Leu Lys Ser Glu Lys Gln Glu Asn Val 35 40 45 Ile His Leu Phe Arg Ile Thr Gln Ser Leu Met Lys Met Lys Ala Gln 50 55 60 Glu Val Glu Leu Ala Leu Glu Glu Val Glu Lys Ala Gly Glu Glu Gln 65 70 75 80 Ala Lys Phe Glu Asn Gln Leu Lys Thr Lys Val Met Lys Leu Glu Asn 85 90 95 Glu Leu Glu Met Ala Gln Gln Ser Ala Gly Gly Arg Asp Thr Arg Phe 100 105 110 Leu Arg Asn Glu Ile Cys Gln Leu Glu Lys Gln Leu Glu Gln Lys Asp 115 120 125 Arg Glu Leu Glu Asp Met Glu Lys Glu Leu Glu Lys Glu Lys Lys Val 130 135 140 Asn Glu Gln Leu Ala Leu Arg Asn Glu Glu Ala Glu Asn Glu Asn Ser 145 150 155 160 Lys Leu Arg Arg Glu Asn Lys Arg Leu Lys Lys Lys Asn Glu Gln Leu 165 170 175 Cys Gln Asp Ile Ile Asp Tyr Gln Lys Gln Ile Asp Ser Gln Lys Glu 180 185 190 Thr Leu Leu Ser Arg Arg Gly Glu Asp Ser Asp Tyr Arg Ser Gln Leu 195 200 205 Ser Lys Lys Asn Tyr Glu Leu Ile Gln Tyr Leu Asp Glu Ile Gln Thr 210 215 220 Leu Thr Glu Ala Asn Glu Lys Ile Glu Val Gln Asn Gln Glu Met Arg 225 230 235 240 Lys Asn Leu Glu Glu Ser Val Gln Glu Met Glu Lys Met Thr Asp Glu 245 250 255 Tyr Asn Arg Met Lys Ala Ile Val His Gln Thr Asp Asn Val Ile Asp 260 265 270 Gln Leu Lys Lys Glu Asn Asp His Tyr Gln Leu Gln Val Gln Glu Leu 275 280 285 Thr Asp Leu Leu Lys Ser Lys Asn Glu Glu Asp Asp Pro Ile Met Val 290 295 300 Ala Val Asn Ala Lys Val Glu Glu Trp Lys Leu Ile Leu Ser Ser Lys 305 310 315 320 Asp Asp Glu Ile Ile Glu Tyr Gln Gln Met Leu His Asn Leu Arg Glu 325 330 335 Lys Leu Lys Asn Ala Gln Leu Asp Ala Asp Lys Ser Asn Val Met Ala 340 345 350 Leu Gln Gln Gly Ile Gln Glu Arg Asp Ser Gln Ile Lys Met Leu Thr 355 360 365 Glu Gln Val Glu Gln Tyr Thr Lys Glu Met Glu Lys Asn Thr Cys Ile 370 375 380 Ile Glu Asp Leu Lys Asn Glu Leu Gln Arg Asn Lys Gly Ala Ser Thr 385 390 395 400 Leu Ser Gln Gln Thr His Met Lys Ile Gln Ser Thr Leu Asp Ile Leu 405 410 415 Lys Glu Lys Thr Lys Glu Ala Glu Arg Thr Ala Glu Leu Ala Glu Ala 420 425 430 Asp Ala Arg Glu Lys Asp Lys Glu Leu Val Glu Ala Leu Lys Arg Leu 435 440 445 Lys Asp Tyr Glu Ser Gly Val Tyr Gly Leu Glu Asp Ala Val Val Glu 450 455 460 Ile Lys Asn Cys Lys Asn Gln Ile Lys Ile Arg Asp Arg Glu Ile Glu 465 470 475 480 Ile Leu Thr Lys Glu Ile Asn Lys Leu Glu Leu Lys Ile Ser Asp Phe 485 490 495 Leu Asp Glu Asn Glu Ala Leu Arg Glu Arg Val Gly Leu Glu Pro Lys 500 505 510 Thr Met Ile Asp Leu Thr Glu Phe Arg Asn Ser Lys His Leu Lys Gln 515 520 525 Gln Gln Tyr Arg Ala Glu Asn Gln Ile Leu Leu Lys Glu Ile Glu Ser 530 535 540 Leu Glu Glu Glu Arg Leu Asp Leu Lys Lys Lys Ile Arg Gln Met Ala 545 550 555 560 Gln Glu Arg Gly Lys Arg Ser Ala Thr Ser Gly Leu Thr Thr Glu Asp 565 570 575 Leu Asn Leu Thr Glu Asn Ile Ser Gln Gly Asp Arg Ile Ser Glu Arg 580 585 590 Lys Leu Asp Leu Leu Ser Leu Lys Asn Met Ser Glu Ala Gln Ser Lys 595 600 605 Asn Glu Phe Leu Ser Arg Glu Leu Ile Glu Lys Glu Arg Asp Leu Glu 610 615 620 Arg Ser Arg Thr Val Ile Ala Lys Phe Gln Asn Lys Leu Lys Glu Leu 625 630 635 640 Val Glu Glu Asn Lys Gln Leu Glu Glu Gly Met Lys Glu Ile Leu Gln 645 650 655 Ala Ile Lys Glu Met Gln Lys Asp Pro Asp Val Lys Gly Gly Glu Thr 660 665 670 Ser Leu Ile Ile Pro Ser Leu Glu Arg Leu Val Asn Ala Ile Glu Ser 675 680 685 Lys Asn Ala Glu Gly Ile Phe Asp Ala Ser Leu His Leu Lys Ala Gln 690 695 700 Val Asp Gln Leu Thr Gly Arg Asn Glu Glu Leu Arg Gln Glu Leu Arg 705 710 715 720 Glu Ser Arg Lys Glu Ala Ile Asn Tyr Ser Gln Gln Leu Ala Lys Ala 725 730 735 Asn Leu Lys Ile Asp His Leu Glu Lys Glu Thr Ser Leu Leu Arg Gln 740 745 750 Ser Glu Gly Ser Asn Val Val Phe Lys Gly Ile Asp Leu Pro Asp Gly 755 760 765 Ile Ala Pro Ser Ser Ala Ser Ile Ile Asn Ser Gln Asn Glu Tyr Leu 770 775 780 Ile His Leu Leu Gln Glu Leu Glu Asn Lys Glu Lys Lys Val Lys Glu 785 790 795 800 Phe Arg Arg Phe Ser 805 <210> 12 <211> 2846 <212> DNA <213> Artificial Sequence <220> <223> CEP290 <400> 12 ggcccgcggc cgggtccagc ttggtggttg cggtagtgag aggcctccgc tggttgccag 60 gcttggtcta gaggtggagc acagtgaaag aattcaagat gccacctaat ataaactgga 120 aagaaataat gaaagttgac ccagatgacc tgccccgtca agaagaactg gcagataatt 180 tattgatttc cttatccaag gtggaagtaa atgagctaaa aagtgaaaag caagaaaatg 240 tgatacacct tttcagaatt actcagtcac taatgaagat gaaagctcaa gaagtggagc 300 tggctttgga agaagtagaa aaagctggag aagaacaagc aaaatttgaa aatcaattaa 360 aaactaaagt aatgaaactg gaaaatgaac tggagatggc tcagcagtct gcaggtggac 420 gagatactcg gtttttacgt aatgaaattt gccaacttga aaaacaatta gaacaaaaag 480 atagagaatt ggaggacatg gaaaaggagt tggagaaaga gaagaaagtt aatgagcaat 540 tggctcttcg aaatgaggag gcagaaaatg aaaacagcaa attaagaaga gagaacaaac 600 gtctaaagaa aaagaatgaa caactttgtc aggatattat tgactaccag aaacaaatag 660 attcacagaa agaaacactt ttatcaagaa gaggggaaga cagtgactac cgatcacagt 720 tgtctaaaaa aaactatgag cttatccaat atcttgatga aattcagact ttaacagaag 780 ctaatgagaa aattgaagtt cagaatcaag aaatgagaaa aaatttagaa gagtctgtac 840 aggaaatgga gaagatgact gatgaatata atagaatgaa agctattgtg catcagacag 900 ataatgtaat agatcagtta aaaaaagaaa acgatcatta tcaacttcaa gtgcaggagc 960 ttacagatct tctgaaatca aaaaatgaag aagatgatcc aattatggta gctgtcaatg 1020 caaaagtaga agaatggaag ctaattttgt cttctaaaga tgatgaaatt attgagtatc 1080 agcaaatgtt acataaccta agggagaaac ttaagaatgc tcagcttgat gctgataaaa 1140 gtaatgttat ggctctacag cagggtatac aggaacgaga cagtcaaatt aagatgctca 1200 ccgaacaagt agaacaatat acaaaagaaa tggaaaagaa tacttgtatt attgaagatt 1260 tgaaaaatga gctccaaaga aacaaaggtg cttcaaccct ttctcaacag actcatatga 1320 aaattcagtc aacgttagac attttaaaag agaaaactaa agaggctgag agaacagctg 1380 aactggctga ggctgatgct agggaaaagg ataaagaatt agttgaggct ctgaagaggt 1440 taaaagatta tgaatcggga gtatatggtt tagaagatgc tgtcgttgaa ataaagaatt 1500 gtaaaaacca aattaaaata agagatcgag agattgaaat attaacaaag gaaatcaata 1560 aacttgaatt gaagatcagt gatttccttg atgaaaatga ggcacttaga gagcgtgtgg 1620 gccttgaacc aaagacaatg attgatttaa ctgaatttag aaatagcaaa cacttaaaac 1680 agcagcagta cagagctgaa aaccagattc ttttgaaaga gattgaaagt ctagaggaag 1740 aacgacttga tctgaaaaaa aaaattcgtc aaatggctca agaaagagga aaaagaagtg 1800 caacttcagg attaaccact gaggacctga acctaactga aaacatttct caaggagata 1860 gaataagtga aagaaaattg gatttattga gcctcaaaaa tatgagtgaa gcacaatcaa 1920 agaatgaatt tctttcaaga gaactaattg aaaaagaaag agatttagaa aggagtagga 1980 cagtgatagc caaatttcag aataaattaa aagaattagt tgaagaaaat aagcaacttg 2040 aagaaggtat gaaagaaata ttgcaagcaa ttaaggaaat gcagaaagat cctgatgtta 2100 aaggaggaga aacatctcta attatcccta gccttgaaag actagttaat gctatagaat 2160 caaagaatgc agaaggaatc tttgatgcga gtctgcattt gaaagcccaa gttgatcagc 2220 ttaccggaag aaatgaagaa ttaagacagg agctcaggga atctcggaaa gaggctataa 2280 attattcaca gcagttggca aaagctaatt taaagataga ccatcttgaa aaagaaacta 2340 gtcttttacg acaatcagaa ggatcgaatg ttgtttttaa aggaattgac ttacctgatg 2400 ggatagcacc atctagtgcc agtatcatta attctcagaa tgaatattta atacatttgt 2460 tacaggaact agaaaataaa gaaaaaaaag ttaaagaatt tagaagattc tcttgaagat 2520 tacaacagaa aatttgctgt aattcgtcat caacaaagtt tgttgtataa agaataccta 2580 agtgaaaagg agacctggaa aacagaatct aaaacaataa aagaggaaaa gagaaaactt 2640 gaggatcaag tccaacaaga tgctataaaa gtaaaagaat ataataattt gctcaatgct 2700 cttcagatgg attcggatga aatgaaaaaa atacttgcag aaaatagtag gaaaattact 2760 gttttgcaag tgaatgaaaa atcacttata aggcaatata caaccttagt agaattggag 2820 cgacaactta gaaaaaaaaa aaaaaa 2846 <210> 13 <211> 477 <212> PRT <213> Artificial Sequence <220> <223> DRD5 <400> 13 Met Leu Pro Pro Gly Ser Asn Gly Thr Ala Tyr Pro Gly Gln Phe Ala 1 5 10 15 Leu Tyr Gln Gln Leu Ala Gln Gly Asn Ala Val Gly Gly Ser Ala Gly 20 25 30 Ala Pro Pro Leu Gly Pro Ser Gln Val Val Thr Ala Cys Leu Leu Thr 35 40 45 Leu Leu Ile Ile Trp Thr Leu Leu Gly Asn Val Leu Val Cys Ala Ala 50 55 60 Ile Val Arg Ser Arg His Leu Arg Ala Asn Met Thr Asn Val Phe Ile 65 70 75 80 Val Ser Leu Ala Val Ser Asp Leu Phe Val Ala Leu Leu Val Met Pro 85 90 95 Trp Lys Ala Val Ala Glu Val Ala Gly Tyr Trp Pro Phe Gly Ala Phe 100 105 110 Cys Asp Val Trp Val Ala Phe Asp Ile Met Cys Ser Thr Ala Ser Ile 115 120 125 Leu Asn Leu Cys Val Ile Ser Val Asp Arg Tyr Trp Ala Ile Ser Arg 130 135 140 Pro Phe Arg Tyr Lys Arg Lys Met Thr Gln Arg Met Ala Leu Val Met 145 150 155 160 Val Gly Leu Ala Trp Thr Leu Ser Ile Leu Ile Ser Phe Ile Pro Val 165 170 175 Gln Leu Asn Trp His Arg Asp Gln Ala Ala Ser Trp Gly Gly Leu Asp 180 185 190 Leu Pro Asn Asn Leu Ala Asn Trp Thr Pro Trp Glu Glu Asp Phe Trp 195 200 205 Glu Pro Asp Val Asn Ala Glu Asn Cys Asp Ser Ser Leu Asn Arg Thr 210 215 220 Tyr Ala Ile Ser Ser Ser Leu Ile Ser Phe Tyr Ile Pro Val Ala Ile 225 230 235 240 Met Ile Val Thr Tyr Thr Arg Ile Tyr Arg Ile Ala Gln Val Gln Ile 245 250 255 Arg Arg Ile Ser Ser Leu Glu Arg Ala Ala Glu His Ala Gln Ser Cys 260 265 270 Arg Ser Ser Ala Ala Cys Ala Pro Asp Thr Ser Leu Arg Ala Ser Ile 275 280 285 Lys Lys Glu Thr Lys Val Leu Lys Thr Leu Ser Val Ile Met Gly Val 290 295 300 Phe Val Cys Cys Trp Leu Pro Phe Phe Ile Leu Asn Cys Met Val Pro 305 310 315 320 Phe Cys Ser Gly His Pro Glu Gly Pro Pro Ala Gly Phe Pro Cys Val 325 330 335 Ser Glu Thr Thr Phe Asp Val Phe Val Trp Phe Gly Trp Ala Asn Ser 340 345 350 Ser Leu Asn Pro Val Ile Tyr Ala Phe Asn Ala Asp Phe Gln Lys Val 355 360 365 Phe Ala Gln Leu Leu Gly Cys Ser His Phe Cys Ser Arg Thr Pro Val 370 375 380 Glu Thr Val Asn Ile Ser Asn Glu Leu Ile Ser Tyr Asn Gln Asp Ile 385 390 395 400 Val Phe His Lys Glu Ile Ala Ala Ala Tyr Ile His Met Met Pro Asn 405 410 415 Ala Val Thr Pro Gly Asn Arg Glu Val Asp Asn Asp Glu Glu Glu Gly 420 425 430 Pro Phe Asp Arg Met Phe Gln Ile Tyr Gln Thr Ser Pro Asp Gly Asp 435 440 445 Pro Val Ala Glu Ser Val Trp Glu Leu Asp Cys Glu Gly Glu Ile Ser 450 455 460 Leu Asp Lys Ile Thr Pro Phe Thr Pro Asn Gly Phe His 465 470 475 <210> 14 <211> 1673 <212> DNA <213> Artificial Sequence <220> <223> DRD5 <400> 14 cccggcgcag ctcatggtga gcgcctctgg ggctcgaggg tcccttggct gagggggcgc 60 atcctcgggg tgcccgatgg ggctgcctgg gggtcgcagg gctgaagttg ggatcgcgca 120 caaaccgacc ctgcagtcca gcccgaaatg ctgccgccag gcagcaacgg caccgcgtac 180 ccggggcagt tcgctctata ccagcagctg gcgcagggga acgccgtggg gggctcggcg 240 ggggcaccgc cactggggcc ctcacaggtg gtcaccgcct gcctgctgac cctactcatc 300 atctggaccc tgctgggcaa cgtgctggtg tgcgcagcca tcgtgcggag ccgccacctg 360 cgcgccaaca tgaccaacgt cttcatcgtg tctctggccg tgtctgacct tttcgtggcg 420 ctgctggtca tgccctggaa ggcagtcgcc gaggtggccg gttactggcc ctttggagcg 480 ttctgcgacg tctgggtggc cttcgacatc atgtgctcca ctgcctccat cctgaacctg 540 tgcgtcatca gcgtggaccg ctactgggcc atctccaggc ccttccgcta caagcgcaag 600 atgactcagc gcatggcctt ggtcatggtc ggcctggcat ggaccttgtc catcctcatc 660 tccttcattc cggtccagct caactggcac agggaccagg cggcctcttg gggcgggctg 720 gacctgccaa acaacctggc caactggacg ccctgggagg aggacttttg ggagcccgac 780 gtgaatgcag agaactgtga ctccagcctg aatcgaacct acgccatctc ttcctcgctc 840 atcagcttct acatccccgt tgccatcatg atcgtgacct acacgcgcat ctaccgcatc 900 gcccaggtgc agatccgcag gatttcctcc ctggagaggg ccgcagagca cgcgcagagc 960 tgccggagca gcgcagcctg cgcgcccgac accagcctgc gcgcttccat caagaaggag 1020 accaaggttc tcaagaccct gtcggtgatc atgggggtct tcgtgtgttg ctggctgccc 1080 ttcttcatcc ttaactgcat ggtccctttc tgcagtggac accctgaagg ccctccggcc 1140 ggcttcccct gcgtcagtga gaccaccttc gacgtcttcg tctggttcgg ctgggctaac 1200 tcctcactca accccgtcat ctatgccttc aacgccgact ttcagaaggt gtttgcccag 1260 ctgctggggt gcagccactt ctgctcccgc acgccggtgg agacggtgaa catcagcaat 1320 gagctcatct cctacaacca agacatcgtc ttccacaagg aaatcgcagc tgcctacatc 1380 cacatgatgc ccaacgccgt tacccccggc aaccgggagg tggacaacga cgaggaggag 1440 ggtcctttcg atcgcatgtt ccagatctat cagacgtccc cagatggtga ccctgttgct 1500 gagtctgtct gggagctgga ctgcgagggg gagatttctt tagacaaaat aacacctttc 1560 accccgaatg gattccatta aactgcatta agaaaccccc tcatggatct gcataaccgc 1620 acagacactg acaagcacgc acacacacgc aaatacatgc ctttccagta ctg 1673 <210> 15 <211> 531 <212> PRT <213> Artificial Sequence <220> <223> ASIC3-1 <400> 15 Met Lys Pro Thr Ser Gly Pro Glu Glu Ala Arg Arg Pro Ala Ser Asp 1 5 10 15 Ile Arg Val Phe Ala Ser Asn Cys Ser Met His Gly Leu Gly His Val 20 25 30 Phe Gly Pro Gly Ser Leu Ser Leu Arg Arg Gly Met Trp Ala Ala Ala 35 40 45 Val Val Leu Ser Val Ala Thr Phe Leu Tyr Gln Val Ala Glu Arg Val 50 55 60 Arg Tyr Tyr Arg Glu Phe His His Gln Thr Ala Leu Asp Glu Arg Glu 65 70 75 80 Ser His Arg Leu Ile Phe Pro Ala Val Thr Leu Cys Asn Ile Asn Pro 85 90 95 Leu Arg Arg Ser Arg Leu Thr Pro Asn Asp Leu His Trp Ala Gly Ser 100 105 110 Ala Leu Leu Gly Leu Asp Pro Ala Glu His Ala Ala Phe Leu Arg Ala 115 120 125 Leu Gly Arg Pro Pro Ala Pro Pro Gly Phe Met Pro Ser Pro Thr Phe 130 135 140 Asp Met Ala Gln Leu Tyr Ala Arg Ala Gly His Ser Leu Asp Asp Met 145 150 155 160 Leu Leu Asp Cys Arg Phe Arg Gly Gln Pro Cys Gly Pro Glu Asn Phe 165 170 175 Thr Thr Ile Phe Thr Arg Met Gly Lys Cys Tyr Thr Phe Asn Ser Gly 180 185 190 Ala Asp Gly Ala Glu Leu Leu Thr Thr Thr Arg Gly Gly Met Gly Asn 195 200 205 Gly Leu Asp Ile Met Leu Asp Val Gln Gln Glu Glu Tyr Leu Pro Val 210 215 220 Trp Arg Asp Asn Glu Glu Thr Pro Phe Glu Val Gly Ile Arg Val Gln 225 230 235 240 Ile His Ser Gln Glu Glu Pro Pro Ile Ile Asp Gln Leu Gly Leu Gly 245 250 255 Val Ser Pro Gly Tyr Gln Thr Phe Val Ser Cys Gln Gln Gln Gln Leu 260 265 270 Ser Phe Leu Pro Pro Pro Trp Gly Asp Cys Ser Ser Ala Ser Leu Asn 275 280 285 Pro Asn Tyr Glu Pro Glu Pro Ser Asp Pro Leu Gly Ser Pro Ser Pro 290 295 300 Ser Pro Ser Pro Pro Tyr Thr Leu Met Gly Cys Arg Leu Ala Cys Glu 305 310 315 320 Thr Arg Tyr Val Ala Arg Lys Cys Gly Cys Arg Met Val Tyr Met Pro 325 330 335 Gly Asp Val Pro Val Cys Ser Pro Gln Gln Tyr Lys Asn Cys Ala His 340 345 350 Pro Ala Ile Asp Ala Met Leu Arg Lys Asp Ser Cys Ala Cys Pro Asn 355 360 365 Pro Cys Ala Ser Thr Arg Tyr Ala Lys Glu Leu Ser Met Val Arg Ile 370 375 380 Pro Ser Arg Ala Ala Ala Arg Phe Leu Ala Arg Lys Leu Asn Arg Ser 385 390 395 400 Glu Ala Tyr Ile Ala Glu Asn Val Leu Ala Leu Asp Ile Phe Phe Glu 405 410 415 Ala Leu Asn Tyr Glu Thr Val Glu Gln Lys Lys Ala Tyr Glu Met Ser 420 425 430 Glu Leu Leu Gly Asp Ile Gly Gly Gln Met Gly Leu Phe Ile Gly Ala 435 440 445 Ser Leu Leu Thr Ile Leu Glu Ile Leu Asp Tyr Leu Cys Glu Val Phe 450 455 460 Arg Asp Lys Val Leu Gly Tyr Phe Trp Asn Arg Gln His Ser Gln Arg 465 470 475 480 His Ser Ser Thr Asn Leu Leu Gln Glu Gly Leu Gly Ser His Arg Thr 485 490 495 Gln Val Pro His Leu Ser Leu Gly Pro Arg Pro Pro Thr Pro Pro Cys 500 505 510 Ala Val Thr Lys Thr Leu Ser Ala Ser His Arg Thr Cys Tyr Leu Val 515 520 525 Thr Gln Leu 530 <210> 16 <211> 2053 <212> DNA <213> Artificial Sequence <220> <223> ASIC3-1 <400> 16 ctctgcagca gcgccggctc agcaccgccg gctcagcacc gctccgcagc ccctgcctgc 60 cacggtcagc tacgtcccac ctggtctgct gcggagtccc cagcccagtg cctagcccag 120 tggagccacc gcctgttcct cgggaaggaa cagtgggacc tgaccggcca gatcacctcc 180 tccaatcctg ccaggctagt gcctccctgc cttccaacct tggctgtctc ccaccctctc 240 ttctcctctc cttgcctggc ctcctgaatc ctatcttagc ctccttagcc ccctgactga 300 ctctctctcg cttcttccaa gcctctgtag ctggttccgc tcctgggttc tggccatgaa 360 gcccacctca ggcccagagg aggcccggcg gccagcctcg gacatccgcg tgttcgccag 420 caactgctcg atgcacgggc tgggccacgt cttcgggcca ggcagcctga gcctgcgccg 480 ggggatgtgg gcagcggccg tggtcctgtc agtggccacc ttcctctacc aggtggctga 540 gagggtgcgc tactacaggg agttccacca ccagactgcc ctggatgagc gagaaagcca 600 ccggctcatc ttcccggctg tcaccctgtg caacatcaac ccactgcgcc gctcgcgcct 660 aacgcccaac gacctgcact gggctgggtc tgcgctgctg ggcctggatc ccgcagagca 720 cgccgccttc ctgcgcgccc tgggccggcc ccctgcaccg cccggcttca tgcccagtcc 780 cacctttgac atggcgcaac tctatgcccg tgctgggcac tccctggatg acatgctgct 840 ggactgtcgc ttccgtggcc aaccttgtgg gcctgagaac ttcaccacga tcttcacccg 900 gatgggaaag tgctacacat ttaactctgg cgctgatggg gcagagctgc tcaccactac 960 taggggtggc atgggcaatg ggctggacat catgctggac gtgcagcagg aggaatatct 1020 acctgtgtgg agggacaatg aggagacccc gtttgaggtg gggatccgag tgcagatcca 1080 cagccaggag gagccgccca tcatcgatca gctgggcttg ggggtgtccc cgggctacca 1140 gacctttgtt tcttgccagc agcagcagct gagcttcctg ccaccgccct ggggcgattg 1200 cagttcagca tctctgaacc ccaactatga gccagagccc tctgatcccc taggctcccc 1260 cagccccagc cccagccctc cctataccct tatggggtgt cgcctggcct gcgaaacccg 1320 ctacgtggct cggaagtgcg gctgccgaat ggtgtacatg ccaggcgacg tgccagtgtg 1380 cagcccccag cagtacaaga actgtgccca cccggccata gatgccatgc ttcgcaagga 1440 ctcgtgcgcc tgccccaacc cgtgcgccag cacgcgctac gccaaggagc tctccatggt 1500 gcggatcccg agccgcgccg ccgcgcgctt cctggcccgg aagctcaacc gcagcgaggc 1560 ctacatcgcg gagaacgtgc tggccctgga catcttcttt gaggccctca actatgagac 1620 cgtggagcag aagaaggcct atgagatgtc agagctgctt ggtgacattg ggggccagat 1680 ggggctgttc atcggggcca gcctgctcac catcctcgag atcctagact acctctgtga 1740 ggtgttccga gacaaggtcc tgggatattt ctggaaccga cagcactccc aaaggcactc 1800 cagcaccaat ctgcttcagg aagggctggg cagccatcga acccaagttc cccacctcag 1860 cctgggcccc agacctccca cccctccctg tgccgtcacc aagactctct ccgcctccca 1920 ccgcacctgc taccttgtca cacagctcta gacctgctgt ctgtgtcctc ggagccccgc 1980 cctgacatcc tggacatgcc tagcctgcac gtagcttttc cgtcttcacc ccaaataaag 2040 tcctaatgca tca 2053 <210> 17 <211> 549 <212> PRT <213> Artificial Sequence <220> <223> ASIC3-2 <400> 17 Met Lys Pro Thr Ser Gly Pro Glu Glu Ala Arg Arg Pro Ala Ser Asp 1 5 10 15 Ile Arg Val Phe Ala Ser Asn Cys Ser Met His Gly Leu Gly His Val 20 25 30 Phe Gly Pro Gly Ser Leu Ser Leu Arg Arg Gly Met Trp Ala Ala Ala 35 40 45 Val Val Leu Ser Val Ala Thr Phe Leu Tyr Gln Val Ala Glu Arg Val 50 55 60 Arg Tyr Tyr Arg Glu Phe His His Gln Thr Ala Leu Asp Glu Arg Glu 65 70 75 80 Ser His Arg Leu Ile Phe Pro Ala Val Thr Leu Cys Asn Ile Asn Pro 85 90 95 Leu Arg Arg Ser Arg Leu Thr Pro Asn Asp Leu His Trp Ala Gly Ser 100 105 110 Ala Leu Leu Gly Leu Asp Pro Ala Glu His Ala Ala Phe Leu Arg Ala 115 120 125 Leu Gly Arg Pro Pro Ala Pro Pro Gly Phe Met Pro Ser Pro Thr Phe 130 135 140 Asp Met Ala Gln Leu Tyr Ala Arg Ala Gly His Ser Leu Asp Asp Met 145 150 155 160 Leu Leu Asp Cys Arg Phe Arg Gly Gln Pro Cys Gly Pro Glu Asn Phe 165 170 175 Thr Thr Ile Phe Thr Arg Met Gly Lys Cys Tyr Thr Phe Asn Ser Gly 180 185 190 Ala Asp Gly Ala Glu Leu Leu Thr Thr Thr Arg Gly Gly Met Gly Asn 195 200 205 Gly Leu Asp Ile Met Leu Asp Val Gln Gln Glu Glu Tyr Leu Pro Val 210 215 220 Trp Arg Asp Asn Glu Glu Thr Pro Phe Glu Val Gly Ile Arg Val Gln 225 230 235 240 Ile His Ser Gln Glu Glu Pro Pro Ile Ile Asp Gln Leu Gly Leu Gly 245 250 255 Val Ser Pro Gly Tyr Gln Thr Phe Val Ser Cys Gln Gln Gln Gln Leu 260 265 270 Ser Phe Leu Pro Pro Pro Trp Gly Asp Cys Ser Ser Ala Ser Leu Asn 275 280 285 Pro Asn Tyr Glu Pro Glu Pro Ser Asp Pro Leu Gly Ser Pro Ser Pro 290 295 300 Ser Pro Ser Pro Pro Tyr Thr Leu Met Gly Cys Arg Leu Ala Cys Glu 305 310 315 320 Thr Arg Tyr Val Ala Arg Lys Cys Gly Cys Arg Met Val Tyr Met Pro 325 330 335 Gly Asp Val Pro Val Cys Ser Pro Gln Gln Tyr Lys Asn Cys Ala His 340 345 350 Pro Ala Ile Asp Ala Met Leu Arg Lys Asp Ser Cys Ala Cys Pro Asn 355 360 365 Pro Cys Ala Ser Thr Arg Tyr Ala Lys Glu Leu Ser Met Val Arg Ile 370 375 380 Pro Ser Arg Ala Ala Ala Arg Phe Leu Ala Arg Lys Leu Asn Arg Ser 385 390 395 400 Glu Ala Tyr Ile Ala Glu Asn Val Leu Ala Leu Asp Ile Phe Phe Glu 405 410 415 Ala Leu Asn Tyr Glu Thr Val Glu Gln Lys Lys Ala Tyr Glu Met Ser 420 425 430 Glu Leu Leu Gly Asp Ile Gly Gly Gln Met Gly Leu Phe Ile Gly Ala 435 440 445 Ser Leu Leu Thr Ile Leu Glu Ile Leu Asp Tyr Leu Cys Glu Val Phe 450 455 460 Arg Asp Lys Val Leu Gly Tyr Phe Trp Asn Arg Gln His Ser Gln Arg 465 470 475 480 His Ser Ser Thr Asn Leu Leu Gln Glu Gly Leu Gly Ser His Arg Thr 485 490 495 Gln Val Pro His Leu Ser Leu Gly Pro Ser Thr Leu Leu Cys Ser Glu 500 505 510 Asp Leu Pro Pro Leu Pro Val Pro Ser Pro Arg Leu Ser Pro Pro Pro 515 520 525 Thr Ala Pro Ala Thr Leu Ser His Ser Ser Arg Pro Ala Val Cys Val 530 535 540 Leu Gly Ala Pro Pro 545 <210> 18 <211> 2314 <212> DNA <213> Artificial Sequence <220> <223> ASIC3-2 <400> 18 tacaggggtt gcaactggga gcctaggggg ccccaaggca tctccaggcc caatctacct 60 ctgggctttt ctcaagctct ccctaggatt actgcggttt cctcctggcg cctctcgtct 120 tggacagcca tgcccccctc catgctgcac taatggctca gcctggggcc ctagggacct 180 ctcctacccc ccagactgct ctgtcggccc cctttccccc ctactgctga aacccaatcc 240 tctgcagcag cgccggctca gcaccgccgg ctcagcaccg ctccgcagcc cctgcctgcc 300 acggtcagct acgtcccacc tggtctgctg cggagtcccc agcccagtgc ctagcccagt 360 ggagccaccg cctgttcctc gggaaggaac agtgggacct gaccggccag atcacctcct 420 ccaatcctgc caggctagtg cctccctgcc ttccaacctt ggctgtctcc caccctctct 480 tctcctctcc ttgcctggcc tcctgaatcc tatcttagcc tccttagccc cctgactgac 540 tctctctcgc ttcttccaag cctctgtagc tggttccgct cctgggttct ggccatgaag 600 cccacctcag gcccagagga ggcccggcgg ccagcctcgg acatccgcgt gttcgccagc 660 aactgctcga tgcacgggct gggccacgtc ttcgggccag gcagcctgag cctgcgccgg 720 gggatgtggg cagcggccgt ggtcctgtca gtggccacct tcctctacca ggtggctgag 780 agggtgcgct actacaggga gttccaccac cagactgccc tggatgagcg agaaagccac 840 cggctcatct tcccggctgt caccctgtgc aacatcaacc cactgcgccg ctcgcgccta 900 acgcccaacg acctgcactg ggctgggtct gcgctgctgg gcctggatcc cgcagagcac 960 gccgccttcc tgcgcgccct gggccggccc cctgcaccgc ccggcttcat gcccagtccc 1020 acctttgaca tggcgcaact ctatgcccgt gctgggcact ccctggatga catgctgctg 1080 gactgtcgct tccgtggcca accttgtggg cctgagaact tcaccacgat cttcacccgg 1140 atgggaaagt gctacacatt taactctggc gctgatgggg cagagctgct caccactact 1200 aggggtggca tgggcaatgg gctggacatc atgctggacg tgcagcagga ggaatatcta 1260 cctgtgtgga gggacaatga ggagaccccg tttgaggtgg ggatccgagt gcagatccac 1320 agccaggagg agccgcccat catcgatcag ctgggcttgg gggtgtcccc gggctaccag 1380 acctttgttt cttgccagca gcagcagctg agcttcctgc caccgccctg gggcgattgc 1440 agttcagcat ctctgaaccc caactatgag ccagagccct ctgatcccct aggctccccc 1500 agccccagcc ccagccctcc ctataccctt atggggtgtc gcctggcctg cgaaacccgc 1560 tacgtggctc ggaagtgcgg ctgccgaatg gtgtacatgc caggcgacgt gccagtgtgc 1620 agcccccagc agtacaagaa ctgtgcccac ccggccatag atgccatgct tcgcaaggac 1680 tcgtgcgcct gccccaaccc gtgcgccagc acgcgctacg ccaaggagct ctccatggtg 1740 cggatcccga gccgcgccgc cgcgcgcttc ctggcccgga agctcaaccg cagcgaggcc 1800 tacatcgcgg agaacgtgct ggccctggac atcttctttg aggccctcaa ctatgagacc 1860 gtggagcaga agaaggccta tgagatgtca gagctgcttg gtgacattgg gggccagatg 1920 gggctgttca tcggggccag cctgctcacc atcctcgaga tcctagacta cctctgtgag 1980 gtgttccgag acaaggtcct gggatatttc tggaaccgac agcactccca aaggcactcc 2040 agcaccaatc tgcttcagga agggctgggc agccatcgaa cccaagttcc ccacctcagc 2100 ctgggcccca gcactctgct ctgttccgaa gacctcccac ccctccctgt gccgtcacca 2160 agactctctc cgcctcccac cgcacctgct accttgtcac acagctctag acctgctgtc 2220 tgtgtcctcg gagccccgcc ctgacatcct ggacatgcct agcctgcacg tagcttttcc 2280 gtcttcaccc caaataaagt cctaatgcat cagc 2314 <210> 19 <211> 1246 <212> PRT <213> Artificial Sequence <220> <223> TRAPPC9 <400> 19 Met Val Pro Ala Gly Asp Gln Asp Arg Ala Pro His Arg Gly Lys Pro 1 5 10 15 Ala Gln Ala Gly Ala Arg Thr Ser Arg Ala Ser Arg Ala Leu Arg Ser 20 25 30 Trp Arg Arg Ser Gln Ala Ala Arg Ala Thr Val Thr His Pro Arg Gly 35 40 45 Gly His Asp Arg Gly Ser His Gly Gly Tyr Arg Glu Gly His Arg Gly 50 55 60 Cys Arg Arg Asp Pro Gln Trp Ala Ser Ala Gly Pro Pro Pro Leu Ser 65 70 75 80 Phe Thr Glu Glu Val Lys Phe Glu Leu Arg Ala Leu Lys Asp Trp Asp 85 90 95 Phe Lys Met Ser Val Pro Asp Tyr Met Gln Cys Ala Glu Asp His Gln 100 105 110 Thr Leu Leu Val Val Val Gln Pro Val Gly Ile Val Ser Glu Glu Asn 115 120 125 Phe Phe Arg Ile Tyr Lys Arg Ile Cys Ser Val Ser Gln Ile Ser Val 130 135 140 Arg Asp Ser Gln Arg Val Leu Tyr Ile Arg Tyr Arg His His Tyr Pro 145 150 155 160 Pro Glu Asn Asn Glu Trp Gly Asp Phe Gln Thr His Arg Lys Val Val 165 170 175 Gly Leu Ile Thr Ile Thr Asp Cys Phe Ser Ala Lys Asp Trp Pro Gln 180 185 190 Thr Phe Glu Lys Phe His Val Gln Lys Glu Ile Tyr Gly Ser Thr Leu 195 200 205 Tyr Asp Ser Arg Leu Phe Val Phe Gly Leu Gln Gly Glu Ile Val Glu 210 215 220 Gln Pro Arg Thr Asp Val Ala Phe Tyr Pro Asn Tyr Glu Asp Cys Gln 225 230 235 240 Thr Val Glu Lys Arg Ile Glu Asp Phe Ile Glu Ser Leu Phe Ile Val 245 250 255 Leu Glu Ser Lys Arg Leu Asp Arg Ala Thr Asp Lys Ser Gly Asp Lys 260 265 270 Ile Pro Leu Leu Cys Val Pro Phe Glu Lys Lys Asp Phe Val Gly Leu 275 280 285 Asp Thr Asp Ser Arg His Tyr Lys Lys Arg Cys Gln Gly Arg Met Arg 290 295 300 Lys His Val Gly Asp Leu Cys Leu Gln Ala Gly Met Leu Gln Asp Ser 305 310 315 320 Leu Val His Tyr His Met Ser Val Glu Leu Leu Arg Ser Val Asn Asp 325 330 335 Phe Leu Trp Leu Gly Ala Ala Leu Glu Gly Leu Cys Ser Ala Ser Val 340 345 350 Ile Tyr His Tyr Pro Gly Gly Thr Gly Gly Lys Ser Gly Ala Arg Arg 355 360 365 Phe Gln Gly Ser Thr Leu Pro Ala Glu Ala Ala Asn Arg His Arg Pro 370 375 380 Gly Ala Gln Glu Val Leu Ile Asp Pro Gly Ala Leu Thr Thr Asn Gly 385 390 395 400 Ile Asn Pro Asp Thr Ser Thr Glu Ile Gly Arg Ala Lys Asn Cys Leu 405 410 415 Ser Pro Glu Asp Ile Ile Asp Lys Tyr Lys Glu Ala Ile Ser Tyr Tyr 420 425 430 Ser Lys Tyr Lys Asn Ala Gly Val Ile Glu Leu Glu Ala Cys Ile Lys 435 440 445 Ala Val Arg Val Leu Ala Ile Gln Lys Arg Ser Met Glu Ala Ser Glu 450 455 460 Phe Leu Gln Asn Ala Val Tyr Ile Asn Leu Arg Gln Leu Ser Glu Glu 465 470 475 480 Glu Lys Ile Gln Arg Tyr Ser Ile Leu Ser Glu Leu Tyr Glu Leu Ile 485 490 495 Gly Phe His Arg Lys Ser Ala Phe Phe Lys Arg Val Ala Ala Met Gln 500 505 510 Cys Val Ala Pro Ser Ile Ala Glu Pro Gly Trp Arg Ala Cys Tyr Lys 515 520 525 Leu Leu Leu Glu Thr Leu Pro Gly Tyr Ser Leu Ser Leu Asp Pro Lys 530 535 540 Asp Phe Ser Arg Gly Thr His Arg Gly Trp Ala Ala Val Gln Met Arg 545 550 555 560 Leu Leu His Glu Leu Val Tyr Ala Ser Arg Arg Met Gly Asn Pro Ala 565 570 575 Leu Ser Val Arg His Leu Ser Phe Leu Leu Gln Thr Met Leu Asp Phe 580 585 590 Leu Ser Asp Gln Glu Lys Lys Asp Val Ala Gln Ser Leu Glu Asn Tyr 595 600 605 Thr Ser Lys Cys Pro Gly Thr Met Glu Pro Ile Ala Leu Pro Gly Gly 610 615 620 Leu Thr Leu Pro Pro Val Pro Phe Thr Lys Leu Pro Ile Val Arg His 625 630 635 640 Val Lys Leu Leu Asn Leu Pro Ala Ser Leu Arg Pro His Lys Met Lys 645 650 655 Ser Leu Leu Gly Gln Asn Val Ser Thr Lys Ser Pro Phe Ile Tyr Ser 660 665 670 Pro Ile Ile Ala His Asn Arg Gly Glu Glu Arg Asn Lys Lys Ile Asp 675 680 685 Phe Gln Trp Val Gln Gly Asp Val Cys Glu Val Gln Leu Met Val Tyr 690 695 700 Asn Pro Met Pro Phe Glu Leu Arg Val Glu Asn Met Gly Leu Leu Thr 705 710 715 720 Ser Gly Val Glu Phe Glu Ser Leu Pro Ala Ala Leu Ser Leu Pro Ala 725 730 735 Glu Ser Gly Leu Tyr Pro Val Thr Leu Val Gly Val Pro Gln Thr Thr 740 745 750 Gly Thr Ile Thr Val Asn Gly Tyr His Thr Thr Val Phe Gly Val Phe 755 760 765 Ser Asp Cys Leu Leu Asp Asn Leu Pro Gly Ile Lys Thr Ser Gly Ser 770 775 780 Thr Val Glu Val Ile Pro Ala Leu Pro Arg Leu Gln Ile Ser Thr Ser 785 790 795 800 Leu Pro Arg Ser Ala His Ser Leu Gln Pro Ser Ser Gly Asp Glu Ile 805 810 815 Ser Thr Asn Val Ser Val Gln Leu Tyr Asn Gly Glu Ser Gln Gln Leu 820 825 830 Ile Ile Lys Leu Glu Asn Ile Gly Met Glu Pro Leu Glu Lys Leu Glu 835 840 845 Val Thr Ser Lys Val Leu Thr Thr Lys Glu Lys Leu Tyr Gly Asp Phe 850 855 860 Leu Ser Trp Lys Leu Glu Glu Thr Leu Ala Gln Phe Pro Leu Gln Pro 865 870 875 880 Gly Lys Val Ala Thr Phe Thr Ile Asn Ile Lys Val Lys Leu Asp Phe 885 890 895 Ser Cys Gln Glu Asn Leu Leu Gln Asp Leu Ser Asp Asp Gly Ile Ser 900 905 910 Val Ser Gly Phe Pro Leu Ser Ser Pro Phe Arg Gln Val Val Arg Pro 915 920 925 Arg Val Glu Gly Lys Pro Val Asn Pro Pro Glu Ser Asn Lys Ala Gly 930 935 940 Asp Tyr Ser His Val Lys Thr Leu Glu Ala Val Leu Asn Phe Lys Tyr 945 950 955 960 Ser Gly Gly Pro Gly His Thr Glu Gly Tyr Tyr Arg Asn Leu Ser Leu 965 970 975 Gly Leu His Val Glu Val Glu Pro Ser Val Phe Phe Thr Arg Val Ser 980 985 990 Thr Leu Pro Ala Thr Ser Thr Arg Gln Cys His Leu Leu Leu Asp Val 995 1000 1005 Phe Asn Ser Thr Glu His Glu Leu Thr Val Ser Thr Arg Ser Ser Glu 1010 1015 1020 Ala Leu Ile Leu His Ala Gly Glu Cys Gln Arg Met Ala Ile Gln Val 1025 1030 1035 1040 Asp Lys Phe Asn Phe Glu Ser Phe Pro Glu Ser Pro Gly Glu Lys Gly 1045 1050 1055 Gln Phe Ala Asn Pro Lys Gln Leu Glu Glu Glu Arg Arg Glu Ala Arg 1060 1065 1070 Gly Leu Glu Ile His Ser Lys Leu Gly Ile Cys Trp Arg Ile Pro Ser 1075 1080 1085 Leu Lys Arg Ser Gly Glu Ala Ser Val Glu Gly Leu Leu Asn Gln Leu 1090 1095 1100 Val Leu Glu His Leu Gln Leu Ala Pro Leu Gln Trp Asp Val Leu Val 1105 1110 1115 1120 Asp Gly Gln Pro Cys Asp Arg Glu Ala Val Ala Ala Cys Gln Val Gly 1125 1130 1135 Asp Pro Val Arg Leu Glu Val Arg Leu Thr Asn Arg Ser Pro Arg Ser 1140 1145 1150 Val Gly Pro Phe Ala Leu Thr Val Val Pro Phe Gln Asp His Gln Asn 1155 1160 1165 Gly Val His Asn Tyr Asp Leu His Asp Thr Val Ser Phe Val Gly Ser 1170 1175 1180 Ser Thr Phe Tyr Leu Asp Ala Val Gln Pro Ser Gly Gln Ser Ala Cys 1185 1190 1195 1200 Leu Gly Ala Leu Leu Phe Leu Tyr Thr Gly Asp Phe Phe Leu His Ile 1205 1210 1215 Arg Phe His Glu Asp Ser Thr Ser Lys Glu Leu Pro Pro Ser Trp Phe 1220 1225 1230 Cys Leu Pro Ser Val His Val Cys Ala Leu Glu Ala Gln Ala 1235 1240 1245 <210> 20 <211> 7099 <212> DNA <213> Artificial Sequence <220> <223> TRAPPC9 <400> 20 aaagtcggga gtgccatggt gccagctggg gatcaagacc gcgcgccaca cagggggaag 60 ccggcccagg ctggggctcg cacctcacgt gcctcccggg ccctgcgatc ctggaggcgc 120 tcccaggccg cgcgcgccac ggtcacccac ccacgtgggg ggcacgaccg tgggagtcac 180 ggggggtacc gtgagggtca cagggggtgc cgcagggatc cacagtgggc ttccgcgggg 240 cctccacccc tgagcttcac agaggaagtg aaatttgagc tgcgcgccct gaaggactgg 300 gacttcaaaa tgagcgtccc tgactacatg cagtgtgctg aggaccacca gacgctgctc 360 gtggtggtcc agcctgtggg catcgtctcc gaggagaact tcttcaggat ctataagagg 420 atttgctctg tgagtcagat cagcgtgcgg gactcccagc gagtcctcta catccgctac 480 aggcaccact acccacccga gaacaacgag tggggtgact tccagaccca ccgcaaagtc 540 gtgggcctca tcaccatcac agactgcttc tcggccaagg actggccaca gacctttgag 600 aagttccacg tgcagaagga gatctacggc tccacactgt atgactcccg gctctttgtc 660 ttcgggctgc agggggagat cgtggagcag ccgcgcaccg acgtggcttt ctaccccaac 720 tacgaggact gccagacggt ggagaagaga atcgaggact tcatcgagtc actgttcatc 780 gtgctggagt ccaagcgtct ggacagagcc acagacaagt ctggggataa gatccccctt 840 ctctgtgtcc cgtttgagaa aaaggacttt gtaggactgg acacagacag cagacattac 900 aagaagcggt gccaaggccg catgcggaag cacgtggggg acctgtgcct gcaggcaggg 960 atgctgcagg actccctggt gcattaccac atgtcggtgg agctgctgcg ttctgtgaat 1020 gactttctgt ggcttggagc tgccctggaa ggattgtgtt cagcttctgt catctatcac 1080 tatcctggtg gaactggtgg gaagagtgga gctcggaggt tccagggcag cacccttcct 1140 gctgaagcag ccaatagaca ccggccaggg gcacaggaag ttctcattga tccaggtgcc 1200 ctcaccacca atggcatcaa ccctgacacc agtactgaga tcggacgtgc taagaactgc 1260 cttagccctg aagacataat tgacaagtat aaagaggcga tttcctatta cagcaagtat 1320 aagaatgcgg gagtgattga gttggaagcg tgcatcaagg ctgtacgtgt ccttgcaatt 1380 cagaaacgga gcatggaagc atcagaattt cttcagaatg cagtttacat taaccttcga 1440 cagctttctg aggaagagaa aattcagcgc tacagcatcc tctccgagct ctatgagctg 1500 atcggcttcc atcgcaagtc tgcgttcttc aagcgcgtgg ccgccatgca gtgcgtggcc 1560 ccaagcatcg cggagcctgg gtggagggcc tgctacaaac tcctcctgga aacgctgccc 1620 ggctacagtc tgtcgctgga tcccaaagat ttcagcagag gcacgcacag aggctgggct 1680 gcggtccaga tgcgtttgct ccatgaattg gtctacgcct cccgaaggat ggggaaccct 1740 gccctctctg tcagacacct gtccttcctt ctacagacca tgctggactt cttgtcggat 1800 caggaaaaga aagatgtggc ccaaagccta gagaactata cgtccaagtg tcctgggacc 1860 atggagccca tcgccctccc tggcggcctc accctgccac cggtgccctt caccaagctt 1920 cccatcgtca ggcatgtgaa actattgaac cttcctgcta gcctccggcc acacaaaatg 1980 aaaagcttgc tgggtcagaa cgtgtcaacc aaaagtcctt tcatctattc accaattatc 2040 gcacacaacc gtggagaaga gcggaacaag aaaatagatt tccagtgggt tcaaggagat 2100 gtgtgtgaag ttcagctgat ggtatataac ccaatgccgt ttgaacttcg agttgaaaac 2160 atggggctgc tcaccagcgg agtggagttc gagtctctcc ctgcggcgct ttctcttccg 2220 gctgaatctg gtctgtaccc agtgacgctc gtcggggtcc cgcagacgac tggaacgatt 2280 actgtgaacg gttaccatac cacggtcttc ggtgtgttca gtgactgttt gctggataac 2340 ctgccgggaa taaaaaccag tggctccaca gtggaagtca ttcccgcgtt gccaagactg 2400 cagatcagca cctctctgcc cagatctgca cattcattgc aaccttcttc tggtgatgaa 2460 atatctacta atgtatctgt ccagctttac aatggagaaa gtcagcaact aatcattaaa 2520 ttggaaaata ttggaatgga accattggag aaactggagg tcacctcgaa agttctcacc 2580 actaaagaaa aattgtatgg cgacttcttg agctggaagc tagaggaaac ccttgcccag 2640 ttccctttgc agcctgggaa ggtggccacg ttcacaatca acatcaaagt gaagctggat 2700 ttctcctgcc aggagaatct cctgcaggat ctcagtgatg atggaatcag tgtgagtggc 2760 tttcccctgt ccagtccttt tcggcaggtc gttcggcccc gagtggaggg caaacctgtg 2820 aacccacccg agagcaacaa agcaggcgac tacagccacg tgaagaccct ggaagctgtc 2880 ctgaatttca aatactctgg aggcccgggc cacactgaag gatattacag gaatctctcc 2940 ctggggctgc atgtagaagt cgagccgtct gtatttttca cccgagtcag caccctccca 3000 gcaaccagta cccggcagtg tcacctgctc ctggatgtct tcaactccac cgagcatgag 3060 ctgaccgtca gcaccaggag cagcgaggca ctcatcctgc acgccggtga gtgccagcga 3120 atggctattc aagtggacaa gttcaacttt gagagtttcc cggagtcccc tggggagaag 3180 gggcaatttg caaaccccaa gcagctggag gaagagcggc gggaagcccg aggcctggag 3240 atccacagca agctgggcat ctgctggaga atcccctccc tgaagcgcag tggcgaggcg 3300 agtgtggaag gactcctgaa ccagctcgtc ctggagcacc tgcagctggc gcctctgcag 3360 tgggatgtgc tggtggacgg acagccatgt gaccgcgagg ctgtggcggc ctgccaggtg 3420 ggcgaccccg tgcgcctgga ggtgcggctg accaaccgga gcccgcgcag cgtagggccc 3480 ttcgccctca ctgtggtccc cttccaggac caccagaacg gcgtgcacaa ctacgacctg 3540 cacgacaccg tctccttcgt gggctccagc accttctacc tcgacgcggt gcagccgtcc 3600 ggccagtcgg cctgcctcgg ggccctcctc ttcctctaca cgggagactt cttcctccac 3660 atccggttcc acgaggacag caccagcaag gagctgccac cctcttggtt ctgcctgccc 3720 agtgtgcacg tgtgtgccct ggaggcgcag gcctgagccc gcctacttcc gtccctcttt 3780 ctgcagggcc agaggtgacc ctgcctggcc tcccacaccc cctgcaatga gcaaggcctt 3840 cactgcagcc ccatctcctc ctcctccccc agacccctcc cagccctctc ctcctgttcc 3900 tcctgtagca tctttgctgg gctacgcaga agccccggac atggcagccc caccccatgc 3960 cacgcccctt cctacactgt tccctggacc atacacaggc tgaagcagag gaaatcccaa 4020 agcgggtgcc catccagccc aggtcccagg atccctgcac ccatttctgt gacctggggc 4080 cccagccgtg ctgtgctgct catcccagca gagggacctc cctcgtccag cgacttccct 4140 ttggccatag aaagaaatgg tgagcatgag actgggcaca gcctgagggc gtgggcagct 4200 tcccaccctc cctgggcctt ggaatccccc aaggctggtt ttcttcctgg agacccccat 4260 gggcaacttg gcaggagaga tggtgccgta ggaggtcgtg gatggttgat gccaagagag 4320 gccctccacc cgtggtgggc aaatgtccag gcctgggctg gcagcccagg gctgtttctg 4380 ggtgctccct ggccccaggg tggcgtctgg ttaccatggc tgtgtgtgtc catgtctgca 4440 agcagttctt caataaatgg cctgcctccc cctccctgcc tccctgcatc tgctagccca 4500 gtgcagtccg gggcccccac ccagcccgtc agcccccacc tcaggtggct ggcttcccag 4560 cagaagccgg acccaggaag ggacgggttc tgagttagag atctcacata agcaaacgct 4620 gagacaggaa tctggtcacc agcagcttgt ctgggaggtg gaaggaggcc tgcaggggag 4680 ggagatcacc agtgaaggag tgtgactggg cccgagctgc cgccacagtg ggcagctggc 4740 cctgcgttcc tcagggagga ccggagagag aaagcacacg gctcacatct cccttgaggg 4800 ccagagctgg ggtgtgtcta cactacccac aagagtcctg ttctgggggc aggcttctgg 4860 ctgtgttgac ccttagctcc ccaggtccca gggaggataa acagcccttg gtccccagca 4920 gataccaggc tcgtcaggtg cccgttgggc actgaaactc ccatggagta ttgggggaca 4980 gagagcgggg cttcctaaag gccagttggc caaacaggca gagccaggaa gtggctgccc 5040 tggcctccca tggggcagag tcatgttggc atcaggaggc ctgcggggct gagggaactt 5100 cctgaggacc tgaagtccca ggcccaaacc tccctcgctg ggagcaaggt caccctgtgg 5160 cctccggcct aaggaacaaa ttttttgttc catctgtggc ctcctgttgt cgctcagtga 5220 atgatgggag cacactgtga tgtggtgtgg gctgggggca tgtgggggcg ggtgtccagt 5280 ggccctcagt tcctggagct cattcagcat ctgctcgagg ctccccaggg aaggcagccc 5340 cagaaggtct ggctgcagta gggggtggag accccagggc ggtctcttca gcccttccgt 5400 tcaggatgcc tcagcgtagt tgagggcctg gccacctggg ctgtttctga atggaacaaa 5460 cagcatccat gtgtgtcaac ctcaacagag cccaaaaata cgccattgaa tgaaacaaca 5520 cattgcagga tgatgggctc agcccaacct cacttgtgta gacaatgtcc ccccaaatta 5580 taaattttct atgcctcaga acgtgtctat aacaacatgt ttaaaaggta gaaaggatac 5640 acatcccctg tagaagagtg acctgaactg gagttgcatt tgacatctga tgtcagcatg 5700 agggtgctgg tcaccgggga cttgagtacg tactttcctt tcaaaagcac aatagatcaa 5760 gatacaggta cacacatagc tacagatata gatccagcca cagatccaga ttagatatag 5820 atgcatatag aagcctagaa aagccaatat ggaaaactct ggcagtgttt gattctgagt 5880 ggcaagcata ttggtgactg taattatttc ttgcagtttt ctatattttt tagtttccca 5940 aactgcaaaa aatacccaca acagttacca gagaaaatgc tggatgccac aagaaggaag 6000 aacctcaagg ggcaggaggc aaagttgtct tcttggagga gggggtgtta gaaataaccc 6060 aggcctcctg gaaaatgacc caaatacagc ctggctctgc cagccccagg cccttcccca 6120 cccccagggc accctttctg tcccagtaga gaaggtgacc tggagtcagg ccttgtgtgt 6180 gctccaagcg tttggagctt taccgtggga tgtggggagc caggggtgtt gtttgctggc 6240 atttctgtgg ctgggtcttt ctggctgtgg agctgtgttt gtgggcagtg gctgagatgg 6300 aggacctggg gcaggtgtct cagtgacagc gcagtatcct ccagcctctt ccagggcccc 6360 acgctactgg ctacctggca atacagtcca gcagttgctg cttctctagg cccctggtgc 6420 attcagaaac ctcctgaagg ccagcggagg gtaagccagg aacagatcat gtccattgca 6480 cttcactagc tgagcgagtt ggttctgccc ttctgatcct tggtccctca gggccaggac 6540 aggcccggtg accccataat catgcagccg tcatgctccg tcatctgaaa ttcacaagga 6600 aggcaaggat tcaaaggaag atgtctgaga tgctgaactc cagctcagcc acttgtcaac 6660 tccttgacct tggacaagtt acttaaatta agccttgatt ttttttcatc tgaataaata 6720 gagataaaaa tacttatttc atggacactt tcatatcagt ttatttgatt taacctccta 6780 gatttttttt tagcacatga tagcaacttc ttaataaata ctgggtccct tctcctagcc 6840 ttcccattgc tgttcttttt aattccatcc acatgtcctg tgcctgaaat atcctgtgcc 6900 atgtgcctga aattccattc ttctcctgga acccttaaca ggccacattg ctatgctccg 6960 ttagtcccct gcatggctta atgtattagg taacatgaag catgtaagga agaactttct 7020 aacacaggtg ctgaaatggg acaggaaatg ctgatggagt gagctctttg tcaatagaag 7080 tattcaagca aaaaaaaaa 7099 <210> 21 <211> 689 <212> PRT <213> Artificial Sequence <220> <223> MST1L <400> 21 Met Ala Pro Ala Pro Val Thr Leu Leu Ala Pro Gly Ala Ala Ser Ser 1 5 10 15 Met Ser Cys Ser Gln Pro Gly Gln Arg Ser Pro Ser Asn Asp Phe Gln 20 25 30 Val Leu Arg Gly Thr Glu Leu Gln His Leu Leu His Ala Val Val Pro 35 40 45 Gly Pro Trp Gln Glu Asp Val Ala Asp Ala Glu Glu Cys Ala Gly Arg 50 55 60 Cys Gly Pro Leu Met Asp Cys Trp Ala Phe His Tyr Asn Val Ser Ser 65 70 75 80 His Gly Cys Gln Leu Leu Pro Trp Thr Gln His Ser Pro His Ser Arg 85 90 95 Leu Trp His Ser Gly Arg Cys Asp Leu Phe Gln Glu Lys Gly Glu Trp 100 105 110 Gly Tyr Met Pro Thr Leu Arg Asn Gly Leu Glu Glu Asn Phe Cys Arg 115 120 125 Asn Pro Asp Gly Asp Pro Gly Gly Pro Trp Cys His Thr Thr Asp Pro 130 135 140 Ala Val Arg Phe Gln Ser Cys Ser Ile Lys Ser Cys Arg Val Ala Ala 145 150 155 160 Cys Val Trp Cys Asn Gly Glu Glu Tyr Arg Gly Ala Val Asp Arg Thr 165 170 175 Glu Ser Gly Arg Glu Cys Gln Arg Trp Asp Leu Gln His Pro His Gln 180 185 190 His Pro Phe Glu Pro Gly Lys Phe Leu Asp Gln Gly Leu Asp Asp Asn 195 200 205 Tyr Cys Arg Asn Pro Asp Gly Ser Glu Arg Pro Trp Cys Tyr Thr Thr 210 215 220 Asp Pro Gln Ile Glu Arg Glu Phe Cys Asp Leu Pro Arg Cys Gly Ser 225 230 235 240 Glu Ala Gln Pro Arg Gln Glu Ala Thr Ser Val Ser Cys Phe Arg Gly 245 250 255 Lys Gly Glu Gly Tyr Arg Gly Thr Ala Asn Thr Thr Thr Ala Ala Tyr 260 265 270 Leu Ala Ser Val Gly Thr Arg Lys Ser His Ile Ser Thr Asp Leu Arg 275 280 285 Gln Lys Asn Thr Arg Ala Ser Glu Val Gly Gly Gly Ala Gly Val Gly 290 295 300 Thr Cys Cys Cys Gly Asp Leu Arg Glu Asn Phe Cys Trp Asn Leu Asp 305 310 315 320 Gly Ser Glu Ala Pro Trp Cys Phe Thr Leu Arg Pro Gly Thr Arg Val 325 330 335 Gly Phe Cys Tyr Gln Ile Arg Arg Cys Thr Asp Asp Val Arg Pro Gln 340 345 350 Asp Cys Tyr His Gly Ala Gly Glu Gln Tyr Arg Gly Thr Val Ser Lys 355 360 365 Thr Arg Lys Gly Val Gln Cys Gln Arg Trp Ser Ala Glu Thr Pro His 370 375 380 Lys Leu Gln Ala Leu Thr Leu Gly Arg His Ala Leu Met Ser Gly Thr 385 390 395 400 Arg Ala Trp Lys Trp Leu Arg Leu Pro Cys His Asp Phe Ala Pro Ala 405 410 415 Pro Ala Ser Val His Ile Tyr Leu Arg Thr Ala Cys Thr Thr Gly Gly 420 425 430 Glu Leu Leu Pro Asp Pro Asp Gly Asp Ser His Gly Pro Trp Cys Tyr 435 440 445 Thr Met Asp Pro Arg Thr Pro Phe Asp Tyr Cys Ala Leu Arg Arg Cys 450 455 460 Asp Gln Val Gln Phe Glu Lys Cys Gly Lys Arg Val Asp Arg Leu Asp 465 470 475 480 Gln Arg Arg Ser Lys Leu Arg Val Ala Gly Gly His Pro Gly Asn Ser 485 490 495 Pro Trp Thr Val Ser Leu Arg Asn Arg His Met Pro Leu Thr Gly Tyr 500 505 510 Glu Val Trp Leu Gly Thr Leu Phe Gln Asn Pro Gln His Gly Glu Pro 515 520 525 Gly Leu Gln Arg Val Pro Val Ala Lys Met Leu Cys Gly Pro Ser Gly 530 535 540 Ser Gln Leu Val Leu Leu Lys Leu Glu Arg Ser Val Thr Leu Asn Gln 545 550 555 560 Arg Val Ala Leu Ile Cys Leu Pro Pro Glu Trp Tyr Val Val Pro Pro 565 570 575 Gly Thr Lys Cys Glu Ile Ala Gly Trp Gly Glu Thr Lys Gly Thr Gly 580 585 590 Asn Asp Thr Val Leu Asn Val Ala Leu Leu Asn Val Ile Ser Asn Gln 595 600 605 Glu Cys Asn Ile Lys His Arg Gly His Val Arg Glu Ser Glu Met Cys 610 615 620 Thr Glu Gly Leu Leu Ala Pro Val Gly Ala Cys Glu Gly Asp Tyr Gly 625 630 635 640 Gly Pro Leu Ala Cys Phe Thr His Asn Cys Trp Val Leu Lys Gly Ile 645 650 655 Arg Ile Pro Asn Arg Val Cys Thr Arg Ser Arg Trp Pro Ala Val Phe 660 665 670 Thr Arg Val Ser Val Phe Val Asp Trp Ile His Lys Val Met Arg Leu 675 680 685 Gly <210> 22 <211> 4668 <212> DNA <213> Artificial Sequence <220> <223> MST1L <400> 22 atggcgcctg ccccagtcac cctgctggcc cctggggcag catcctcaat gtcttgcagc 60 cagcccgggc agcgctcgcc atcgaatgac ttccaggtgc tccggggcac agagctacag 120 cacctgctac atgcggtggt gcccgggcct tggcaggagg atgtggcaga tgctgaagag 180 tgtgctggtc gctgtgggcc cttaatggac tgctgggcgt tccactacaa tgtgagcagc 240 catggttgcc aactgctgcc atggactcaa cactcgcccc actcaaggct gtggcattct 300 gggcgctgtg acctcttcca ggagaaaggc gagtgggggt acatgcccac gctccggaat 360 ggcctggaag agaacttctg ccgtaaccct gatggcgacc ccggaggtcc ttggtgccac 420 acaacagacc ctgccgtgcg cttccagagc tgcagcatca aatcctgccg ggtggccgcg 480 tgtgtctggt gcaatggcga ggaataccgc ggcgcggtag accgcaccga gtcagggcgc 540 gagtgccagc gctgggatct tcagcacccg caccagcacc ccttcgagcc gggcaagttc 600 ctcgaccaag gtctggacga caactattgc cggaatcctg acggctccga gcggccatgg 660 tgctacacta cggatccgca gatcgagcga gagttctgtg acctcccccg ctgcgggtcc 720 gaggcacagc cccgccaaga ggccacaagt gtcagctgct tccgcgggaa gggtgagggc 780 taccggggca cagccaatac caccaccgcg gcgtaccttg ccagcgttgg gacgcgcaaa 840 tcccacatca gcaccgattt acgccagaaa aatacgcgtg caagtgaggt gggcgggggg 900 gcgggcgttg ggacgtgctg ctgcggagac cttcgggaga acttctgctg gaacctcgac 960 ggctcagagg cgccctggtg cttcaccctg cggcccggca cgcgcgtggg cttttgctac 1020 cagatccggc gttgtacaga cgacgtgcgg ccccaggact gctaccacgg cgcgggggag 1080 cagtaccgcg gcacggtcag caagacccgc aagggtgtcc agtgccagcg ctggtccgct 1140 gagacgccgc acaagctgca ggccctaacc ctggggcggc atgctttgat gtctgggacc 1200 agagcctgga aatggttgag actaccctgc cacgattttg ctcccgctcc cgcctcggtt 1260 cacatttacc tccgaaccgc atgcacaact ggaggagaac ttctgccaga cccagatggg 1320 gatagccatg ggccctggtg ctacacgatg gacccaagga ccccattcga ctactgtgcc 1380 ctgcgacgct gcgaccaggt gcagtttgag aagtgtggca agagggtgga tcggctggat 1440 cagcgtcgtt ccaagctgcg cgtggctggg ggccatccgg gcaactcacc ctggacagtc 1500 agcttgcgga atcgccatat gcctctcacg ggctatgagg tatggttggg caccctgttc 1560 cagaacccac aacatggaga gccaggccta cagcgggtcc cagtagccaa gatgctgtgt 1620 gggccctcag gctcccagct tgtcctgctc aagctggaga gatctgtgac cctgaaccag 1680 cgtgtggccc tgatctgcct gccgcctgaa tggtatgtgg tgcctccagg gaccaagtgt 1740 gagattgcag gctggggtga gaccaaaggt acgggtaatg acacagtcct aaatgtggcc 1800 ttgctgaacg tcatctccaa ccaggagtgt aacatcaagc accgaggaca tgtgcgggag 1860 agcgagatgt gcactgaggg actgttggcc cctgtggggg cctgtgaggg tgactacggg 1920 ggcccacttg cctgctttac ccacaactgc tgggtcctga aaggaattag aatccccaac 1980 cgagtatgca caaggtcgcg ctggccagcc gtcttcacgc gtgtctctgt gtttgtggac 2040 tggattcaca aggtcatgag actgggttag gcccagcctt gacgccatat gctttgggga 2100 ggacaaaact tgtaagtaca gtcaaggaca agacttgtac tcaaggttga gatttaataa 2160 aattaatatt tttactactt caccaaggac tttcttaaac gaaaatggtt tttccccctg 2220 caagtaaaca gtaatgaaga agagaattat tcctagtgca gtttgttttc atggtcttaa 2280 tttttgctaa gactccactg tttttgcctt atcaatacaa gtgccaacac agtgaaaagg 2340 caaatatcat cttagtatta ctctgaaaat agttctgagc taatggccta ctgaaaggaa 2400 aagagtggct cctgctattc tattagactt attacaatta tcttaagtat tctttctacc 2460 ctcctttaat tgaatggaaa cagggatgga ttggaagagc tgtttttctc ctttctttcc 2520 cccggcaata tttaccattt aatgccactt actaacactc aaagaaacaa aaccaaactt 2580 ctcaattgac agtgcagtga cccaacaaag acacgggttc ttgaattcaa agtggagcag 2640 gagagacggt aaatacacat ttactttaat atatatatat ttattattta tgtgtttaaa 2700 gcacaaatta gtttggtaaa aaacatctca tgtctgtttt atttccacat ccctgagact 2760 gacaatggga tgcctatcaa ttaattcatt tagagagcca tacaccacaa gaaacaaatt 2820 atttgtcctc tggagcttgt cacaggggga tttttaaaaa accattaaac agaaagacaa 2880 ctgtgcatct tagaaagata aaaggccaat tcttcctctc cggctgatag gttcttaata 2940 atagtgatat ctactaataa ggtgttttac atagtgtaaa gcatgttcac atacaaatta 3000 cttagcctct ttgagcctca gttttcttat atgtaaaact ggattaatag tacattttgt 3060 gtttaaaaag ataatgtata tgaagtgttt accatttttg cttggcatct agttcagttc 3120 tcagtaactg atgtggtggt ggtggtggtc atagtagcag taagatccgt agtaatagta 3180 gcagcagttg ttttagaaat tagtaactga ggcctggcaa agttaaaggc tctttcatta 3240 acacccagag gggaagaaat gaagctggtc ttcagaggca ggctattttc actctgtgtc 3300 ccaaattttc ccccctagac cgtttttata cttctggggc ctcagaaaat attctcagct 3360 attctgttag cttgatctcc taccatctga gagtgggctt ccttcaaaca accaaatttc 3420 caggtatttc taaactgccc ttcccctaca ccattctttg gttcagtatt tcaagacccc 3480 taagagaaat ggtacattta catgtaagca caggatagtg aagtatttac aacaagtgct 3540 ttggagccag caaatatgaa tcagaatcca gctttccttt cctacataca tgacattggg 3600 cagctaattt ctaagatttt acttctttat ctatgaaagt ggagtactag tacttgctct 3660 gtgcaactgt gatggttgtt acatgaggta gcatctagaa gcagcttgca cattgccaga 3720 cacccagtgg aaggtcaatg aatgactatt tgaggactaa ctattacaga aatgtttact 3780 cttctgagtc ctgatttcta gtctcctgga ctaaataggt tcactgtttt cctcccggtt 3840 cagtttccag acacatcaca gaattataag aatattaaaa actcaggctt atacctacac 3900 aggattttct ataaccctct ttctgctttg agctcctaaa ggtatttcat agaaaaatga 3960 ccttattttt aaatagaggg ggcagttgaa aatcagtgaa cgggcctacc ccctaatgat 4020 ttttttctca gacctaatta taataattag cattataaag tgctaattat ctttggacac 4080 agaggacctg cacaccagag acagaggtcc gcattaagta aagtggattt cactttcttc 4140 agttgtgaga tttctctttt ttcttctttg taatgatgca aagatatatc ttccaccaag 4200 cctcatttaa aagctttttc cagttaagga aactatctct tggccatcca cagccagact 4260 gcatattgag attatggata ttcaaagaaa ttgtctttcc tttgtatatt gtcataactt 4320 tttgtgaaat gttcgtttta tagttccagg ccagcaccta gaacctggct agaataaaaa 4380 actgcagaaa tcatgagttt cttgtttgga tgaaagagca cacctattaa caaatgatag 4440 acggctatcc tactgtgagt cctgaaaact ggtggtgtga ttgttgaatg ggttaggggt 4500 atagcagaga aactcagtgt gggctacata caatttcagc ttgaatcaca cttaacagat 4560 cctctgttcc aaccatttaa atttacaaag aagaaactaa ggcacagaac tacttgagaa 4620 gagaagcaga attgaaaact agagctcctg attgttctca aaataatt 4668 <210> 23 <211> 595 <212> PRT <213> Artificial Sequence <220> <223> GBP3 <400> 23 Met Ala Pro Glu Ile His Met Thr Gly Pro Met Cys Leu Ile Glu Asn 1 5 10 15 Thr Asn Gly Glu Leu Val Ala Asn Pro Glu Ala Leu Lys Ile Leu Ser 20 25 30 Ala Ile Thr Gln Pro Val Val Val Val Ala Ile Val Gly Leu Tyr Arg 35 40 45 Thr Gly Lys Ser Tyr Leu Met Asn Lys Leu Ala Gly Lys Asn Lys Gly 50 55 60 Phe Ser Leu Gly Ser Thr Val Lys Ser His Thr Lys Gly Ile Trp Met 65 70 75 80 Trp Cys Val Pro His Pro Lys Lys Pro Glu His Thr Leu Val Leu Leu 85 90 95 Asp Thr Glu Gly Leu Gly Asp Val Lys Lys Gly Asp Asn Gln Asn Asp 100 105 110 Ser Trp Ile Phe Thr Leu Ala Val Leu Leu Ser Ser Thr Leu Val Tyr 115 120 125 Asn Ser Met Gly Thr Ile Asn Gln Gln Ala Met Asp Gln Leu Tyr Tyr 130 135 140 Val Thr Glu Leu Thr His Arg Ile Arg Ser Lys Ser Ser Pro Asp Glu 145 150 155 160 Asn Glu Asn Glu Asp Ser Ala Asp Phe Val Ser Phe Phe Pro Asp Phe 165 170 175 Val Trp Thr Leu Arg Asp Phe Ser Leu Asp Leu Glu Ala Asp Gly Gln 180 185 190 Pro Leu Thr Pro Asp Glu Tyr Leu Glu Tyr Ser Leu Lys Leu Thr Gln 195 200 205 Gly Thr Ser Gln Lys Asp Lys Asn Phe Asn Leu Pro Gln Leu Cys Ile 210 215 220 Trp Lys Phe Phe Pro Lys Lys Lys Cys Phe Val Phe Asp Leu Pro Ile 225 230 235 240 His Arg Arg Lys Leu Ala Gln Leu Glu Lys Leu Gln Asp Glu Glu Leu 245 250 255 Asp Pro Glu Phe Val Gln Gln Val Ala Asp Phe Cys Ser Tyr Ile Phe 260 265 270 Ser Asn Ser Lys Thr Lys Thr Leu Ser Gly Gly Ile Lys Val Asn Gly 275 280 285 Pro Cys Leu Glu Ser Leu Val Leu Thr Tyr Ile Asn Ala Ile Ser Arg 290 295 300 Gly Asp Leu Pro Cys Met Glu Asn Ala Val Leu Ala Leu Ala Gln Ile 305 310 315 320 Glu Asn Ser Ala Ala Val Gln Lys Ala Ile Ala His Tyr Asp Gln Gln 325 330 335 Met Gly Gln Lys Val Gln Leu Pro Ala Glu Thr Leu Gln Glu Leu Leu 340 345 350 Asp Leu His Arg Val Ser Glu Arg Glu Ala Thr Glu Val Tyr Met Lys 355 360 365 Asn Ser Phe Lys Asp Val Asp His Leu Phe Gln Lys Lys Leu Ala Ala 370 375 380 Gln Leu Asp Lys Lys Arg Asp Asp Phe Cys Lys Gln Asn Gln Glu Ala 385 390 395 400 Ser Ser Asp Arg Cys Ser Ala Leu Leu Gln Val Ile Phe Ser Pro Leu 405 410 415 Glu Glu Glu Val Lys Ala Gly Ile Tyr Ser Lys Pro Gly Gly Tyr Cys 420 425 430 Leu Phe Ile Gln Lys Leu Gln Asp Leu Glu Lys Lys Tyr Tyr Glu Glu 435 440 445 Pro Arg Lys Gly Ile Gln Ala Glu Glu Ile Leu Gln Thr Tyr Leu Lys 450 455 460 Ser Lys Glu Ser Val Thr Asp Ala Ile Leu Gln Thr Asp Gln Ile Leu 465 470 475 480 Thr Glu Lys Glu Lys Glu Ile Glu Val Glu Cys Val Lys Ala Glu Ser 485 490 495 Ala Gln Ala Ser Ala Lys Met Val Glu Glu Met Gln Ile Lys Tyr Gln 500 505 510 Gln Met Met Glu Glu Lys Glu Lys Ser Tyr Gln Glu His Val Lys Gln 515 520 525 Leu Thr Glu Lys Met Glu Arg Glu Arg Ala Gln Leu Leu Glu Glu Gln 530 535 540 Glu Lys Thr Leu Thr Ser Lys Leu Gln Glu Gln Ala Arg Val Leu Lys 545 550 555 560 Glu Arg Cys Gln Gly Glu Ser Thr Gln Leu Gln Asn Glu Ile Gln Lys 565 570 575 Leu Gln Lys Thr Leu Lys Lys Lys Thr Lys Arg Tyr Met Ser His Lys 580 585 590 Leu Lys Ile 595 <210> 24 <211> 2332 <212> DNA <213> Artificial Sequence <220> <223> GBP3 <400> 24 cagcgatcca gcgaaagaaa agagaagtga cagaaacaac tttacctgga ctgaagataa 60 aagcacagac aagagaacaa tgccctggac atggctccag agatccacat gacaggccca 120 atgtgcctca ttgagaacac taatggggaa ctggtggcga atccagaagc tctgaaaatc 180 ctgtctgcca ttacacagcc tgtggtggtg gtggcaattg tgggcctcta ccgcacagga 240 aaatcctacc tgatgaacaa gctagctggg aagaataagg gcttctctct gggctccaca 300 gtgaaatctc acaccaaagg aatctggatg tggtgtgtgc ctcaccccaa aaagccagaa 360 cacaccttag tcctgcttga cactgagggc ctgggagatg taaagaaggg tgacaaccag 420 aatgactcct ggatcttcac cctggccgtc ctcctgagca gcactctcgt gtacaatagc 480 atgggaacca tcaaccagca ggctatggac caactgtact atgtgacaga gctgacacat 540 cgaatccgat caaaatcctc acctgatgag aatgagaatg aggattcagc tgactttgtg 600 agcttcttcc cagattttgt gtggacactg agagatttct ccctggactt ggaagcagat 660 ggacaacccc tcacaccaga tgagtacctg gagtattccc tgaagctaac gcaaggtacc 720 agtcaaaaag ataaaaattt taatctgccc caactctgta tctggaagtt cttcccaaag 780 aaaaaatgtt ttgtcttcga tctgcccatt caccgcagga agcttgccca gcttgagaaa 840 ctacaagatg aagagctgga ccctgaattt gtgcaacaag tagcagactt ctgttcctac 900 atctttagca attccaaaac taaaactctt tcaggaggca tcaaggtcaa tgggccttgt 960 ctagagagcc tagtgctgac ctatatcaat gctatcagca gaggggatct gccctgcatg 1020 gagaacgcag tcctggcctt ggcccagata gagaactcag ccgcagtgca aaaggctatt 1080 gcccactatg accagcagat gggccagaag gtgcagctgc ccgcagaaac cctccaggag 1140 ctgctggacc tgcacagggt tagtgagagg gaggccactg aagtctatat gaagaactct 1200 ttcaaggatg tggaccatct gtttcaaaag aaattagcgg cccagctaga caaaaagcgg 1260 gatgactttt gtaaacagaa tcaagaagca tcatcagatc gttgctcagc tttacttcag 1320 gtcattttca gtcctctaga agaagaagtg aaggcgggaa tttattcgaa accagggggc 1380 tattgtctct ttattcagaa gctacaagac ctggagaaaa agtactatga ggaaccaagg 1440 aaggggatac aggctgaaga gattctgcag acatacttga aatccaagga gtctgtgacc 1500 gatgcaattc tacagacaga ccagattctc acagaaaagg aaaaggagat tgaagtggaa 1560 tgtgtaaaag ctgaatctgc acaggcttca gcaaaaatgg tggaggaaat gcaaataaag 1620 tatcagcaga tgatggaaga gaaagagaag agttatcaag aacatgtgaa acaattgact 1680 gagaagatgg agagggagag ggcccagttg ctggaagagc aagagaagac cctcactagt 1740 aaacttcagg aacaggcccg agtactaaag gagagatgcc aaggtgaaag tacccaactt 1800 caaaatgaga tacaaaagct acagaagacc ctgaaaaaaa aaaccaagag atatatgtcg 1860 cataagctaa agatctaaac aacagagctt ttctgtcatc ctaacccaag gcataactga 1920 aacaatttta gaatttggaa caagtgtcac tatatttgat aataattaga tcttgcatca 1980 taacactaaa agtttacaag aacatgcagt tcaatgatca aaatcatgtt ttttccttaa 2040 aaagattgta aattgtgcaa caaagatgca tttacctctg taccaacaga ggagggatca 2100 tgagttgcca ccactcagaa gtttattctt ccagacgacc agtggatact gaggaaagtc 2160 ttaggtaaaa atcttgggac atatttgggc actggtttgg ccaagtgtac aatgggtccc 2220 aatatcagaa acaaccatcc tagcttccta gggaagacag tgtacagttc tccattatat 2280 caaggctaca aggtctatga gcaataatgt gatttctgga cattgcccat gg 2332 <210> 25 <211> 232 <212> PRT <213> Artificial Sequence <220> <223> CFHR3 <400> 25 Met Leu Leu Leu Ile Asn Val Ile Leu Thr Leu Trp Val Ser Cys Ala 1 5 10 15 Asn Gly Gln Val Lys Pro Cys Asp Phe Pro Asp Ile Lys His Gly Gly 20 25 30 Leu Phe His Glu Asn Met Arg Arg Pro Tyr Phe Pro Val Ala Val Gly 35 40 45 Lys Tyr Tyr Ser Tyr Tyr Cys Asp Glu His Phe Glu Thr Pro Ser Gly 50 55 60 Ser Tyr Trp Asp Tyr Ile His Cys Thr Gln Asn Gly Trp Ser Pro Ala 65 70 75 80 Val Pro Cys Leu Arg Lys Cys Tyr Phe Pro Tyr Leu Glu Asn Gly Tyr 85 90 95 Asn Gln Asn Tyr Gly Arg Lys Phe Val Gln Gly Asn Ser Thr Glu Val 100 105 110 Ala Cys His Pro Gly Tyr Gly Leu Pro Lys Ala Gln Thr Thr Val Thr 115 120 125 Cys Thr Glu Lys Gly Trp Ser Pro Thr Pro Arg Cys Ile Arg Val Arg 130 135 140 Thr Cys Ser Lys Ser Asp Ile Glu Ile Glu Asn Gly Phe Ile Ser Glu 145 150 155 160 Ser Ser Ser Ile Tyr Ile Leu Asn Lys Glu Ile Gln Tyr Lys Cys Lys 165 170 175 Pro Gly Tyr Ala Thr Ala Asp Gly Asn Ser Ser Gly Ser Ile Thr Cys 180 185 190 Leu Gln Asn Gly Trp Ser Ala Gln Pro Ile Cys Ile Thr Ala Cys Ile 195 200 205 Ala Phe Arg Ala His Ala Gln Lys Ser Cys Thr Cys Arg Gly Arg Asn 210 215 220 Glu Cys Leu Ile Leu Asn Phe Cys 225 230 <210> 26 <211> 1995 <212> DNA <213> Artificial Sequence <220> <223> CFHR3 <400> 26 aataatgaaa ggtttcaaac cccaaacagt gcaactgaaa cttttgtatt agcatactac 60 tgagaatatc taacatgttg ttactaatca atgtcattct gaccttgtgg gtttcctgtg 120 ctaatggaca agtgaaacct tgtgattttc cagacattaa acatggaggt ctatttcatg 180 agaatatgcg tagaccatac tttccagtag ctgtaggaaa atattactcc tattactgtg 240 atgaacattt tgagactcct tcaggaagtt actgggatta cattcattgc acacaaaatg 300 ggtggtcacc agcagtacca tgtctcagaa aatgttattt tccttatttg gaaaatggat 360 ataatcaaaa ttatggaaga aagtttgtac agggtaactc tacagaagtt gcctgccatc 420 ctggctacgg tcttccaaaa gcgcagacca cagttacatg tacggagaaa ggctggtctc 480 ctactcccag atgcatccgt gtcagaacat gctcaaaatc agatatagaa attgaaaatg 540 gattcatttc cgaatcttcc tctatttata ttttaaataa agaaatacaa tataaatgta 600 aaccaggata tgcaacagca gatggaaatt cttcaggatc aattacatgt ttgcaaaatg 660 gatggtcagc acaaccaatt tgcattactg cttgtattgc attccgtgct cacgctcaga 720 aaagttgtac atgtcggggg agaaatgagt gcttaattct gaatttctgc tagcgtcagg 780 agaatcagac cttaataatt tgtatcaatg atgctactga ggatatccaa tcaaaaaatt 840 atctctaccc tattgtttac tacagagaaa acaagtaaag gaaaagggta agtgggtggg 900 ctgaatgttt acaaacctca tacttgccaa tggattcttt acatgtaaag ttctctgaac 960 gtgctcgacc tttacttagt atgataaaga gaatgcataa tgaacaaagg aaatctttca 1020 gtggcaaaag ttgatttttt ttctttcctc tttatatatt cacaaagagt tttaaagata 1080 aattgctgaa tgtattttta acccaaatac tgttgtataa tttacatact cccaaatcca 1140 cttcattttc aagggtacaa ttcaactatt tttagtcatg agataatatg gaaccttgat 1200 acataataca acatggaaga ctctcaaaaa tattaagcta agagagaaaa aagatacact 1260 tacaaagtga ttatctcaat gttatcatga gttttggggg ttatatgaat tcctacattt 1320 ctagaatact gttttcaatt tctatactta tcaagggctc tgtgtaaaga aaaaggtgta 1380 tacttcaatt ttactcagct ttgataaatg atctacttta aaacttgtga aataaaagac 1440 aggatgcttc ataaatttta tagaacttat atccaattaa tataattttt atctaatata 1500 tacaccctta ctgttagtaa atcgttacat aactacataa atggttacat taacttttaa 1560 attcacaaat ttaaaagcgg tttaagctcc gtgacacatt ggactacgaa tgctacgatg 1620 gatatggaat cagttatgga aacaccacag gttccatagt gtgtggtgaa gatgggtagt 1680 cccatttccc aacatgttat agtaagtatt ttattcaagt attttttatt agaattaaat 1740 aaaataataa atagacacct acatatgtat atgtacacat atgtgtgtac atatatgtac 1800 atatatatgt agtcctccta tgagtgtgaa ttatcttgag acttaaaaag aaaaaacaac 1860 gttgaaaatg cagatgtctt cctaagaaat caaataagat acagttaaga gtatataaaa 1920 agctttattt agaaagtttc caataagact attgattttt ccccaaaaaa aaaaaaaaaa 1980 aaaaaaaaaa aaaaa 1995 <210> 27 <211> 330 <212> PRT <213> Artificial Sequence <220> <223> CFHR1 <400> 27 Met Trp Leu Leu Val Ser Val Ile Leu Ile Ser Arg Ile Ser Ser Val 1 5 10 15 Gly Gly Glu Ala Thr Phe Cys Asp Phe Pro Lys Ile Asn His Gly Ile 20 25 30 Leu Tyr Asp Glu Glu Lys Tyr Lys Pro Phe Ser Gln Val Pro Thr Gly 35 40 45 Glu Val Phe Tyr Tyr Ser Cys Glu Tyr Asn Phe Val Ser Pro Ser Lys 50 55 60 Ser Phe Trp Thr Arg Ile Thr Cys Thr Glu Glu Gly Trp Ser Pro Thr 65 70 75 80 Pro Lys Cys Leu Arg Leu Cys Phe Phe Pro Phe Val Glu Asn Gly His 85 90 95 Ser Glu Ser Ser Gly Gln Thr His Leu Glu Gly Asp Thr Val Gln Ile 100 105 110 Ile Cys Asn Thr Gly Tyr Arg Leu Gln Asn Asn Glu Asn Asn Ile Ser 115 120 125 Cys Val Glu Arg Gly Trp Ser Thr Pro Pro Lys Cys Arg Ser Thr Asp 130 135 140 Thr Ser Cys Val Asn Pro Pro Thr Val Gln Asn Ala His Ile Leu Ser 145 150 155 160 Arg Gln Met Ser Lys Tyr Pro Ser Gly Glu Arg Val Arg Tyr Glu Cys 165 170 175 Arg Ser Pro Tyr Glu Met Phe Gly Asp Glu Glu Val Met Cys Leu Asn 180 185 190 Gly Asn Trp Thr Glu Pro Pro Gln Cys Lys Asp Ser Thr Gly Lys Cys 195 200 205 Gly Pro Pro Pro Pro Ile Asp Asn Gly Asp Ile Thr Ser Phe Pro Leu 210 215 220 Ser Val Tyr Ala Pro Ala Ser Ser Val Glu Tyr Gln Cys Gln Asn Leu 225 230 235 240 Tyr Gln Leu Glu Gly Asn Lys Arg Ile Thr Cys Arg Asn Gly Gln Trp 245 250 255 Ser Glu Pro Pro Lys Cys Leu His Pro Cys Val Ile Ser Arg Glu Ile 260 265 270 Met Glu Asn Tyr Asn Ile Ala Leu Arg Trp Thr Ala Lys Gln Lys Leu 275 280 285 Tyr Leu Arg Thr Gly Glu Ser Ala Glu Phe Val Cys Lys Arg Gly Tyr 290 295 300 Arg Leu Ser Ser Arg Ser His Thr Leu Arg Thr Thr Cys Trp Asp Gly 305 310 315 320 Lys Leu Glu Tyr Pro Thr Cys Ala Lys Arg 325 330 <210> 28 <211> 1320 <212> DNA <213> Artificial Sequence <220> <223> CFHR1 <400> 28 ggggacactg aaattcaaag tcatgctcat aactgttaat gaaagcagat tcaaagcaac 60 accaccacca ctgaagtatt tttggttata taagattgga actaccaagc atgtggctcc 120 tggtcagtgt aattctaatc tcacggatat cctctgttgg gggagaagca acattttgtg 180 attttccaaa aataaaccat ggaattctat atgatgaaga aaaatataag ccattttccc 240 aggttcctac aggggaagtt ttctattact cctgtgaata taattttgtg tctccttcaa 300 aatcattttg gactcgcata acatgcacag aagaaggatg gtcaccaaca ccaaagtgtc 360 tcagactgtg tttctttcct tttgtggaaa atggtcattc tgaatcttca ggacaaacac 420 atctggaagg tgatactgtg caaattattt gcaacacagg atacaggctt caaaacaatg 480 agaacaacat ttcatgtgta gaacggggct ggtccacccc tcccaaatgc aggtccactg 540 acacttcctg tgtgaatccg cccacagtac aaaatgctca tatactgtcg agacagatga 600 gtaaatatcc atctggtgag agagtacgtt atgaatgtag gagcccttat gaaatgtttg 660 gggatgaaga agtgatgtgt ttaaatggaa actggacaga accacctcaa tgcaaagatt 720 ctacgggaaa atgtgggccc cctccaccta ttgacaatgg ggacattact tcattcccgt 780 tgtcagtata tgctccagct tcatcagttg agtaccaatg ccagaacttg tatcaacttg 840 agggtaacaa gcgaataaca tgtagaaatg gacaatggtc agaaccacca aaatgcttac 900 atccgtgtgt aatatcccga gaaattatgg aaaattataa catagcatta aggtggacag 960 ccaaacagaa gctttatttg agaacaggtg aatcagctga atttgtgtgt aaacggggat 1020 atcgtctttc atcacgttct cacacattgc gaacaacatg ttgggatggg aaactggagt 1080 atccaacttg tgcaaaaaga tagaatcaat cataaaatgc acacctttat tcagaacttt 1140 agtattaaat cagttcttaa tttcattttt aagtattgtt ttactccttt ttattcatac 1200 gtaaaatttt ggattaattt gtgaaaatgt aattataagc tgagaccggt ggctctcttc 1260 ttaaaagcac catattaaaa cttggaaaac caaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1320 1320 <210> 29 <211> 322 <212> PRT <213> Artificial Sequence <220> <223> OR2T2 <400> 29 Met Glu Gly Leu Leu Gln Asn Ser Thr Asn Phe Val Leu Thr Gly Leu 1 5 10 15 Ile Thr His Pro Ala Phe Pro Gly Leu Leu Phe Ala Ile Val Phe Ser 20 25 30 Ile Phe Val Val Ala Ile Thr Ala Asn Leu Val Met Ile Leu Leu Ile 35 40 45 His Met Asp Ser Arg Leu His Thr Pro Met Tyr Phe Leu Leu Ser Gln 50 55 60 Leu Ser Ile Met Asp Thr Ile Tyr Ile Cys Ile Thr Val Pro Lys Met 65 70 75 80 Leu Gln Asp Leu Leu Ser Lys Asp Lys Thr Ile Ser Phe Leu Gly Cys 85 90 95 Ala Val Gln Ile Phe Leu Tyr Leu Thr Leu Ile Gly Gly Glu Phe Phe 100 105 110 Leu Leu Gly Leu Met Ala Tyr Asp Arg Tyr Val Ala Val Cys Asn Pro 115 120 125 Leu Arg Tyr Pro Leu Leu Met Asn Arg Arg Val Cys Leu Phe Met Val 130 135 140 Val Gly Ser Trp Val Gly Gly Ser Leu Asp Gly Phe Met Leu Thr Pro 145 150 155 160 Val Thr Met Ser Phe Pro Phe Cys Arg Ser Arg Glu Ile Asn His Phe 165 170 175 Phe Cys Glu Ile Pro Ala Val Leu Lys Leu Ser Cys Thr Asp Thr Ser 180 185 190 Leu Tyr Glu Thr Leu Met Tyr Ala Cys Cys Val Leu Met Leu Leu Ile 195 200 205 Pro Leu Ser Val Ile Ser Val Ser Tyr Thr His Ile Leu Leu Thr Val 210 215 220 His Arg Met Asn Ser Ala Glu Gly Arg Arg Lys Ala Phe Ala Thr Cys 225 230 235 240 Ser Ser His Ile Met Val Val Ser Val Phe Tyr Gly Ala Ala Phe Tyr 245 250 255 Thr Asn Val Leu Pro His Ser Tyr His Thr Pro Glu Lys Asp Lys Val 260 265 270 Val Ser Ala Phe Tyr Thr Ile Leu Thr Pro Met Leu Asn Pro Leu Ile 275 280 285 Tyr Ser Leu Arg Asn Lys Asp Val Ala Ala Ala Leu Arg Lys Val Leu 290 295 300 Gly Arg Cys Gly Ser Ser Gln Ser Ile Arg Val Ala Thr Val Ile Arg 305 310 315 320 Lys Gly <210> 30 <211> 969 <212> DNA <213> Artificial Sequence <220> <223> OR2T2 <400> 30 atggagggtc ttctccagaa ctccactaac ttcgtcctca caggcctcat cacccatcct 60 gccttccccg ggcttctctt tgcaatagtc ttctccatct ttgtggtggc tataacagcc 120 aacttggtca tgattctgct catccacatg gactcccgcc tccacacacc catgtacttc 180 ttgctcagcc agctctccat catggatacc atctacatct gtatcactgt ccccaagatg 240 ctccaggacc tcctgtccaa ggacaagacc atttccttcc tgggctgtgc agttcagatc 300 ttcctctacc tgaccctgat tggaggggaa ttcttcctgc tgggtctcat ggcctatgac 360 cgctatgtgg ctgtgtgcaa ccctctacgg taccctctcc tcatgaaccg cagggtttgc 420 ttattcatgg tggtcggctc ctgggttggt ggttccttgg atgggttcat gctgactcct 480 gtcactatga gtttcccctt ctgtagatcc cgagagatca atcacttttt ctgtgagatc 540 ccagccgtgc tgaagttgtc ttgcacagac acgtcactct atgagaccct gatgtatgcc 600 tgctgcgtgc tgatgctgct tatccctcta tctgtcatct ctgtctccta cacgcacatc 660 ctcctgactg tccacaggat gaactctgct gagggccggc gcaaagcctt tgctacgtgt 720 tcctcccaca ttatggtggt gagcgttttc tacggggcag ccttctacac caacgtgctg 780 ccccactcct accacactcc agagaaagat aaagtggtgt ctgccttcta caccatcctc 840 acccccatgc tcaacccact catctacagc ttgaggaata aagatgtggc tgcagctctg 900 aggaaagtac tagggagatg tggttcctcc cagagcatca gggtggcgac tgtgatcagg 960 aagggctag 969 <210> 31 <211> 318 <212> PRT <213> Artificial Sequence <220> <223> OR2T3 <400> 31 Met Cys Ser Gly Asn Gln Thr Ser Gln Asn Gln Thr Ala Ser Thr Asp 1 5 10 15 Phe Thr Leu Thr Gly Leu Phe Ala Glu Ser Lys His Ala Ala Leu Leu 20 25 30 Tyr Thr Val Thr Phe Leu Leu Phe Leu Met Ala Leu Thr Gly Asn Ala 35 40 45 Leu Leu Ile Leu Leu Ile His Ser Glu Pro Arg Leu His Thr Pro Met 50 55 60 Tyr Phe Phe Ile Ser Gln Leu Ala Leu Met Asp Leu Met Tyr Leu Cys 65 70 75 80 Val Thr Val Pro Lys Met Leu Val Gly Gln Val Thr Gly Asp Asp Thr 85 90 95 Ile Ser Pro Ser Gly Cys Gly Ile Gln Met Phe Phe Tyr Leu Thr Leu 100 105 110 Ala Gly Ala Glu Val Phe Leu Leu Ala Ala Met Ala Tyr Asp Arg Tyr 115 120 125 Ala Ala Val Cys Arg Pro Leu His Tyr Pro Leu Leu Met Asn Gln Arg 130 135 140 Val Cys Gln Leu Leu Val Ser Ala Cys Trp Val Leu Gly Met Val Asp 145 150 155 160 Gly Leu Leu Leu Thr Pro Ile Thr Met Ser Phe Pro Phe Cys Gln Ser 165 170 175 Arg Lys Ile Leu Ser Phe Phe Cys Glu Thr Pro Ala Leu Leu Lys Leu 180 185 190 Ser Cys Ser Asp Val Ser Leu Tyr Lys Thr Leu Met Tyr Leu Cys Cys 195 200 205 Ile Leu Met Leu Leu Ala Pro Ile Met Val Ile Ser Ser Ser Tyr Thr 210 215 220 Leu Ile Leu His Leu Ile His Arg Met Asn Ser Ala Ala Gly His Arg 225 230 235 240 Lys Ala Leu Ala Thr Cys Ser Ser His Met Ile Ile Val Leu Leu Leu 245 250 255 Phe Gly Ala Ser Phe Tyr Thr Tyr Met Leu Pro Ser Ser Tyr His Thr 260 265 270 Ala Glu Gln Asp Met Met Val Ser Ala Phe Tyr Thr Ile Phe Thr Pro 275 280 285 Val Leu Asn Pro Leu Ile Tyr Ser Leu Arg Asn Lys Asp Val Thr Arg 290 295 300 Ala Leu Arg Ser Met Met Gln Ser Arg Met Asn Gln Glu Lys 305 310 315 <210> 32 <211> 957 <212> DNA <213> Artificial Sequence <220> <223> OR2T3 <400> 32 atgtgctcag ggaatcagac ttctcagaat caaacagcaa gcactgattt caccctcacg 60 ggactctttg ctgagagcaa gcatgctgcc ctcctctaca ccgtgacctt ccttcttttc 120 ttgatggccc tcactgggaa tgccctcctc atcctcctca tccactcaga gccccgcctc 180 cacaccccca tgtacttctt catcagccag ctcgcgctca tggatctcat gtacctatgc 240 gtgactgtgc ccaagatgct tgtgggccag gtcactggag atgataccat ttccccgtca 300 ggctgtggga tccagatgtt cttctacctg accctggctg gagctgaggt tttcctcctg 360 gctgccatgg cctatgaccg atatgctgct gtttgcagac ctctccatta cccactgctg 420 atgaaccaga gggtgtgcca gctcctggtg tcagcctgct gggttttggg aatggttgat 480 ggtttgttgc tcacccccat taccatgagc ttcccctttt gccagtctag gaaaatcctg 540 agttttttct gtgagactcc tgccctgctg aagctctcct gctctgacgt ctccctctat 600 aagacgctca tgtacctgtg ctgcatcctc atgcttctcg cccccatcat ggtcatctcc 660 agctcataca ccctcatcct gcatctcatc cacaggatga attctgccgc cggccacagg 720 aaggccttgg ccacctgctc ctcccacatg atcatagtgc tgctgctctt cggtgcttcc 780 ttctacacct acatgctccc gagttcctac cacacagctg agcaggacat gatggtgtct 840 gccttttaca ccatcttcac tcctgtgctg aaccccctca tttacagtct ccgcaacaaa 900 gatgtcacca gggctctgag gagcatgatg cagtcaagaa tgaaccaaga aaagtag 957 <210> 33 <211> 295 <212> PRT <213> Artificial Sequence <220> <223> AQP12A <400> 33 Met Ala Gly Leu Asn Val Ser Leu Ser Phe Phe Phe Ala Thr Phe Ala 1 5 10 15 Leu Cys Glu Ala Ala Arg Arg Ala Ser Lys Ala Leu Leu Pro Val Gly 20 25 30 Ala Tyr Glu Val Phe Ala Arg Glu Ala Met Arg Thr Leu Val Glu Leu 35 40 45 Gly Pro Trp Ala Gly Asp Phe Gly Pro Asp Leu Leu Leu Thr Leu Leu 50 55 60 Phe Leu Leu Phe Leu Ala His Gly Val Thr Leu Asp Gly Ala Ser Ala 65 70 75 80 Asn Pro Thr Val Ser Leu Gln Glu Phe Leu Met Ala Glu Gln Ser Leu 85 90 95 Pro Gly Thr Leu Leu Lys Leu Ala Ala Gln Gly Leu Gly Met Gln Ala 100 105 110 Ala Cys Thr Leu Met Arg Leu Cys Trp Ala Trp Glu Leu Ser Asp Leu 115 120 125 His Leu Leu Gln Ser Leu Met Ala Gln Ser Cys Ser Ser Ala Leu Arg 130 135 140 Thr Ser Val Pro His Gly Ala Leu Val Glu Ala Ala Cys Ala Phe Cys 145 150 155 160 Phe His Leu Thr Leu Leu His Leu Arg His Ser Pro Pro Ala Tyr Ser 165 170 175 Gly Pro Ala Val Ala Leu Leu Val Thr Val Thr Ala Tyr Thr Ala Gly 180 185 190 Pro Phe Thr Ser Ala Phe Phe Asn Pro Ala Leu Ala Ala Ser Val Thr 195 200 205 Phe Ala Cys Ser Gly His Thr Leu Leu Glu Tyr Val Gln Val Tyr Trp 210 215 220 Leu Gly Pro Leu Thr Gly Met Val Leu Ala Val Leu Leu His Gln Gly 225 230 235 240 Arg Leu Pro His Leu Phe Gln Arg Asn Leu Phe Tyr Gly Gln Lys Asn 245 250 255 Lys Tyr Arg Ala Pro Arg Gly Lys Pro Ala Pro Ala Ser Gly Asp Thr 260 265 270 Gln Thr Pro Ala Lys Gly Ser Ser Val Arg Glu Pro Gly Arg Ser Gly 275 280 285 Val Glu Gly Pro His Ser Ser 290 295 <210> 34 <211> 1097 <212> DNA <213> Artificial Sequence <220> <223> AQP12A <400> 34 gaaccagcca gctcctgctc tgtcccctca ggtgtcctgc aggcacagct cctcgggggg 60 cccaggccga tggcaggtct taacgtgtcc ctctccttct tctttgccac cttcgccctc 120 tgtgaggcgg ccaggcgggc ctccaaggcc ctgctcccag tgggcgccta tgaagtcttc 180 gcccgggagg cgatgaggac gctggtcgag ctcgggccct gggctgggga ctttgggcct 240 gacctgctgc tcaccctgct cttcctgctc ttcctggcgc acggggtcac cttggacggg 300 gcctcggcca accccactgt gtccctgcag gagttcctca tggccgagca gtctctgcct 360 ggcacgctgt tgaagctggc ggcacagggg ctgggcatgc aggccgcctg caccctgatg 420 cgcctctgct gggcctggga gctcagtgac ctgcacctgc tgcagagcct catggcccag 480 agctgcagct cggccctgcg cacatccgtg ccccacgggg cgcttgtgga ggccgcctgc 540 gccttttgtt tccatctgac cctcctgcac ctgcggcaca gtcctcccgc ctacagcggg 600 cccgctgtgg ctctgttggt caccgtcacg gcctacacgg ccgggccctt cacgtctgcc 660 ttcttcaacc ctgccctggc cgcctctgtg acctttgcct gctcgggaca caccttactg 720 gagtacgtgc aggtgtactg gctgggccct ctgacaggga tggtcctggc tgtgctgctg 780 caccagggcc gccttcccca ccttttccag aggaacctgt tctacggcca gaagaacaag 840 taccgagcac cccgagggaa gccggccccg gcctcagggg acacccagac ccctgcaaag 900 gggtccagtg tccgggagcc tgggcgcagt ggtgttgagg ggccacattc cagctgagtg 960 gccttgctct gtgtgagccc cgtgcgaggg ccctgcttgt agctggaccc tggaaccttc 1020 tgtagctaag agggaatcct ggccccctcc ccagaagcca tttgtcaata aaccatttct 1080 aagaaaaaaa aaaaaaa 1097 <210> 35 <211> 2169 <212> PRT <213> Artificial Sequence <220> <223> MUC4 <400> 35 Met Lys Gly Ala Arg Trp Arg Arg Val Pro Trp Val Ser Leu Ser Cys 1 5 10 15 Leu Cys Leu Cys Leu Leu Pro His Val Val Pro Gly Thr Thr Glu Asp 20 25 30 Thr Leu Ile Thr Gly Ser Lys Thr Pro Ala Pro Val Thr Ser Thr Gly 35 40 45 Ser Thr Thr Ala Thr Leu Glu Gly Gln Ser Thr Ala Ala Ser Ser Arg 50 55 60 Thr Ser Asn Gln Asp Ile Ser Ala Ser Ser Gln Asn His Gln Thr Lys 65 70 75 80 Ser Thr Glu Thr Thr Ser Lys Ala Gln Thr Asp Thr Leu Thr Gln Met 85 90 95 Met Thr Ser Thr Leu Phe Ser Ser Pro Ser Val His Asn Val Met Glu 100 105 110 Thr Val Thr Gln Glu Thr Ala Pro Pro Asp Glu Met Thr Thr Ser Phe 115 120 125 Pro Ser Ser Val Thr Asn Thr Leu Met Met Thr Ser Lys Thr Ile Thr 130 135 140 Met Thr Thr Ser Thr Asp Ser Thr Leu Gly Asn Thr Glu Glu Thr Ser 145 150 155 160 Thr Ala Gly Thr Glu Ser Ser Thr Pro Val Thr Ser Ala Val Ser Ile 165 170 175 Thr Ala Gly Gln Glu Gly Gln Ser Arg Thr Thr Ser Trp Arg Thr Ser 180 185 190 Ile Gln Asp Thr Ser Ala Ser Ser Gln Asn His Trp Thr Arg Ser Thr 195 200 205 Gln Thr Thr Arg Glu Ser Gln Thr Ser Thr Leu Thr His Arg Thr Thr 210 215 220 Ser Thr Pro Ser Phe Ser Pro Ser Val His Asn Val Thr Gly Thr Val 225 230 235 240 Ser Gln Lys Thr Ser Pro Ser Gly Glu Thr Ala Thr Ser Ser Leu Cys 245 250 255 Ser Val Thr Asn Thr Ser Met Met Thr Ser Glu Lys Ile Thr Val Thr 260 265 270 Thr Ser Thr Gly Ser Thr Leu Gly Asn Pro Gly Glu Thr Ser Ser Val 275 280 285 Pro Val Thr Gly Ser Leu Met Pro Val Thr Ser Ala Ala Leu Val Thr 290 295 300 Val Asp Pro Glu Gly Gln Ser Pro Ala Thr Phe Ser Arg Thr Ser Thr 305 310 315 320 Gln Asp Thr Thr Ala Phe Ser Lys Asn His Gln Thr Gln Ser Val Glu 325 330 335 Thr Thr Arg Val Ser Gln Ile Asn Thr Leu Asn Thr Leu Thr Pro Val 340 345 350 Thr Thr Ser Thr Val Leu Ser Ser Pro Ser Gly Phe Asn Pro Ser Gly 355 360 365 Thr Val Ser Gln Glu Thr Phe Pro Ser Gly Glu Thr Thr Ile Ser Ser 370 375 380 Pro Ser Ser Val Ser Asn Thr Phe Leu Val Thr Ser Lys Val Phe Arg 385 390 395 400 Met Pro Ile Ser Arg Asp Ser Thr Leu Gly Asn Thr Glu Glu Thr Ser 405 410 415 Leu Ser Val Ser Gly Thr Ile Ser Ala Ile Thr Ser Lys Val Ser Thr 420 425 430 Ile Trp Trp Ser Asp Thr Leu Ser Thr Ala Leu Ser Pro Ser Ser Leu 435 440 445 Pro Pro Lys Ile Ser Thr Ala Phe His Thr Gln Gln Ser Glu Gly Ala 450 455 460 Glu Thr Thr Gly Arg Pro His Glu Arg Ser Ser Phe Ser Pro Gly Val 465 470 475 480 Ser Gln Glu Ile Phe Thr Leu His Glu Thr Thr Thr Trp Pro Ser Ser 485 490 495 Phe Ser Ser Lys Gly His Thr Thr Trp Ser Gln Thr Glu Leu Pro Ser 500 505 510 Thr Ser Thr Gly Ala Ala Thr Arg Leu Val Thr Gly Asn Pro Ser Thr 515 520 525 Arg Ala Ala Gly Thr Ile Pro Arg Val Pro Ser Lys Val Ser Ala Ile 530 535 540 Gly Glu Pro Gly Glu Pro Thr Thr Tyr Ser Ser His Ser Thr Thr Leu 545 550 555 560 Pro Lys Thr Thr Gly Ala Gly Ala Gln Thr Gln Trp Thr Gln Glu Thr 565 570 575 Gly Thr Thr Gly Glu Ala Leu Leu Ser Ser Pro Ser Tyr Ser Val Ile 580 585 590 Gln Met Ile Lys Thr Ala Thr Ser Pro Ser Ser Ser Pro Met Leu Asp 595 600 605 Arg His Thr Ser Gln Gln Ile Thr Thr Ala Pro Ser Thr Asn His Ser 610 615 620 Thr Ile His Ser Thr Ser Thr Ser Pro Gln Glu Ser Pro Ala Val Ser 625 630 635 640 Gln Arg Gly His Thr Arg Ala Pro Gln Thr Thr Gln Glu Ser Gln Thr 645 650 655 Thr Arg Ser Val Ser Pro Met Thr Asp Thr Lys Thr Val Thr Thr Pro 660 665 670 Gly Ser Ser Phe Thr Ala Ser Gly His Ser Pro Ser Glu Ile Val Pro 675 680 685 Gln Asp Ala Pro Thr Ile Ser Ala Ala Thr Thr Phe Ala Pro Ala Pro 690 695 700 Thr Gly Asn Gly His Thr Thr Gln Ala Pro Thr Thr Ala Leu Gln Ala 705 710 715 720 Ala Pro Ser Ser His Asp Ala Thr Leu Gly Pro Ser Gly Gly Thr Ser 725 730 735 Leu Ser Lys Thr Gly Ala Leu Thr Leu Ala Asn Ser Val Val Ser Thr 740 745 750 Pro Gly Gly Pro Glu Gly Gln Trp Thr Ser Ala Ser Ala Ser Thr Ser 755 760 765 Pro Asp Thr Ala Ala Ala Met Thr His Thr His Gln Ala Glu Ser Thr 770 775 780 Glu Ala Ser Gly Gln Thr Gln Thr Ser Glu Pro Ala Ser Ser Gly Ser 785 790 795 800 Arg Thr Thr Ser Ala Gly Thr Ala Thr Pro Ser Ser Ser Gly Ala Ser 805 810 815 Gly Thr Thr Pro Ser Gly Ser Glu Gly Ile Ser Thr Ser Gly Glu Thr 820 825 830 Thr Arg Phe Ser Ser Asn Pro Ser Arg Asp Ser His Thr Thr Gln Ser 835 840 845 Thr Thr Glu Leu Leu Ser Ala Ser Ala Ser His Gly Ala Ile Pro Val 850 855 860 Ser Thr Gly Met Ala Ser Ser Ile Val Pro Gly Thr Phe His Pro Thr 865 870 875 880 Leu Ser Glu Ala Ser Thr Ala Gly Arg Pro Thr Gly Gln Ser Ser Pro 885 890 895 Thr Ser Pro Ser Ala Ser Pro Gln Glu Thr Ala Ala Ile Ser Arg Met 900 905 910 Ala Gln Thr Gln Arg Thr Gly Thr Ser Arg Gly Ser Asp Thr Ile Ser 915 920 925 Leu Ala Ser Gln Ala Thr Asp Thr Phe Ser Thr Val Pro Pro Thr Pro 930 935 940 Pro Ser Ile Thr Ser Ser Gly Leu Thr Ser Pro Gln Thr Gln Thr His 945 950 955 960 Thr Leu Ser Pro Ser Gly Ser Gly Lys Thr Phe Thr Thr Ala Leu Ile 965 970 975 Ser Asn Ala Thr Pro Leu Pro Val Thr Ser Thr Ser Ser Ala Ser Thr 980 985 990 Gly His Ala Thr Pro Leu Ala Val Ser Ser Ala Thr Ser Ala Ser Thr 995 1000 1005 Val Ser Ser Asp Ser Pro Leu Lys Met Glu Thr Ser Gly Met Thr Thr 1010 1015 1020 Pro Ser Leu Lys Thr Asp Gly Gly Arg Arg Thr Ala Thr Ser Pro Pro 1025 1030 1035 1040 Pro Thr Thr Ser Gln Thr Ile Ile Ser Thr Ile Pro Ser Thr Ala Met 1045 1050 1055 His Thr Arg Ser Thr Ala Ala Pro Ile Pro Ile Leu Pro Glu Arg Gly 1060 1065 1070 Val Ser Leu Phe Pro Tyr Gly Ala Asp Ala Gly Asp Leu Glu Phe Val 1075 1080 1085 Arg Arg Thr Val Asp Phe Thr Ser Pro Leu Phe Lys Pro Ala Thr Gly 1090 1095 1100 Phe Pro Leu Gly Ser Ser Leu Arg Asp Ser Leu Tyr Phe Thr Asp Asn 1105 1110 1115 1120 Gly Gln Ile Ile Phe Pro Glu Ser Asp Tyr Gln Ile Phe Ser Tyr Pro 1125 1130 1135 Asn Pro Leu Pro Thr Gly Phe Thr Gly Arg Asp Pro Val Ala Leu Val 1140 1145 1150 Ala Pro Phe Trp Asp Asp Ala Asp Phe Ser Thr Gly Arg Gly Thr Thr 1155 1160 1165 Phe Tyr Gln Glu Tyr Glu Thr Phe Tyr Gly Glu His Ser Leu Leu Val 1170 1175 1180 Gln Gln Ala Glu Ser Trp Ile Arg Lys Ile Thr Asn Asn Gly Gly Tyr 1185 1190 1195 1200 Lys Ala Arg Trp Ala Leu Lys Val Thr Trp Val Asn Ala His Ala Tyr 1205 1210 1215 Pro Ala Gln Trp Thr Leu Gly Ser Asn Thr Tyr Gln Ala Ile Leu Ser 1220 1225 1230 Thr Asp Gly Ser Arg Ser Tyr Ala Leu Phe Leu Tyr Gln Ser Gly Gly 1235 1240 1245 Met Gln Trp Asp Val Ala Gln Arg Ser Gly Lys Pro Val Leu Met Gly 1250 1255 1260 Phe Ser Ser Gly Asp Gly Phe Phe Glu Asn Ser Pro Leu Met Ser Gln 1265 1270 1275 1280 Pro Val Trp Glu Arg Tyr Arg Pro Asp Arg Phe Leu Asn Ser Asn Ser 1285 1290 1295 Gly Leu Gln Gly Leu Gln Phe Tyr Gly Leu His Arg Glu Glu Arg Pro 1300 1305 1310 Asn Tyr Arg Leu Glu Cys Leu Gln Trp Leu Lys Ser Gln Pro Arg Trp 1315 1320 1325 Pro Ser Trp Gly Trp Asn Gln Val Ser Cys Pro Cys Ser Trp Gln Gln 1330 1335 1340 Gly Arg Arg Asp Leu Arg Phe Gln Pro Val Ser Ile Gly Arg Trp Gly 1345 1350 1355 1360 Leu Gly Ser Arg Gln Leu Cys Ser Phe Thr Ser Trp Arg Gly Gly Val 1365 1370 1375 Cys Cys Ser Tyr Gly Pro Trp Gly Glu Phe Arg Glu Gly Trp His Val 1380 1385 1390 Gln Arg Pro Trp Gln Leu Ala Gln Glu Leu Glu Pro Gln Ser Trp Cys 1395 1400 1405 Cys Arg Trp Asn Asp Lys Pro Tyr Leu Cys Ala Leu Tyr Gln Gln Arg 1410 1415 1420 Arg Pro His Val Gly Cys Ala Thr Tyr Arg Pro Pro Gln Pro Ala Trp 1425 1430 1435 1440 Met Phe Gly Asp Pro His Ile Thr Thr Leu Asp Gly Val Ser Tyr Thr 1445 1450 1455 Phe Asn Gly Leu Gly Asp Phe Leu Leu Val Gly Ala Gln Asp Gly Asn 1460 1465 1470 Ser Ser Phe Leu Leu Gln Gly Arg Thr Ala Gln Thr Gly Ser Ala Gln 1475 1480 1485 Ala Thr Asn Phe Ile Ala Phe Ala Ala Gln Tyr Arg Ser Ser Ser Leu 1490 1495 1500 Gly Pro Val Thr Val Gln Trp Leu Leu Glu Pro His Asp Ala Ile Arg 1505 1510 1515 1520 Val Leu Leu Asp Asn Gln Thr Val Thr Phe Gln Pro Asp His Glu Asp 1525 1530 1535 Gly Gly Gly Gln Glu Thr Phe Asn Ala Thr Gly Val Leu Leu Ser Arg 1540 1545 1550 Asn Gly Ser Glu Ala Ser Ala Ser Phe Asp Gly Trp Ala Thr Val Ser 1555 1560 1565 Val Ile Ala Leu Ser Asn Ile Leu His Ser Ser Ala Ser Leu Pro Pro 1570 1575 1580 Glu Tyr Gln Asn Arg Thr Glu Gly Leu Leu Gly Val Trp Asn Asn Asn 1585 1590 1595 1600 Pro Glu Asp Asp Phe Arg Met Pro Asn Gly Ser Thr Ile Pro Pro Gly 1605 1610 1615 Ser Pro Glu Glu Met Leu Phe His Phe Gly Met Thr Trp Gln Ile Asn 1620 1625 1630 Gly Thr Gly Leu Leu Gly Lys Arg Asn Asp Gln Leu Pro Ser Asn Phe 1635 1640 1645 Thr Pro Val Phe Tyr Ser Gln Leu Gln Lys Asn Ser Ser Trp Ala Glu 1650 1655 1660 His Leu Ile Ser Asn Cys Asp Gly Asp Ser Ser Cys Ile Tyr Asp Thr 1665 1670 1675 1680 Leu Ala Leu Arg Asn Ala Ser Ile Gly Leu His Thr Arg Glu Val Ser 1685 1690 1695 Lys Asn Tyr Glu Gln Ala Asn Ala Thr Leu Asn Gln Tyr Pro Pro Ser 1700 1705 1710 Ile Asn Gly Gly Arg Val Ile Glu Ala Tyr Lys Gly Gln Thr Thr Leu 1715 1720 1725 Ile Gln Tyr Thr Ser Asn Ala Glu Asp Ala Asn Phe Thr Leu Arg Asp 1730 1735 1740 Ser Cys Thr Asp Leu Glu Leu Phe Glu Asn Gly Thr Leu Leu Trp Thr 1745 1750 1755 1760 Pro Lys Ser Leu Glu Pro Phe Thr Leu Glu Ile Leu Ala Arg Ser Ala 1765 1770 1775 Lys Ile Gly Leu Ala Ser Ala Leu Gln Pro Arg Thr Val Val Cys His 1780 1785 1790 Cys Asn Ala Glu Ser Gln Cys Leu Tyr Asn Gln Thr Ser Arg Val Gly 1795 1800 1805 Asn Ser Ser Leu Glu Val Ala Gly Cys Lys Cys Asp Gly Gly Thr Phe 1810 1815 1820 Gly Arg Tyr Cys Glu Gly Ser Glu Asp Ala Cys Glu Glu Pro Cys Phe 1825 1830 1835 1840 Pro Ser Val His Cys Val Pro Gly Lys Gly Cys Glu Ala Cys Pro Pro 1845 1850 1855 Asn Leu Thr Gly Asp Gly Arg His Cys Ala Ala Leu Gly Ser Ser Phe 1860 1865 1870 Leu Cys Gln Asn Gln Ser Cys Pro Val Asn Tyr Cys Tyr Asn Gln Gly 1875 1880 1885 His Cys Tyr Ile Ser Gln Thr Leu Gly Cys Gln Pro Met Cys Thr Cys 1890 1895 1900 Pro Pro Ala Phe Thr Asp Ser Arg Cys Phe Leu Ala Gly Asn Asn Phe 1905 1910 1915 1920 Ser Pro Thr Val Asn Leu Glu Leu Pro Leu Arg Val Ile Gln Leu Leu 1925 1930 1935 Leu Ser Glu Glu Glu Asn Ala Ser Met Ala Glu Val Asn Ala Ser Val 1940 1945 1950 Ala Tyr Arg Leu Gly Thr Leu Asp Met Arg Ala Phe Leu Arg Asn Ser 1955 1960 1965 Gln Val Glu Arg Ile Asp Ser Ala Ala Pro Ala Ser Gly Ser Pro Ile 1970 1975 1980 Gln His Trp Met Val Ile Ser Glu Phe Gln Tyr Arg Pro Arg Gly Pro 1985 1990 1995 2000 Val Ile Asp Phe Leu Asn Asn Gln Leu Leu Ala Ala Val Val Glu Ala 2005 2010 2015 Phe Leu Tyr His Val Pro Arg Arg Ser Glu Glu Pro Arg Asn Asp Val 2020 2025 2030 Val Phe Gln Pro Ile Ser Gly Glu Asp Val Arg Asp Val Thr Ala Leu 2035 2040 2045 Asn Val Ser Thr Leu Lys Ala Tyr Phe Arg Cys Asp Gly Tyr Lys Gly 2050 2055 2060 Tyr Asp Leu Val Tyr Ser Pro Gln Ser Gly Phe Thr Cys Val Ser Pro 2065 2070 2075 2080 Cys Ser Arg Gly Tyr Cys Asp His Gly Gly Gln Cys Gln His Leu Pro 2085 2090 2095 Ser Gly Pro Arg Cys Ser Cys Val Ser Phe Ser Ile Tyr Thr Ala Trp 2100 2105 2110 Gly Glu His Cys Glu His Leu Ser Met Lys Leu Asp Ala Phe Phe Gly 2115 2120 2125 Ile Phe Phe Gly Ala Leu Gly Gly Leu Leu Leu Leu Gly Val Gly Thr 2130 2135 2140 Phe Val Val Leu Arg Phe Trp Gly Cys Ser Gly Ala Arg Phe Ser Tyr 2145 2150 2155 2160 Phe Leu Asn Ser Ala Glu Ala Leu Pro 2165 <210> 36 <211> 9579 <212> DNA <213> Artificial Sequence <220> <223> MUC4 <400> 36 gtctgctcct cacactgcag ctgctgggcc gtggagcttc cccgggagcc agggggactt 60 ttgccgcagc catgaagggg gcacgctgga ggagggtccc ctgggtgtcc ctgagctgcc 120 tgtgtctctg cctccttccg catgtggtcc caggtaagtg atgnnnnnnn nnnnnnnnnn 180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 240 nnnnnnnnnn nnnnnnnnnn nnnttcactc caggaaccac agaggacaca ttaataactg 300 gaagtaaaac tcctgcccca gtcacctcaa caggctcaac aacagcgaca ctagagggac 360 aatcaactgc agcttcttca aggacctcta atcaggacat atcagcttca tctcagaacc 420 accagactaa gagcacggag accaccagca aagctcaaac cgacaccctc acgcagatga 480 tgacatcaac tcttttttct tccccaagtg tacacaatgt gatggagact gttacgcagg 540 agacagctcc tccagatgaa atgaccacat catttccctc cagtgtcacc aacacactca 600 tgatgacatc aaagactata acaatgacaa cctccacaga ctccactctt ggaaacacag 660 aagagacatc aacagcagga actgaaagtt ctaccccagt gacctcagca gtctcaataa 720 cagctggaca ggaaggacaa tcacgaacaa cttcctggag gacctctatc caagacacat 780 cagcttcttc tcagaaccac tggactcgga gcacgcagac caccagggaa tctcaaacca 840 gcaccctaac acacagaacc acttcaactc cttctttctc tccaagtgta cacaatgtga 900 cagggactgt ttctcagaag acatctcctt caggtgaaac agctacctca tccctctgta 960 gtgtcacaaa cacatccatg atgacatcag agaagataac agtgacaacc tccacaggct 1020 ccactcttgg aaacccaggg gagacatcat cagtacctgt tactggaagt cttatgccag 1080 tcacctcagc agccttagta acagttgatc cagaaggaca atcaccagca actttctcaa 1140 ggacttctac tcaggacaca acagcttttt ctaagaacca ccagactcag agcgtggaga 1200 ccaccagagt atctcaaatc aacaccctca acaccctcac accggttaca acatcaactg 1260 ttttatcctc accaagtgga ttcaacccaa gtggaacagt ttctcaggag acattccctt 1320 ctggtgaaac aaccatctca tccccttcca gtgtcagcaa tacattcctg gtaacatcaa 1380 aggtgttcag aatgccaatc tccagagact ctactcttgg aaacacagag gagacatcac 1440 tatctgtaag tggaaccatt tctgcaatca cttccaaagt ttcaaccata tggtggtcag 1500 acactctgtc aacagcactc tcccccagtt ctctacctcc aaaaatatcc acagctttcc 1560 acacccagca gagtgaaggt gcagagacca caggacggcc tcatgagagg agctcattct 1620 ctccaggtgt gtctcaagaa atatttactc tacatgaaac aacaacatgg ccttcctcat 1680 tctccagcaa aggccacaca acttggtcac aaacagaact gccctcaaca tcaacaggtg 1740 ctgccactag gcttgtcaca ggaaatccat ctacaagggc agctggcact attccaaggg 1800 tcccctctaa ggtctcagca ataggggaac caggagagcc caccacatac tcctcccaca 1860 gcacaactct cccaaaaaca acaggggcag gcgcccagac acaatggaca caagaaacgg 1920 ggaccactgg agaggctctt ctcagcagcc caagctatag tgtgattcag atgataaaaa 1980 cggccacatc cccatcttct tcacctatgc tggatagaca cacatcacaa caaattacaa 2040 cggcaccatc aacaaatcat tcaacaatac attccacaag cacctctcct caggaatcac 2100 cagctgtttc ccaaaggggt cacactcgag ccccgcagac cacacaagaa tcacaaacca 2160 cgaggtccgt ctcccccatg actgacacca agacagtcac caccccaggt tcttccttca 2220 cagccagtgg gcactcgccc tcagaaattg ttcctcagga cgcacccacc ataagtgcag 2280 caacaacctt tgccccagct cccaccggga atggtcacac aacccaggcc ccgaccacag 2340 cactgcaggc agcacccagc agccatgatg ccaccctggg gccctcagga ggcacgtcac 2400 tttccaaaac aggtgccctt actctggcca actctgtagt gtcaacacca gggggcccag 2460 aaggacaatg gacatcagcc tctgccagca cctcacctga cacagcagca gccatgaccc 2520 atacccacca ggctgagagc acagaggcct ctggacaaac acagaccagc gaaccggcct 2580 cctcagggtc acgaaccacc tcagcgggca cagctacccc ttcctcatcc ggggcgagtg 2640 gcacaacacc ttcaggaagc gaaggaatat ccacctcagg agagacgaca aggttttcat 2700 caaacccctc cagggacagt cacacaaccc agtcaacaac cgaattgctg tccgcctcag 2760 ccagtcatgg tgccatccca gtaagcacag gaatggcgtc ttcgatcgtc cccggcacct 2820 ttcatcccac cctctctgag gcctccactg cagggagacc gacaggacag tcaagcccaa 2880 cttctcccag tgcctctcct caggagacag ccgccatttc ccggatggcc cagactcaga 2940 ggacaggaac cagcagaggg tctgacacta tcagcctggc gtcccaggca accgacacct 3000 tctcaacagt cccacccaca cctccatcga tcacatccag tgggcttaca tctccacaaa 3060 cccagaccca cactctgtca ccttcagggt ctggtaaaac cttcaccacg gccctcatca 3120 gcaacgccac ccctcttcct gtcaccagca cctcctcagc ctccacaggt cacgccaccc 3180 ctcttgctgt cagcagtgct acctcagctt ccacagtatc ctcggactcc cctctgaaga 3240 tggaaacatc aggtagctgc cannnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3300 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3360 nnacgtgtcc aggaatgaca acaccgtcac tgaagacaga cggtgggaga cgcacagcca 3420 catcaccacc ccccacaacc tcccagacca tcatttccac cattcccagc actgccatgc 3480 acacccgctc cacagctgcc cccatcccca tcctgcctga gagaggtgag gccatnnnnn 3540 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3600 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnncctgg cccaggagtt tccctcttcc 3660 cctatggggc agacgccggg gacctggagt tcgtcaggag gaccgtggac ttcacctccc 3720 cactcttcaa gccggcgact ggcttccccc ttggctcctc tctccgtgat tccctctacg 3780 tgagtccggn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 3840 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnna tgtgctcagt 3900 tcacagacaa tggccagatc atcttcccag agtcagacta ccagattttc tcctacccca 3960 acccactccc aacaggcttc acaggccggg accctgtggc cctggtggct ccgttctggg 4020 acgatgctga cttctccact ggtcggggga ccacatttta tcaggtgagc ctttnnnnnn 4080 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4140 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnncctttc ctaggaatac gagacgttct 4200 atggtgaaca cagcctgcta gtccagcagg ccgagtcttg gattagaaag atcacaaaca 4260 acgggggcta caaggccagg tgggccctaa aggtcacgtg ggtcaatgcc cacgcctatc 4320 ctgcccagtg gaccctcggg gtgagtagac nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4380 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4440 nnnnnnnnnn ccacccccag agcaacacct accaagccat cctctccacg gacgggagca 4500 ggtcctatgc cctgtttctc taccagagcg gtgggatgca gtgggacgtg gcccagcgct 4560 caggcaagcc ggtgctcatg ggcttctcta ggtaggatgg gnnnnnnnnn nnnnnnnnnn 4620 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4680 nnnnnnnnnn nnnnnnnnnn ntttcctgca gtggagatgg ctttttcgaa aacagcccac 4740 tgatgtccca gccagtgtgg gagaggtatc gccctgatag attcctgaat tccaactcag 4800 gtaaaagtgc nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4860 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn ccgacctcag 4920 gcctccaagg gctgcagttc tacgggctac accgggaaga aaggcccaac taccgtctcg 4980 agtgcctgca gtggctgaag agccagcctc ggtggcccag ctggggctgg aaccaggtct 5040 cctgcccttg ttcctggcag cagggacgac gggacttacg attccaaccc gtcagcatag 5100 gtgacacctc nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5160 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn cttgtttcag 5220 gtcgctgggg cctcggcagt aggcagctgt gcagcttcac ctcttggcga ggaggcgtgt 5280 gctgcagcta cgggccctgg ggagagtttc gtgaaggctg gcacgtgcag cgtccttggc 5340 agttgggtga tctcaannnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5400 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnttct 5460 ccgcagccca ggaactggag ccacagagct ggtgctgccg ctggaatgac aagccctacc 5520 tctgtgccct gtaccagcag aggcggcccc acgtgggctg tgctacatac aggcccccac 5580 agcccggtga gcgacannnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 5640 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnttcc 5700 ttccagcctg gatgttcggg gacccccaca tcaccacctt ggatggtgtc agttacacct 5760 tcaatgggct gggggacttc ctgctggtcg gggcccaaga cgggaactcc tccttcctgc 5820 ttcagggccg caccgcccag actggctcag cccaggccac caacttcatc gcctttgcgg 5880 ctcagtaccg ctccagcagc ctgggccccg tcacggtgag tgaggnnnnn nnnnnnnnnn 5940 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6000 nnnnnnnnnn nnnnnnnnnn nnnnnctcct tccaggtcca atggctcctt gagcctcacg 6060 acgcaatccg tgtcctgctg gataaccaga ctgtgacatt tcagcctgac catgaagacg 6120 gcggaggtag gttgggnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnngatg 6240 ctccaggcca ggagacgttc aacgccaccg gagtcctcct gagccgcaac ggctctgagg 6300 cctccgccag cttcgacggc tgggccaccg tctcggtgat cgcgctctcc aacatcctcc 6360 actcctccgc cagcctcccg cccgagtacc agaaccgcac ggaggggctc ctgggtgagg 6420 gcggnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6480 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnntgtccc tcaggggtct 6540 ggaataacaa tccagaggac gacttcagga tgcccaatgg ctccaccatt cccccaggga 6600 gccctgagga gatgcttttc cactttggaa tgacctgtga gtctggnnnn nnnnnnnnnn 6660 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 6720 nnnnnnnnnn nnnnnnnnnn nnnnnntgtg ttacagggca gatcaacggg acaggcctcc 6780 ttggcaagag gaatgaccag ctgccttcca acttcacccc tgttttctac tcacaactgc 6840 aaaaaaacag ctcctgggct gaacatttga tctccaactg tgacggagat agctcatgca 6900 tctatgacac cctggccctg cgcaacgcaa gcatcggact tcacacgagg gaagtcagta 6960 aaaactacga gcaggcgaac gccaccctca gtaagtggcc nnnnnnnnnn nnnnnnnnnn 7020 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7080 nnnnnnnnnn nnnnnnnnnn tgtgtttcag atcagtaccc gccctccatc aatggtggtc 7140 gtgtgattga agcctacaag gggcagacca cgctgattca gtacaccagc aatgctgagg 7200 atgccaactt cacgctcaga gacagctgca ccgacttgga gctctttggt aggactatnn 7260 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7320 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnncc cggggcagag aatgggacgt 7380 tgctgtggac acccaagtcg ctggagccat tcactctgga gattctagca agaagtgcca 7440 agattggctt ggcatctgca ctccagccca ggactgtggt ctgccattgc aatgcagaga 7500 gccagtgttt gtacaatcag accagcaggg tgggcaactc ctccctggag gtgagtgttg 7560 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7620 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn cctcctccag gtggctggct 7680 gcaagtgtga cgggggcacc ttcggccgct actgcgaggg ctccgaggat gcctgtgagg 7740 agccgtgctt cccgagtgtc cactgcgttc ctgggaaggg ctgcgaggcc tgccctccaa 7800 acctgactgg ggatgggcgg cactgtgcgg gtgagccggg nnnnnnnnnn nnnnnnnnnn 7860 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 7920 nnnnnnnnnn nnnnnnnnnn cactctgcag ctctggggag ctctttcctg tgtcagaacc 7980 agtcctgccc tgtgaattac tgctacaatc aaggccactg ctacatctcc cagactctgg 8040 gctgtcagcc catgtgcacc tgccccccag ccttcactga cagccgctgc ttcctggctg 8100 ggaacaactt cagtccaact gtcaacctag gtaccgccag nnnnnnnnnn nnnnnnnnnn 8160 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8220 nnnnnnnnnn nnnnnnnnnn ccatctccag aacttccctt aagagtcatc cagctcttgc 8280 tcagtgaaga ggaaaatgcc tccatggcag aggtcaacgc ctcggtcagt gctgnnnnnn 8340 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8400 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnntctaac ctaggtggca tacagactgg 8460 ggaccctgga catgcgggcc tttctccgca acagccaagt ggaacgaatg taagtgggan 8520 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8580 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnt cccacacagc gattctgcag 8640 caccggcctc gggaagcccc atccaacact ggatggtcat ctcggagttc cagtaccgcc 8700 ctcggggccc ggtcattgac ttcctgaaca accagctgct ggccgcggtg gtggaggcgt 8760 tcttatacca cgttccacgg aggagtgagg agcccaggaa cgacgtggtc ttccagccca 8820 tttccgggga agacgtgcgc gatgtgacag cccgtgagtc cgtnnnnnnn nnnnnnnnnn 8880 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 8940 nnnnnnnnnn nnnnnnnnnn nnnttcccga cagtgaacgt gagcacgctg aaggcttact 9000 tcagatgcga tggctacaag ggctacgacc tggtctacag cccccagagc ggcttcacct 9060 gcgtgtcccc gtgcagtagg ggctactgtg accatggagg ccagtgccag cacctgccca 9120 gtgggccccg ctgcaggtgc atagggnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9180 nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9240 nnnnnntggt caccagctgt gtgtccttct ccatctacac ggcctggggc gagcactgtg 9300 agcacctgag catgaaactc gacgcgttct tcggcatctt ctttggggcc ctgggcggcc 9360 tcttgctgct gggggtcggg acgttcgtgg tcctgcgctt ctggggttgc tccggggcca 9420 ggttctccta tttcctgaac tcagctgagg ccttgccttg aaggggcagc tgtggcctag 9480 gctacctcaa gactcacctc atccttaccg cacatttaag gcgccattgc ttttgggaga 9540 ctggaaaagg gaaggtgact gaaggctgtc aggattctt 9579 <210> 37 <211> 530 <212> PRT <213> Artificial Sequence <220> <223> USP17L17 <400> 37 Met Glu Asp Asp Ser Leu Tyr Leu Gly Gly Glu Trp Gln Phe Asn His 1 5 10 15 Phe Ser Lys Leu Thr Ser Ser Arg Pro Asp Ala Ala Phe Ala Glu Ile 20 25 30 Gln Arg Thr Ser Leu Pro Glu Lys Ser Pro Leu Ser Cys Glu Thr Arg 35 40 45 Val Asp Leu Cys Asp Asp Leu Ala Pro Val Ala Arg Gln Leu Ala Pro 50 55 60 Arg Glu Lys Leu Pro Leu Ser Ser Arg Arg Pro Ala Ala Val Gly Ala 65 70 75 80 Gly Leu Gln Asn Met Gly Asn Thr Cys Tyr Val Asn Ala Ser Leu Gln 85 90 95 Cys Leu Thr Tyr Thr Pro Pro Leu Ala Asn Tyr Met Leu Ser Arg Glu 100 105 110 His Ser Gln Thr Cys His Arg His Lys Gly Cys Met Leu Cys Thr Met 115 120 125 Gln Ala His Ile Thr Arg Ala Leu His Asn Pro Gly His Val Ile Gln 130 135 140 Pro Ser Gln Ala Leu Ala Ala Gly Phe His Arg Gly Lys Gln Glu Asp 145 150 155 160 Ala His Glu Phe Leu Met Phe Thr Val Asp Ala Met Lys Lys Ala Cys 165 170 175 Leu Pro Gly His Lys Gln Val Asp His His Ser Lys Asp Thr Thr Leu 180 185 190 Ile His Gln Ile Phe Gly Gly Tyr Trp Arg Ser Gln Ile Lys Cys Leu 195 200 205 His Cys His Gly Ile Ser Asp Thr Phe Asp Pro Tyr Leu Asp Ile Ala 210 215 220 Leu Asp Ile Gln Ala Ala Gln Ser Val Gln Gln Ala Leu Glu Gln Leu 225 230 235 240 Val Lys Pro Glu Glu Leu Asn Gly Glu Asn Ala Tyr His Cys Gly Val 245 250 255 Cys Leu Gln Arg Ala Pro Ala Ser Lys Thr Leu Thr Leu His Thr Ser 260 265 270 Ala Lys Val Leu Ile Leu Val Leu Lys Arg Phe Ser Asp Val Thr Gly 275 280 285 Asn Lys Ile Ala Lys Asn Val Gln Tyr Pro Glu Cys Leu Asp Met Gln 290 295 300 Pro Tyr Met Ser Gln Gln Asn Thr Gly Pro Leu Val Tyr Val Leu Tyr 305 310 315 320 Ala Val Leu Val His Ala Gly Trp Ser Cys His Asn Gly His Tyr Phe 325 330 335 Ser Tyr Val Lys Ala Gln Glu Gly Gln Trp Tyr Lys Met Asp Asp Ala 340 345 350 Glu Val Thr Ala Ala Ser Ile Thr Ser Val Leu Ser Gln Gln Ala Tyr 355 360 365 Val Leu Phe Tyr Ile Gln Lys Ser Glu Trp Glu Arg His Ser Glu Ser 370 375 380 Val Ser Arg Gly Arg Glu Pro Arg Ala Leu Gly Ala Glu Asp Thr Asp 385 390 395 400 Arg Arg Ala Thr Gln Gly Glu Leu Lys Arg Asp His Pro Cys Leu Gln 405 410 415 Ala Pro Glu Leu Asp Glu His Leu Val Glu Arg Ala Thr Gln Glu Ser 420 425 430 Thr Leu Asp His Trp Lys Phe Leu Gln Glu Gln Asn Lys Thr Lys Pro 435 440 445 Glu Phe Asn Val Arg Lys Val Glu Gly Thr Leu Pro Pro Asp Val Leu 450 455 460 Val Ile His Gln Ser Lys Tyr Lys Cys Gly Met Lys Asn His His Pro 465 470 475 480 Glu Gln Gln Ser Ser Leu Leu Asn Leu Ser Ser Ser Thr Pro Thr His 485 490 495 Gln Glu Ser Met Asn Thr Gly Thr Leu Ala Ser Leu Arg Gly Arg Ala 500 505 510 Arg Arg Ser Lys Gly Lys Asn Lys His Ser Lys Arg Ala Leu Leu Val 515 520 525 Cys Gln 530 <210> 38 <211> 1593 <212> DNA <213> Artificial Sequence <220> <223> USP17L17 <400> 38 atggaggacg actcactcta cttgggaggt gagtggcagt tcaaccactt ttcaaaactc 60 acatcttctc ggcccgatgc agcttttgct gaaatccagc ggacttctct ccctgagaag 120 tcaccactct catgtgagac ccgtgtcgac ctctgtgatg atttggctcc tgtggcaaga 180 cagcttgctc ccagggagaa gcttcctctg agtagcagga gacctgctgc ggtgggggct 240 gggctccaga atatgggaaa tacctgctac gtgaacgctt ccttgcagtg cctgacatac 300 acaccgcccc ttgccaacta catgctgtcc cgggagcact ctcaaacgtg tcatcgtcac 360 aagggctgca tgctctgtac gatgcaagct cacatcacac gggccctcca caatcctggc 420 cacgtcatcc agccctcaca ggcattggct gctggcttcc atagaggcaa gcaggaagat 480 gcccatgaat ttctcatgtt cactgtggat gccatgaaaa aggcatgcct tcccgggcac 540 aagcaggtag atcatcactc taaggacacc accctcatcc accaaatatt tggaggctac 600 tggagatctc aaatcaagtg tctccactgc cacggcattt cagacacttt tgacccttac 660 ctggacatcg ccctggatat ccaggcagct cagagtgtcc agcaagcttt ggaacagttg 720 gtgaagcccg aagaactcaa tggagagaat gcctatcatt gtggtgtttg tctccagagg 780 gcgccggcct ccaagacgtt aactttacac acctctgcca aggtcctcat ccttgtattg 840 aagagattct ccgatgtgac aggcaacaag attgccaaga atgtgcaata tcctgagtgc 900 cttgacatgc agccatacat gtctcagcag aacacaggac ctcttgtcta tgtcctctat 960 gctgtgctgg tccacgctgg gtggagttgt cacaacggac attacttctc ttatgtcaaa 1020 gctcaagaag gccaatggta taaaatggat gatgccgagg tcaccgccgc tagcatcact 1080 tctgtcctga gtcaacaggc ctacgtcctc ttttacatcc agaagagtga atgggaaaga 1140 cacagtgaga gtgtgtcaag aggcagggaa ccaagagccc ttggcgcaga agacacagac 1200 aggcgagcaa cgcaaggaga gctcaagaga gaccacccct gcctccaggc ccccgagttg 1260 gacgagcact tggtggaaag agccactcag gaaagcacct tagaccactg gaaattcctt 1320 caagagcaaa acaaaacgaa gcctgagttc aacgtcagaa aagtcgaagg taccctgcct 1380 cccgacgtac ttgtgattca tcaatcaaaa tacaagtgtg ggatgaagaa ccatcatcct 1440 gaacagcaaa gctccctgct aaacctctct tcgtcgaccc cgacacatca ggagtccatg 1500 aacactggca cactcgcttc cctgcgaggg agggccagga gatccaaagg gaagaacaaa 1560 cacagcaaga gggctctgct tgtgtgccag tga 1593 <210> 39 <211> 530 <212> PRT <213> Artificial Sequence <220> <223> USP17L18 <400> 39 Met Glu Asp Asp Ser Leu Tyr Leu Gly Gly Glu Trp Gln Phe Asn His 1 5 10 15 Phe Ser Lys Leu Thr Ser Ser Arg Pro Asp Ala Ala Phe Ala Glu Ile 20 25 30 Gln Arg Thr Ser Leu Pro Glu Lys Ser Pro Leu Ser Cys Glu Thr Arg 35 40 45 Val Asp Leu Cys Asp Asp Leu Ala Pro Val Ala Arg Gln Leu Ala Pro 50 55 60 Arg Glu Lys Leu Pro Leu Ser Ser Arg Arg Pro Ala Ala Val Gly Ala 65 70 75 80 Gly Leu Gln Asn Met Gly Asn Thr Cys Tyr Val Asn Ala Ser Leu Gln 85 90 95 Cys Leu Thr Tyr Thr Pro Pro Leu Ala Asn Tyr Met Leu Ser Arg Glu 100 105 110 His Ser Gln Thr Cys His Arg His Lys Gly Cys Met Leu Cys Thr Met 115 120 125 Gln Ala His Ile Thr Arg Ala Leu His Asn Pro Gly His Val Ile Gln 130 135 140 Pro Ser Gln Ala Leu Ala Ala Gly Phe His Arg Gly Lys Gln Glu Asp 145 150 155 160 Ala His Glu Phe Leu Met Phe Thr Val Asp Ala Met Lys Lys Ala Cys 165 170 175 Leu Pro Gly His Lys Gln Val Asp His His Ser Lys Asp Thr Thr Leu 180 185 190 Ile His Gln Ile Phe Gly Gly Tyr Trp Arg Ser Gln Ile Lys Cys Leu 195 200 205 His Cys His Gly Ile Ser Asp Thr Phe Asp Pro Tyr Leu Asp Ile Ala 210 215 220 Leu Asp Ile Gln Ala Ala Gln Ser Val Gln Gln Ala Leu Glu Gln Leu 225 230 235 240 Val Lys Pro Glu Glu Leu Asn Gly Glu Asn Ala Tyr His Cys Gly Val 245 250 255 Cys Leu Gln Arg Ala Pro Ala Ser Lys Thr Leu Thr Leu His Thr Ser 260 265 270 Ala Lys Val Leu Ile Leu Val Leu Lys Arg Phe Ser Asp Val Thr Gly 275 280 285 Asn Lys Ile Ala Lys Asn Val Gln Tyr Pro Glu Cys Leu Asp Met Gln 290 295 300 Pro Tyr Met Ser Gln Thr Asn Thr Gly Pro Leu Val Tyr Val Leu Tyr 305 310 315 320 Ala Val Leu Val His Ala Gly Trp Ser Cys His Asn Gly His Tyr Phe 325 330 335 Ser Tyr Val Lys Ala Gln Glu Gly Gln Trp Tyr Lys Met Asp Asp Ala 340 345 350 Glu Val Thr Ala Ser Ser Ile Thr Ser Val Leu Ser Gln Gln Ala Tyr 355 360 365 Val Leu Phe Tyr Ile Gln Lys Ser Glu Trp Glu Arg His Ser Glu Ser 370 375 380 Val Ser Arg Gly Arg Glu Pro Arg Ala Leu Gly Ala Glu Asp Thr Asp 385 390 395 400 Arg Arg Ala Lys Gln Gly Glu Leu Lys Arg Asp His Pro Cys Leu Gln 405 410 415 Ala Pro Glu Leu Asp Glu His Leu Val Glu Arg Ala Thr Gln Glu Ser 420 425 430 Thr Leu Asp His Trp Lys Phe Leu Gln Glu Gln Asn Lys Thr Lys Pro 435 440 445 Glu Phe Asn Val Arg Lys Val Glu Gly Thr Leu Pro Pro Asp Val Leu 450 455 460 Val Ile His Gln Ser Lys Tyr Lys Cys Gly Met Lys Asn His His Pro 465 470 475 480 Glu Gln Gln Ser Ser Leu Leu Asn Leu Ser Ser Thr Thr Pro Thr His 485 490 495 Gln Glu Ser Met Asn Thr Gly Thr Leu Ala Ser Leu Arg Gly Arg Ala 500 505 510 Arg Arg Ser Lys Gly Lys Asn Lys His Ser Lys Arg Ala Leu Leu Val 515 520 525 Cys Gln 530 <210> 40 <211> 1593 <212> DNA <213> Artificial Sequence <220> <223> USP17L18 <400> 40 atggaggacg actcactcta cttgggaggt gagtggcagt tcaaccactt ttcaaaactc 60 acatcttctc ggcccgatgc agcttttgct gaaatccagc ggacttctct ccctgagaag 120 tcaccactct catgtgagac ccgtgtcgac ctctgtgatg atttggctcc tgtggcaaga 180 cagcttgctc ccagggagaa gcttcctctg agtagcagga gacctgctgc ggtgggggct 240 gggctccaga atatgggaaa tacctgctac gtgaacgctt ccttgcagtg cctgacatac 300 acaccgcccc ttgccaacta catgctgtcc cgggagcact ctcaaacgtg tcatcgtcac 360 aagggctgta tgctctgtac gatgcaagct cacatcacac gggccctcca caatcctggc 420 cacgtcatcc agccctcaca ggcattggct gctggcttcc atagaggcaa gcaggaagat 480 gcccatgaat ttctcatgtt cactgtggat gccatgaaaa aggcatgcct tcccgggcac 540 aagcaggtgg atcatcactc taaggacacc accctcatcc accaaatatt tggaggctac 600 tggagatctc aaatcaagtg tctccactgc cacggcattt cagacacttt tgacccttac 660 ctggacatcg ccctggatat ccaggcagct cagagtgtcc agcaagcttt ggaacagttg 720 gtgaagcccg aagaactcaa tggagagaat gcctatcatt gtggtgtttg tctccagagg 780 gcgccggcct ccaagacgtt aactttacac acctctgcca aggtcctcat ccttgtattg 840 aagagattct ccgatgtcac aggcaacaag attgccaaga atgtgcaata tcctgagtgc 900 cttgacatgc agccatacat gtctcagacg aacacaggac ctctcgtcta tgtcctctat 960 gctgtgctgg tccacgctgg gtggagttgt cacaacggac attacttctc ttatgtcaaa 1020 gctcaagaag gccagtggta taaaatggat gatgccgagg tcaccgcctc tagcatcact 1080 tctgtcctga gtcaacaggc ctacgtcctc ttttacatcc agaagagtga atgggaaaga 1140 cacagtgaga gtgtgtcaag aggcagggaa ccaagagccc ttggcgcaga agacacagac 1200 aggcgagcaa agcaaggaga gctcaagaga gaccacccct gcctccaggc ccccgagttg 1260 gacgagcact tggtggaaag agccactcag gaaagcacct tagaccactg gaaattcctt 1320 caagagcaaa acaaaacgaa gcctgagttc aacgtcagaa aagtcgaagg taccctgcct 1380 cccgacgtac ttgtgattca tcaatcaaaa tacaagtgtg ggatgaagaa ccatcatcct 1440 gaacagcaaa gctccctgct aaacctctct tcgacgaccc cgacacatca ggagtccatg 1500 aacactggca cactcgcttc cctgcgaggg agggccagga gatccaaagg gaagaacaaa 1560 cacagcaaga gggctctgct tgtgtgccag tga 1593 <210> 41 <211> 423 <212> PRT <213> Artificial Sequence <220> <223> TMPRSS11E <400> 41 Met Met Tyr Arg Pro Asp Val Val Arg Ala Arg Lys Arg Val Cys Trp 1 5 10 15 Glu Pro Trp Val Ile Gly Leu Val Ile Phe Ile Ser Leu Ile Val Leu 20 25 30 Ala Val Cys Ile Gly Leu Thr Val His Tyr Val Arg Tyr Asn Gln Lys 35 40 45 Lys Thr Tyr Asn Tyr Tyr Ser Thr Leu Ser Phe Thr Thr Asp Lys Leu 50 55 60 Tyr Ala Glu Phe Gly Arg Glu Ala Ser Asn Asn Phe Thr Glu Met Ser 65 70 75 80 Gln Arg Leu Glu Ser Met Val Lys Asn Ala Phe Tyr Lys Ser Pro Leu 85 90 95 Arg Glu Glu Phe Val Lys Ser Gln Val Ile Lys Phe Ser Gln Gln Lys 100 105 110 His Gly Val Leu Ala His Met Leu Leu Ile Cys Arg Phe His Ser Thr 115 120 125 Glu Asp Pro Glu Thr Val Asp Lys Ile Val Gln Leu Val Leu His Glu 130 135 140 Lys Leu Gln Asp Ala Val Gly Pro Pro Lys Val Asp Pro His Ser Val 145 150 155 160 Lys Ile Lys Lys Ile Asn Lys Thr Glu Thr Asp Ser Tyr Leu Asn His 165 170 175 Cys Cys Gly Thr Arg Arg Ser Lys Thr Leu Gly Gln Ser Leu Arg Ile 180 185 190 Val Gly Gly Thr Glu Val Glu Glu Gly Glu Trp Pro Trp Gln Ala Ser 195 200 205 Leu Gln Trp Asp Gly Ser His Arg Cys Gly Ala Thr Leu Ile Asn Ala 210 215 220 Thr Trp Leu Val Ser Ala Ala His Cys Phe Thr Thr Tyr Lys Asn Pro 225 230 235 240 Ala Arg Trp Thr Ala Ser Phe Gly Val Thr Ile Lys Pro Ser Lys Met 245 250 255 Lys Arg Gly Leu Arg Arg Ile Ile Val His Glu Lys Tyr Lys His Pro 260 265 270 Ser His Asp Tyr Asp Ile Ser Leu Ala Glu Leu Ser Ser Pro Val Pro 275 280 285 Tyr Thr Asn Ala Val His Arg Val Cys Leu Pro Asp Ala Ser Tyr Glu 290 295 300 Phe Gln Pro Gly Asp Val Met Phe Val Thr Gly Phe Gly Ala Leu Lys 305 310 315 320 Asn Asp Gly Tyr Ser Gln Asn His Leu Arg Gln Ala Gln Val Thr Leu 325 330 335 Ile Asp Ala Thr Thr Cys Asn Glu Pro Gln Ala Tyr Asn Asp Ala Ile 340 345 350 Thr Pro Arg Met Leu Cys Ala Gly Ser Leu Glu Gly Lys Thr Asp Ala 355 360 365 Cys Gln Gly Asp Ser Gly Gly Pro Leu Val Ser Ser Asp Ala Arg Asp 370 375 380 Ile Trp Tyr Leu Ala Gly Ile Val Ser Trp Gly Asp Glu Cys Ala Lys 385 390 395 400 Pro Asn Lys Pro Gly Val Tyr Thr Arg Val Thr Ala Leu Arg Asp Trp 405 410 415 Ile Thr Ser Lys Thr Gly Ile 420 <210> 42 <211> 1358 <212> DNA <213> Artificial Sequence <220> <223> TMPRSS11E <400> 42 ggactcttca ttgctggttg gcaatgatgt atcggccaga tgtggtgagg gctaggaaaa 60 gagtttgttg ggaaccctgg gttatcggcc tcgtcatctt catatccctg attgtcctgg 120 cagtgtgcat tggactcact gttcattatg tgagatataa tcaaaagaag acctacaatt 180 actatagcac attgtcattt acaactgaca aactatatgc tgagtttggc agagaggctt 240 ctaacaattt tacagaaatg agccagagac ttgaatcaat ggtgaaaaat gcattttata 300 aatctccatt aagggaagaa tttgtcaagt ctcaggttat caagttcagt caacagaagc 360 atggagtgtt ggctcatatg ctgttgattt gtagatttca ctctactgag gatcctgaaa 420 ctgtagataa aattgttcaa cttgttttac atgaaaagct gcaagatgct gtaggacccc 480 ctaaagtaga tcctcactca gttaaaatta aaaaaatcaa caagacagaa acagacagct 540 atctaaacca ttgctgcgga acacgaagaa gtaaaactct aggtcagagt ctcaggatcg 600 ttggtgggac agaagtagaa gagggtgaat ggccctggca ggctagcctg cagtgggatg 660 ggagtcatcg ctgtggagca accttaatta atgccacatg gcttgtgagt gctgctcact 720 gttttacaac atataagaac cctgccagat ggactgcttc ctttggagta acaataaaac 780 cttcgaaaat gaaacggggt ctccggagaa taattgtcca tgaaaaatac aaacacccat 840 cacatgacta tgatatttct cttgcagagc tttctagccc tgttccctac acaaatgcag 900 tacatagagt ttgtctccct gatgcatcct atgagtttca accaggtgat gtgatgtttg 960 tgacaggatt tggagcactg aaaaatgatg gttacagtca aaatcatctt cgacaagcac 1020 aggtgactct catagacgct acaacttgca atgaacctca agcttacaat gacgccataa 1080 ctcctagaat gttatgtgct ggctccttag aaggaaaaac agatgcatgc cagggtgact 1140 ctggaggacc actggttagt tcagatgcta gagatatctg gtaccttgct ggaatagtga 1200 gctggggaga tgaatgtgcg aaacccaaca agcctggtgt ttatactaga gttacggcct 1260 tgcgggactg gattacttca aaaactggta tctaagagac aaaagcctca tggaacagat 1320 aacatttttt tttgtttttt gggtgtggag gccatttt 1358 <210> 43 <211> 530 <212> PRT <213> Artificial Sequence <220> <223> UGT2B17 <400> 43 Met Ser Leu Lys Trp Met Ser Val Phe Leu Leu Met Gln Leu Ser Cys 1 5 10 15 Tyr Phe Ser Ser Gly Ser Cys Gly Lys Val Leu Val Trp Pro Thr Glu 20 25 30 Tyr Ser His Trp Ile Asn Met Lys Thr Ile Leu Glu Glu Leu Val Gln 35 40 45 Arg Gly His Glu Val Ile Val Leu Thr Ser Ser Ala Ser Ile Leu Val 50 55 60 Asn Ala Ser Lys Ser Ser Ala Ile Lys Leu Glu Val Tyr Pro Thr Ser 65 70 75 80 Leu Thr Lys Asn Asp Leu Glu Asp Phe Phe Met Lys Met Phe Asp Arg 85 90 95 Trp Thr Tyr Ser Ile Ser Lys Asn Thr Phe Trp Ser Tyr Phe Ser Gln 100 105 110 Leu Gln Glu Leu Cys Trp Glu Tyr Ser Asp Tyr Asn Ile Lys Leu Cys 115 120 125 Glu Asp Ala Val Leu Asn Lys Lys Leu Met Arg Lys Leu Gln Glu Ser 130 135 140 Lys Phe Asp Val Leu Leu Ala Asp Ala Val Asn Pro Cys Gly Glu Leu 145 150 155 160 Leu Ala Glu Leu Leu Asn Ile Pro Phe Leu Tyr Ser Leu Arg Phe Ser 165 170 175 Val Gly Tyr Thr Val Glu Lys Asn Gly Gly Gly Phe Leu Phe Pro Pro 180 185 190 Ser Tyr Val Pro Val Val Met Ser Glu Leu Ser Asp Gln Met Ile Phe 195 200 205 Met Glu Arg Ile Lys Asn Met Ile Tyr Met Leu Tyr Phe Asp Phe Trp 210 215 220 Phe Gln Ala Tyr Asp Leu Lys Lys Trp Asp Gln Phe Tyr Ser Glu Val 225 230 235 240 Leu Gly Arg Pro Thr Thr Leu Phe Glu Thr Met Gly Lys Ala Glu Met 245 250 255 Trp Leu Ile Arg Thr Tyr Trp Asp Phe Glu Phe Pro Arg Pro Phe Leu 260 265 270 Pro Asn Val Asp Phe Val Gly Gly Leu His Cys Lys Pro Ala Lys Pro 275 280 285 Leu Pro Lys Glu Met Glu Glu Phe Val Gln Ser Ser Gly Glu Asn Gly 290 295 300 Ile Val Val Phe Ser Leu Gly Ser Met Ile Ser Asn Met Ser Glu Glu 305 310 315 320 Ser Ala Asn Met Ile Ala Ser Ala Leu Ala Gln Ile Pro Gln Lys Val 325 330 335 Leu Trp Arg Phe Asp Gly Lys Lys Pro Asn Thr Leu Gly Ser Asn Thr 340 345 350 Arg Leu Tyr Lys Trp Leu Pro Gln Asn Asp Leu Leu Gly His Pro Lys 355 360 365 Thr Lys Ala Phe Ile Thr His Gly Gly Thr Asn Gly Ile Tyr Glu Ala 370 375 380 Ile Tyr His Gly Ile Pro Met Val Gly Ile Pro Leu Phe Ala Asp Gln 385 390 395 400 His Asp Asn Ile Ala His Met Lys Ala Lys Gly Ala Ala Leu Ser Val 405 410 415 Asp Ile Arg Thr Met Ser Ser Arg Asp Leu Leu Asn Ala Leu Lys Ser 420 425 430 Val Ile Asn Asp Pro Ile Tyr Lys Glu Asn Ile Met Lys Leu Ser Arg 435 440 445 Ile His His Asp Gln Pro Val Lys Pro Leu Asp Arg Ala Val Phe Trp 450 455 460 Ile Glu Phe Val Met Arg His Lys Gly Ala Lys His Leu Arg Val Ala 465 470 475 480 Ala His Asn Leu Thr Trp Ile Gln Tyr His Ser Leu Asp Val Ile Ala 485 490 495 Phe Leu Leu Ala Cys Val Ala Thr Met Ile Phe Met Ile Thr Lys Cys 500 505 510 Cys Leu Phe Cys Phe Arg Lys Leu Ala Lys Thr Gly Lys Lys Lys Lys 515 520 525 Arg Asp 530 <210> 44 <211> 2099 <212> DNA <213> Artificial Sequence <220> <223> UGT2B17 <400> 44 gaaagaaaca acaactggaa aagaagcact gcataagacc aggatgtctc tgaaatggat 60 gtcagtcttt ctgctgatgc agctcagttg ttactttagc tctgggagtt gtggaaaggt 120 gctggtgtgg cccacagaat acagccattg gataaatatg aagacaatcc tggaagagct 180 tgttcagagg ggtcatgagg tgattgtgtt gacatcttcg gcttctattc ttgtcaatgc 240 cagtaaatca tctgctatta aattagaagt ttatcctaca tctttaacta aaaatgattt 300 ggaagatttt tttatgaaaa tgttcgatag atggacatat agtatttcaa aaaatacatt 360 ttggtcatat ttttcacaac tacaagaatt gtgttgggaa tattctgact ataatataaa 420 gctctgtgaa gatgcagttt tgaacaagaa acttatgaga aaactacaag agtcaaaatt 480 tgatgtcctt ctggcagatg ccgttaatcc ctgtggtgag ctgctggctg agctacttaa 540 catacccttt ctgtacagtc tccgcttctc tgttggctac acagttgaga agaatggtgg 600 aggatttctg ttccctcctt cctatgtacc tgttgttatg tcagaattaa gtgatcaaat 660 gattttcatg gagaggataa aaaatatgat atatatgctt tattttgact tttggtttca 720 agcatatgat ctgaagaagt gggaccagtt ttatagtgaa gttctaggaa gacccactac 780 attatttgag acaatgggga aagctgaaat gtggctcatt cgaacctatt gggattttga 840 atttcctcgc ccattcttac caaatgttga ttttgttgga ggacttcact gtaaaccagc 900 caaacccttg cctaaggaaa tggaagagtt tgtgcagagc tctggagaaa atggtattgt 960 ggtgttttct ctggggtcga tgatcagtaa catgtcagaa gaaagtgcca acatgattgc 1020 atcagccctt gcccagatcc cacaaaaggt tctatggaga tttgatggca agaagccaaa 1080 tactttaggt tccaatactc gactgtataa gtggttaccc cagaatgacc ttcttggtca 1140 tcccaaaacc aaagctttta taactcatgg tggaaccaat ggcatctatg aggcaatcta 1200 ccatgggatc cctatggtgg gcattccctt gtttgcggat caacatgata acattgctca 1260 catgaaagcc aagggagcag ccctcagtgt ggacatcagg accatgtcaa gtagagattt 1320 gctcaatgca ttgaagtcag tcattaatga ccctatctat aaagagaata tcatgaaatt 1380 atcaagaatt catcatgatc aaccggtgaa gcccctggat cgagcagtct tctggattga 1440 gtttgtcatg cgccataaag gagccaagca ccttcgggtc gcagcccaca acctcacctg 1500 gatccagtac cactctttgg atgtgatagc attcctgctg gcctgcgtgg caactatgat 1560 atttatgatc acaaaatgtt gcctgttttg tttccgaaag cttgccaaaa caggaaagaa 1620 gaagaaaagg gattagttat atcaaaagcc tgaagtggaa tgaccaaaag atgggactcc 1680 tcctttattc cagcatggag ggttttaaat ggaggatttc ctttttcctg cgacaaaacg 1740 tcttttcaca acttaccctg ttaagtcaaa atttattttc caggaattta atatgtactt 1800 tagttggaat tattctatgt caatgatttt taagctatga aaaataataa tataaaacct 1860 tatgggctta tattgaaatt tattattcta atccaaaagt taccccacac aaaagttact 1920 gagcttcctt atgtttcaca cattgtattt gaacacaaaa cattaacaac tccactcata 1980 gtatcaacat tgttttgcaa atactcagaa tattttggct tcattttgag cagaattttt 2040 gtttttaatt ttgccaatga aatcttcaat aattaaaaaa aaaaaaaaaa aaaaaaaaa 2099 <210> 45 <211> 2839 <212> PRT <213> Artificial Sequence <220> <223> PDZD2 <400> 45 Met Pro Ile Thr Gln Asp Asn Ala Val Leu His Leu Pro Leu Leu Tyr 1 5 10 15 Gln Trp Leu Gln Asn Ser Leu Gln Glu Gly Gly Asp Gly Pro Glu Gln 20 25 30 Arg Leu Cys Gln Ala Ala Ile Gln Lys Leu Gln Glu Tyr Ile Gln Leu 35 40 45 Asn Phe Ala Val Asp Glu Ser Thr Val Pro Pro Asp His Ser Pro Pro 50 55 60 Glu Met Glu Ile Cys Thr Val Tyr Leu Thr Lys Glu Leu Gly Asp Thr 65 70 75 80 Glu Thr Val Gly Leu Ser Phe Gly Asn Ile Pro Val Phe Gly Asp Tyr 85 90 95 Gly Glu Lys Arg Arg Gly Gly Lys Lys Arg Lys Thr His Gln Gly Pro 100 105 110 Val Leu Asp Val Gly Cys Ile Trp Val Thr Glu Leu Arg Lys Asn Ser 115 120 125 Pro Ala Gly Lys Ser Gly Lys Val Arg Leu Arg Asp Glu Ile Leu Ser 130 135 140 Leu Asn Gly Gln Leu Met Val Gly Val Asp Val Ser Gly Ala Ser Tyr 145 150 155 160 Leu Ala Glu Gln Cys Trp Asn Gly Gly Phe Ile Tyr Leu Ile Met Leu 165 170 175 Arg Arg Phe Lys His Lys Ala His Ser Thr Tyr Asn Gly Asn Ser Ser 180 185 190 Asn Ser Ser Glu Pro Gly Glu Thr Pro Thr Leu Glu Leu Gly Asp Arg 195 200 205 Thr Ala Lys Lys Gly Lys Arg Thr Arg Lys Phe Gly Val Ile Ser Arg 210 215 220 Pro Pro Ala Asn Lys Ala Pro Glu Glu Ser Lys Gly Ser Ala Gly Cys 225 230 235 240 Glu Val Ser Ser Asp Pro Ser Thr Glu Leu Glu Asn Gly Pro Asp Pro 245 250 255 Glu Leu Gly Asn Gly His Val Phe Gln Leu Glu Asn Gly Pro Asp Ser 260 265 270 Leu Lys Glu Val Ala Gly Pro His Leu Glu Arg Ser Glu Val Asp Arg 275 280 285 Gly Thr Glu His Arg Ile Pro Lys Thr Asp Ala Pro Leu Thr Thr Ser 290 295 300 Asn Asp Lys Arg Arg Phe Ser Lys Gly Gly Lys Thr Asp Phe Gln Ser 305 310 315 320 Ser Asp Cys Leu Ala Arg Glu Glu Val Gly Arg Ile Trp Lys Met Glu 325 330 335 Leu Leu Lys Glu Ser Asp Gly Leu Gly Ile Gln Val Ser Gly Gly Arg 340 345 350 Gly Ser Lys Arg Ser Pro His Ala Ile Val Val Thr Gln Val Lys Glu 355 360 365 Gly Gly Ala Ala His Arg Asp Gly Arg Leu Ser Leu Gly Asp Glu Leu 370 375 380 Leu Val Ile Asn Gly His Leu Leu Val Gly Leu Ser His Glu Glu Ala 385 390 395 400 Val Ala Ile Leu Arg Ser Ala Thr Gly Met Val Gln Leu Val Val Ala 405 410 415 Ser Lys Glu Asn Ser Ala Glu Asp Leu Leu Arg Leu Thr Ser Lys Ser 420 425 430 Leu Pro Asp Leu Thr Ser Ser Val Glu Asp Val Ser Ser Trp Thr Asp 435 440 445 Asn Glu Asp Gln Glu Ala Asp Gly Glu Glu Asp Glu Gly Thr Ser Ser 450 455 460 Ser Val Gln Arg Ala Met Pro Gly Thr Asp Glu Pro Gln Asp Val Cys 465 470 475 480 Gly Ala Glu Glu Ser Lys Gly Asn Leu Glu Ser Pro Lys Gln Gly Ser 485 490 495 Asn Lys Ile Lys Leu Lys Ser Arg Leu Ser Gly Gly Val His Arg Leu 500 505 510 Glu Ser Val Glu Glu Tyr Asn Glu Leu Met Val Arg Asn Gly Asp Pro 515 520 525 Arg Ile Arg Met Leu Glu Val Ser Arg Asp Gly Arg Lys His Ser Leu 530 535 540 Pro Gln Leu Leu Asp Ser Ser Ser Ala Ser Gln Glu Tyr His Ile Val 545 550 555 560 Lys Lys Ser Thr Arg Ser Leu Ser Thr Thr Gln Val Glu Ser Pro Trp 565 570 575 Arg Leu Ile Arg Pro Ser Val Ile Ser Ile Ile Gly Leu Tyr Lys Glu 580 585 590 Lys Gly Lys Gly Leu Gly Phe Ser Ile Ala Gly Gly Arg Asp Cys Ile 595 600 605 Arg Gly Gln Met Gly Ile Phe Val Lys Thr Ile Phe Pro Asn Gly Ser 610 615 620 Ala Ala Glu Asp Gly Arg Leu Lys Glu Gly Asp Glu Ile Leu Asp Val 625 630 635 640 Asn Gly Ile Pro Ile Lys Gly Leu Thr Phe Gln Glu Ala Ile His Thr 645 650 655 Phe Lys Gln Ile Arg Ser Gly Leu Phe Val Leu Thr Val Arg Thr Lys 660 665 670 Leu Val Ser Pro Ser Leu Thr Pro Cys Ser Thr Pro Thr His Met Ser 675 680 685 Arg Ser Ala Ser Pro Asn Phe Asn Thr Ser Gly Gly Ala Ser Ala Gly 690 695 700 Gly Ser Asp Glu Gly Ser Ser Ser Ser Leu Gly Arg Lys Thr Pro Gly 705 710 715 720 Pro Lys Asp Arg Ile Val Met Glu Val Thr Leu Asn Lys Glu Pro Arg 725 730 735 Val Gly Leu Gly Ile Gly Ala Cys Cys Leu Ala Leu Glu Asn Ser Pro 740 745 750 Pro Gly Ile Tyr Ile His Ser Leu Ala Pro Gly Ser Val Ala Lys Met 755 760 765 Glu Ser Asn Leu Ser Arg Gly Asp Gln Ile Leu Glu Val Asn Ser Val 770 775 780 Asn Val Arg His Ala Ala Leu Ser Lys Val His Ala Ile Leu Ser Lys 785 790 795 800 Cys Pro Pro Gly Pro Val Arg Leu Val Ile Gly Arg His Pro Asn Pro 805 810 815 Lys Val Ser Glu Gln Glu Met Asp Glu Val Ile Ala Arg Ser Thr Tyr 820 825 830 Gln Glu Ser Lys Glu Ala Asn Ser Ser Pro Gly Leu Gly Thr Pro Leu 835 840 845 Lys Ser Pro Ser Leu Ala Lys Lys Asp Ser Leu Ile Ser Glu Ser Glu 850 855 860 Leu Ser Gln Tyr Phe Ala His Asp Val Pro Gly Pro Leu Ser Asp Phe 865 870 875 880 Met Val Ala Gly Ser Glu Asp Glu Asp His Pro Gly Ser Gly Cys Ser 885 890 895 Thr Ser Glu Glu Gly Ser Leu Pro Pro Ser Thr Ser Thr His Lys Glu 900 905 910 Pro Gly Lys Pro Arg Ala Asn Ser Leu Val Thr Leu Gly Ser His Arg 915 920 925 Ala Ser Gly Leu Phe His Lys Gln Val Thr Val Ala Arg Gln Ala Ser 930 935 940 Leu Pro Gly Ser Pro Gln Ala Leu Arg Asn Pro Leu Leu Arg Gln Arg 945 950 955 960 Lys Val Gly Cys Tyr Asp Ala Asn Asp Ala Ser Asp Glu Glu Glu Phe 965 970 975 Asp Arg Glu Gly Asp Cys Ile Ser Leu Pro Gly Ala Leu Pro Gly Pro 980 985 990 Ile Arg Pro Leu Ser Glu Asp Asp Pro Arg Arg Val Ser Ile Ser Ser 995 1000 1005 Ser Lys Gly Met Asp Val His Asn Gln Glu Glu Arg Pro Arg Lys Thr 1010 1015 1020 Leu Val Ser Lys Ala Ile Ser Ala Pro Leu Leu Gly Ser Ser Val Asp 1025 1030 1035 1040 Leu Glu Glu Ser Ile Pro Glu Gly Met Val Asp Ala Ala Ser Tyr Ala 1045 1050 1055 Ala Asn Leu Thr Asp Ser Ala Glu Ala Pro Lys Gly Ser Pro Gly Ser 1060 1065 1070 Trp Trp Lys Lys Glu Leu Ser Gly Ser Ser Ser Ala Pro Lys Leu Glu 1075 1080 1085 Tyr Thr Val Arg Thr Asp Thr Gln Ser Pro Thr Asn Thr Gly Ser Pro 1090 1095 1100 Ser Ser Pro Gln Gln Lys Ser Glu Gly Leu Gly Ser Arg His Arg Pro 1105 1110 1115 1120 Val Ala Arg Val Ser Pro His Cys Lys Arg Ser Glu Ala Glu Ala Lys 1125 1130 1135 Pro Ser Gly Ser Gln Thr Val Asn Leu Thr Gly Arg Ala Asn Asp Pro 1140 1145 1150 Cys Asp Leu Asp Ser Arg Val Gln Ala Thr Ser Val Lys Val Thr Val 1155 1160 1165 Ala Gly Phe Gln Pro Gly Gly Ala Val Glu Lys Glu Ser Leu Gly Lys 1170 1175 1180 Leu Thr Thr Gly Asp Ala Cys Val Ser Thr Ser Cys Glu Leu Ala Ser 1185 1190 1195 1200 Ala Leu Ser His Leu Asp Ala Ser His Leu Thr Glu Asn Leu Pro Lys 1205 1210 1215 Ala Ala Ser Glu Leu Gly Gln Gln Pro Met Thr Glu Leu Asp Ser Ser 1220 1225 1230 Ser Asp Leu Ile Ser Ser Pro Gly Lys Lys Gly Ala Ala His Pro Asp 1235 1240 1245 Pro Ser Lys Thr Ser Val Asp Thr Gly Gln Val Ser Arg Pro Glu Asn 1250 1255 1260 Pro Ser Gln Pro Ala Ser Pro Arg Val Thr Lys Cys Lys Ala Arg Ser 1265 1270 1275 1280 Pro Val Arg Leu Pro His Glu Gly Ser Pro Ser Pro Gly Glu Lys Ala 1285 1290 1295 Ala Ala Pro Pro Asp Tyr Ser Lys Thr Arg Ser Ala Ser Glu Thr Ser 1300 1305 1310 Thr Pro His Asn Thr Arg Arg Val Ala Ala Leu Arg Gly Ala Gly Pro 1315 1320 1325 Gly Ala Glu Gly Met Thr Pro Ala Gly Ala Val Leu Pro Gly Asp Pro 1330 1335 1340 Leu Thr Ser Gln Glu Gln Arg Gln Gly Ala Pro Gly Asn His Ser Lys 1345 1350 1355 1360 Ala Leu Glu Met Thr Gly Ile His Ala Pro Glu Ser Ser Gln Glu Pro 1365 1370 1375 Ser Leu Leu Glu Gly Ala Asp Ser Val Ser Ser Arg Ala Pro Gln Ala 1380 1385 1390 Ser Leu Ser Met Leu Pro Ser Thr Asp Asn Thr Lys Glu Ala Cys Gly 1395 1400 1405 His Val Ser Gly His Cys Cys Pro Gly Gly Ser Arg Glu Ser Pro Val 1410 1415 1420 Thr Asp Ile Asp Ser Phe Ile Lys Glu Leu Asp Ala Ser Ala Ala Arg 1425 1430 1435 1440 Ser Pro Ser Ser Gln Thr Gly Asp Ser Gly Ser Gln Glu Gly Ser Ala 1445 1450 1455 Gln Gly His Pro Pro Ala Gly Ala Gly Gly Gly Ser Ser Cys Arg Ala 1460 1465 1470 Glu Pro Val Pro Gly Gly Gln Thr Ser Ser Pro Arg Arg Ala Trp Ala 1475 1480 1485 Ala Gly Ala Pro Ala Tyr Pro Gln Trp Ala Ser Gln Pro Ser Val Leu 1490 1495 1500 Asp Ser Ile Asn Pro Asp Lys His Phe Thr Val Asn Lys Asn Phe Leu 1505 1510 1515 1520 Ser Asn Tyr Ser Arg Asn Phe Ser Ser Phe His Glu Asp Ser Thr Ser 1525 1530 1535 Leu Ser Gly Leu Gly Asp Ser Thr Glu Pro Ser Leu Ser Ser Met Tyr 1540 1545 1550 Gly Asp Ala Glu Asp Ser Ser Ser Asp Pro Glu Ser Leu Thr Glu Ala 1555 1560 1565 Pro Arg Ala Ser Ala Arg Asp Gly Trp Ser Pro Pro Arg Ser Arg Val 1570 1575 1580 Ser Leu His Lys Glu Asp Pro Ser Glu Ser Glu Glu Glu Gln Ile Glu 1585 1590 1595 1600 Ile Cys Ser Thr Arg Gly Cys Pro Asn Pro Pro Ser Ser Pro Ala His 1605 1610 1615 Leu Pro Thr Gln Ala Ala Ile Cys Pro Ala Ser Ala Lys Val Leu Ser 1620 1625 1630 Leu Lys Tyr Ser Thr Pro Arg Glu Ser Val Ala Ser Pro Arg Glu Lys 1635 1640 1645 Ala Ala Cys Leu Pro Gly Ser Tyr Thr Ser Gly Pro Asp Ser Ser Gln 1650 1655 1660 Pro Ser Ser Leu Leu Glu Met Ser Ser Gln Glu His Glu Thr His Ala 1665 1670 1675 1680 Asp Ile Ser Thr Ser Gln Asn His Arg Pro Ser Cys Ala Glu Glu Thr 1685 1690 1695 Thr Glu Val Thr Ser Ala Ser Ser Ala Met Glu Asn Ser Pro Leu Ser 1700 1705 1710 Lys Val Ala Arg His Phe His Ser Pro Pro Ile Ile Leu Ser Ser Pro 1715 1720 1725 Asn Met Val Asn Gly Leu Glu His Asp Leu Leu Asp Asp Glu Thr Leu 1730 1735 1740 Asn Gln Tyr Glu Thr Ser Ile Asn Ala Ala Ala Ser Leu Ser Ser Phe 1745 1750 1755 1760 Ser Val Asp Val Pro Lys Asn Gly Glu Ser Val Leu Glu Asn Leu His 1765 1770 1775 Ile Ser Glu Ser Gln Asp Leu Asp Asp Leu Leu Gln Lys Pro Lys Met 1780 1785 1790 Ile Ala Arg Arg Pro Ile Met Ala Trp Phe Lys Glu Ile Asn Lys His 1795 1800 1805 Asn Gln Gly Thr His Leu Arg Ser Lys Thr Glu Lys Glu Gln Pro Leu 1810 1815 1820 Met Pro Ala Arg Ser Pro Asp Ser Lys Ile Gln Met Val Ser Ser Ser 1825 1830 1835 1840 Gln Lys Lys Gly Val Thr Val Pro His Ser Pro Pro Gln Pro Lys Thr 1845 1850 1855 Asn Leu Glu Asn Lys Asp Leu Ser Lys Lys Ser Pro Ala Glu Met Leu 1860 1865 1870 Leu Thr Asn Gly Gln Lys Ala Lys Cys Gly Pro Lys Leu Lys Arg Leu 1875 1880 1885 Ser Leu Lys Gly Lys Ala Lys Val Asn Ser Glu Ala Pro Ala Ala Asn 1890 1895 1900 Ala Val Lys Ala Gly Gly Thr Asp His Arg Lys Pro Leu Ile Ser Pro 1905 1910 1915 1920 Gln Thr Ser His Lys Thr Leu Ser Lys Ala Val Ser Gln Arg Leu His 1925 1930 1935 Val Ala Asp His Glu Asp Pro Asp Arg Asn Thr Thr Ala Ala Pro Arg 1940 1945 1950 Ser Pro Gln Cys Val Leu Glu Ser Lys Pro Pro Leu Ala Thr Ser Gly 1955 1960 1965 Pro Leu Lys Pro Ser Val Ser Asp Thr Ser Ile Arg Thr Phe Val Ser 1970 1975 1980 Pro Leu Thr Ser Pro Lys Pro Val Pro Glu Gln Gly Met Trp Ser Arg 1985 1990 1995 2000 Phe His Met Ala Val Leu Ser Glu Pro Asp Arg Gly Cys Pro Thr Thr 2005 2010 2015 Pro Lys Ser Pro Lys Cys Arg Ala Glu Gly Arg Ala Pro Arg Ala Asp 2020 2025 2030 Ser Gly Pro Val Ser Pro Ala Ala Ser Arg Asn Gly Met Ser Val Ala 2035 2040 2045 Gly Asn Arg Gln Ser Glu Pro Arg Leu Ala Ser His Val Ala Ala Asp 2050 2055 2060 Thr Ala Gln Pro Arg Pro Thr Gly Glu Lys Gly Gly Asn Ile Met Ala 2065 2070 2075 2080 Ser Asp Arg Leu Glu Arg Thr Asn Gln Leu Lys Ile Val Glu Ile Ser 2085 2090 2095 Ala Glu Ala Val Ser Glu Thr Val Cys Gly Asn Lys Pro Ala Glu Ser 2100 2105 2110 Asp Arg Arg Gly Gly Cys Leu Ala Gln Gly Asn Cys Gln Glu Lys Ser 2115 2120 2125 Glu Ile Arg Leu Tyr Arg Gln Val Ala Glu Ser Ser Thr Ser His Pro 2130 2135 2140 Ser Ser Leu Pro Ser His Ala Ser Gln Ala Glu Gln Glu Met Ser Arg 2145 2150 2155 2160 Ser Phe Ser Met Ala Lys Leu Ala Ser Ser Ser Ser Ser Leu Gln Thr 2165 2170 2175 Ala Ile Arg Lys Ala Glu Tyr Ser Gln Gly Lys Ser Ser Leu Met Ser 2180 2185 2190 Asp Ser Arg Gly Val Pro Arg Asn Ser Ile Pro Gly Gly Pro Ser Gly 2195 2200 2205 Glu Asp His Leu Tyr Phe Thr Pro Arg Pro Ala Thr Arg Thr Tyr Ser 2210 2215 2220 Met Pro Ala Gln Phe Ser Ser His Phe Gly Arg Glu Gly His Pro Pro 2225 2230 2235 2240 His Ser Leu Gly Arg Ser Arg Asp Ser Gln Val Pro Val Thr Ser Ser 2245 2250 2255 Val Val Pro Glu Ala Lys Ala Ser Arg Gly Gly Leu Pro Ser Leu Ala 2260 2265 2270 Asn Gly Gln Gly Ile Tyr Ser Val Lys Pro Leu Leu Asp Thr Ser Arg 2275 2280 2285 Asn Leu Pro Ala Thr Asp Glu Gly Asp Ile Ile Ser Val Gln Glu Thr 2290 2295 2300 Ser Cys Leu Val Thr Asp Lys Ile Lys Val Thr Arg Arg His Tyr Cys 2305 2310 2315 2320 Tyr Glu Gln Asn Trp Pro His Glu Ser Thr Ser Phe Phe Ser Val Lys 2325 2330 2335 Gln Arg Ile Lys Ser Phe Glu Asn Leu Ala Asn Ala Asp Arg Pro Val 2340 2345 2350 Ala Lys Ser Gly Ala Ser Pro Phe Leu Ser Val Ser Ser Lys Pro Pro 2355 2360 2365 Ile Gly Arg Arg Ser Ser Gly Ser Ile Val Ser Gly Ser Leu Gly His 2370 2375 2380 Pro Gly Asp Ala Ala Ala Arg Leu Leu Arg Arg Ser Leu Ser Ser Cys 2385 2390 2395 2400 Ser Glu Asn Gln Ser Glu Ala Gly Thr Leu Leu Pro Gln Met Ala Lys 2405 2410 2415 Ser Pro Ser Ile Met Thr Leu Thr Ile Ser Arg Gln Asn Pro Pro Glu 2420 2425 2430 Thr Ser Ser Lys Gly Ser Asp Ser Glu Leu Lys Lys Ser Leu Gly Pro 2435 2440 2445 Leu Gly Ile Pro Thr Pro Thr Met Thr Leu Ala Ser Pro Val Lys Arg 2450 2455 2460 Asn Lys Ser Ser Val Arg His Thr Gln Pro Ser Pro Val Ser Arg Ser 2465 2470 2475 2480 Lys Leu Gln Glu Leu Arg Ala Leu Ser Met Pro Asp Leu Asp Lys Leu 2485 2490 2495 Cys Ser Glu Asp Tyr Ser Ala Gly Pro Ser Ala Val Leu Phe Lys Thr 2500 2505 2510 Glu Leu Glu Ile Thr Pro Arg Arg Ser Pro Gly Pro Pro Ala Gly Gly 2515 2520 2525 Val Ser Cys Pro Glu Lys Gly Gly Asn Arg Ala Cys Pro Gly Gly Ser 2530 2535 2540 Gly Pro Lys Thr Ser Ala Ala Glu Thr Pro Ser Ser Ala Ser Asp Thr 2545 2550 2555 2560 Gly Glu Ala Ala Gln Asp Leu Pro Phe Arg Arg Ser Trp Ser Val Asn 2565 2570 2575 Leu Asp Gln Leu Leu Val Ser Ala Gly Asp Gln Gln Arg Leu Gln Ser 2580 2585 2590 Val Leu Ser Ser Val Gly Ser Lys Ser Thr Ile Leu Thr Leu Ile Gln 2595 2600 2605 Glu Ala Lys Ala Gln Ser Glu Asn Glu Glu Asp Val Cys Phe Ile Val 2610 2615 2620 Leu Asn Arg Lys Glu Gly Ser Gly Leu Gly Phe Ser Val Ala Gly Gly 2625 2630 2635 2640 Thr Asp Val Glu Pro Lys Ser Ile Thr Val His Arg Val Phe Ser Gln 2645 2650 2655 Gly Ala Ala Ser Gln Glu Gly Thr Met Asn Arg Gly Asp Phe Leu Leu 2660 2665 2670 Ser Val Asn Gly Ala Ser Leu Ala Gly Leu Ala His Gly Asn Val Leu 2675 2680 2685 Lys Val Leu His Gln Ala Gln Leu His Lys Asp Ala Leu Val Val Ile 2690 2695 2700 Lys Lys Gly Met Asp Gln Pro Arg Pro Ser Ala Arg Gln Glu Pro Pro 2705 2710 2715 2720 Thr Ala Asn Gly Lys Gly Leu Leu Ser Arg Lys Thr Ile Pro Leu Glu 2725 2730 2735 Pro Gly Ile Gly Arg Ser Val Ala Val His Asp Ala Leu Cys Val Glu 2740 2745 2750 Val Leu Lys Thr Ser Ala Gly Leu Gly Leu Ser Leu Asp Gly Gly Lys 2755 2760 2765 Ser Ser Val Thr Gly Asp Gly Pro Leu Val Ile Lys Arg Val Tyr Lys 2770 2775 2780 Gly Gly Ala Ala Glu Gln Ala Gly Ile Ile Glu Ala Gly Asp Glu Ile 2785 2790 2795 2800 Leu Ala Ile Asn Gly Lys Pro Leu Val Gly Leu Met His Phe Asp Ala 2805 2810 2815 Trp Asn Ile Met Lys Ser Val Pro Glu Gly Pro Val Gln Leu Leu Ile 2820 2825 2830 Arg Lys His Arg Asn Ser Ser 2835 <210> 46 <211> 11984 <212> DNA <213> Artificial Sequence <220> <223> PDZD2 <400> 46 gaggctcggc ggatcccctg cgcagcgagg cgaggagcgg accccagcgc cggtgcgtgc 60 cggccccggg cagcgggacg cggcggggcg gcggctgcag gcagccgagg agccgcaggc 120 cgaacccaag gcaccgggat tgcgcctccc gcggctgccg gcgaaccgcg gctctgcagc 180 tcggggcagg cgcggcggcg gcaccggtgg tggccgcggt ggcggcagct gcgcggggac 240 ccgccgggcg gcgcctgggt ctggacgcgc gaggaagccg cgggagcctc ggccaagccg 300 cgagcaggtg tgaatgagcc cagggaagga cacacggcca ctgctggagg gatcctccat 360 tcctgtgtca tttgcatggg tcctgctgtg aaatgaacct ggcagggact tgttagacac 420 ttccttcctt ccctcattga gcactccagt gccattgttc cacagttgtt ctaattgggt 480 cctagcttcc tcctgccaag gcaaacagca tagtctcgag taggtgtccc taggctcatc 540 tgccagcctg aacatgaaca caggcaaagc tgatgatggc cagggacccc aggggacgtg 600 gggccctgtg gggtctggcc cccaggagca agacctctga tgatgctggt gtctgggagt 660 gagcaccatg cccatcaccc aggacaatgc cgtgctgcac ctgcccctcc tctaccagtg 720 gctgcagaac agcctgcagg aaggtgggga tgggccggag cagcggctct gccaggcggc 780 catccagaag ctgcaggagt acatccagct gaactttgct gtggatgaga gtacggtccc 840 acctgatcac agcccccccg aaatggagat ctgtactgtg tacctcacca aggagctggg 900 ggacacagag actgtgggcc tgagttttgg gaacatccct gttttcgggg actatggtga 960 aaagcgcagg gggggcaaga agaggaaaac ccaccagggt cctgtgctgg atgtgggctg 1020 catctgggtg acagagctga ggaagaacag cccagcaggg aagagtggga aggtccgact 1080 gcgggatgag atcctctcac tgaatgggca gctgatggtt ggagttgatg tcagtggggc 1140 cagttacctg gctgagcagt gctggaatgg cggctttatc tacctgatca tgctgcgtcg 1200 ctttaagcac aaagcccact ccacttataa tggcaacagt agcaacagct ctgaaccagg 1260 agaaacacct accttggagc tgggtgaccg aactgcgaaa aaggggaaac gaaccagaaa 1320 gtttggggtc atctccaggc ctcctgccaa caaggcccct gaagaatcca agggcagcgc 1380 tggctgtgag gtgtccagtg accccagcac tgagctggag aacggccctg accctgaact 1440 tggaaacggc catgtctttc agctagaaaa tggcccagat tctctcaagg aggtggctgg 1500 accccatcta gagaggtcag aagtggacag agggacagag catagaattc caaagacaga 1560 tgctcctctg accacaagca atgacaaacg ccgcttctca aaaggtggga agacggactt 1620 ccaatcgagt gactgcctgg cacgggagga agttggccga atatggaaga tggagctgct 1680 caaagaatcg gatgggctgg gaattcaggt tagtggaggc cgaggatcaa agcgctcacc 1740 tcacgctatc gttgtcactc aagtgaagga aggaggtgcc gctcacaggg atggcaggct 1800 gtccttagga gatgagctgc tggtaatcaa tggtcattta ctggtcgggc tctcccacga 1860 ggaagcagtg gccattcttc gctccgccac gggaatggtg cagcttgtgg tggccagcaa 1920 ggaaaactcc gcagaggacc tcctcaggtt aacatctaag agcttgccag atctgaccag 1980 ctcggtagaa gatgtgtcct cctggactga taacgaagac caggaggcag acggggaaga 2040 ggacgaagga accagctctt ctgtccagag agcaatgcct gggacagatg aaccccaaga 2100 tgtgtgcggt gctgaggaat ccaaggggaa cttggaaagt cccaaacagg gcagcaataa 2160 aatcaagctc aagagtcgcc tttcaggggg tgtacaccgc cttgagtcag ttgaagaata 2220 taacgagctg atggtgcgga atggggaccc ccggatccgg atgttggagg tctcccgaga 2280 tggccggaaa cactccctcc cgcagctgct ggactcttcc agtgcctcac aggaatacca 2340 cattgtgaag aagtctaccc gctccttaag cacgactcag gtggaatctc cttggaggct 2400 cattcggcca tccgtcatct cgatcattgg gttgtacaaa gaaaaaggca agggccttgg 2460 ctttagtatt gctggaggtc gagactgcat tcgtggacag atggggattt ttgtcaagac 2520 catcttccca aatggatcag ctgcagagga cggaagactt aaagaagggg atgaaatcct 2580 agatgtaaat ggaataccaa taaagggctt gacatttcaa gaagccattc atacctttaa 2640 gcaaatccgg agtggattat ttgttttaac ggtacgcaca aagttggtga gccccagcct 2700 cacaccctgc tcgacaccca cacacatgag cagatccgcc tccccgaact tcaataccag 2760 tgggggagcc tcagcgggag gttccgatga aggcagttct tcatccctgg gtcggaagac 2820 ccctgggccc aaggacagga tcgtcatgga agtaacactc aacaaagagc caagagttgg 2880 attaggcatt ggtgcctgct gcttggctct ggaaaacagt cctcctggca tctacattca 2940 cagccttgct ccaggatcag tggccaagat ggagagcaac ctgagccgcg gggatcaaat 3000 cctggaagtg aactccgtca acgtccgcca tgctgcttta agcaaagtcc acgccatctt 3060 gagtaaatgc cctccaggac ccgttcgcct tgtcatcggc cggcacccta atccaaaggt 3120 ttccgagcag gaaatggatg aagtcatagc acgcagcact tatcaggaga gcaaagaggc 3180 caattcctct cctggcttag gtaccccctt gaagagtccc tctcttgcaa aaaaggactc 3240 ccttatttct gaatctgaac tctcccagta ctttgcccac gatgtccctg gccccttgtc 3300 agacttcatg gtggccggtt ctgaggacga ggatcacccg ggaagtggct gcagcacgtc 3360 ggaggagggc agcctgcctc ccagcacctc cactcacaag gagcctggaa aacccagagc 3420 caacagcctc gtgactcttg ggagccatcg ggcttctggg ctcttccaca agcaggtgac 3480 agttgccaga caagccagtc tccccggaag cccacaggcc ctccgaaacc ctctcctccg 3540 ccagaggaag gtaggctgct acgatgccaa cgatgccagt gatgaggaag agtttgacag 3600 agaaggggac tgcatttcac tcccaggggc cctcccgggt cccatcaggc ctctgtcaga 3660 ggatgacccg aggcgtgtct caatttcctc ttccaagggc atggacgtcc acaaccaaga 3720 ggaacgaccc cggaaaacac tggtgagcaa ggccatctcg gcacctcttc ttggtagctc 3780 agtggactta gaggagagta tcccagaggg catggtggat gctgcgtcct atgcagccaa 3840 cctcacggac tctgcagagg cccccaaggg gagccctgga agctggtgga agaaggaact 3900 gtcaggatca agtagcgcac ccaaattgga atacacagtc cgtacagaca cccagagtcc 3960 gacgaacact gggagcccca gttcccccca gcagaaaagt gaaggcctgg gctccaggca 4020 cagaccagtg gccagggtaa gcccccactg caagagatcc gaggctgagg ccaagcccag 4080 tggctcacag acagtgaacc tgactggcag agccaatgat ccatgcgatc tggactcgag 4140 agtccaggcc acttctgtca aagtgactgt cgctggcttt cagccaggtg gagctgtgga 4200 gaaggaatct ctgggaaagc tgaccactgg agatgcttgt gtctctacca gctgtgaact 4260 agccagtgct ctgtcccatc tggatgccag ccacctcaca gagaacctgc ccaaagctgc 4320 atcagagctg gggcaacaac ccatgactga actggacagc tcctcagacc tcatctcttc 4380 cccagggaag aagggggccg ctcatcctga ccccagcaag acctctgtag acacagggca 4440 agtcagtcgg ccagagaatc ccagccagcc tgcatcgccc agggtcacca agtgcaaggc 4500 caggtctcca gtcaggctcc cccatgaggg cagcccctcc ccgggggaga aagcagcggc 4560 tccccctgac tacagcaaga ctcgatcagc atcggaaacc agcacacccc acaataccag 4620 gagggtggct gccctcaggg gagcgggacc tggagcagag ggaatgacac cagctggtgc 4680 tgtcctgcca ggagaccccc tcacatccca ggagcagaga cagggagctc caggtaacca 4740 cagtaaggct ctggaaatga caggaatcca tgcacctgaa agctcccagg agccttccct 4800 gctggaggga gcagattctg tgtcctcaag ggcaccgcag gccagcctct ccatgctgcc 4860 atccactgac aacaccaaag aagcatgtgg ccatgtctcg gggcactgct gcccaggggg 4920 gagtagagag agccctgtga cggacattga cagcttcatc aaggagctgg atgcttctgc 4980 agcaaggtct ccgtcttccc agacggggga cagtggctct caggagggca gtgctcaggg 5040 ccacccacca gccggggctg gaggtgggag ctcctgccgt gccgaaccag tcccgggggg 5100 ccagacctcc tccccgagga gggcctgggc tgctggtgcc cccgcctacc cacaatgggc 5160 ctcccagcct tcggttttag attcaattaa tcccgacaaa cattttactg tgaacaaaaa 5220 ctttctgagc aactactcta gaaattttag cagttttcat gaagacagca cctccctatc 5280 aggcctgggt gacagcacgg agccgtctct gtcatccatg tatggcgatg ctgaggattc 5340 ttcttctgac cctgagtcac tcactgaagc cccacgagct tctgccaggg acggctggtc 5400 ccctcctcgt tcccgtgtgt ctttgcacaa ggaagatcct tcggagtcag aagaggaaca 5460 gattgagatt tgttccacac gtggctgccc caatccaccc tcgagtcctg ctcatcttcc 5520 cacccaggct gccatctgtc ctgcctcagc caaagttctg tcattaaaat acagcactcc 5580 gagagagtcg gtggccagtc cccgtgagaa ggccgcctgc ttgccaggct catacacttc 5640 aggcccagac tcttcccagc catcatcact cttggagatg agctctcagg agcatgaaac 5700 tcatgcggac ataagcactt cacagaacca caggccctcg tgtgcagaag aaaccacaga 5760 agtcaccagc gctagctcag ccatggaaaa cagtccgctg tctaaagtag ccaggcattt 5820 tcacagtccg cccatcattc tcagctcccc caacatggta aatggcttgg aacatgacct 5880 gctagatgac gaaaccctga atcaatacga aacaagcatt aatgcagctg ccagtctgtc 5940 ctccttcagt gtggatgtcc ctaagaatgg agaatctgtt ttggaaaacc tccacatctc 6000 tgaaagtcaa gacctggatg acttgctaca gaaaccaaaa atgatcgcta ggaggcccat 6060 catggcctgg tttaaagaaa taaataaaca taaccaaggc acacatttga ggagcaaaac 6120 cgagaaggaa caacctctaa tgcctgccag aagtcccgac tccaagattc agatggtgag 6180 ttcaagccaa aaaaagggcg ttactgtgcc tcatagccct cctcagccga aaacaaacct 6240 ggaaaataag gacctgtcta agaagagtcc ggcagaaatg cttctgacta atggtcagaa 6300 ggcaaagtgt ggtccgaagc tgaagaggct cagcctcaag ggcaaggcca aagtcaactc 6360 tgaggcccct gctgcgaatg ctgtgaaggc tggggggacg gaccacagga aacccttgat 6420 ctcaccccag acctcccaca aaacactttc taaggcagtg tcacagcggc tccatgtagc 6480 cgaccacgag gaccctgaca gaaacaccac agctgccccc aggtcccccc agtgtgtgct 6540 ggaaagcaag ccacctcttg ccacctctgg gccactgaaa ccctcagtgt ctgacacgag 6600 catcaggaca tttgtctcgc ccctgacctc tcccaagcct gttcctgagc aaggcatgtg 6660 gagcaggttc cacatggctg tcctctctga acccgacaga ggttgcccaa ccacccctaa 6720 atctcctaag tgtagagcag agggcagggc gccccgtgct gactccgggc cggtgagtcc 6780 ggcagcgtct aggaacggca tgtccgtggc agggaacaga cagagtgagc cgcgcctggc 6840 cagccatgtg gcagcagaca cagcccaacc caggccgact ggcgaaaaag gaggcaacat 6900 aatggccagc gatcgcctcg aaagaacaaa ccagctgaaa atcgtggaga tttctgctga 6960 agcagtgtca gagactgtat gtggtaacaa gccagctgaa agcgacagac ggggagggtg 7020 cttggcccag ggcaactgtc aggagaagag tgaaatcagg ctctatcgcc aggtcgcaga 7080 atcatccaca agtcatccat cctcactccc atctcatgcc tcccaggcag agcaggaaat 7140 gtcacgatca ttcagcatgg caaaactggc gtcctcctcc tcctcccttc aaacagccat 7200 tagaaaggca gaatactccc agggaaaatc aagcctgatg tcagactccc gaggggtgcc 7260 cagaaacagc attccagggg gcccctcggg ggaggaccat ctctacttca ccccaaggcc 7320 agcgaccagg acctactcca tgccagccca gttctcaagc cattttggac gggagggtca 7380 ccccccacac agcctgggtc gctctcggga cagccaggtc cctgtgacaa gcagtgttgt 7440 ccccgaggca aaggcatcca gaggtggtct tcccagcctg gctaatggac agggcatata 7500 tagtgtaaag ccgctgctgg acacatcgag gaatcttcca gccacagatg aaggggatat 7560 catttcagtc caggagacga gctgcctagt cacagacaaa atcaaagtca ccagacgaca 7620 ctactgctat gagcagaact ggccccatga atctacctca tttttctctg tgaagcagcg 7680 gatcaagtct tttgagaacc tggccaatgc tgaccggcct gtagccaagt ccggggcttc 7740 cccatttttg tcggtgagct ccaagcctcc cattgggagg cggtcttccg gcagcattgt 7800 ttccgggagc ctgggccacc caggtgacgc agcagcaagg ttgttgagac gcagcttgag 7860 ttcctgcagc gaaaaccaaa gcgaagccgg caccctcctg ccccagatgg ccaagtctcc 7920 ctcaatcatg acactgacca tctctcggca gaacccacca gagaccagta gcaagggctc 7980 tgattcggaa ctaaagaaat cacttggtcc tttgggaatt cccaccccaa cgatgaccct 8040 ggcttctcct gttaagagga acaagtcctc ggtacgccac acgcagccct cgcccgtgtc 8100 ccgctccaag ctccaggagc tgagagcctt gagcatgcct gaccttgaca agctctgcag 8160 cgaggattac tcagcagggc cgagcgccgt gctcttcaaa actgagctgg agatcacccc 8220 caggaggtca cctggccctc ctgctggagg cgtttcgtgt cccgagaagg gcgggaacag 8280 ggcctgtcca ggaggaagtg gccctaaaac cagtgctgct gagacaccca gttcagccag 8340 tgatacgggt gaagctgccc aggatctgcc ttttagaaga agctggtcag ttaatttgga 8400 tcaacttcta gtctcagcgg gggaccagca aagattacag tctgttttat cgtcagtggg 8460 atcgaaatct accatcctaa ctctcattca ggaagcgaaa gcacaatcag agaatgaaga 8520 agatgtttgc ttcatagtct tgaatagaaa agaaggctca ggtctgggat tcagtgtggc 8580 aggagggaca gatgtggagc caaaatcaat cacggtccac agggtgtttt ctcagggggc 8640 ggcttctcag gaagggacta tgaaccgagg ggatttcctt ctgtcagtca acggcgcctc 8700 actggctggc ttagcccacg ggaatgtcct gaaggttctg caccaggcac agctgcacaa 8760 agatgccctc gtggtcatca agaaagggat ggatcagccc aggccctctg cccggcagga 8820 gcctcccaca gccaatggga agggtttgct gtccagaaag accatccccc tggagcctgg 8880 cattgggaga agtgtggctg tacacgatgc tctgtgtgtt gaagtgctga agacctcggc 8940 tgggctggga ctgagtctgg atgggggaaa atcatcggtg acgggagatg ggcccttggt 9000 cattaaaaga gtgtacaaag gtggtgcggc tgaacaagct ggaataatag aagctggaga 9060 tgaaattctt gctattaatg ggaaacctct ggttgggctc atgcactttg atgcctggaa 9120 tattatgaag tctgtcccag aaggacctgt gcagttatta attagaaagc ataggaattc 9180 ttcatgaatt ttaacaagaa tcattttctc agttctcttc tttctttagc aaatcagagt 9240 gacttcttta aaccacaggt tgttgaaatg gccaacactg gtacagacac ggactataaa 9300 aatctccaag cttgtgctta cacatgaagc ctgacttaac tgtatgtgca acagcaatga 9360 aattaactcc agaagccttc cacctgcgtc acccaggccg ggagggttcc ttcgttccag 9420 tgcctgtccc ctacctttat gttatgttta ctgatgggga tacaagatgt gacacaccct 9480 tctttatttg aaacaaacaa acatttagct agacctttgc ttccttcttg ccagctctcc 9540 caacataccc aatcctggtg atcagggaac taaaagtctg agggggacac aaatgtcaca 9600 cctaagagga caatcaatca ttttgtatga ttttgtaagt aaatgacaga atgcttttag 9660 gcacattcaa tggaaggagg agatgtaggt ctgtatatgt taccctgaaa agagaataag 9720 acttacttaa aaaaatgaat tatgacctgt taggctgagc tcaggaattg tccaaaaagg 9780 aaaaagcaaa ataattaatt gagagtattt tttagtgagt gtaatgtata atgtacgtat 9840 gcaaagttca actcaatagg ttattgatca ccatgaagta ttgatcattt tctatctcaa 9900 aagtgtaagc cataaggctg ttttacagaa tagcacttct gataagctgt attaaatagc 9960 catgagcttc actgcttaga gggagcagaa aggtcaacat ctaaaagcac cttacaacta 10020 gtttttgaac ctgtcttgat aagtgcttga attcaagact ggtcagtcca agagcagaca 10080 aaaatatcac aagtcagtca gtcactgggt ttccatttct gaattttatg cactccaacc 10140 atgaatttaa actaaatttt tagaaatcaa gtatctttct aagtgtcctt ggatttatag 10200 acaatgtatg tacaatccaa atagaggagc ttaatggaat ccttttagga gactggttgg 10260 tttttttccc tctttcccaa catgtttaag aaatgtaaca ttctaagtat tggatctctt 10320 ttcttgacct agtataatga caactgcagt gacttaagtt tttgctgttt tcgttttccc 10380 gctttgcaat ttcctccttt tgccaaaaat gttttcctac agaagactgt cgtgactcac 10440 gctacttggg aaactcactc tggccactcc tcctctggtg gcatgagctg cttcccagta 10500 gctattccga ttggatattc cgttcgtcgt cacatagctg gcttttctct cctcatgatg 10560 taccttattt tcttaggtaa ataattccaa actctcatcg ggtcataaag aggaggagaa 10620 acagggtgag tcaaggtaaa ggagcagaaa tgtagttaca agccaggtcg tcttcagtgg 10680 cacaaaccaa cccgttgagc cctgacaaca tgagtggaga gtgcatttgc catacctgtg 10740 tgcatgacac taagatttta tgttggagat acttctttaa ataacctaca gcttgggtct 10800 atggctgtga cccccagatt catggagggg ctttagccat cagctttgta catcatcatt 10860 tttctgaatg accaatccca ctaaacatct ttgaagtcgg cctagagagg tccttcagat 10920 gagagagaaa tagctggctt gtctgagtcc agatttctca tcaactggca atacaaagga 10980 aaatatggta caggagttag ttagaaaggt cttattgatt ttacttctac ttttcactac 11040 agttacaggt agaatactgt aggaagtcag tgcaaggtgc atgcttgatt gatagatatt 11100 gattgattgt ttttcagtct ctggggtcag ttttgtggtt tctgctttct tgcctaaatc 11160 aaagactatt tcaagtcaac aacactgaaa actgcttttc gcctccactc ttacagctgt 11220 gcctaataat aattaattaa taaacgcaca gccctatgtg aacagacagg aatttcttgt 11280 gcaatgtgga gcaaatggaa tggtctcctt ccgcaagtct ttttaatcct catatctgga 11340 gtacaagggt agacctctgg cttaccacat acactatgct aaagtcatca gccactgcta 11400 ctacatcttg ccagaaggtt tccctcgcca acaaacagtt gaaatttaag ggaagaagca 11460 aaagctaaac tgtctttgac cctaagatag atagaaagct atttatttgt cttcagtgtt 11520 caaggcatga ctagtatttc taattagcct aataaattcc cacactttct gaagtgaaca 11580 ctaatggtat tgtcctacta aaactgtcat tgtttctttt tttttaactg gtcagtcatt 11640 cacaataagc tatgagggta aataaatatg tgttataaca agtaaaccgt agttgcaaga 11700 atataccatg aagattaaag taggctgggt ttcatttcca tcttcccaca catctcattg 11760 aatttgatgg ttgacttaat tggcaccata actttgtatg atattataca ttaaccttta 11820 tttatgtaaa gtaaaatgcc ttatatatta aagagtaagt gcaataatat gaaatagcct 11880 gtacatttta aaaatgttgt caccaagtta tataaatcca catctctgta aacaaccttt 11940 tttaagtaat tttaaaaaaa ataaacactc tgcttactac ttga 11984 <210> 47 <211> 298 <212> PRT <213> Artificial Sequence <220> <223> GOLPH3 <400> 47 Met Thr Ser Leu Thr Gln Arg Ser Ser Gly Leu Val Gln Arg Arg Thr 1 5 10 15 Glu Ala Ser Arg Asn Ala Ala Asp Lys Glu Arg Ala Ala Gly Gly Gly 20 25 30 Ala Gly Ser Ser Glu Asp Asp Ala Gln Ser Arg Arg Asp Glu Gln Asp 35 40 45 Asp Asp Asp Lys Gly Asp Ser Lys Glu Thr Arg Leu Thr Leu Met Glu 50 55 60 Glu Val Leu Leu Leu Gly Leu Lys Asp Arg Glu Gly Tyr Thr Ser Phe 65 70 75 80 Trp Asn Asp Cys Ile Ser Ser Gly Leu Arg Gly Cys Met Leu Ile Glu 85 90 95 Leu Ala Leu Arg Gly Arg Leu Gln Leu Glu Ala Cys Gly Met Arg Arg 100 105 110 Lys Ser Leu Leu Thr Arg Lys Val Ile Cys Lys Ser Asp Ala Pro Thr 115 120 125 Gly Asp Val Leu Leu Asp Glu Ala Leu Lys His Val Lys Glu Thr Gln 130 135 140 Pro Pro Glu Thr Val Gln Asn Trp Ile Glu Leu Leu Ser Gly Glu Thr 145 150 155 160 Trp Asn Pro Leu Lys Leu His Tyr Gln Leu Arg Asn Val Arg Glu Arg 165 170 175 Leu Ala Lys Asn Leu Val Glu Lys Gly Val Leu Thr Thr Glu Lys Gln 180 185 190 Asn Phe Leu Leu Phe Asp Met Thr Thr His Pro Leu Thr Asn Asn Asn 195 200 205 Ile Lys Gln Arg Leu Ile Lys Lys Val Gln Glu Ala Val Leu Asp Lys 210 215 220 Trp Val Asn Asp Pro His Arg Met Asp Arg Arg Leu Leu Ala Leu Ile 225 230 235 240 Tyr Leu Ala His Ala Ser Asp Val Leu Glu Asn Ala Phe Ala Pro Leu 245 250 255 Leu Asp Glu Gln Tyr Asp Leu Ala Thr Lys Arg Val Arg Gln Leu Leu 260 265 270 Asp Leu Asp Pro Glu Val Glu Cys Leu Lys Ala Asn Thr Asn Glu Val 275 280 285 Leu Trp Ala Val Val Ala Ala Phe Thr Lys 290 295 <210> 48 <211> 2678 <212> DNA <213> Artificial Sequence <220> <223> GOLPH3 <400> 48 atattggaaa ggcgccgccg ccgcctccgc cttggagctc ggggtgtttc ggggactgcg 60 gccacaggca ggaaggcgct cctctcctgc cccgccgacg cccggccagc ccgcttcgcc 120 ctgacctgtt tcctcatgac tgcccccggc cctgctgccg acggacgtcg ccccggcgtc 180 cggatttaac acggaaaccc ggatcggagg ccgcgcgggg aggaggaggg cgacccggtc 240 ggtcctgcga ccctctcggc ccggctcggc gcctcggcgg gagccatgac ctcgctgacc 300 cagcgcagct ccggcctggt gcagcggcgc accgaggcct cccgcaacgc cgccgacaag 360 gagcgggcgg cgggcggcgg cgccggcagc agcgaggacg acgcgcagag ccgccgcgac 420 gagcaggacg acgacgacaa gggcgactcc aaggaaacgc ggctgaccct gatggaggaa 480 gtgctcctgc tgggcctcaa ggaccgcgag ggttacacat cattttggaa tgactgtata 540 tcatctggat tacgtggctg tatgttaatt gaattagcat tgagaggaag gttacaacta 600 gaggcttgtg gaatgagacg taaaagtcta ttaacaagaa aggtaatctg taagtcagat 660 gctccaacag gggatgttct tcttgatgaa gctctgaagc atgttaagga aactcagcct 720 ccagaaacgg tccagaactg gattgaatta cttagtggtg agacatggaa tccattaaaa 780 ttgcattatc agttaagaaa tgtacgggaa cgattagcta aaaacctggt ggaaaagggt 840 gtattgacaa cagagaaaca gaacttccta ctttttgaca tgacaacaca tcccctcacc 900 aataacaaca ttaagcagcg cctcatcaag aaagtacagg aagccgttct tgacaaatgg 960 gtgaatgacc ctcaccgcat ggacaggcgc ttgctggccc tcatttacct ggctcatgcc 1020 tcggacgtcc tggagaatgc ttttgctcct cttctggacg agcagtatga tttggctacc 1080 aagagagtgc ggcagcttct cgacttagac cctgaagtgg aatgtctgaa ggccaacacc 1140 aatgaggttc tgtgggcggt ggtggcggcg ttcaccaagt aactctgctc ggggtgaacc 1200 attctccttt ctctcaagta aaccagtagt ttttcttctg ttgacttctg gttttctgta 1260 atttgtactt tcccacacta taattggctt ctgttttaca aaatggtggg tggctttttc 1320 ttttttgtac gtgtacagga ttctgctggt acgagaggcc ttcctctttc tgtttttaaa 1380 aaaagtttta ctgccatatt ggcattccat tccctgttgc catcctcact gttacctgtt 1440 ttgggtttct ggtctacttt gactttcaaa gtacctccag cctcctcata cgcacagctt 1500 ttggatgacc tcagcttgag tttctccata tgtgcatgta catctagcat tctgcctaca 1560 gttcagacag aagtcacaaa aaggccttca actcaccaaa ggtaaatatc tgtatctatt 1620 aggacatttt ttacatagac ttcagttgag atgtatactt agcaaaatta tttttaaatt 1680 gaaacagcac agtaaatact taatataaaa tgtcccttgg attttgcttc ccatgtaaat 1740 ctattgtatt attacacttg ttataatttt aactataaag gtccaattgt ttcacagagc 1800 cagtttggga tgggctgcat tccatttatg ctgtatatag tttgaattat atataaatta 1860 ccccttcttc tggccacccc tgctcccatc ttagtatttt gcaagatcta atcagttgta 1920 cacctggtgc ccctcgcttg cttcaatcat ggttatttga tggcaaaatc gacctcttgt 1980 cgctgaagga gagagaaaag atgtgtgtct gattggtcct gggatttttt gagctgtgcc 2040 atttatggta ctctttgcct atgcatcccc ttgttagatt ttttttaaat tttatcttac 2100 tgtttttata atttctattg ggaagaggct tgtgaccagt accaatcttg agtttctttt 2160 tctgtccaca agtaaattaa tatctgctct gaaatgtcat ttatctactc acacattctt 2220 ggggaaaaaa atcaaatgtc agtcctagca gatgttgcat gtaaattggt agcaagtaat 2280 gattacaacc cagaggatta agaattttgt aacagaaagc tctatgtttt aattttttat 2340 atacaattag gataattagc attgtcagac tataaacctt tgctttttaa agtttatttt 2400 tactatttct ttatcacttt attgtatcat caccattggt ttcataatgt aaatactata 2460 tgttgaacaa attaaatgtc aaaatttttt attaccatag tccatgttaa tagtggggct 2520 ttcaggtgtt tagagatttt ttttgttgtt gttaacattc attgcaaaag tactagatgg 2580 tgtataactc tagagttgaa ttttaaggga ttccctaata tgtatactat ctttttatct 2640 gaagtaataa ataaacaatg atcttgaaag tgcctgaa 2678 <210> 49 <211> 587 <212> PRT <213> Artificial Sequence <220> <223> KLHL3 <400> 49 Met Glu Gly Glu Ser Val Lys Leu Ser Ser Gln Thr Leu Ile Gln Ala 1 5 10 15 Gly Asp Asp Glu Lys Asn Gln Arg Thr Ile Thr Val Asn Pro Ala His 20 25 30 Met Gly Lys Ala Phe Lys Val Met Asn Glu Leu Arg Ser Lys Gln Leu 35 40 45 Leu Cys Asp Val Met Ile Val Ala Glu Asp Val Glu Ile Glu Ala His 50 55 60 Arg Val Val Leu Ala Ala Cys Ser Pro Tyr Phe Cys Ala Met Phe Thr 65 70 75 80 Gly Asp Met Ser Glu Ser Lys Ala Lys Lys Ile Glu Ile Lys Asp Val 85 90 95 Asp Gly Gln Thr Leu Ser Lys Leu Ile Asp Tyr Ile Tyr Thr Ala Glu 100 105 110 Ile Glu Val Thr Glu Glu Asn Val Gln Val Leu Leu Pro Ala Ala Ser 115 120 125 Leu Leu Gln Leu Met Asp Val Arg Gln Asn Cys Cys Asp Phe Leu Gln 130 135 140 Ser Gln Leu His Pro Thr Asn Cys Leu Gly Ile Arg Ala Phe Ala Asp 145 150 155 160 Val His Thr Cys Thr Asp Leu Leu Gln Gln Ala Asn Ala Tyr Ala Glu 165 170 175 Gln His Phe Pro Glu Val Met Leu Gly Glu Glu Phe Leu Ser Leu Ser 180 185 190 Leu Asp Gln Val Cys Ser Leu Ile Ser Ser Asp Lys Leu Thr Val Ser 195 200 205 Ser Glu Glu Lys Val Phe Glu Ala Val Ile Ser Trp Ile Asn Tyr Glu 210 215 220 Lys Glu Thr Arg Leu Glu His Met Ala Lys Leu Met Glu His Val Arg 225 230 235 240 Leu Pro Leu Leu Pro Arg Asp Tyr Leu Val Gln Thr Val Glu Glu Glu 245 250 255 Ala Leu Ile Lys Asn Asn Asn Thr Cys Lys Asp Phe Leu Ile Glu Ala 260 265 270 Met Lys Tyr His Leu Leu Pro Leu Asp Gln Arg Leu Leu Ile Lys Asn 275 280 285 Pro Arg Thr Lys Pro Arg Thr Pro Val Ser Leu Pro Lys Val Met Ile 290 295 300 Val Val Gly Gly Gln Ala Pro Lys Ala Ile Arg Ser Val Glu Cys Tyr 305 310 315 320 Asp Phe Glu Glu Asp Arg Trp Asp Gln Ile Ala Glu Leu Pro Ser Arg 325 330 335 Arg Cys Arg Ala Gly Val Val Phe Met Ala Gly His Val Tyr Ala Val 340 345 350 Gly Gly Phe Asn Gly Ser Leu Arg Val Arg Thr Val Asp Val Tyr Asp 355 360 365 Gly Val Lys Asp Gln Trp Thr Ser Ile Ala Ser Met Gln Glu Arg Arg 370 375 380 Ser Thr Leu Gly Ala Ala Val Leu Asn Asp Leu Leu Tyr Ala Val Gly 385 390 395 400 Gly Phe Asp Gly Ser Thr Gly Leu Ala Ser Val Glu Ala Tyr Ser Tyr 405 410 415 Lys Thr Asn Glu Trp Phe Phe Val Ala Pro Met Asn Thr Arg Arg Ser 420 425 430 Ser Val Gly Val Gly Val Val Glu Gly Lys Leu Tyr Ala Val Gly Gly 435 440 445 Tyr Asp Gly Ala Ser Arg Gln Cys Leu Ser Thr Val Glu Gln Tyr Asn 450 455 460 Pro Ala Thr Asn Glu Trp Ile Tyr Val Ala Asp Met Ser Thr Arg Arg 465 470 475 480 Ser Gly Ala Gly Val Gly Val Leu Ser Gly Gln Leu Tyr Ala Thr Gly 485 490 495 Gly His Asp Gly Pro Leu Val Arg Lys Ser Val Glu Val Tyr Asp Pro 500 505 510 Gly Thr Asn Thr Trp Lys Gln Val Ala Asp Met Asn Met Cys Arg Arg 515 520 525 Asn Ala Gly Val Cys Ala Val Asn Gly Leu Leu Tyr Val Val Gly Gly 530 535 540 Asp Asp Gly Ser Cys Asn Leu Ala Ser Val Glu Tyr Tyr Asn Pro Val 545 550 555 560 Thr Asp Lys Trp Thr Leu Leu Pro Thr Asn Met Ser Thr Gly Arg Ser 565 570 575 Tyr Ala Gly Val Ala Val Ile His Lys Ser Leu 580 585 <210> 50 <211> 6805 <212> DNA <213> Artificial Sequence <220> <223> KLHL3 <400> 50 attctttgca gcctagacag aggaggcagg agccccaggg gcgggctaat cgcctgggct 60 ggggatgcct gggcagatgc agaggaagct ggaaaggtgg cagtgcacct gggtcgctgg 120 agctgccgcc gttcctagga gaccaaggag cagcaagcct gcgggggagg gggagcaagt 180 gggttgctgc ttttagcagc tgaaagggct gcagggagct ctgggtaaga cattttctgt 240 tgctgctgct tttgcggtag aagctgctgc gagtaagtca gaggaaggag gattgagaag 300 ggaggaggct tcacttgcag ccatgctgaa tcactgttgc tgatcagatt tcccactggt 360 ctgctgggag aatcaagaac agaactagga tccccgagtg catacacagt tcagcagcta 420 ccaccctggc tgggcctcac acaatggagg gtgaaagtgt caagctgagc tcccagactc 480 tgatacaggc tggggatgat gagaagaacc agaggacgat cactgtcaac cctgcccaca 540 tggggaaagc attcaaggtt atgaatgaac tgcggagtaa acagctgttg tgtgacgtga 600 tgattgtggc agaagatgtc gagatagaag cccaccgtgt ggtcctggca gcctgcagcc 660 cctacttctg tgcgatgttc acaggtgaca tgtctgagag taaagccaaa aagatagaaa 720 tcaaggacgt ggatgggcag acgctgagta agctgattga ctacatctat actgctgaaa 780 tcgaggtgac tgaagagaat gtccaggtgc tgctcccggc agccagcttg ctgcagctca 840 tggatgttcg gcagaactgc tgtgacttcc tgcagtctca gttgcatccc accaattgcc 900 tgggcatccg tgcatttgca gatgtacaca cctgcactga ccttctgcag caggccaatg 960 cctacgcaga gcagcacttt ccagaggtga tgctaggaga agaatttctt agcctgagtc 1020 tggaccaggt gtgcagcttg atatccagcg acaagctgac cgtttcttca gaagagaagg 1080 tgtttgaagc tgtgatctca tggatcaatt atgagaaaga aacccgttta gagcacatgg 1140 caaagctgat ggaacatgtc cgacttcctc tcttacctag ggactaccta gtccaaacgg 1200 ttgaagaaga agctttgata aagaataaca acacctgtaa agacttcctc attgaggcca 1260 tgaaatacca tctcctccct ctggatcaga gactattgat taagaaccca aggaccaagc 1320 ccaggactcc agtcagcctt cccaaggtca tgattgtggt tggcggccag gcacccaagg 1380 caatccgcag tgtggagtgc tatgatttcg aggaggaccg gtgggatcag attgctgagc 1440 ttccttccag aagatgcaga gcaggtgtgg tgttcatggc tggccacgtg tatgccgtgg 1500 gagggtttaa tggctcactg cgggtgcgga cagtggatgt gtatgacggc gtgaaggacc 1560 agtggacgtc cattgccagc atgcaggagc gccggagcac actgggcgca gcggtgctca 1620 atgacttgct ctacgcagtg ggaggctttg atggcagtac tggcctagca tcggtggaag 1680 cctacagcta caagaccaac gagtggttct ttgtggcccc gatgaacacg cggcggagca 1740 gtgtgggtgt gggcgttgtg gaggggaagc tatatgctgt tgggggttat gatggagctt 1800 cccgccagtg tctgagcact gtggagcagt acaacccagc gaccaatgaa tggatatacg 1860 tggcggacat gagcacccgc cgcagtggcg caggggttgg agtgcttagc ggacagctgt 1920 acgccacagg tgggcatgat gggcctttgg tgaggaagag cgttgaggtt tacgatcctg 1980 gaacaaatac ctggaagcaa gtggcagaca tgaacatgtg ccggcgcaac gcaggggtct 2040 gtgcagtaaa tgggctcctg tatgtggttg gaggggatga tggatcctgc aacttggctt 2100 cggtggagta ctacaatcct gtcactgaca aatggacgct gcttccaacg aacatgagca 2160 cggggcggag ctatgcaggt gttgccgtga ttcacaagtc cttgtgaccc aaactcctac 2220 tgccaggagg tggaggaagg agcaggtgct gcctgtgact ctgaacagca ggaccttggt 2280 gactggattc aacttgcttg ggagggtctg tgctgctgtg agaaccgctc tcctctgact 2340 tggcagactg gtgttgttca tcgcagtgtg gacaccatta cccacccccg ttcccctgag 2400 gtgctctggc ctatgccctg agcaaggggg gtcttgacat ccccaggcag cacctttggg 2460 ctttgttttg gtgtttctac agggacaata cagaccctgg agtgtgtgtg tgtgtgtgtg 2520 tgtgtgtaga ccatggtgtt tctctatgtt tctctaagtt ggggggtgag cgtgtgtgac 2580 agtctactgg atttctttac tactgatcct ttcgctgtgt taaaaatcaa gtcacagaga 2640 cctctcttct ggatttgtcc catggggacc ctgagactac taaagctgct ttcttctgaa 2700 ggtccagttg gacagtctgg gaatgtccag aaataaccag tgagaggggc agttctctgg 2760 ccacacccac ttatgtactt taactactgt gactttgtct gcagaagagc tggaaaattc 2820 tcgaagctgc accgtgtcct ctgtgtgcta gaataaggga caaatgggtt ccctgtgctt 2880 ctcagctcac tgtttttcct tgagttctcc tacaggaagc agatgagaac tgcccagtct 2940 tcaggtttag gccattggtc tttgatgtca tagattccag gcctgggagg tgttatgtct 3000 cttcagctgg gaaaactagc tcttcagaga agcctcgggt aacactgaaa aacaaaacaa 3060 aacaaaacaa aaacaggaaa aaaacaaaaa accaaagtgg taaggattca gttcctgcct 3120 ataatggtct cagagagggt cctactttta ggttttccca ggacaggaca gtccccattt 3180 atacttatta tcccagttta attattcaca gcaccccatt ttactcagaa gtgttctggt 3240 ctggaggata aataagaggt caccctcctc cagacccaaa gatagatttg tgcctgtgtt 3300 ggatggggtc gtgtgtgatt cagatggaca ttggatggct tcaaaggaat ataccactag 3360 agctggccct tggcactttg tgacagtggt caagtctgtc taatgtcctt gtcttctttt 3420 tcttgtgctt tccccctatt ccagggtgtg caccctctcc ccaaccccca agaaccccac 3480 tactgctttc cctgtgaggt aggagatatc agtgggcctt ggatttgagg cttcctaaga 3540 tgtgcttgca ttttaaaaag ggagcttggt gagagctttg ctaattcaca ggtaaaaatt 3600 attaacaata gaacttcaag catcttgagg agcgggcatt tgagggggca tggagtaatt 3660 tgtatttaaa aaaccttaaa gttgtgctgt tcctaaacta gcaaattgct catgctgaaa 3720 tttctggcat aagcagggga agtcttgtgt ctggagaata gtctcatacc ttgcagtctg 3780 ggacaccctc cctactttga gaatccacct acaggaagcc aaggaacttt ataaatcctg 3840 atgttggact tctgatacga ctgggctact tccaagcagg tgctgcagga gattggcatc 3900 ccccagcccc tgcagttaga aaccccgaag tcttcccagc cagtgagcca ctttgtgtat 3960 ttactgtata tttattgtgc cctaaatgtg caactctcct aaagacaaaa cttctctttc 4020 tgatgttaag cacatgttac ttcaacaaga tgcttggaga acaacaaggt acccagaatt 4080 tttagaagcc ttcagaagag gctaaaatat ccagctttgg gggacctgga agaaatgtct 4140 ccaaaggaag caaggcatgt tttagttgag tgctctggtc tcactatgaa gtggggatga 4200 ctgtggcttc ataactctac ctggctgtgg gttggaagct gatggaatga gaaatgtcct 4260 ttctccttct ctgaggaaat tttgagactt gtttcggtgt gtctgtgtga tggggatgag 4320 gctggggttg ggatctgatg tatgccattc acagaagctc tcaatttcag atgataggtg 4380 aattccctgc ccctccccca ccactgagaa gctagacttt catgcgggag aggctacttt 4440 tatgtgtcgt cttccgggga agggtccctc cactgaaagc tagccagtca tgttttctgt 4500 ttttggattt ttgcaattgg tttcacctca tgtctccctc cctacaaagc actgcctcta 4560 ctgggcgtgc tgccaaggcc atgtgcactc catcctcatg tatccttttt cacggggacc 4620 agaacactgg tacgtcatca ccaaagccaa tctgctctag ctgcccacag atgccaccaa 4680 aacctgctat ctcttcatca ccaggtacga ttctctttcc acagtggaca cagcaggcta 4740 ttttctagtt tgtgctggtc acgtggtaga tgaagcctct tactgcccca cttagggtgg 4800 ccacggctgc ttgtgaatgc agctttgcca gtggcatatc tgtcatctga ttgcggtggt 4860 gaaatggaat tgaggcccaa ggttagaagc agccgagacg ccacttggat actgatttga 4920 acaatgtaga agtcagattc tgaattccaa agttatttct cataagtacc caatggcatc 4980 tctccatcta caaagttgca gtattatgca aataaaactg acctcatttt ctgctatgca 5040 ataagaatac ttaattctag ttcccgacaa gccagttgca atatccccta agatgctttt 5100 tgagctgtct tactttgata tctgttgtgt aatgtttgta tatttctgag ccagatcctt 5160 tcaaagattg cctttttata aaattgaagc tatagctttt aggctaaaat tttaacgtag 5220 atatttttat aagatatttt tttcaagagt ttgaatcgct ttttattgtc catgataatg 5280 aaatgttgtg ttctttgcat cattcactct caaacgtagt tcatgcctgt agctctcttc 5340 cttttgtttc tcacccttca gaaacatatt tttcagtagc tccaggtaga tgagcctttt 5400 tttttttttt aaaaatacca tattcaaggg agtctgctga attttaaaac gcagtcactg 5460 gtgtttcttg aattgctagg gactgatgtt atgttcgact cagcacttgc ccgtctgtat 5520 tgattgtgtc tttttttttt tttttggagt ctgctttctg tgggggtgag gccgggctgt 5580 ctcgtggtgg ctcccactga cgggcactga gcctggtacc ctgtggcatg gagaagcctc 5640 agggaaaggc ctgccccccc agcacatact cccatagtgt cctaggtcca gccgaccatt 5700 ccttattctc ttctatctcc ttgttgatct gaagcttcca atagcttgag gcctttgctg 5760 ctggatgatg ccctttttgg gagcatcttg tctctaacct ttaaaagagg ggtcaatcct 5820 catgatccct gtgtgttaag catatgcttt gcaggtgctc acactacact tacaacttgc 5880 ttcttgagct atgtctctac tccaggctct gttttgtgta tttatctgcc atttgcatca 5940 tggtttttaa aatttattat tattattatt attgttggga caggtgccat ttaaattgcc 6000 tccatgctcc ccatttgcac ctagctggat caagttggga ggctgagcaa actcatattc 6060 cagttagttg gagtttttaa aggctctgtt tgcctggaga agcaaggagg ttagaatgta 6120 atttttttaa gcgtttgcac tatttagagt cctaagcccc tcatgttcag ctgtgctgtg 6180 tttctactga ccaagcagga gagccagcag cacttccagc atttgggaat ggaagagatt 6240 tcttctgtag tggataatta cagcctcata gcccctgtgc agccttcgtc atgggactca 6300 gtgactcatg gatatagcat cagccatggc aggaatgcac aggactgtgg catttgcagc 6360 atcaaatcac cctagtgcca tgtttggtta tgagattgta aattattcgc tcccccatcc 6420 tcccctcccc tcattttcag tggcaataga ggacccttgt tgtacttctt gtttaatttg 6480 catattatgt gtaaaatgct ttcgttgaaa gaaaactgaa gacactgaat gtgtatgtct 6540 gtgtgggtgc tctgtccctg tggttgtcat agccagtcag acttgatcac tgacaccccg 6600 tacaacatat tgcataggta agatcctcga tctggtgttc tctgcgtggc tgttagggac 6660 tgtatatctt gtaaaagaac acttgtcaca tgcttgatca gttacagcaa tagctgaaga 6720 aacatttcct caaatgtatt attttaacag gaatcatgtt ctaatttccc atcctttaat 6780 tttaataaaa gctgaactgt gtgaa 6805 <210> 51 <211> 895 <212> PRT <213> Artificial Sequence <220> <223> CTNNA3 <400> 51 Met Ser Ala Glu Thr Pro Ile Thr Leu Asn Ile Asp Pro Gln Asp Leu 1 5 10 15 Gln Val Gln Thr Phe Thr Val Glu Lys Leu Leu Glu Pro Leu Ile Ile 20 25 30 Gln Val Thr Thr Leu Val Asn Cys Pro Gln Asn Pro Ser Ser Arg Lys 35 40 45 Lys Gly Arg Ser Lys Arg Ala Ser Val Leu Leu Ala Ser Val Glu Glu 50 55 60 Ala Thr Trp Asn Leu Leu Asp Lys Gly Glu Lys Ile Ala Gln Glu Ala 65 70 75 80 Thr Val Leu Lys Asp Glu Leu Thr Ala Ser Leu Glu Glu Val Arg Lys 85 90 95 Glu Ser Glu Ala Leu Lys Val Ser Ala Glu Arg Phe Thr Asp Asp Pro 100 105 110 Cys Phe Leu Pro Lys Arg Glu Ala Val Val Gln Ala Ala Arg Ala Leu 115 120 125 Leu Ala Ala Val Thr Arg Leu Leu Ile Leu Ala Asp Met Ile Asp Val 130 135 140 Met Cys Leu Leu Gln His Val Ser Ala Phe Gln Arg Thr Phe Glu Ser 145 150 155 160 Leu Lys Asn Val Ala Asn Lys Ser Asp Leu Gln Lys Thr Tyr Gln Lys 165 170 175 Leu Gly Lys Glu Leu Glu Asn Leu Asp Tyr Leu Ala Phe Lys Arg Gln 180 185 190 Gln Asp Leu Lys Ser Pro Asn Gln Arg Asp Glu Ile Ala Gly Ala Arg 195 200 205 Ala Ser Leu Lys Glu Asn Ser Pro Leu Leu His Ser Ile Cys Ser Ala 210 215 220 Cys Leu Glu His Ser Asp Val Ala Ser Leu Lys Ala Ser Lys Asp Thr 225 230 235 240 Val Cys Glu Glu Ile Gln Asn Ala Leu Asn Val Ile Ser Asn Ala Ser 245 250 255 Gln Gly Ile Gln Asn Met Thr Thr Pro Pro Glu Pro Gln Ala Ala Thr 260 265 270 Leu Gly Ser Ala Leu Asp Glu Leu Glu Asn Leu Ile Val Leu Asn Pro 275 280 285 Leu Thr Val Thr Glu Glu Glu Ile Arg Pro Ser Leu Glu Lys Arg Leu 290 295 300 Glu Ala Ile Ile Ser Gly Ala Ala Leu Leu Ala Asp Ser Ser Cys Thr 305 310 315 320 Arg Asp Leu His Arg Glu Arg Ile Ile Ala Glu Cys Asn Ala Ile Arg 325 330 335 Gln Ala Leu Gln Asp Leu Leu Ser Glu Tyr Met Asn Asn Ala Gly Lys 340 345 350 Lys Glu Arg Ser Asn Thr Leu Asn Ile Ala Leu Asp Asn Met Cys Lys 355 360 365 Lys Thr Arg Asp Leu Arg Arg Gln Leu Arg Lys Ala Ile Ile Asp His 370 375 380 Val Ser Asp Ser Phe Leu Asp Thr Thr Val Pro Leu Leu Val Leu Ile 385 390 395 400 Glu Ala Ala Lys Asn Gly Arg Glu Lys Glu Ile Lys Glu Tyr Ala Ala 405 410 415 Ile Phe His Glu His Thr Ser Arg Leu Val Glu Val Ala Asn Leu Ala 420 425 430 Cys Ser Met Ser Thr Asn Glu Asp Gly Ile Lys Ile Val Lys Ile Ala 435 440 445 Ala Asn His Leu Glu Thr Leu Cys Pro Gln Ile Ile Asn Ala Ala Leu 450 455 460 Ala Leu Ala Ala Arg Pro Lys Ser Gln Ala Val Lys Asn Thr Met Glu 465 470 475 480 Met Tyr Lys Arg Thr Trp Glu Asn His Ile His Val Leu Thr Glu Ala 485 490 495 Val Asp Asp Ile Thr Ser Ile Asp Asp Phe Leu Ala Val Ser Glu Ser 500 505 510 His Ile Leu Glu Asp Val Asn Lys Cys Ile Ile Ala Leu Arg Asp Gln 515 520 525 Asp Ala Asp Asn Leu Asp Arg Ala Ala Gly Ala Ile Arg Gly Arg Ala 530 535 540 Ala Arg Val Ala His Ile Val Thr Gly Glu Met Asp Ser Tyr Glu Pro 545 550 555 560 Gly Ala Tyr Thr Glu Gly Val Met Arg Asn Val Asn Phe Leu Thr Ser 565 570 575 Thr Val Ile Pro Glu Phe Val Thr Gln Val Asn Val Ala Leu Glu Ala 580 585 590 Leu Ser Lys Ser Ser Leu Asn Val Leu Asp Asp Asn Gln Phe Val Asp 595 600 605 Ile Ser Lys Lys Ile Tyr Asp Thr Ile His Asp Ile Arg Cys Ser Val 610 615 620 Met Met Ile Arg Thr Pro Glu Glu Leu Glu Asp Val Ser Asp Leu Glu 625 630 635 640 Glu Glu His Glu Val Arg Ser His Thr Ser Ile Gln Thr Glu Gly Lys 645 650 655 Thr Asp Arg Ala Lys Met Thr Gln Leu Pro Glu Ala Glu Lys Glu Lys 660 665 670 Ile Ala Glu Gln Val Ala Asp Phe Lys Lys Val Lys Ser Lys Leu Asp 675 680 685 Ala Glu Ile Glu Ile Trp Asp Asp Thr Ser Asn Asp Ile Ile Val Leu 690 695 700 Ala Lys Asn Met Cys Met Ile Met Met Glu Met Thr Asp Phe Thr Arg 705 710 715 720 Gly Lys Gly Pro Leu Lys His Thr Thr Asp Val Ile Tyr Ala Ala Lys 725 730 735 Met Ile Ser Glu Ser Gly Ser Arg Met Asp Val Leu Ala Arg Gln Ile 740 745 750 Ala Asn Gln Cys Pro Asp Pro Ser Cys Lys Gln Asp Leu Leu Ala Tyr 755 760 765 Leu Glu Gln Ile Lys Phe Tyr Ser His Gln Leu Lys Ile Cys Ser Gln 770 775 780 Val Lys Ala Glu Ile Gln Asn Leu Gly Gly Glu Leu Ile Met Ser Ala 785 790 795 800 Leu Asp Ser Val Thr Ser Leu Ile Gln Ala Ala Lys Asn Leu Met Asn 805 810 815 Ala Val Val Gln Thr Val Lys Met Ser Tyr Ile Ala Ser Thr Lys Ile 820 825 830 Ile Arg Ile Gln Ser Pro Ala Gly Pro Arg His Pro Val Val Met Trp 835 840 845 Arg Met Lys Ala Pro Ala Lys Lys Pro Leu Ile Lys Arg Glu Lys Pro 850 855 860 Glu Glu Thr Cys Ala Ala Val Arg Arg Gly Ser Ala Lys Lys Lys Ile 865 870 875 880 His Pro Leu Gln Val Met Ser Glu Phe Arg Gly Arg Gln Ile Tyr 885 890 895 <210> 52 <211> 10696 <212> DNA <213> Artificial Sequence <220> <223> CTNNA3 <400> 52 cccctttctt tcttatcctg ggtgaacaac gctcagcgaa attgactgcc ccactgtcat 60 ctgcctctca atttggtact ctgtaactct gtgaccacca agaagccttt ttccgtcccc 120 cacaaagctc tttttggaaa attccctacg ggagctgaat tttaagccca tttactttat 180 aggaagaaac agaaaggcag catgtcagct gaaacaccaa tcacattgaa tatcgatcct 240 caggatctgc aggtccaaac attcaccgtg gagaagctac tggagcctct cataatccag 300 gttaccacac ttgtaaactg tccccagaac ccttccagca ggaaaaaagg acgttcgaaa 360 agagccagtg tccttctagc ttctgtggag gaagcaactt ggaatttatt agacaaggga 420 gagaagattg cccaggaagc tacagtttta aaggatgagc ttacggcttc acttgaggaa 480 gttcgcaaag aaagtgaagc tctgaaagta tcagctgaga gatttacaga tgacccctgt 540 tttctcccaa aaagggaggc tgtggttcaa gctgcccgtg ccttgctggc tgcggtgacg 600 agactcctta tccttgcgga catgattgat gtcatgtgcc tcttgcaaca tgtgtcagct 660 tttcaaagga catttgagtc tctcaaaaat gttgccaaca aatctgacct ccagaaaacc 720 taccagaagc ttgggaagga gctggaaaat ttggattatt tagccttcaa acgtcagcag 780 gacttaaaat ctccaaatca gagagatgaa attgcaggag cccgagcttc actgaaggag 840 aactctcccc tcttgcattc aatttgttca gcttgtttgg agcattctga tgttgcttcc 900 ctcaaagcaa gcaaggacac agtttgtgaa gaaattcaga atgctctcaa tgtaatttca 960 aatgcttcac aagggatcca gaatatgaca accccaccag aacctcaggc agcaaccctg 1020 ggaagtgccc ttgatgagct ggagaattta attgtcctga atccactcac agtaactgag 1080 gaggaaatac gaccatcact agagaaacgc cttgaagcca ttatcagtgg ggctgctctg 1140 ctggcggatt cttcatgtac gagggactta caccgagagc ggattatcgc agaatgcaac 1200 gccattcgcc aggctcttca ggatctgctt tcagagtaca tgaacaacgc tggaaaaaaa 1260 gaaaggagta ataccctgaa tattgcttta gacaacatgt gtaagaagac aagagacctt 1320 cgcagacagc tccgcaaggc tattatagat catgtgtcag actctttcct ggatacgaca 1380 gtccctcttt tggttctcat tgaagctgct aagaatggcc gggaaaagga aataaaagaa 1440 tatgctgcga tatttcatga acacaccagc aggcttgtag aggtggcaaa tcttgcttgt 1500 tccatgtcaa caaatgaaga tggaattaaa attgtcaaaa ttgcagccaa tcatttggaa 1560 accttgtgtc cacagattat taatgctgca cttgctttgg ctgcaagacc caaaagtcaa 1620 gcggtcaaaa acaccatgga aatgtacaag cgtacatggg agaatcatat acatgtcctc 1680 actgaagccg tagatgacat tacaagcatt gatgacttcc ttgctgtatc tgaaagccat 1740 atcttggaag atgtcaacaa gtgtatcata gccttaagag accaggatgc tgataattta 1800 gaccgtgctg cgggtgctat cagaggccgg gcagcaagag ttgctcacat cgtcacgggt 1860 gaaatggaca gttacgagcc aggggcttac acggaaggtg taatgagaaa tgttaacttc 1920 cttacaagta ctgtaattcc tgaatttgta acacaagtga atgttgcctt ggaagcctta 1980 agcaaaagct cattgaatgt gttggatgat aatcaatttg tggacatctc aaagaagatc 2040 tatgatacaa ttcatgatat cagatgttca gtcatgatga ttcggacccc agaggaactg 2100 gaggatgttt ctgaccttga agaggaacac gaggtccgca gtcacaccag cattcagacc 2160 gaagggaaaa ctgatagggc taagatgact caactgcctg aggcagaaaa agaaaagatt 2220 gctgagcaag ttgctgattt caagaaagta aagagtaagc tggatgctga gattgagata 2280 tgggatgata caagcaacga catcattgtt ctggccaaga acatgtgtat gatcatgatg 2340 gagatgacag acttcactag gggcaaagga ccactaaagc atacaactga tgtgatctat 2400 gcagcgaaaa tgatatcaga atcaggatca aggatggatg tccttgctcg gcagattgct 2460 aatcagtgcc cagatccatc ttgtaaacag gacttgttgg cctacctgga acagattaag 2520 ttctactccc accaactgaa aatctgcagt caagttaaag ctgagatcca gaacctggga 2580 ggagagctca tcatgtcagc tttggacagt gtcacatccc tgatccaagc agccaaaaat 2640 ttaatgaatg ctgtagtgca aacagtgaaa atgtcttaca ttgcctcaac caagatcatc 2700 cgaatccaga gtcctgctgg gccccggcac ccagttgtga tgtggagaat gaaggctcct 2760 gcaaaaaaac ccttgattaa aagagagaag ccagaggaaa cgtgtgcagc tgtcagacga 2820 ggctcagcaa agaaaaaaat ccatccattg caagtcatga gtgaatttag aggaagacaa 2880 atctactgaa accactattc tacatatagt gcctatatga caaaatcctg cctaaccaca 2940 ctgctttatt ttacacttaa gaagttctgt aatttcacta agttttggtg tttaactcac 3000 aaataacata aaatattggg cgctaaatca acaaaagcaa tatatatttg ggatcatatc 3060 actgtcattt ctgtatggtc agcacctaat agttaaggaa tatttgcttg ttgaatgaat 3120 gaaattatca cgtgtcattc agcgtttccc atcatagaga ttatctacta ttcgttacca 3180 aataaacaca ggagaggcca gagagtcctg tttatctgta atacttcatg tacacttatc 3240 atccttatct tgaattaaaa cactaacatg agctcctaac ttggtttttt aatagaaaca 3300 aaagactttt ataaaatatt ttcccattta atctccatgc tttctttatc tgatctaacc 3360 tggcacctaa ccaggcagaa atgtatgatt cctgccatag caaaaaaacc accttttaat 3420 ctctagatag ctgtactcat tgtcaactta ttaggctaat atccattata aactaatcaa 3480 atttgaatag ttaggctact tgctggattt tgaaggtcaa ccttgtttat taataaaatg 3540 ctttcttaac attataaagg ttacaatgag ttctgatgcc acatactcac cttttgggtt 3600 tccaatgtgt tagagagttc tgtacttttg agtgttaggc ctatccgaat acatatgtga 3660 aagaagagtt atcacccatt gggaaaatga gacaacaaac tatatcccaa agtgagttat 3720 attaataata acagaagtgt aagttctgta tgaccctatt tttacaaaca taaaaataca 3780 tttttatcag cttgcaattg taattaaaag gaaaatggca gtttgaaaaa tcttttgacg 3840 ttgagtaaaa tataactgca tgaactgtac cattgaacta tgaagcagtt aatggcaatg 3900 aagctggagt gattttaata acgttctttt aaataaaagt cactggggtc attttacaac 3960 tccagtcact gtgttcattc ctagttgagt tcataatgga cttcataata gtcttagagt 4020 ctagtgtacc tctctctgtc attctctctc tctttttctc tcccctgatg acctgcctct 4080 ctctgtcttt caaaaatgtc cttaacagaa actctttgga ggccataagt tttgttttca 4140 ttttttctca ttcatagagt tctgattcac agactttaaa aaacattttc tattctacat 4200 tatcataata gtattatatt gtaccttttt acttctaaaa cattgccatt gatgagaggc 4260 gtctcagcaa cgatgtaacc ttagattgat gagaaaaaaa tttacacata caataccagt 4320 tgatattagg agaaccagtt tgggtaaaag gatgcatgaa acatagaaaa gcataaagat 4380 tgacagtacg atatgtgtgg aatcttaagc ataatggtat gggaaaagaa ggatctattt 4440 aataggtcat tattaattat gcaggaaatg ttaaatttct tgtaatgaat tatttcaatc 4500 ttccaagtag ctggattgcc tgagctcagt gtaaaacttc ctaggaggtg aaggcaataa 4560 actcctccct gttttgcacc atagactggc ctacaataac tattctgaac agaatgcact 4620 gcccagaaga aaaggcagtg aagaagggaa ttagggaatt cagatgactc tgtttttcag 4680 aattatctag atgattgagg caaatagaga ttttctttta ttgaaatgct gccatcattt 4740 tctgctctaa gtgtagatgt tgcacgagtg acatctgtgg tatatccttc ttccagatag 4800 aagtataaat ttataattta gtggggagag gaattttctt ttaatcacca aactaccagc 4860 tgtatagaaa atttttttga aatcaccaaa ctagcagttg tgtatcttag ctcctaaagc 4920 cacattcaca agaatcataa gcatctcatt tagaaaatct aatgtccacc ggacccaaca 4980 tcatctttcc acatgttgga gtttagaaca tgcagtattg gacaccaaaa gaacagaaat 5040 aacttgaaat tatacaatta ccaaataacc tccatctctt tgataaaaat gactttttac 5100 ccaaagtgag atgtgaacaa gttctcctct ttaacccagg cagattttca gcttagctca 5160 aagcaaacta taattaaata aaacaataaa aattttaact caatgaatat cctaggagtt 5220 aatctgctgt ggtaatcgat agtgacataa aaacaagatg catagagctg aaaaaggcat 5280 ctgtccacct gagtgaattc ctgtagtagg cagcatttca agatttagtt tgattcaaac 5340 ctgctgataa aacaaatgct tatatgcaag aaaaactagc agcaagatat gaataaatct 5400 ccccattatg ttaaaaacta atcagcatga cacatttata tttaggtatt tcttctgcag 5460 aggattgcca gaaatttctt aatgagagtg ataatttggt gatccaccta tgtacctttt 5520 gatgtctcat cacagagatt agtgcttcct ctgataatac atttgcttga tcctgctaat 5580 aagaactcac ctttcttttt aacgaaatca atttgtgtta atttacatag ggaaaaacgt 5640 tccttgagag ggaggagagg ggtgtttacc acccacagga gcagttccat ccctggttac 5700 ccaccagtat ccctgccaga tgttagactg gaattagaat gaaaatttaa aaaaaaacga 5760 aataataatc cactaaatgg ccttctaatt agagaaattt gaagatttct atcaaatata 5820 aaaagtgaga aatagaaatc attcccttta atttatgaca aattaaaatt attgaagtaa 5880 aatgtttcta agtcagtttc tggatagttc ttgaatgagt caagaaaaaa ctcaaacact 5940 aaatttctaa gctaatttgt tcattattcc tttctgttta ttatttatta tatgagtgca 6000 aattgcaaat gtattagaaa agacaatata gccttcacat tcacaagtgt gccttctggc 6060 ctttgaaatc ttttaacata agatcaatat aacatatttg ttccccataa aaatctctta 6120 tgtacaggga ataaggtcct ccatagctac acagagacac ccattcttgc agtctggcat 6180 tcttaacacc tttagcttca aaggcagtcg gtttcaattt caagctaaaa ttgtgaatat 6240 acaattaagt gcgcaacttg ggccttcaga ccaaataatt aatgctcctc aaattaaata 6300 atgagtggct gaaaagccta atgtgtaaca acagagtaag tgacttcaac ggcttcctga 6360 accttcattt tgagggtgtt ctcatctcac ccacctctgg gctaggtatc ctctgttaat 6420 tgctaacaat ttcctaccaa aatagtgcaa tcaggactgc ccatcacagc aaattactcc 6480 aattaatgcc ctttcctctt ctgtgaacca aaaatatctc tctcctgtct ctcatctcta 6540 tcccaggtat aaattgacta ataaatttta cagagtttta gtcaaaatta tgaattttaa 6600 tttccagcaa cattattcct ttagtctttc tattccccaa aacattctta aattttggta 6660 tcaaatgttc atttctgtcc tagagattgt ggttcccatg gctacagatt tttgtatagt 6720 ggtagtatga gtggaataag atctgagact tttcctggtg atatccacca ttctcttgtg 6780 aactcattgc agaataaaca gcctctttca ctttcattgt aagtagttct ttaggttggc 6840 aaagcaggca ccagaatctg aatttgaatg aaatctgtac tttcttttgt tggcctattt 6900 tttgccaaga cagcctgcct attgtatatt taggaatcaa atattcattt tttctgtttg 6960 taaagccaga aatatttcag ttaaaaagaa tataatttta tattatttta ataattcttt 7020 ttaagtaaag taggaaaatt tctgccacct gagatgttac tgtttttatt tatatgaaaa 7080 ttccattttt ttctttatga ctcttttccc ataccattat gatatgttca gcattagtat 7140 tttccatttt accactaaac aaataagcca ggactaataa caatagcagt ttgagagttt 7200 ttattttggc attttggttt taattaggaa agaaaacatg gatgctatta aaactagttg 7260 tagttaaaaa tgttttgaat gaggctacta acgctatctt agtcatctca aagagaaaga 7320 gaagaagtaa aattttaaat acctttttgg catttttaac caattcatat gaacaaaaac 7380 atatattcta ttaaagcaaa ataaagaata gtttaaaaca ctagtcttct tgcatgtgaa 7440 tattttcccc ctaacagaca agatagaatc catttctgga cactctaaaa gaaagtaatg 7500 ttaattgaga gagctgtctc caaatggaac tcaacctgac cagaatccca tccgcaggca 7560 agagatgggc tgtggggccg accataagtt ttattctgag acgagatgct ttatggctgc 7620 tggctgcatt atggcatcta gcagtcccca ctggtttgtg ggaacaaaac acatggattg 7680 atattcaata tgagtttggt aagttggaga aaaatggaga cgatacggag tagtctaaag 7740 gcacaacaaa taagaaggtg ccagaggtag ggaagggtct gagaggggta gttgcagaag 7800 catttagctt tcatagatga ggggagagtt ttttggtttt atgaagatta ggcagccctc 7860 ctgaggaaag tattaaggat agaatgagac acttgtcttg gtggtttggt gaagtcgtgt 7920 cgaaggcaaa aggatgaact acacatattt tgagcatgat ttagtaatat tgtttgcaag 7980 ttgagaaaca gaaattcagc cattctcctg actgatggta ggtgaaattc ttcccaatat 8040 tagagccaat ctgggttacc atacgttcat atatttaaca gctatttgtt aaacaactgt 8100 gatgttccag gcactgttct aggttcttag gatacaccat ataacaaaac aaatagtgat 8160 ccttgccctc atgtgccagc tctgtgttaa atgctaacta ttatcttatt caatcctcca 8220 aataacccta taggatatat atcattccac agacaaggaa gctgagggtt aaatgggata 8280 agaatcttgc tcaaagtctc actggcttta aaatattaat atatatttta catcattcaa 8340 gacagatata taattagctg aatattctca cctctattaa ctgaatgtag agtgtcaggt 8400 acatgggaag taagttattt acgtgatctg aaataagagg acaacctttg ccaaattact 8460 tttattacca tggagaagaa agcttcaatg aaatgcctgt ttcagcccat cttcagtatt 8520 ggttggttaa taccatttgt taacaccaat cctttcctct tttactgcta acatgtgact 8580 gtgtttaaaa ttatagattg cagcagagtg ttggccaagg attattgtag taggataaag 8640 tttccctgta tttgaaattt acccaaactg tagcaaatac atctttcctt ctttatggag 8700 gtcacacgtg tgcatagtat gtgcctgaat acaggtaaac tttgtgtttt aaaattacat 8760 ggcctttttt agagtcgact gaaggctaga ctcttcttgc ccatgttgct ggataggctt 8820 tcaaatctca ggccttggag agttaaatga ctttctgata tttcttacgg tggagccaca 8880 taagaaatac acacattcct agtttgagca atggatgtgt atattgggga tcttcacttt 8940 tatgtatctg gtagtcatga tggtgcctat tcccaactgt aggaaaggaa agccctcaag 9000 gagaaaattc tctttcaaaa aggctcagat ttctaacaat tatgtctaaa tgtttcttca 9060 atttagaaca tgtacaaaag ctacttaata ttagcagagt agttcttgtg ttttcatttc 9120 aaatgaaata ttcccagttc caaagtttat ttccttggca ttttcaagaa ggatggttat 9180 ttttagttct ggttatttgg tgttggaggt ctgtgacact tccttataca tttaaagccc 9240 cctttgaata gttgtatcaa tagttcaaaa gcctgtagat atcttgcttt tctattctga 9300 cagcccagct tctgcagact caggccaacc ttaggtatgt cacatcagtt ccccatctgt 9360 gtaaattacc tgaacctctc tgtggtattc actgagagtt gagcttctac taagggattt 9420 ccttgacctg ggcttttaac accagatttg tagtggagtg tttattagta aagcaaagga 9480 gtaagtcaga gatgctagcc cctgctctca gttaaacaga actatcctcc ttcctgaata 9540 tttaaacata aaattaattt taaactacta taaaaatcat gttccattgc ttgccaaatc 9600 ttataccatt ataatgctgt atttttacat gttagtatac attgtgtcca tagggatata 9660 aatatgagaa tttcagcaac tgtcttgcag ggttaagatg tgctatttgt tgattttgca 9720 aatttttctc atgtggtaca gtccctccaa agaaaatttc cttagaccta aatggtagca 9780 taataatttc tcctccatat ccgtgactaa caagttcctg aaaacagaag tagacatatt 9840 ttatgcccaa tttagtacct aataattgag gcttaaaaca aaattttctt gggcaagttg 9900 ggtgattaaa tatgaattaa aacttctaga gaaagatttc cagatgcccc tgttttccat 9960 tttgattatg gttttagttt catagcaact tgggggaaaa aaatcattta ctttatgtcg 10020 tattacaaca acaactttga cagcttaaat gtagtgacct ctctgaaggt aatatgcaca 10080 ttataattta ggatgaaagc agatttaaac taagactttc tgagatgaaa taaaagtaac 10140 caattcataa tagtcatgtt tattgcagca atatcttggg aacttaagtt tagtggcgtt 10200 tgaggtgttt ctgactatct tgcttgttca tttcaccctg ctctgcagag acagaatatg 10260 attgaatgaa ctgcccatta tattaatgac cagttcaaag cagttttgtg ctttggctat 10320 agataaaatt tggcaaaaat actgtaggcc tttctggtcc atatttatcc ttaaagtatt 10380 tttgccaatc aataaaaatg aaatatttat ccagaaatag aaaaagctct ttcctatgac 10440 attgacatag aagcccaaat actccttgtt tgagaccctg catgtctttt gtgtttaact 10500 gaatcaatga ttttttttag ttgtgctctt aattgcatta tttcacatat gtaacactgg 10560 gtttgttttg gttatatagt attattttac tttattctaa tttcaactca tgtcactctg 10620 tagctcatta agaatttgtt caactgaaaa ttaaaatgtg tgttaaagca ataaaaatga 10680 aaagattggc tatgca 10696 <210> 53 <211> 825 <212> PRT <213> Artificial Sequence <220> <223> FSCB <400> 53 Met Val Gly Lys Ser Gln Gln Thr Asp Val Ile Glu Lys Lys Lys His 1 5 10 15 Met Ala Ile Pro Lys Ser Ser Ser Pro Lys Ala Thr His Arg Ile Gly 20 25 30 Asn Thr Ser Gly Ser Lys Gly Ser Tyr Ser Ala Lys Ala Tyr Glu Ser 35 40 45 Ile Arg Val Ser Ser Glu Leu Gln Gln Thr Trp Thr Lys Arg Lys His 50 55 60 Gly Gln Glu Met Thr Ser Lys Ser Leu Gln Thr Asp Thr Ile Val Glu 65 70 75 80 Glu Lys Lys Glu Val Lys Leu Val Glu Glu Thr Val Val Pro Glu Glu 85 90 95 Lys Ser Ala Asp Val Arg Glu Ala Ala Ile Glu Leu Pro Glu Ser Val 100 105 110 Gln Asp Val Glu Ile Pro Pro Asn Ile Pro Ser Val Gln Leu Lys Met 115 120 125 Asp Arg Ser Gln Gln Thr Ser Arg Thr Gly Tyr Trp Thr Met Met Asn 130 135 140 Ile Pro Pro Val Glu Lys Val Asp Lys Glu Gln Gln Thr Tyr Phe Ser 145 150 155 160 Glu Ser Glu Ile Val Val Ile Ser Arg Pro Asp Ser Ser Ser Thr Lys 165 170 175 Ser Lys Glu Asp Ala Leu Lys His Lys Ser Ser Gly Lys Ile Phe Ala 180 185 190 Ser Glu Gln Pro Glu Phe Gln Pro Ala Thr Asn Ser Asn Glu Glu Ile 195 200 205 Gly Gln Lys Asn Ile Ser Arg Thr Ser Phe Thr Gln Glu Thr Lys Lys 210 215 220 Gly Pro Pro Val Leu Leu Glu Asp Glu Leu Arg Glu Glu Val Thr Val 225 230 235 240 Pro Val Val Gln Glu Gly Ser Ala Val Lys Lys Val Ala Ser Ala Glu 245 250 255 Ile Glu Pro Pro Ser Thr Glu Lys Phe Pro Ala Lys Ile Gln Pro Pro 260 265 270 Leu Val Glu Glu Ala Thr Ala Lys Ala Glu Pro Arg Pro Ala Glu Glu 275 280 285 Thr His Val Gln Val Gln Pro Ser Thr Glu Glu Thr Pro Asp Ala Glu 290 295 300 Ala Ala Thr Ala Val Ala Glu Asn Ser Val Lys Val Gln Pro Pro Pro 305 310 315 320 Ala Glu Glu Ala Pro Leu Val Glu Phe Pro Ala Glu Ile Gln Pro Pro 325 330 335 Ser Ala Glu Glu Ser Pro Ser Val Glu Leu Leu Ala Glu Ile Leu Pro 340 345 350 Pro Ser Ala Glu Glu Ser Leu Ser Glu Glu Pro Pro Ala Glu Ile Leu 355 360 365 Pro Pro Pro Ala Glu Lys Ser Pro Ser Val Glu Pro Leu Gly Glu Ile 370 375 380 Arg Ser Pro Ser Ala Gln Lys Ala Pro Ile Glu Val Gln Pro Leu Pro 385 390 395 400 Ala Glu Gly Ala Leu Glu Glu Ala Ser Ala Lys Val Glu Pro Pro Thr 405 410 415 Val Glu Glu Thr Leu Ala Asp Val Gln Pro Leu Leu Pro Glu Glu Ala 420 425 430 Pro Arg Glu Glu Ala Arg Glu Leu Gln Leu Ser Thr Ala Met Glu Thr 435 440 445 Pro Ala Glu Glu Ala Pro Thr Glu Phe Gln Ser Pro Leu Pro Lys Glu 450 455 460 Thr Thr Ala Glu Glu Ala Ser Ala Glu Ile Gln Leu Leu Ala Ala Thr 465 470 475 480 Glu Pro Pro Ala Asp Glu Thr Pro Ala Glu Ala Arg Ser Pro Leu Ser 485 490 495 Glu Glu Thr Ser Ala Glu Glu Ala His Ala Glu Val Gln Ser Pro Leu 500 505 510 Ala Glu Glu Thr Thr Ala Glu Glu Ala Ser Ala Glu Ile Gln Leu Leu 515 520 525 Ala Ala Ile Glu Ala Pro Ala Asp Glu Thr Pro Ala Glu Ala Gln Ser 530 535 540 Pro Leu Ser Glu Glu Thr Ser Ala Glu Glu Ala Pro Ala Glu Val Gln 545 550 555 560 Ser Pro Ser Ala Lys Gly Val Ser Ile Glu Glu Ala Pro Leu Glu Leu 565 570 575 Gln Pro Pro Ser Gly Glu Glu Thr Thr Ala Glu Glu Ala Ser Ala Ala 580 585 590 Ile Gln Leu Leu Ala Ala Thr Glu Ala Ser Ala Glu Glu Ala Pro Ala 595 600 605 Glu Val Gln Pro Pro Pro Ala Glu Glu Ala Pro Ala Glu Val Gln Pro 610 615 620 Pro Pro Ala Glu Glu Ala Pro Ala Glu Val Gln Pro Pro Pro Ala Glu 625 630 635 640 Glu Thr Pro Ala Glu Val Gln Pro Pro Pro Ala Glu Glu Ala Pro Ala 645 650 655 Glu Val Gln Pro Pro Pro Ala Glu Glu Ala Pro Ala Glu Val Gln Pro 660 665 670 Pro Pro Ala Glu Glu Ala Pro Ala Glu Val Gln Ser Leu Pro Ala Glu 675 680 685 Glu Thr Pro Ile Glu Glu Thr Leu Ala Ala Val His Ser Pro Pro Ala 690 695 700 Asp Asp Val Pro Ala Glu Glu Ala Ser Val Asp Lys His Ser Pro Pro 705 710 715 720 Ala Asp Leu Leu Leu Thr Glu Glu Phe Pro Ile Gly Glu Ala Ser Ala 725 730 735 Glu Val Ser Pro Pro Pro Ser Glu Gln Thr Pro Glu Asp Glu Ala Leu 740 745 750 Val Glu Asn Val Ser Thr Glu Phe Gln Ser Pro Gln Val Ala Gly Ile 755 760 765 Pro Ala Val Lys Leu Gly Ser Val Val Leu Glu Gly Glu Ala Lys Phe 770 775 780 Glu Glu Val Ser Lys Ile Asn Ser Val Leu Lys Asp Leu Ser Asn Thr 785 790 795 800 Asn Asp Gly Gln Ala Pro Thr Leu Glu Ile Glu Ser Val Phe His Ile 805 810 815 Glu Leu Lys Gln Arg Pro Pro Glu Leu 820 825 <210> 54 <211> 3006 <212> DNA <213> Artificial Sequence <220> <223> FSCB <400> 54 cgcttactga ataaggtggg atccaacaag agtgtagtat ggaatgcaat agcttatgaa 60 ttactttttt tctgagggag ctcaacagaa tgacacctaa gaaagggaaa gtctttgaca 120 cttggtacgt ttgtgatttt tggtcattac ttgaaaatta ataagtttga aatcactact 180 cttagaaatg gaagaaagtg atgactctaa tcagcctatc tcagcgtgta ggcaagaaat 240 tcgaaagaga agatgaccca gcaaaccaat ggtaggcaaa tcccagcaaa ctgatgtaat 300 agagaaaaag aaacacatgg ccataccaaa atcatctagc cccaaagcta cccatcgtat 360 tggtaatact tctggaagca aaggcagcta ctctgccaaa gcctatgagt ctattagagt 420 atcttctgag cttcagcaaa cttggacaaa gagaaagcat ggacaggaaa tgactagtaa 480 gtctctccag acagacacca ttgtagaaga gaaaaaagaa gtcaagttag ttgaggaaac 540 cgtggtacct gaagaaaagt cagctgatgt tagagaagct gctattgaat tgccagagag 600 tgttcaggat gtagaaattc caccaaacat accttcagtt caactaaaaa tggacagatc 660 tcagcagacc agccgtacag gatactggac catgatgaac atcccccctg tagaaaaagt 720 ggacaaggaa caacagacat actttagtga atcagaaata gtggttattt ccaggccaga 780 tagttcttct acaaagtcaa aggaagatgc cctgaaacat aaatcgtcgg gaaagatttt 840 tgctagtgaa caacctgaat ttcaaccagc aacaaacagc aatgaagaaa ttgggcagaa 900 aaatatcagc agaacttcat ttactcagga gactaaaaaa ggtcccccgg tacttttaga 960 agatgagctt agggaagaag taactgtacc tgttgtacaa gaaggttctg ctgttaaaaa 1020 agtggcttct gctgaaatag agcctccatc aacagaaaaa ttcccagcta aaatacagcc 1080 tccattagtt gaagaggcca ctgctaaagc ggagcccaga cctgctgaag agacccatgt 1140 ccaagtacag ccatcaactg aagagactcc tgatgctgag gcagccactg cagttgcgga 1200 gaattctgtt aaagttcagc ctccacctgc tgaagaggcc cctttagtgg agtttcctgc 1260 tgaaattcag cctccatcag ctgaagagtc tccttctgta gagcttctgg ctgaaattct 1320 gcctccatca gctgaagagt ccctttcaga agagcctcct gctgaaattc tgcctccacc 1380 agctgaaaag tctccttcag tagagcctct tggtgaaatt cggtctccct cagcacaaaa 1440 ggctcccatt gaagtacagc ctttaccagc tgagggcgcc cttgaagagg cctcagctaa 1500 agtagagcct cccactgttg aagagaccct tgctgatgtt cagcctctat tacctgaaga 1560 ggctcctaga gaagaggctc gagaacttca gctttcaaca gctatggaga cccctgcaga 1620 agaggctcct actgaatttc agtctccatt acctaaagag accactgcag aagaggcctc 1680 tgctgaaatt cagcttctag cagctacgga gcctcctgca gatgaaactc ctgccgaagc 1740 tcggtctcca ctatctgagg agacttctgc agaagaggct catgctgaag ttcaatctcc 1800 attagctgaa gagaccactg cagaagaggc ctctgctgaa attcagcttc tagcagctat 1860 agaggctcct gcagatgaaa ctcctgctga agctcagtct ccactatctg aggagacttc 1920 tgcagaagag gctcctgctg aagttcagtc tccatcagct aagggagttt ctatagaaga 1980 ggcccctctt gagcttcagc ctccatcagg tgaagagacc actgcagaag aggcctctgc 2040 tgcaattcag cttctagcag ctacagaggc ttctgcagaa gaggctcctg ctgaagttca 2100 gcctccacca gctgaggagg cccccgctga agttcagcct ccaccagctg aggaggcccc 2160 cgctgaagtt cagcctccac cagctgagga gacccccgct gaagttcagc ctccaccagc 2220 tgaggaggcc cccgctgaag ttcagcctcc accagctgag gaggcccccg ctgaagttca 2280 gcctccacca gctgaggagg cccctgctga agttcagtct ctaccagctg aggagactcc 2340 tatagaagag acccttgctg cagtacactc tcccccagct gatgatgtcc ctgcagaaga 2400 ggcctccgtt gacaaacatt ccccaccagc tgatttgctt ctgactgagg agtttcctat 2460 aggagaggcc tctgctgaag tttcacctcc accatctgaa caaacccctg aagatgaggc 2520 tctggtagag aatgtgtcta cagaatttca gtcaccgcag gtggcaggaa ttccagcagt 2580 aaaattagga tcggttgttt tggaaggtga agcaaaattt gaagaggttt caaaaatcaa 2640 ttctgtcctt aaagatttgt ctaataccaa tgatggacag gctcccactc ttgaaataga 2700 aagtgttttt catatagaat taaaacaacg tcctcctgaa ctgtagtcag gttgtaccta 2760 agctagcaat cagaagctac atggttttgg aagaacatac tttagaaaag ggtgggcagc 2820 gggaagtagc tttgtcaata aggcaaatta aaggggaccc caagacttgg aatacaggtt 2880 ggaaaatgaa caataaaaac tgtagcagca taaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2940 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3000 aaaaaa 3006 <210> 55 <211> 483 <212> PRT <213> Artificial Sequence <220> <223> DUOXA1 <400> 55 Met Ala Thr Leu Gly His Thr Phe Pro Phe Tyr Ala Gly Pro Lys Pro 1 5 10 15 Thr Phe Pro Met Asp Thr Thr Leu Ala Ser Ile Ile Met Ile Phe Leu 20 25 30 Thr Ala Leu Ala Thr Phe Ile Val Ile Leu Pro Gly Ile Arg Gly Lys 35 40 45 Thr Arg Leu Phe Trp Leu Leu Arg Val Val Thr Ser Leu Phe Ile Gly 50 55 60 Ala Ala Ile Leu Ala Val Asn Phe Ser Ser Glu Trp Ser Val Gly Gln 65 70 75 80 Val Ser Thr Asn Thr Ser Tyr Lys Ala Phe Ser Ser Glu Trp Ile Ser 85 90 95 Ala Asp Ile Gly Leu Gln Val Gly Leu Gly Gly Val Asn Ile Thr Leu 100 105 110 Thr Gly Thr Pro Val Gln Gln Leu Asn Glu Thr Ile Asn Tyr Asn Glu 115 120 125 Glu Phe Thr Trp Arg Leu Gly Glu Asn Tyr Ala Glu Glu Tyr Ala Lys 130 135 140 Ala Leu Glu Lys Gly Leu Pro Asp Pro Val Leu Tyr Leu Ala Glu Lys 145 150 155 160 Phe Thr Pro Arg Ser Pro Cys Gly Leu Tyr Arg Gln Tyr Arg Leu Ala 165 170 175 Gly His Tyr Thr Ser Ala Met Leu Trp Val Ala Phe Leu Cys Trp Leu 180 185 190 Leu Ala Asn Val Met Leu Ser Met Pro Val Leu Val Tyr Gly Gly Tyr 195 200 205 Met Leu Leu Ala Thr Gly Ile Phe Gln Leu Leu Ala Leu Leu Phe Phe 210 215 220 Ser Met Ala Thr Ser Leu Thr Ser Pro Cys Pro Leu His Leu Gly Ala 225 230 235 240 Ser Val Leu His Thr His His Gly Pro Ala Phe Trp Ile Thr Leu Thr 245 250 255 Thr Gly Leu Leu Cys Val Leu Leu Gly Leu Ala Met Ala Val Ala His 260 265 270 Arg Met Gln Pro His Arg Leu Lys Ala Phe Phe Asn Gln Ser Val Asp 275 280 285 Glu Asp Pro Met Leu Glu Trp Ser Pro Glu Glu Gly Gly Leu Leu Ser 290 295 300 Pro Arg Tyr Arg Ser Met Ala Asp Ser Pro Lys Ser Gln Asp Ile Pro 305 310 315 320 Leu Ser Glu Ala Ser Ser Thr Lys Ala Tyr Tyr Arg Pro Arg Arg Leu 325 330 335 Ser Leu Val Pro Ala Asp Val Arg Gly Leu Ala Pro Ala Ala Leu Ser 340 345 350 Ala Leu Pro Gly Ala Leu Leu Ala Gln Ala Trp Arg Ala Leu Leu Pro 355 360 365 Gly Leu Arg Cys Pro Lys Ala Gly Lys Glu Ser Arg Leu Gly Pro Pro 370 375 380 His Ser Pro Trp Arg Phe Gly Pro Glu Gly Cys Glu Glu Arg Trp Ala 385 390 395 400 Glu His Thr Gly Asp Ser Pro Arg Pro Leu Arg Gly Arg Gly Thr Gly 405 410 415 Arg Leu Trp Arg Trp Gly Ser Lys Glu Arg Arg Ala Cys Gly Val Arg 420 425 430 Ala Met Leu Pro Arg Leu Val Ser Asn Ser Gly Leu Lys Arg Pro Ser 435 440 445 Cys Leu Asp Leu Pro Lys Cys Trp Asp Tyr Arg Arg Asp Ala Arg Ala 450 455 460 Phe Phe His Leu Leu Glu Pro Thr Pro Cys Val Thr Ser Arg His Thr 465 470 475 480 Pro Leu Ile <210> 56 <211> 1923 <212> DNA <213> Artificial Sequence <220> <223> DUOXA1 <400> 56 cgcgcgaggt gagacggcga gggctcccgg ggcgcaggta gagatgttcc gtcggtgccg 60 aaggcccggc tagtgcggtt gtgtggacgg cgaaaaaaac caggccaggc gtcaaagcaa 120 gtcacccatc cgacctaaac ccctctaaga ccctggagtc acgtctccct ccggggatcc 180 tcgtgaggtc ttgggggtcc caccgccctg cgagccgcgc ccgcgccccg ccagacccgg 240 aactgcgtcg ctagaacgtc gggacctggt tccctcttta atcacacccc cgggggcgct 300 tcgtgtgagg ttccacgggg gaaggcgaga ggcgcaggct gccacacacg cactgtttgg 360 aagaggggaa gcattgcccc ccctgcacca cctcaccaag atggctactt tgggacacac 420 attccccttc tatgctggcc ccaagccaac cttcccgatg gacaccactt tggccagcat 480 catcatgatc tttctgactg cactggccac gttcatcgtc atcctgcctg gcattcgggg 540 aaagacgagg ctgttctggc tgcttcgggt ggtgaccagc ttattcatcg gggctgcaat 600 cctggctgtg aatttcagtt ctgagtggtc tgtgggccag gtcagcacca acacatcata 660 caaggccttc agttctgagt ggatcagcgc tgatattggg ctgcaggtcg ggctgggtgg 720 agtcaacatc acactcacag ggacccccgt gcagcagctg aatgagacca tcaattacaa 780 cgaggagttc acctggcgcc tgggtgagaa ctatgctgag gagtatgcaa aggctctgga 840 gaaggggctg ccagaccctg tgttgtacct agctgagaag ttcactccaa gaagcccatg 900 tggcctatac cgccagtacc gcctggcggg acactacacc tcagccatgc tatgggtggc 960 attcctctgc tggctgctgg ccaatgtgat gctctccatg cctgtgctgg tatatggtgg 1020 ctacatgcta ttggccacgg gcatcttcca gctgttggct ctgctcttct tctccatggc 1080 cacatcactc acctcaccct gtcccctgca cctgggcgct tctgtgctgc atactcacca 1140 tgggcctgcc ttctggatca cattgaccac aggactgctg tgtgtgctgc tgggcctggc 1200 tatggcggtg gcccacagga tgcagcctca caggctgaag gctttcttca accagagtgt 1260 ggatgaagac cccatgctgg agtggagtcc tgaggaaggt ggactcctga gcccccgcta 1320 ccggtccatg gctgacagtc ccaagtccca ggacattccc ctgtcagagg cttcctccac 1380 caaggcatac tatcgcccca ggagactttc cctggtgcct gcggatgtcc gaggcctcgc 1440 gccagcagcg ctcagtgccc ttcctggagc tctcctggcc caggcctggc gggcactgct 1500 tcccggcctg cgatgtccca aggcggggaa ggagtccaga ttgggtcccc ctcacagtcc 1560 ttggcgcttt ggtccagaag ggtgcgaaga gcgctgggcc gaacatactg gagactcacc 1620 acggcccctc cgaggaagag gcacaggacg cctgtggcgg tggggatcga aagaaaggag 1680 ggcatgtgga gtcagggcta tgttgcccag gctggtctcg aactctggcc tcaaacgacc 1740 ttcctgcctc gacctcccaa agtgctggga ttacaggcgt gatgcccggg ccttcttcca 1800 tcttttggag cctacccctt gtgttacctc ccgccacaca cctctaatct gaattacatg 1860 aaacacggca agacaccaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1920 aaa 1923 <210> 57 <211> 764 <212> PRT <213> Artificial Sequence <220> <223> DLG4 <400> 57 Met Ser Gln Arg Pro Arg Ala Pro Arg Ser Ala Leu Trp Leu Leu Ala 1 5 10 15 Pro Pro Leu Leu Arg Trp Ala Pro Pro Leu Leu Thr Val Leu His Ser 20 25 30 Asp Leu Phe Gln Ala Leu Leu Asp Ile Leu Asp Tyr Tyr Glu Ala Ser 35 40 45 Leu Ser Glu Ser Gln Lys Tyr Arg Tyr Gln Asp Glu Asp Thr Pro Pro 50 55 60 Leu Glu His Ser Pro Ala His Leu Pro Asn Gln Ala Asn Ser Pro Pro 65 70 75 80 Val Ile Val Asn Thr Asp Thr Leu Glu Ala Pro Gly Tyr Val Asn Gly 85 90 95 Thr Glu Gly Glu Met Glu Tyr Glu Glu Ile Thr Leu Glu Arg Gly Asn 100 105 110 Ser Gly Leu Gly Phe Ser Ile Ala Gly Gly Thr Asp Asn Pro His Ile 115 120 125 Gly Asp Asp Pro Ser Ile Phe Ile Thr Lys Ile Ile Pro Gly Gly Ala 130 135 140 Ala Ala Gln Asp Gly Arg Leu Arg Val Asn Asp Ser Ile Leu Phe Val 145 150 155 160 Asn Glu Val Asp Val Arg Glu Val Thr His Ser Ala Ala Val Glu Ala 165 170 175 Leu Lys Glu Ala Gly Ser Ile Val Arg Leu Tyr Val Met Arg Arg Lys 180 185 190 Pro Pro Ala Glu Lys Val Met Glu Ile Lys Leu Ile Lys Gly Pro Lys 195 200 205 Gly Leu Gly Phe Ser Ile Ala Gly Gly Val Gly Asn Gln His Ile Pro 210 215 220 Gly Asp Asn Ser Ile Tyr Val Thr Lys Ile Ile Glu Gly Gly Ala Ala 225 230 235 240 His Lys Asp Gly Arg Leu Gln Ile Gly Asp Lys Ile Leu Ala Val Asn 245 250 255 Ser Val Gly Leu Glu Asp Val Met His Glu Asp Ala Val Ala Ala Leu 260 265 270 Lys Asn Thr Tyr Asp Val Val Tyr Leu Lys Val Ala Lys Pro Ser Asn 275 280 285 Ala Tyr Leu Ser Asp Ser Tyr Ala Pro Pro Asp Ile Thr Thr Ser Tyr 290 295 300 Ser Gln His Leu Asp Asn Glu Ile Ser His Ser Ser Tyr Leu Gly Thr 305 310 315 320 Asp Tyr Pro Thr Ala Met Thr Pro Thr Ser Pro Arg Arg Tyr Ser Pro 325 330 335 Val Ala Lys Asp Leu Leu Gly Glu Glu Asp Ile Pro Arg Glu Pro Arg 340 345 350 Arg Ile Val Ile His Arg Gly Ser Thr Gly Leu Gly Phe Asn Ile Val 355 360 365 Gly Gly Glu Asp Gly Glu Gly Ile Phe Ile Ser Phe Ile Leu Ala Gly 370 375 380 Gly Pro Ala Asp Leu Ser Gly Glu Leu Arg Lys Gly Asp Gln Ile Leu 385 390 395 400 Ser Val Asn Gly Val Asp Leu Arg Asn Ala Ser His Glu Gln Ala Ala 405 410 415 Ile Ala Leu Lys Asn Ala Gly Gln Thr Val Thr Ile Ile Ala Gln Tyr 420 425 430 Lys Pro Glu Glu Tyr Ser Arg Phe Glu Ala Lys Ile His Asp Leu Arg 435 440 445 Glu Gln Leu Met Asn Ser Ser Leu Gly Ser Gly Thr Ala Ser Leu Arg 450 455 460 Ser Asn Pro Lys Arg Gly Phe Tyr Ile Arg Ala Leu Phe Asp Tyr Asp 465 470 475 480 Lys Thr Lys Asp Cys Gly Phe Leu Ser Gln Ala Leu Ser Phe Arg Phe 485 490 495 Gly Asp Val Leu His Val Ile Asp Ala Ser Asp Glu Glu Trp Trp Gln 500 505 510 Ala Arg Arg Val His Ser Asp Ser Glu Thr Asp Asp Ile Gly Phe Ile 515 520 525 Pro Ser Lys Arg Arg Val Glu Arg Arg Glu Trp Ser Arg Leu Lys Ala 530 535 540 Lys Asp Trp Gly Ser Ser Ser Gly Ser Gln Gly Arg Glu Asp Ser Val 545 550 555 560 Leu Ser Tyr Glu Thr Val Thr Gln Met Glu Val His Tyr Ala Arg Pro 565 570 575 Ile Ile Ile Leu Gly Pro Thr Lys Asp Arg Ala Asn Asp Asp Leu Leu 580 585 590 Ser Glu Phe Pro Asp Lys Phe Gly Ser Cys Val Pro His Thr Thr Arg 595 600 605 Pro Lys Arg Glu Tyr Glu Ile Asp Gly Arg Asp Tyr His Phe Val Ser 610 615 620 Ser Arg Glu Lys Met Glu Lys Asp Ile Gln Ala His Lys Phe Ile Glu 625 630 635 640 Ala Gly Gln Tyr Asn Ser His Leu Tyr Gly Thr Ser Val Gln Ser Val 645 650 655 Arg Glu Val Ala Glu Gln Gly Lys His Cys Ile Leu Asp Val Ser Ala 660 665 670 Asn Ala Val Arg Arg Leu Gln Ala Ala His Leu His Pro Ile Ala Ile 675 680 685 Phe Ile Arg Pro Arg Ser Leu Glu Asn Val Leu Glu Ile Asn Lys Arg 690 695 700 Ile Thr Glu Glu Gln Ala Arg Lys Ala Phe Asp Arg Ala Thr Lys Leu 705 710 715 720 Glu Gln Glu Phe Thr Glu Cys Phe Ser Ala Ile Val Glu Gly Asp Ser 725 730 735 Phe Glu Glu Ile Tyr His Lys Val Lys Arg Val Ile Glu Asp Leu Ser 740 745 750 Gly Pro Tyr Ile Trp Val Pro Ala Arg Glu Arg Leu 755 760 <210> 58 <211> 3289 <212> DNA <213> Artificial Sequence <220> <223> DLG4 <400> 58 ggcagaggca gagacatgga aagacagact ctagggttcc tgatgaaatc tatctcggcc 60 aacacaaaag ggagggtaca gtggtggggg gcacccaagc tagggtgtga gtaccctaag 120 tgtattcttc tgagatgtag gccattcact aactcttgga acagctacag tttcacagta 180 ggaagacccc cccagattca ctgcccctcc cttagtaaag cctctgagac cttcctgaac 240 attcccttct gtctttgccc tctgttcctt ccagagacca tgtgcccagg cagatggatt 300 cctcccgggc ctgagaggaa ctgcaggaat tctcctgcct cttacccgta aaaccccaac 360 ttctctagcc ctagggcagg aagtcccaaa caatttctac ccctttttct gcaattctca 420 ttggggtgag aggaggccca ggaggagaga gagctgggct cagcttcttt ttgagctgct 480 ggagccctct gtgaggaggc cctctttgct ggcttctcag gagagcgtgg ctaggttctg 540 cctgcctatg ggaagagggg gccagggtgt gtggagcaag atggtggcgg tgctggtgcc 600 ttgggacctg ggggaatggg acagctggtc ggctcagaga cggcctactt tactcacagc 660 tggaatttag tggggagaag cagctcaact ccaatcctgg aggattaggg agattaaagt 720 gagagaagag agagatgtcc cagagaccaa gagctcccag gtcagccctc tggctcctgg 780 cacccccact gctgcggtgg gcacccccac tcctcacagt gctgcatagc gacctcttcc 840 aggccttgct ggacatcctg gactattatg aggcttccct ctcagagagt cagaaatacc 900 gctaccaaga tgaagacacg ccccctctgg agcacagccc ggcccacctc cccaaccagg 960 ccaattctcc cccagtgatt gtcaacacag ataccctaga agccccagga tatgtgaacg 1020 ggaccgaggg ggagatggaa tacgaggaaa tcacattgga aaggggtaac tcaggtctgg 1080 gcttcagcat cgcaggtggc actgacaacc cacacatcgg tgacgaccca tccattttca 1140 tcaccaagat cattcctggt ggggctgcgg cccaggatgg ccgcctcagg gtcaacgaca 1200 gcatcctgtt tgtaaatgaa gtggacgtgc gcgaggtgac ccactcagcg gcggtggaag 1260 ccctcaaaga ggcaggctcc atcgttcgcc tctatgtcat gcgccggaag cccccggctg 1320 agaaggtcat ggagatcaag ctcatcaagg ggcctaaagg tcttggcttc agcatcgcag 1380 ggggcgtagg gaaccagcac atcccaggag ataatagcat ctatgtaaca aagatcatcg 1440 aagggggtgc tgcccacaag gatgggaggt tgcagattgg agacaagatc ctggcggtca 1500 acagtgtggg gctagaggac gtcatgcatg aagatgctgt ggcagccctg aagaacacgt 1560 atgatgttgt ctacctaaag gtggccaagc ccagcaatgc ctacctgagt gacagctatg 1620 ctcccccaga catcacaacc tcttattccc agcacctgga caatgagatc agtcacagca 1680 gctacctggg caccgactac cccacagcca tgacccccac ttcccctcgg cgctactctc 1740 cagtggccaa ggacctgctc ggggaggaag acattccccg agaaccgagg cgaattgtga 1800 tccaccgggg ctccacgggc ctgggcttca acatcgtggg tggcgaggac ggtgaaggca 1860 tcttcatctc ctttatcctg gccgggggcc ctgcagacct cagtggggag ctgcggaagg 1920 gggaccagat cctgtcggtc aacggtgtgg acctccgaaa tgccagccat gagcaggctg 1980 ccattgccct gaagaatgcg ggtcagacgg tcacgatcat cgctcagtat aaaccagaag 2040 agtacagccg attcgaggcc aagatccacg accttcggga acagctcatg aacagcagcc 2100 tgggctcagg gactgcgtcc ctgcggagca accccaaaag gggtttctac atcagggccc 2160 tgtttgatta cgacaagacc aaggactgcg gcttcctgag ccaggccctg agcttccgct 2220 ttggggatgt gctgcatgtc atcgatgcta gtgatgagga gtggtggcag gcacggcggg 2280 tccactctga cagtgagacc gacgacattg ggttcatccc cagcaaacgg cgggttgagc 2340 gacgagagtg gtcaaggtta aaggccaagg actggggctc cagctctgga tcgcagggtc 2400 gagaagactc ggttctgagc tacgagacag tgacgcagat ggaagtgcac tatgctcgcc 2460 ccatcatcat ccttgggccc accaaggacc gcgccaacga tgatcttctc tccgagttcc 2520 ccgacaagtt tggatcctgt gttccccata cgacacggcc caagcgggag tatgagatag 2580 atggccggga ttaccacttt gtgtcgtccc gggagaaaat ggagaaggac attcaggcgc 2640 acaagttcat tgaggccggc cagtacaaca gccacctcta tgggaccagc gtccagtccg 2700 tgcgagaggt ggcagagcag gggaagcact gcatcctcga tgtctcggcc aatgccgtgc 2760 ggcggctgca ggcggcccac ctgcacccca tcgccatctt catccgcccc cgctccctgg 2820 agaatgtgct agagattaac aagcggatca cagaggagca agcccgcaaa gccttcgaca 2880 gagccaccaa gctggagcag gagttcacag agtgcttctc agccatcgtg gagggtgaca 2940 gctttgagga gatctaccac aaggtgaagc gtgtcatcga ggacctctca ggcccctaca 3000 tctgggttcc agcccgagag agactctgat tcctgccctg gcttggcctg gactcgccct 3060 gcctccatca cctgggccct tggtctggac tgaattgccc aagcccttgg ctccccccgg 3120 cctccctccc accccttctt atttatttcc tttctaactg gatccagcct gttggagggg 3180 ggacactcct ctgcatgtat ccccgcaccc cagaactggg ctcctgaacg ccaggaacct 3240 ggggtctggg ggggagctgg gctccttgtt ccgagccctt gctccttag 3289 <210> 59 <211> 655 <212> PRT <213> Artificial Sequence <220> <223> ACADVL <400> 59 Met Gln Ala Ala Arg Met Ala Ala Ser Leu Gly Arg Gln Leu Leu Arg 1 5 10 15 Leu Gly Gly Gly Ser Ser Arg Leu Thr Ala Leu Leu Gly Gln Pro Arg 20 25 30 Pro Gly Pro Ala Arg Arg Pro Tyr Ala Gly Gly Ala Ala Gln Leu Ala 35 40 45 Leu Asp Lys Ser Asp Ser His Pro Ser Asp Ala Leu Thr Arg Lys Lys 50 55 60 Pro Ala Lys Ala Glu Ser Lys Ser Phe Ala Val Gly Met Phe Lys Gly 65 70 75 80 Gln Leu Thr Thr Asp Gln Val Phe Pro Tyr Pro Ser Val Leu Asn Glu 85 90 95 Glu Gln Thr Gln Phe Leu Lys Glu Leu Val Glu Pro Val Ser Arg Phe 100 105 110 Phe Glu Glu Val Asn Asp Pro Ala Lys Asn Asp Ala Leu Glu Met Val 115 120 125 Glu Glu Thr Thr Trp Gln Gly Leu Lys Glu Leu Gly Ala Phe Gly Leu 130 135 140 Gln Val Pro Ser Glu Leu Gly Gly Val Gly Leu Cys Asn Thr Gln Tyr 145 150 155 160 Ala Arg Leu Val Glu Ile Val Gly Met His Asp Leu Gly Val Gly Ile 165 170 175 Thr Leu Gly Ala His Gln Ser Ile Gly Phe Lys Gly Ile Leu Leu Phe 180 185 190 Gly Thr Lys Ala Gln Lys Glu Lys Tyr Leu Pro Lys Leu Ala Ser Gly 195 200 205 Glu Thr Val Ala Ala Phe Cys Leu Thr Glu Pro Ser Ser Gly Ser Asp 210 215 220 Ala Ala Ser Ile Arg Thr Ser Ala Val Pro Ser Pro Cys Gly Lys Tyr 225 230 235 240 Tyr Thr Leu Asn Gly Ser Lys Leu Trp Ile Ser Asn Gly Gly Leu Ala 245 250 255 Asp Ile Phe Thr Val Phe Ala Lys Thr Pro Val Thr Asp Pro Ala Thr 260 265 270 Gly Ala Val Lys Glu Lys Ile Thr Ala Phe Val Val Glu Arg Gly Phe 275 280 285 Gly Gly Ile Thr His Gly Pro Pro Glu Lys Lys Met Gly Ile Lys Ala 290 295 300 Ser Asn Thr Ala Glu Val Phe Phe Asp Gly Val Arg Val Pro Ser Glu 305 310 315 320 Asn Val Leu Gly Glu Val Gly Ser Gly Phe Lys Val Ala Met His Ile 325 330 335 Leu Asn Asn Gly Arg Phe Gly Met Ala Ala Ala Leu Ala Gly Thr Met 340 345 350 Arg Gly Ile Ile Ala Lys Ala Val Asp His Ala Thr Asn Arg Thr Gln 355 360 365 Phe Gly Glu Lys Ile His Asn Phe Gly Leu Ile Gln Glu Lys Leu Ala 370 375 380 Arg Met Val Met Leu Gln Tyr Val Thr Glu Ser Met Ala Tyr Met Val 385 390 395 400 Ser Ala Asn Met Asp Gln Gly Ala Thr Asp Phe Gln Ile Glu Ala Ala 405 410 415 Ile Ser Lys Ile Phe Gly Ser Glu Ala Ala Trp Lys Val Thr Asp Glu 420 425 430 Cys Ile Gln Ile Met Gly Gly Met Gly Phe Met Lys Glu Pro Gly Val 435 440 445 Glu Arg Val Leu Arg Asp Leu Arg Ile Phe Arg Ile Phe Glu Gly Thr 450 455 460 Asn Asp Ile Leu Arg Leu Phe Val Ala Leu Gln Gly Cys Met Asp Lys 465 470 475 480 Gly Lys Glu Leu Ser Gly Leu Gly Ser Ala Leu Lys Asn Pro Phe Gly 485 490 495 Asn Ala Gly Leu Leu Leu Gly Glu Ala Gly Lys Gln Leu Arg Arg Arg 500 505 510 Ala Gly Leu Gly Ser Gly Leu Ser Leu Ser Gly Leu Val His Pro Glu 515 520 525 Leu Ser Arg Ser Gly Glu Leu Ala Val Arg Ala Leu Glu Gln Phe Ala 530 535 540 Thr Val Val Glu Ala Lys Leu Ile Lys His Lys Lys Gly Ile Val Asn 545 550 555 560 Glu Gln Phe Leu Leu Gln Arg Leu Ala Asp Gly Ala Ile Asp Leu Tyr 565 570 575 Ala Met Val Val Val Leu Ser Arg Ala Ser Arg Ser Leu Ser Glu Gly 580 585 590 His Pro Thr Ala Gln His Glu Lys Met Leu Cys Asp Thr Trp Cys Ile 595 600 605 Glu Ala Ala Ala Arg Ile Arg Glu Gly Met Ala Ala Leu Gln Ser Asp 610 615 620 Pro Trp Gln Gln Glu Leu Tyr Arg Asn Phe Lys Ser Ile Ser Lys Ala 625 630 635 640 Leu Val Glu Arg Gly Gly Val Val Thr Ser Asn Pro Leu Gly Phe 645 650 655 <210> 60 <211> 5400 <212> DNA <213> Artificial Sequence <220> <223> ACADVL <400> 60 agtcagggtt aggggcgcca ggacgtggcg tgcaggacgc cagagctggg tcagagctcg 60 agccagcggc gcccggagag attcggagat gcaggcggct cggatggccg cgagcttggg 120 gcggcagctg ctgaggctcg ggggcggaag gtctgtgtgt gacaagaggg acggtgggca 180 gcggccctgg gcaccgggcc ggcactgaac ccccactccc cacagctcgc ggctcacggc 240 gctcctgggg cagccccggc ccggccctgc ccggcggccc tatgccgggg gtgccgctca 300 ggtaagtcac cgcagccttg gcaagggggt gtgggagcgg cggtccgctt cggcgcccgc 360 catcggcagg gatctccctc ttggtgccag gtacctgcct actgctcagt cgccgaaagt 420 aggggaaagg gcaagccagg gtcgcctagg gcgaaactag gggaaaggtc acccgttcgc 480 ggcctccccg cgccggtctc gcctgttctc cccttgacac agcggaagtc ccttccctga 540 acttgctaac cgtctctttt cccagctggc tctggacaag tcagattccc acccctctga 600 cgctctgacc aggaaaaaac cggccaaggc ggtaggtagc cccgaggcca ggtggacctt 660 agccagaccc aaccagagcc ctgaaatttg cctctctctg cccaggaatc taagtccttt 720 gctgtgggaa tgttcaaagg ccagctcacc acagatcagg tgttcccata cccgtccggt 780 aagggaaggg ataatcagag ctgggtgggg ccagggtggt ttcccctgcc agcctggcct 840 gaccagcctg tcccccaccc tctgcagtgc tcaacgaaga gcagacacag tttcttaaag 900 agctggtgga gcctgtgtcc cgtttcttcg aggtaaggaa tgactcgggg cttggtccct 960 ggtgaggtgt ttggagatgt taagctcaaa aggagcctgg atgtgggatc ctgtgccttc 1020 cccaggaagt gaacgatccc gccaagaatg acgctctgga gatggtggag gagaccactt 1080 ggcagggcct caaggagctg ggggcctttg gtctgcaagt gcccagtgag ctgggtggtg 1140 tgggcctttg caacacccag gtgagggcgc cctatcgcac atcccagtat gccatacccc 1200 agcttggcag actcagctct tttgccatag acctagagac tagggctaag gtctcttcta 1260 agcacctgcc ctgggtgcct gtgggatgga tgttaactgt ccaaacataa cacaatttac 1320 atgcagtccc ctaggcctga gcatgggcct caccctggtt cccaagtcct tacaaatctc 1380 taagttgggg attgcctaac aatagactga ctagtagcaa gtcaccctcc tacctagacc 1440 taagacagac cagcattctc tgctgtgccc tttgcacacc ccacttcttt tctacacact 1500 ggggatggcc caggtcagca ctgccctagg tcaggaactg ccctgttgcc cacactctcc 1560 tgttaaggtc aggtccccct gcagccagtg acaaccccag attcctgctt cccctccagt 1620 acgcccgttt ggtggagatc gtgggcatgc atgaccttgg cgtgggcatt accctggggg 1680 cccatcagag catcggtttc aaaggcatcc tgctctttgg cacaaaggcc cagaaagaaa 1740 aatacctccc caagctggca tctggtgagg caaccctagg agagccaggg attggggggc 1800 acactgggct tggcacagat taggccagtt ggcacttaga ttatcagatg gctgagcatt 1860 tcagttgggg gaaggttttg gggaggcatc acagtgtgct ggttgggaca tgcaaaagaa 1920 ctggatactc ccaggtgtta agggggaact gcctgctgga gggatgggga agtgggccga 1980 ggggactttg aagctcatca gaacttgggg taaagtagct ctctccccaa caggggagac 2040 tgtggccgct ttctgtctaa ccgagccctc aagcgggtca gatgcagcct ccatccgaac 2100 ctctgctgtg cccagcccct gtggaaaata ctataccctc aatggaagca agctttggat 2160 caggcaacct gcctcccatt tctccccttc tcctccgccc aattccaggc cccactgctc 2220 cccgtcctcc acgccctgaa tatcccattc ttccacagta atgggggcct agcagacatc 2280 ttcacggtct ttgccaagac accagttaca gatccagcca caggagccgt gaaggagaag 2340 atcacagctt ttgtggtgga gaggggcttc gggggcatta cccagtgagt gaatttgggt 2400 tgggggagct taggactgag gggcaggact gggctcctgg gcagatggct gttgcaagtc 2460 accctgggga cgtgtgcaaa agccaaagca ggtggactga atgtggcctt tggactaata 2520 tgtatgcaac gagtcaaagt tttggctcca gcaaccaagt ccaacacaaa ataggacata 2580 gccaggcctc cttaacctca gggcctgagg ggaagtggtg ctgtagcctc taatagtcta 2640 gtggtcgtca ttcctccctg gtgcataagg agcgaaggag cagtttttcc cccagtgaca 2700 acctgttgaa cacacctctg ctttcccaca ctgccctgac acagtgggcc ccctgagaag 2760 aagatgggca tcaaggcttc aaacacagca gaggtgttct ttgatggagt acgggtgcca 2820 tcggagaacg tgctgggtga ggttgggagt ggcttcaagg ttgccatgca catcctcaac 2880 aatggaaggt ttggcatggc tgcggccctg gcaggtacca tgagaggcat cattgctaag 2940 gcggtgagta ccctgcccga gtccctaggt aacccaaaca gaagtctcac tgtccccctt 3000 gccatgtgtc cctgatcact tgcaggcact ccctacacta gaaactcctc ccctaccagc 3060 agcccgactt gctagcttag gtctccatcc agcgtagact gaactctggt tgtatgcaaa 3120 acccatccct ctgcgcaagc ccagcccctt cctagggaga ctgcagaacc acactgaacc 3180 acagcgggat gtgtggaccc tcttccaggt agatcatgcc actaatcgta cccagtttgg 3240 ggagaaaatt cacaactttg ggctgatcca ggagaagctg gcacggatgg ttatgctgca 3300 gtatgtaact gaggtgaggg cctcccaagc ccctctccct ggagccctgg gcgtttcttc 3360 ccagtcgggt cagactacaa cccccagcag cacctggggc agtgggtctc cagctttaca 3420 ccaatgccct aggggatgcg gggaggcata gtcagctcag cttctgcgaa gagagacagc 3480 aatgatgttc tgctcaggtg cctgccagca gtaccagaag ttaattctac ctcatccctt 3540 acatccaccc cttttaagaa aacaaatcct ggaagcacct gatgaactga cccagaacaa 3600 gtatctgcct gacctgacaa gctaggtcag cccttatctt ggagatctgg gtgatgaggc 3660 caagtctgac aaagcccttt gcaattttcc ttcccatgtc ccaactatgc aacctcagtc 3720 catggcttac atggtgagtg ctaacatgga ccagggagcc acggacttcc agatagaggc 3780 cgccatcagc aaaatctttg gctcggtgag gtcccaggca tgctggaggg agtccagttt 3840 gggtgctcag ctcccaaaac cagtctcatc tgttctttgt ccctaggagg cagcctggaa 3900 ggtgacagat gaatgcatcc aaatcatggg gggtatgggc ttcatgaagg tacaggacgg 3960 tcttctgcag agcctcggct gggccagggg tgggatggca catctcagca cgggcatata 4020 atttgtgtgg ccctgtgcta ggaacctgga gtagagcgtg tgctccgaga tcttcgcatc 4080 ttccggatct ttgaggggac aaatgacatt cttcggctgt ttgtggctct gcagggctgt 4140 atggtaagac agagaattgg gtgggggtag aggtggggag gacagtgagt cctgactgct 4200 ggaccctctt cccccatagg acaaaggaaa ggagctctct gggcttggca gtgctctaaa 4260 gaatcccttt gggaatgctg gcctcctgct aggagaggca ggcaaacagc tgaggcggta 4320 ggcttagggc cagagccagg ggagggcagg gtggtgtatg gcaactaacc agtcattctc 4380 cctcttcctc tcaggcgggc agggctgggc agcggcctga gtctcagcgg acttgtccac 4440 ccggagttga gtcggagtgg cgagctggta aggcggccag gggtccagga gagcctgcat 4500 cagggactgc agccgatggc ccctctgagc cccgcactgt ccccatctct taaggcagta 4560 cgggctctgg agcagtttgc cactgtggtg gaggccaagc tgataaaaca caagaagggg 4620 attgtcagta agtgagctct acaccattcc gcccctccct ttcctctcct tgagactaat 4680 gcccccaccc ccacccccac cccacctacc ggacagatga acagtttctg ctgcagcggc 4740 tggcagacgg ggccatcgac ctctatgcca tggtggtggt tctctcgagg tgaggaggca 4800 ggcagggaat gcctgagccg cagggggcct gggcctggat cccagccggc ccagatttat 4860 tttcatctcc tgcttcctgc cagggcctca agatccctga gtgagggcca ccccacggcc 4920 cagcatgaga aaatgctctg tgacacctgg tgtatcgagg tgagactcgg ggctgccaag 4980 ctcaggtgag ggctggaggt gcaggcccaa cccctccttc cctctcccca ggctgcagct 5040 cggatccgag agggcatggc cgccctgcag tctgacccct ggcagcaaga gctctaccgc 5100 aacttcaaaa gcatctccaa ggccttggtg gagcggggtg gtgtggtcac cagcaaccca 5160 cttggcttct gaatactccc ggccagggcc tgtcccagtt atgtgccttc cctcaagcca 5220 aagccgaagc ccctttcctt aaggccctgg tttgtcccga aggggcctag tgttcccagc 5280 actgtgcctg ctctcaagag cacttactgc ctcgcaaata ataaaaattt ctagccagtc 5340 atgctttgct cctgtgtgac ggttctttnc ccctgctgcc tgcctccctc ccaaagaaag 5400 5400 <210> 61 <211> 752 <212> PRT <213> Artificial Sequence <220> <223> CDRT1 <400> 61 Met Glu Asn Leu Glu Ser Arg Leu Lys Asn Ala Pro Tyr Phe Arg Cys 1 5 10 15 Glu Lys Gly Thr Asp Ser Ile Pro Leu Cys Arg Lys Cys Glu Thr Arg 20 25 30 Val Leu Ala Trp Lys Ile Phe Ser Thr Lys Glu Trp Phe Cys Arg Ile 35 40 45 Asn Asp Ile Ser Gln Arg Arg Phe Leu Val Gly Ile Leu Lys Gln Leu 50 55 60 Asn Ser Leu Tyr Leu Leu His Tyr Phe Gln Asn Ile Leu Gln Thr Thr 65 70 75 80 Gln Gly Lys Asp Phe Ile Tyr Asn Arg Ser Arg Ile Asp Leu Ser Lys 85 90 95 Lys Glu Gly Lys Val Val Lys Ser Ser Leu Asn Gln Met Leu Asp Lys 100 105 110 Thr Val Glu Gln Lys Met Lys Glu Ile Leu Tyr Trp Phe Ala Asn Ser 115 120 125 Thr Gln Trp Thr Lys Ala Asn Tyr Thr Leu Leu Leu Leu Gln Met Cys 130 135 140 Asn Pro Lys Leu Leu Leu Thr Ala Ala Asn Val Ile Arg Val Leu Phe 145 150 155 160 Leu Arg Glu Glu Asn Asn Ile Ser Gly Leu Asn Gln Asp Ile Thr Asp 165 170 175 Val Cys Phe Ser Pro Glu Lys Asp His Ser Ser Lys Ser Ala Thr Ser 180 185 190 Gln Val Tyr Trp Thr Ala Lys Thr Gln His Thr Ser Leu Pro Leu Ser 195 200 205 Lys Ala Pro Glu Asn Glu His Phe Leu Gly Ala Ala Ser Asn Pro Glu 210 215 220 Glu Pro Trp Arg Asn Ser Leu Arg Cys Ile Ser Glu Met Asn Arg Leu 225 230 235 240 Phe Ser Gly Lys Ala Asp Ile Thr Lys Pro Gly Tyr Asp Pro Cys Asn 245 250 255 Leu Leu Val Asp Leu Asp Asp Ile Arg Asp Leu Ser Ser Gly Phe Ser 260 265 270 Lys Tyr Arg Asp Phe Ile Arg Tyr Leu Pro Ile His Leu Ser Lys Tyr 275 280 285 Ile Leu Arg Met Leu Asp Arg His Thr Leu Asn Lys Cys Ala Ser Val 290 295 300 Ser Gln His Trp Ala Ala Met Ala Gln Gln Val Lys Met Asp Leu Ser 305 310 315 320 Ala His Gly Phe Ile Gln Asn Gln Ile Thr Phe Leu Gln Gly Ser Tyr 325 330 335 Thr Arg Gly Ile Asp Pro Asn Tyr Ala Asn Lys Val Ser Ile Pro Val 340 345 350 Pro Lys Met Val Asp Asp Gly Lys Ser Met Arg Val Lys His Pro Lys 355 360 365 Trp Lys Leu Arg Thr Lys Asn Glu Tyr Asn Leu Trp Thr Ala Tyr Gln 370 375 380 Asn Glu Glu Thr Gln Gln Val Leu Met Glu Glu Arg Asn Val Phe Cys 385 390 395 400 Gly Thr Tyr Asn Val Arg Ile Leu Ser Asp Thr Trp Asp Gln Asn Arg 405 410 415 Val Ile His Tyr Ser Gly Gly Asp Leu Ile Ala Val Ser Ser Asn Arg 420 425 430 Lys Ile His Leu Leu Asp Ile Ile Gln Val Lys Ala Ile Pro Val Glu 435 440 445 Phe Arg Gly His Ala Gly Ser Val Arg Ala Leu Phe Leu Cys Glu Glu 450 455 460 Glu Asn Phe Leu Leu Ser Gly Ser Tyr Asp Leu Ser Ile Arg Tyr Trp 465 470 475 480 Asp Leu Lys Ser Gly Val Cys Thr Arg Ile Phe Gly Gly His Gln Gly 485 490 495 Thr Ile Thr Cys Met Asp Leu Cys Lys Asn Arg Leu Val Ser Gly Gly 500 505 510 Arg Asp Cys Gln Val Lys Val Trp Asp Val Asp Thr Gly Lys Cys Leu 515 520 525 Lys Thr Phe Arg His Lys Asp Pro Ile Leu Ala Thr Arg Ile Asn Asp 530 535 540 Thr Tyr Ile Val Ser Ser Cys Glu Arg Gly Leu Val Lys Val Trp His 545 550 555 560 Ile Ala Met Ala Gln Leu Val Lys Thr Leu Ser Gly His Glu Gly Ala 565 570 575 Val Lys Cys Leu Phe Phe Asp Gln Trp His Leu Leu Ser Gly Ser Thr 580 585 590 Asp Gly Leu Val Met Ala Trp Ser Met Val Gly Lys Tyr Glu Arg Cys 595 600 605 Leu Met Ala Phe Lys His Pro Lys Glu Val Leu Asp Val Ser Leu Leu 610 615 620 Phe Leu Arg Val Ile Ser Ala Cys Ala Asp Gly Lys Ile Arg Ile Tyr 625 630 635 640 Asn Phe Phe Asn Gly Asn Cys Met Lys Val Ile Lys Ala Asn Gly Arg 645 650 655 Gly Asp Pro Val Leu Ser Phe Phe Ile Gln Gly Asn Arg Ile Ser Val 660 665 670 Cys His Ile Ser Thr Phe Ala Lys Arg Ile Asn Val Gly Trp Asn Gly 675 680 685 Ile Glu Pro Ser Ala Thr Ala Gln Gly Gly Asn Ala Ser Leu Thr Glu 690 695 700 Cys Ala His Val Arg Leu His Ile Ala Gly His Leu Pro Ala Ser Arg 705 710 715 720 Leu Pro Val Ala Ala Val Gln Pro Met Thr Gly Gly Met Ala Pro Thr 725 730 735 Thr Ala Pro Thr His Val Leu Ala Met Leu Ile Leu Phe Ser Gly Val 740 745 750 <210> 62 <211> 2780 <212> DNA <213> Artificial Sequence <220> <223> CDRT1 <400> 62 agacttctgc tggcagttac tgagagagat aggctttcca tccatggcag ccatttactt 60 ttgctctggg agacgtttgt aatagaaaag gcacaactgg ggtatttatt catttccccc 120 cgttcctcta gtgtttggtg gcgttgccgt tgcaagtgcg cagggctaaa atgaactggt 180 tatcttagga tcatggaaaa cctggaatca aggctcaaga atgcccccta ttttcgttgt 240 gagaagggaa ccgattccat ccctctatgc cggaagtgtg agacgcgtgt cttagcctgg 300 aagatcttct ctaccaaaga gtggttctgc aggatcaatg acatatcaca gaggaggttt 360 ctagttggca ttctgaagca gttaaatagc ttatatttgt tacactattt ccaaaatatc 420 cttcagacca cacagggaaa ggatttcatc tataacaggt cccggatcga cctcagcaag 480 aaagagggga aagttgtgaa gtcctccttg aaccaaatgt tggataaaac agtagaacag 540 aagatgaaag agatcttgta ttggtttgcg aacagcaccc agtggaccaa ggcgaattat 600 actctcttac tgctgcagat gtgcaacccc aaattactgc tcactgctgc caatgtgatc 660 agagtcctgt ttctgagaga ggagaacaat atctcagggc tcaatcaaga catcacagat 720 gtgtgttttt cccctgagaa agaccacagc tccaagtctg cgacctcaca agtctattgg 780 acagccaaaa ctcagcacac atcccttcct ttgtccaaag ccccagaaaa tgaacacttc 840 cttggggcag catctaaccc tgaggaacca tggaggaatt cactccggtg tatatccgaa 900 atgaataggc tgttttctgg aaaagcagac ataaccaagc cagggtacga tccctgcaat 960 ctattggttg acctggatga catcagagac ctgtcttctg ggttcagcaa ataccgagac 1020 ttcatccgtt acctgcccat ccacctctcc aagtacattc taagaatgct ggatagacac 1080 accctgaaca agtgcgcctc tgtgagccag cactgggccg ccatggctca acaggtcaag 1140 atggacttgt cagcgcacgg cttcattcag aaccagatta ccttcttgca ggggtcctat 1200 acaagaggaa ttgatcctaa ttatgccaat aaggtttcta tcccagttcc taaaatggta 1260 gatgacggga agagcatgcg tgtgaaacat ccgaagtgga agttgagaac gaagaatgag 1320 tacaacctgt ggactgcata ccagaacgag gaaacgcagc aggtcctgat ggaggagaga 1380 aatgttttct gtgggaccta caatgttcgc attctctctg acacgtggga tcaaaaccga 1440 gtcatccact attccggggg agatctgata gctgtgtcat ctaatcgaaa gatccatctt 1500 ctggacatca tacaagtgaa agcgataccc gttgaattcc gaggccatgc tgggagtgtc 1560 cgggccctct tcctgtgtga ggaggaaaac tttctcctaa gcgggagcta tgacctaagt 1620 atcagatact gggatctgaa aagtggggtt tgcacacgaa tcttcggtgg tcaccagggg 1680 actatcactt gcatggactt gtgtaagaac aggctcgtat ctggaggaag agattgccag 1740 gtaaaagtat gggatgtaga cacagggaag tgcctgaaga cgtttagaca caaagacccc 1800 atcttggcca ccaggatcaa tgatacctac attgtgagca gctgtgagcg agggctggtg 1860 aaagtgtggc acattgccat ggcccagttg gtaaagactc tcagtggcca cgagggagct 1920 gtgaaatgcc tgttctttga ccagtggcat ctcctctcag gaagcactga tggcctggtc 1980 atggcctgga gcatggtggg gaagtacgag cgctgcctga tggccttcaa gcatcccaaa 2040 gaggtgctcg acgtgtccct tctcttcctc cgggtcatca gcgcctgtgc agatggcaag 2100 atccgaattt acaatttctt caatgggaac tgtatgaagg tgataaaagc caatggcaga 2160 ggcgatcctg tgctgtcctt ctttattcag ggcaacagaa tttcagtctg ccacatcagc 2220 acatttgcta aaagaattaa cgtgggatgg aatggaatcg aaccaagtgc tacagctcag 2280 ggaggaaatg cctccttgac cgagtgtgct catgtgagac tccacatcgc aggacactta 2340 ccagcatcga ggctgcccgt ggccgctgtc cagcccatga caggcgggat ggccccaacc 2400 acagctccga cccatgtgtt ggcaatgctg atccttttca gtggtgtgta gcagcaggta 2460 tacaggaaaa tgttgaagag ccccagggct cctgtgagtg gattcacccc caaggtcaga 2520 atggcaactc ctggaacagc acaacaagtg gcaaaggaca cagctagcaa cgggctggaa 2580 aaaggcagag aacgtggagt gatcatctcc aactcaactc aatcacactg cctaaaatcg 2640 gcactgacaa agtacaaata atccacctat ccagtagagg cagcactgat ctggctgatt 2700 cctggatatg gggaatcctc tttatgaaac aaacttacaa taagaaagaa tatgtttgtg 2760 ggaaaaaaaa aaaaaaaaaa 2780 <210> 63 <211> 269 <212> PRT <213> Artificial Sequence <220> <223> Basigin <400> 63 Met Ala Ala Ala Leu Phe Val Leu Leu Gly Phe Ala Leu Leu Gly Thr 1 5 10 15 His Gly Ala Ser Gly Ala Ala Gly Thr Val Phe Thr Thr Val Glu Asp 20 25 30 Leu Gly Ser Lys Ile Leu Leu Thr Cys Ser Leu Asn Asp Ser Ala Thr 35 40 45 Glu Val Thr Gly His Arg Trp Leu Lys Gly Gly Val Val Leu Lys Glu 50 55 60 Asp Ala Leu Pro Gly Gln Lys Thr Glu Phe Lys Val Asp Ser Asp Asp 65 70 75 80 Gln Trp Gly Glu Tyr Ser Cys Val Phe Leu Pro Glu Pro Met Gly Thr 85 90 95 Ala Asn Ile Gln Leu His Gly Pro Pro Arg Val Lys Ala Val Lys Ser 100 105 110 Ser Glu His Ile Asn Glu Gly Glu Thr Ala Met Leu Val Cys Lys Ser 115 120 125 Glu Ser Val Pro Pro Val Thr Asp Trp Ala Trp Tyr Lys Ile Thr Asp 130 135 140 Ser Glu Asp Lys Ala Leu Met Asn Gly Ser Glu Ser Arg Phe Phe Val 145 150 155 160 Ser Ser Ser Gln Gly Arg Ser Glu Leu His Ile Glu Asn Leu Asn Met 165 170 175 Glu Ala Asp Pro Gly Gln Tyr Arg Cys Asn Gly Thr Ser Ser Lys Gly 180 185 190 Ser Asp Gln Ala Ile Ile Thr Leu Arg Val Arg Ser His Leu Ala Ala 195 200 205 Leu Trp Pro Phe Leu Gly Ile Val Ala Glu Val Leu Val Leu Val Thr 210 215 220 Ile Ile Phe Ile Tyr Glu Lys Arg Arg Lys Pro Glu Asp Val Leu Asp 225 230 235 240 Asp Asp Asp Ala Gly Ser Ala Pro Leu Lys Ser Ser Gly Gln His Gln 245 250 255 Asn Asp Lys Gly Lys Asn Val Arg Gln Arg Asn Ser Ser 260 265 <210> 64 <211> 1586 <212> DNA <213> Artificial Sequence <220> <223> Basigin <400> 64 ggttgtagga ccggcgagga ataggaatca tggcggctgc gctgttcgtg ctgctgggat 60 tcgcgctgct gggcacccac ggagcctccg gggctgccgg cacagtcttc actaccgtag 120 aagaccttgg ctccaagata ctcctcacct gctccttgaa tgacagcgcc acagaggtca 180 cagggcaccg ctggctgaag gggggcgtgg tgctgaagga ggacgcgctg cccggccaga 240 aaacggagtt caaggtggac tccgacgacc agtggggaga gtactcctgc gtcttcctcc 300 ccgagcccat gggcacggcc aacatccagc tccacgggcc tcccagagtg aaggccgtga 360 agtcgtcaga acacatcaac gagggggaga cggccatgct ggtctgcaag tcagagtccg 420 tgccacctgt cactgactgg gcctggtaca agatcactga ctctgaggac aaggccctca 480 tgaacggctc cgagagcagg ttcttcgtga gttcctcgca gggccggtca gagctacaca 540 ttgagaacct gaacatggag gccgaccccg gccagtaccg gtgcaacggc accagctcca 600 agggctccga ccaggccatc atcacgctcc gcgtgcgcag ccacctggcc gccctctggc 660 ccttcctggg catcgtggct gaggtgctgg tgctggtcac catcatcttc atctacgaga 720 agcgccggaa gcccgaggac gtcctggatg atgacgacgc cggctctgca cccctgaaga 780 gcagcgggca gcaccagaat gacaaaggca agaacgtccg ccagaggaac tcttcctgag 840 gcaggtggcc cgaggacgct ccctgctccg cgtctgcgcc gccgccggag tccactccca 900 gtgcttgcaa gattccaagt tctcacctct taaagaaaac ccaccccgta gattcccatc 960 atacacttcc ttctttttta aaaaagttgg gttttctcca ttcaggattc tgttccttag 1020 gattttttct tctgaagtgt ttcacgagag cccgggagct gctgccctgc ggccccgtct 1080 gtggctttca gcctctgggt ctgagtcatg gccgggtggg cggcacagcc ttctccactg 1140 gccggagtca gtgccaggtc cttgcccttt gtggaaagtc acaggtcaca cgaggggccc 1200 cgtgtcctgc ctgtctgaag ccaatgctgt ctggttgcgc catttttgtg cttttatgtt 1260 taattttatg agggccacgg gtctgtgttc gactcagcct cagggacgac tctgacctct 1320 tggccacaga ggactcactt gcccacaccg agggcgaccc cgtcacagcc tcaagtcact 1380 cccaagcccc ctccttgtct gtgcatccgg gggcagctct ggagggggtt tgctggggaa 1440 ctggcgccat cgccgggact ccagaaccgc agaagcctcc ccagctcacc cctggaggac 1500 ggccggctct ctatagcacc agggctcacg tgggaacccc cctcccaccc accgccacaa 1560 taaagatcgc ccccacctcc agggtc 1586 <210> 65 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TPH2(chr12, start: 72335380) PCR primer sequence(left) <400> 65 gagtgacacg gcaacttcac 20 <210> 66 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TPH2(chr12, start: 72335380) PCR primer sequence(right) <400> 66 caactgctgt cttgccactt 20 <210> 67 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> HTR1F(chr3, start: 88040035) PCR primer sequence(left) <400> 67 tggtgtccct cactctgtct 20 <210> 68 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> HTR1F(chr3, start: 88040035) PCR primer sequence(right) <400> 68 gccagtggga tgtagaaagc t 21 <210> 69 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9, start: 117070009) PCR primer sequence(left) <400> 69 tcacaagatg cagggtccat 20 <210> 70 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9, start: 117070009) PCR primer sequence(right) <400> 70 ctggggatag aggcagacag 20 <210> 71 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TPH2(chr12, start: 72335380) PCR primer sequence(left) <400> 71 gagtgacacg gcaacttcac 20 <210> 72 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TPH2(chr12, start: 72335380) PCR primer sequence(right) <400> 72 caactgctgt cttgccactt 20 <210> 73 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9) PCR primer sequence(left) <400> 73 cagccaccaa aatccccaaa 20 <210> 74 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9) PCR primer sequence(right) <400> 74 aactggacgg gaagtaggtg 20 <210> 75 <211> 17 <212> DNA <213> Artificial Sequence <220> <223> DPP6(chr7) PCR primer sequence(left) <400> 75 agtgggaacc ggagaga 17 <210> 76 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> DPP6(chr7) PCR primer sequence(right) <400> 76 ggaacgtaag gcgaattcc 19 <210> 77 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> BTBD9(chr6) PCR primer sequence(left) <400> 77 cgctgcctcc tttattggtg 20 <210> 78 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> BTBD9(chr6) PCR primer sequence(right) <400> 78 ctttgagtgt ccagagcagc 20 <210> 79 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> IL1RN(chr2) PCR primer sequence(left) <400> 79 aacatcactg acctgagcga 20 <210> 80 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> IL1RN(chr2) PCR primer sequence(right) <400> 80 ggcagtacta ctcgtcctcc 20 <210> 81 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> BTBD9(chr6, start: 38142846) PCR primer sequence(left) <400> 81 cgctgcctcc tttattggtg 20 <210> 82 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> BTBD9(chr6, start: 38142846) PCR primer sequence(right) <400> 82 ctttgagtgt ccagagcagc 20 <210> 83 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> SGCE(chr7) PCR primer sequence(left) <400> 83 gacacaagtg ttttgcctt 19 <210> 84 <211> 18 <212> DNA <213> Artificial Sequence <220> <223> SGCE(chr7) PCR primer sequence(right) <400> 84 ggggtcatag tttacccg 18 <210> 85 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> MECP2(chr10) PCR primer sequence(left) <400> 85 aggcatcttg acaaggagct 20 <210> 86 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> MECP2(chr10) PCR primer sequence(right) <400> 86 ttcacggtaa ctgggagagg 20 <210> 87 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> ITGA1(chr5) PCR primer sequence(left) <400> 87 tcggagtgaa aatgcatctc tg 22 <210> 88 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ITGA1(chr5) PCR primer sequence(right) <400> 88 tctgtcactt accgagagca 20 <210> 89 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DRD3(chr3) PCR primer sequence(left) <400> 89 tggatgaggg acaggatggt 20 <210> 90 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DRD3(chr3) PCR primer sequence(right) <400> 90 accaagcccc aaagagtctg 20 <210> 91 <211> 17 <212> DNA <213> Artificial Sequence <220> <223> DPP6(chr7) PCR primer sequence(left) <400> 91 agtgggaacc ggagaga 17 <210> 92 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> DPP6(chr7) PCR primer sequence(right) <400> 92 ggaacgtaag gcgaattcc 19 <210> 93 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> SGCE(chr7) PCR primer sequence(left) <400> 93 caggttttgg gtaaggtgga 20 <210> 94 <211> 22 <212> DNA <213> Artificial Sequence <220> <223> SGCE(chr7) PCR primer sequence(right) <400> 94 gacccctctt tataaacagc gt 22 <210> 95 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9) PCR primer sequence(left) <400> 95 cttctgtggc ctagagtccc 20 <210> 96 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9) PCR primer sequence(right) <400> 96 cacagattta ggggaggcca 20 <210> 97 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9) PCR primer sequence(left) <400> 97 aaagtcagcc ctacccactc 20 <210> 98 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9) PCR primer sequence(right) <400> 98 catggctggt tatcttggcc 20 <210> 99 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> USH2A-1(chr1) PCR primer sequence(left) <400> 99 aaagtcagcc ctacccactc 20 <210> 100 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> USH2A-1(chr1) PCR primer sequence(right) <400> 100 catggctggt tatcttggcc 20 <210> 101 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> USH2A-2(chr1) PCR primer sequence(left) <400> 101 gcttgaaagg ctagctgtgc 20 <210> 102 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> USH2A-2(chr1) PCR primer sequence(right) <400> 102 tcatgctgga actgttgggt 20 <210> 103 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> CEP290(chr12) PCR primer sequence(left) <400> 103 gcagatccac aatagaaca 19 <210> 104 <211> 18 <212> DNA <213> Artificial Sequence <220> <223> CEP290(chr12) PCR primer sequence(right) <400> 104 cacttaaaac agcagcag 18 <210> 105 <211> 18 <212> DNA <213> Artificial Sequence <220> <223> CYP2D6(chr22) PCR primer sequence(left) <400> 105 catctgggaa acagtgca 18 <210> 106 <211> 18 <212> DNA <213> Artificial Sequence <220> <223> CYP2D6(chr22) PCR primer sequence(right) <400> 106 atgtcacggg atgtcata 18 <210> 107 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DRD5(chr4) PCR primer sequence(left) <400> 107 ggggcagttc gctctatacc 20 <210> 108 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> DRD5(chr4) PCR primer sequence(right) <400> 108 cggtccacgc tgatgacgc 19 <210> 109 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> SLC6A2(chr16) PCR primer sequence(left) <400> 109 ttctctccct tctctgccca 20 <210> 110 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> SLC6A2(chr16) PCR primer sequence(right) <400> 110 gacatcacag tgagctgggt 20 <210> 111 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> PRODH(chr22) PCR primer sequence(left) <400> 111 catgacataa aagctgagg 19 <210> 112 <211> 17 <212> DNA <213> Artificial Sequence <220> <223> PRODH(chr22) PCR primer sequence(right) <400> 112 ccacaggatg cctatga 17 <210> 113 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ASIC3-1(chr7) PCR primer sequence(left) <400> 113 catcatcgat cagctgggct 20 <210> 114 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ASIC3-1(chr7) PCR primer sequence(right) <400> 114 gggtgggcac agttcttgta 20 <210> 115 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ASIC3-2(chr7) PCR primer sequence(left) <400> 115 tagccccctg actgactctc 20 <210> 116 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ASIC3-2(chr7) PCR primer sequence(right) <400> 116 agtccagcag catgtcatcc 20 <210> 117 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TRAPPC9-1(chr8) PCR primer sequence(left) <400> 117 agcttcactg tgacggcttt 20 <210> 118 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TRAPPC9-1(chr8) PCR primer sequence(right) <400> 118 aaaacaaaac cagcctgggc 20 <210> 119 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TRAPPC9-2(chr8) PCR primer sequence(left) <400> 119 gaaggaggcc cagttctgtc 20 <210> 120 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TRAPPC9-2(chr8) PCR primer sequence(right) <400> 120 agtctgtaag cctcccccat 20 <210> 121 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> HTR3A(chr11) PCR primer sequence(left) <400> 121 accatgttca ggtcaccacc 20 <210> 122 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> HTR3A(chr11) PCR primer sequence(right) <400> 122 agggttcaga ccttggcttg 20 <210> 123 <211> 17 <212> DNA <213> Artificial Sequence <220> <223> DPP6(chr7, start: 153750096) PCR primer sequence(left) <400> 123 agtgggaacc ggagaga 17 <210> 124 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> DPP6(chr7, start: 153750096) PCR primer sequence(right) <400> 124 ggaacgtaag gcgaattcc 19 <210> 125 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ARHGAP32(chr11) PCR primer sequence(left) <400> 125 ctgaccagga ggaactgagc 20 <210> 126 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> ARHGAP32(chr11) PCR primer sequence(right) <400> 126 ggcgcaaatg tcacaaact 19 <210> 127 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ARHGAP32(chr11, start: 60704125) PCR primer sequence(left) <400> 127 ctgaccagga ggaactgagc 20 <210> 128 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> ARHGAP32(chr11, start: 60704125) PCR primer sequence(right) <400> 128 ggcgcaaatg tcacaaact 19 <210> 129 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DNAH10(chr12, start: 124323060) PCR primer sequence(left) <400> 129 gaacagtgtc tccgctctcc 20 <210> 130 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DNAH10(chr12, start: 124323060) PCR primer sequence(right) <400> 130 ttgaggcttt tctggcattt 20 <210> 131 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DNAH10(chr12, start: 124315194) PCR primer sequence(left) <400> 131 aaatgaccga aacgttcacc 20 <210> 132 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DNAH10(chr12, start: 124315194) PCR primer sequence(right) <400> 132 cataccacca cgctcagcta 20 <210> 133 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FN1(chr2, start: 216239973) PCR primer sequence(left) <400> 133 agcatggaag cagcaatacc 20 <210> 134 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FN1(chr2, start: 216239973) PCR primer sequence(right) <400> 134 attgatgcac catccaacct 20 <210> 135 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9, start: 116930998) PCR primer sequence(left) <400> 135 cgctcaacca tcacagaaga 20 <210> 136 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9, start: 116930998) PCR primer sequence(right) <400> 136 gagactctgg caggaactgg 20 <210> 137 <211> 23 <212> DNA <213> Artificial Sequence <220> <223> TBCD(chr17, start: 80755631) PCR primer sequence(left) <400> 137 ttttcagatg aatttttggg aga 23 <210> 138 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TBCD(chr17, start: 80755631) PCR primer sequence(right) <400> 138 gggcaaacag tcttcacgtt 20 <210> 139 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> PRX(chr19, start: 40900339) PCR primer sequence(left) <400> 139 ttccccagtg accatctca 19 <210> 140 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PRX(chr19, start: 40900339) PCR primer sequence(right) <400> 140 gcgtaccttc tgcctctcac 20 <210> 141 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PRX(chr19, start: 40901579) PCR primer sequence(left) <400> 141 gaacttggaa gagggcttga 20 <210> 142 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PRX(chr19, start: 40901579) PCR primer sequence(right) <400> 142 tagacctgcc aggagcactt 20 <210> 143 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DRD3(chr3, start: 113890728) PCR primer sequence(left) <400> 143 tataccaccc agggcatcac 20 <210> 144 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DRD3(chr3, start: 113890728) PCR primer sequence(right) <400> 144 actacacctg tggggcagag 20 <210> 145 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FBN3(chr19, start: 8161788) PCR primer sequence(left) <400> 145 gacctggaca gagccatacc 20 <210> 146 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FBN3(chr19, start: 8161788) PCR primer sequence(right) <400> 146 cccagatgtc gatgagtgtg 20 <210> 147 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FBN3(chr19, start: 8151993) PCR primer sequence(left) <400> 147 agtttcctgc acccatgaag 20 <210> 148 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FBN3(chr19, start: 8151993) PCR primer sequence(right) <400> 148 agtgtgcaga tggtcagcag 20 <210> 149 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9, start: 116931124) PCR primer sequence(left) <400> 149 cagttcctgc cagagtctcc 20 <210> 150 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9, start: 116931124) PCR primer sequence(right) <400> 150 ctggcatggc tggttatctt 20 <210> 151 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9, start: 116994117) PCR primer sequence(left) <400> 151 catttgcccc cttttacaga 20 <210> 152 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL27A1(chr9, start: 116994117) PCR primer sequence(right) <400> 152 gcagagaaac cacagtgcaa 20 <210> 153 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> CACNA1B(chr9, start: 141014669) PCR primer sequence(left) <400> 153 tgactgtgag accaggatgg 20 <210> 154 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> CACNA1B(chr9, start: 141014669) PCR primer sequence(right) <400> 154 tggtgctgca aagatgagtc 20 <210> 155 <211> 18 <212> DNA <213> Artificial Sequence <220> <223> CACNA1B(chr9, start: 140772393) PCR primer sequence(left) <400> 155 acgtgaccgg ccccttat 18 <210> 156 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> CACNA1B(chr9, start: 140772393) PCR primer sequence(right) <400> 156 cgatcgattg cttgtagagg a 21 <210> 157 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> KAT6B(chr10, start: 76781852) PCR primer sequence(left) <400> 157 cagtaggcaa tcacctgcaa 20 <210> 158 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> KAT6B(chr10, start: 76781852) PCR primer sequence(right) <400> 158 ttgggggaga gctttgaata 20 <210> 159 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL6A1(chr21, start: 47422538) PCR primer sequence(left) <400> 159 cttgtcccca gaaagacgag 20 <210> 160 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> COL6A1(chr21, start: 47422538) PCR primer sequence(right) <400> 160 gcggtgacat tcttcagga 19 <210> 161 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO3(chr11, start: 124744033) PCR primer sequence(left) <400> 161 ggagtaggca ggttgggagt 20 <210> 162 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO3(chr11, start: 124744033) PCR primer sequence(right) <400> 162 cactgctcga accagaaaca 20 <210> 163 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PRODH(chr22, start: 18905859) PCR primer sequence(left) <400> 163 ctgccctgag aagacagagg 20 <210> 164 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PRODH(chr22, start: 18905859) PCR primer sequence(right) <400> 164 ccacaggatg cctatgacaa 20 <210> 165 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> CTNNA3(chr10, start: 68040262) PCR primer sequence(left) <400> 165 aggcattcca gatggtgaag 20 <210> 166 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> CTNNA3(chr10, start: 68040262) PCR primer sequence(right) <400> 166 caagtgaatg ttgccttgga 20 <210> 167 <211> 18 <212> DNA <213> Artificial Sequence <220> <223> DGKQ(chr4, start: 955317) PCR primer sequence(left) <400> 167 gctcaccatg tgcacgac 18 <210> 168 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> DGKQ(chr4, start: 955317) PCR primer sequence(right) <400> 168 cttcatcaac atccccaggt 20 <210> 169 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> DGKQ(chr4, start: 961785) PCR primer sequence(left) <400> 169 aagctctgcg tcttgctga 19 <210> 170 <211> 19 <212> DNA <213> Artificial Sequence <220> <223> DGKQ(chr4, start: 961785) PCR primer sequence(right) <400> 170 gtggggtctt tccctggac 19 <210> 171 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO4(chr11, start: 124764205) PCR primer sequence(left) <400> 171 gcactgccct cacctaaaag 20 <210> 172 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO4(chr11, start: 124764205) PCR primer sequence(right) <400> 172 gctgttcacc tctgcttgtg 20 <210> 173 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> MDN1(chr6, start: 90428873) PCR primer sequence(left) <400> 173 ccacgggaaa ggactgagta 20 <210> 174 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> MDN1(chr6, start: 90428873) PCR primer sequence(right) <400> 174 acccatacat gggaaccaga 20 <210> 175 <211> 21 <212> DNA <213> Artificial Sequence <220> <223> MDN1(chr6, start: 90382295) PCR primer sequence(left) <400> 175 tgcctgattt cagacatacc a 21 <210> 176 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> MDN1(chr6, start: 90382295) PCR primer sequence(right) <400> 176 gttggacgaa ggatttgtgg 20 <210> 177 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PCNT(chr21, start: 47832788) PCR primer sequence(left) <400> 177 gtactggttc ccagctccag 20 <210> 178 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PCNT(chr21, start: 47832788) PCR primer sequence(right) <400> 178 aggcgcattt catttttcac 20 <210> 179 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PCNT(chr21, start: 47847674) PCR primer sequence(left) <400> 179 ttctgcaggt tgtgcaagag 20 <210> 180 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> PCNT(chr21, start: 47847674) PCR primer sequence(right) <400> 180 gcagagctga cactcacctg 20 <210> 181 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> NTN4(chr12, start: 96076512) PCR primer sequence(left) <400> 181 tcccctcata ggatccaaaa 20 <210> 182 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> NTN4(chr12, start: 96076512) PCR primer sequence(right) <400> 182 tgcacaataa gagcgaacca 20 <210> 183 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> RCAN1(chr21, start: 35897642) PCR primer sequence(left) <400> 183 cgttaaggag cagtcggaac 20 <210> 184 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> RCAN1(chr21, start: 35897642) PCR primer sequence(right) <400> 184 tcaagagagg tggggaaaaa 20 <210> 185 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> KAT6B(chr10, start: 76788690) PCR primer sequence(left) <400> 185 acatgtgccc ctgtaagtcc 20 <210> 186 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> KAT6B(chr10, start: 76788690) PCR primer sequence(right) <400> 186 ttttccgtgg agatttctgg 20 <210> 187 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL8A1(chr3, start: 99509813) PCR primer sequence(left) <400> 187 agatgcccca cttgcagtat 20 <210> 188 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> COL8A1(chr3, start: 99509813) PCR primer sequence(right) <400> 188 tccccctctg atcccataat 20 <210> 189 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FN1(chr2, start: 216296589) PCR primer sequence(left) <400> 189 ctaagcatcc cagctcttgc 20 <210> 190 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FN1(chr2, start: 216296589) PCR primer sequence(right) <400> 190 catgaagggg gtcagtccta 20 <210> 191 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO4(chr11, start: 124763789) PCR primer sequence(left) <400> 191 cctggtcaga gatccaaagc 20 <210> 192 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO4(chr11, start: 124763789) PCR primer sequence(right) <400> 192 cagctgaggg ctaccttgaa 20 <210> 193 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO4(chr11, start: 124765478) PCR primer sequence(left) <400> 193 gccagaggat ggtctcactt 20 <210> 194 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO4(chr11, start: 124765478) PCR primer sequence(right) <400> 194 cgttcctgag ctctctgacc 20 <210> 195 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO3(chr11, start: 124742934) PCR primer sequence(left) <400> 195 gagtgactgg gaaccctcaa 20 <210> 196 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO3(chr11, start: 124742934) PCR primer sequence(right) <400> 196 ggctacaggc ccagtgagta 20 <210> 197 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> SLC6A2(chr16, start: 55690691) PCR primer sequence(left) <400> 197 gaccggtaaa gttcctctcg 20 <210> 198 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> SLC6A2(chr16, start: 55690691) PCR primer sequence(right) <400> 198 atcttcttgc cccaggtctc 20 <210> 199 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO4(chr11, start: 124761429) PCR primer sequence(left) <400> 199 gaggctgtct gagctggaac 20 <210> 200 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO4(chr11, start: 124761429) PCR primer sequence(right) <400> 200 gatctcaggg atggaaagca 20 <210> 201 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TH(chr11, start: 2186957) PCR primer sequence(left) <400> 201 gaggactggg cagagacaag 20 <210> 202 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> TH(chr11, start: 2186957) PCR primer sequence(right) <400> 202 actggttcac ggtggagttc 20 <210> 203 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> SLC6A20(chr3, start: 45814094) PCR primer sequence(left) <400> 203 gcccctgatg aggtagatga 20 <210> 204 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> SLC6A20(chr3, start: 45814094) PCR primer sequence(right) <400> 204 gaatctccat gccttttcca 20 <210> 205 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FN1(chr2, start: 216244028) PCR primer sequence(left) <400> 205 ttcattggtc cggtcttctc 20 <210> 206 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> FN1(chr2, start: 216244028) PCR primer sequence(right) <400> 206 ttttcctttt cccccatttc 20 <210> 207 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO3(chr11, start: 124739454) PCR primer sequence(left) <400> 207 ccagtcctcc gtgatgattt 20 <210> 208 <211> 20 <212> DNA <213> Artificial Sequence <220> <223> ROBO3(chr11, start: 124739454) PCR primer sequence(right) <400> 208 cctatgtccc ctcccttgtt 20

Claims (22)

  1. COL27A1, BTBD9, SGCE, MECP2, USH2A, CEP290, DRD5, ASIC3-1, ASIC3-2 및 TRAPPC9로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물.
  2. 제1항에 있어서,
    상기 COL27A1이 서열번호 1의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 BTBD9가 서열번호 3의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 SGCE가 서열번호 5의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 MECP2가 서열번호 7의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USH2A가 서열번호 9의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CEP290가 서열번호 11의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DRD5가 서열번호 13의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ASIC3-1이 서열번호 15의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ASIC3-2가 서열번호 17의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 TRAPPC9가 서열번호 19의 아미노산 서열을 갖는 폴리펩타이드인 것인, 바이오마커 조성물.
  3. MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 포함하는 뚜렛증후군 바이오마커 조성물.
  4. 제3항에 있어서,
    상기 MST1L이 서열번호 21의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 GBP3이 서열번호 23의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CFHR3이 서열번호 25의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CFHR1이 서열번호 27의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 OR2T2가 서열번호 29의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 OR2T3이 서열번호 31의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 AQP12A가 서열번호 33의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 MUC4가 서열번호 35의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USP17L17이 서열번호 37의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 USP17L18이 서열번호 39의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 TMPRSS11E가 서열번호 41의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 UGT2B17이 서열번호 43의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 PDZD2가 서열번호 45의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 GOLPH3이 서열번호 47의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 KLHL3이 서열번호 49의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CTNNA3이 서열번호 51의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 FSCB가 서열번호 53의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DUOXA1이 서열번호 55의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 DLG4가 서열번호 57의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 ACADVL이 서열번호 59의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 CDRT1이 서열번호 61의 아미노산 서열을 갖는 폴리펩타이드이거나, 상기 BSG가 서열번호 63의 아미노산 서열을 갖는 폴리펩타이드인 것인, 바이오마커 조성물.
  5. TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 단백질 또는 이를 코딩하는 유전자를 검출하는 제제를 포함하는 뚜렛증후군 진단용 키트.
  6. 제5항에 있어서,
    상기 검출 제제가 상기 SNV 관련 단백질을 코딩하는 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브인 것인, 뚜렛증후군 진단용 키트.
  7. MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 단백질을 유효성분으로 포함하는 뚜렛증후군 진단용 키트.
  8. 제7항에 있어서,
    상기 검출제제가 상기 CNV 관련 단백질을 코딩하는 유전자에 상보적으로 결합할 수 있는 프라이머 또는 프로브인 것인, 뚜렛증후군 진단용 키트.
  9. TPH2, HTR1F, COL27A1, BTBD9, IL1RN, SGCE, MECP2, ITGA1, DRD3, USH2A, CEP290, DRD5, SLC6A2, ASIC3-1, ASIC3-2, TRAPPC9 및 HTR3A로 구성된 군으로부터 선택되는 어느 하나 이상의 SNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩.
  10. MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 CNV 관련 유전자 또는 그와 상보적인 핵산이 집적된 뚜렛증후군 진단용 DNA 마이크로어레이 칩.
  11. 1) 뚜렛증후군 환자, 그의 부모 또는 형제자매로부터 SNV(single nucleotide variation) 및 CNV(copy number variation) 데이터를 수득하는 단계;
    2) 상기 SNV 및 CNV 데이터를 맵핑하는 단계; 및
    3) 뚜렛증후군 원인 유전자의 변이된 위치를 확인 또는 CNV 변이를 확인하는 단계를 포함하는 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법.
  12. 제11항에 있어서,
    카피(copy) 수가 2 이상인 경우 뚜렛증후군 관련 CNV라고 판정하는 단계를 추가적으로 포함하는 것인, 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법.
  13. 제11항에 있어서,
    상기 SNV는 하기 유전자 중 어느 하나 이상에서 발생될 수 있는 것인, 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법:
    ABCA13, ADCY2, ADORA1, ADORA2, AGPAT5, AMBRA1, ANK3, ARHGAP26, ARHGAP30, ARID1A, ARL8A, ASIC3-1, ASIC3-2, ATF6, ATP1A1, BARD1, BCAS3, BCAT1, BDNF, BSN, BTBD9, C8A, CACNA1D, CAMSAP1, CAPRIN2, CARD8, CBFA2T1, CCAR1, CDK12, CELSR3, CEP290, CHD2, CHD5, CHRNA7, CIT, CLCN1, CNTNAP2, COL27A1, CPA4, CREBBP, CSDE1, CSNK1G3, CTCF, CX3CL1, CYP2B6, CYP2C18, DBH, DCLK2, DENND5A, DHX15, DLG5, DLGAP3, DNAH2, DNAJC13, DOCK7, DPP6, DRD1, DRD2, DRD3, DSCAM, DSCAML1, EVPL, FAM120A, FAM71A, FBXO15, FMNL2, FN1, FRY, GAPVD1, GBX2, GCH1, GDNF, GET4, GIGYF1, GNB2L1, GOPC, HDAC5, HDC, HEATR5B, HECTD3, HEPACAM2, HERC1, HERC2, HIST1H1T, HLA-E, HNRNPA0, HTR1F, HTR2C, HTR3A, IL16, IL1RN, ITGA1, IMMP2L, ITOA1, ITPR2, KBTBD8, KDM5B, KIAA0368, KIAA1429, KLHL32, KLHL9, KNDC1, KRTAP10-4, LAX1, LILRA2, LLGL1, LMNA, LRP8, LZTR1, MAB21L2, MARK2, MCM7, ME2, MECP2, MGAM, MPL, MRPL3, MUC5B, MYH10, MYH4, MYO5A, NCBP1, NID1, NIPBL, NLGN4X, NLRP11, NPC1, NUP85, OFCC1, OLFM1, OPA1, OR9I1, PAG1, PDP1, PKD1L1, PREX2, PROM1, PYROXD2, RELN, RFWD3, RNF213, RYR1, RYR2, RYR3, SCN11A, SCNN1B, SEL1L3, Serotonin 1B, SGCE, SH3TC1, SKP2, SLC1A3, SLC38A8, SLC6A1, SLC6A2, SLITRK1, SLO6A2, SNRNP200, SOCE, SPEN, SPRY2, SPTBN1, SRGAP3, SSBP2, ST18, STAB2, TDRD9, TGM1, THBS3, TLN2, TMEM147, TNPO1, TOX, TP53BP2, TPH2, TPX2, TRAPPC9, TTN, TULP4, UBASH3A, UBR4, UNC13C, USH2A, USPL1, WDFY3, WDR72, WNK4, WNT7B, WWC1, YLPM1, ZMIZ1, ZNF385A, ZNF799 또는 DRD5.
  14. 제11항에 있어서,
    상기 CNV는 하기 유전자 중 어느 하나 이상에서 발생될 수 있는 것인, 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법:
    A2BP1, AADAC, ACADVL, ADSL, ALDH18A1, AQP12A, ASTN2, AUTS2, BSG, CACNA1C, CBR2, CDH10, CDH13, CDH18, CDRT1, CFHR1, CFHR3, CNTN4, CNTNAP2, Col8A1, CTNNA3, CTNND2, DISC1, DLG4, DOPEY2, DPP6, DUOXA1, FHIT, FSCB, GABRA4, GABRB1, GABRG1, GALNT13, GBP3, GOLPH3, GPR89A, GRM8, KCNE1, KCNMA1, KLHL3, MACROD2, MST1L, MUC4, NF1, NRXN1, NSD1, OR2R2, OR2T2, OR2T3, OXTR, P2RX2, PAK7, PARK2, PDE9A, PDZD2, POLE, RB1CC1, SEMA5A, TBX1, TMEM195, TMPRSS11E, UGT2B17, USP17L17, USP17L18 또는 WDR4.
  15. 제11항에 있어서,
    상기 단계 1)은 SNV 후보군 그룹 및 CNV 후보군 그룹으로부터 선택되는 하나 이상의 유전자를 코딩하는 뉴클레오타이드 서열에 특이적으로 결합하는 프라이머 또는 프로브를 이용하여 NGS(next generation sequencing)를 통해 수행되는 것인, 뚜렛증후군의 원인 유전자 및 뚜렛증후군 관련 CNV를 스크리닝하는 방법.
  16. 1) 뚜렛증후군 SNV 데이터, CNV 데이터 및 가족 정보가 입력되는 데이터 취득부;
    2) 상기 입력된 SNV 데이터, CNV 데이터 및 가족 정보를 기 설정된 수식 및 윈도우를 이용하여 우선순위 점수 연산을 수행하는 데이터 연산부;
    3) 상기 연산부에서 연산된 우선순위 점수에 따라 선정된 SNV 및 CNV, 그리고 뚜렛증후군의 SNV 및 CNV 데이터를 맵핑하는 맵핑부; 및
    4) 상기 맵핑된 SNV 및 CNV를 이용하여 뚜렛증후군 위험 여부를 출력하는 동정부를 포함하는 뚜렛증후군 진단용 시스템.
  17. 제16항에 있어서,
    상기 SNV 데이터는 COL27A1, BTBD9, SGCE, MECP2, USH2A, CEP290, DRD5, ASIC3-1, ASIC3-2 및 TRAPPC9로 구성된 군으로부터 선택되는 어느 하나 이상의 유전자를 포함하는 것인, 뚜렛증후군 진단용 시스템.
  18. 제16항에 있어서,
    상기 CNV 데이터는 MST1L, GBP3, CFHR3, CFHR1, OR2T2, OR2T3, AQP12A, MUC4, USP17L17, USP17L18, TMPRSS11E, UGT2B17, PDZD2, GOLPH3, KLHL3, CTNNA3, FSCB, DUOXA1, DLG4, ACADVL, CDRT1 및 BSG로 구성된 군으로부터 선택되는 어느 하나 이상의 유전자를 포함하는 것인, 뚜렛증후군 진단용 시스템.
  19. 제16항에 있어서,
    상기 2)의 데이터 연산부는 하기 단계로 이루어지는 것인, 뚜렛증후군 진단용 시스템:
    (i) 상기 입력된 SNV 데이터, CNV 데이터 및 가족 정보를 하기 수학식 1 또는 수학식 2를 이용하여 SNV 및 CNV 데이터의 수치화하는 단계;
    [수학식 1]
    Figure pat00028

    [수학식 2]
    Figure pat00029

    (ii) 수치화된 SNV 데이터 및 CNV 데이터를 분석 가능한 가족 구성원의 수를 윈도우 사이즈 n으로 설정하는 단계;
    (iii) 설정된 윈도우 내 수치화된 SNV 데이터 및 CNV 데이터를 이용하여 분석 대상 가족의 비율을 연산하는 단계;
    (iv) 상기 설정된 윈도우 내 SNV 및 CNV 위치에서 단일 비율 검정을 이용하여 유의 확률을 연산하는 단계;
    (v) 상기 설정된 윈도우의 양측 말단의 물리적인 위치 보정을 위한 가중치 연산하는 단계;
    (vi) 상기 연산된 유의 확률 및 가중치를 이용하여 점수를 계산하는 단계;
    (vii) 상기 계산된 (vi)의 점수가 -log(0.05)=2.996 이상인 단일 염기서열에서 뚜렛증후군 환자와 정상인의 패턴이 각각 일치하는지 확인하는 단계;
    (viii) 상기 (vii)의 조건을 만족하는 단일 염기서열 위치가 암호화 부위(coding region)인지 확인하는 단계;
    (ix) 상기 (viii)의 조건을 만족하는 위치의 단일 염기서열을 유전자 기호(gene symbol)로 변환하는 단계; 및
    (x) 점수에 따라 우선순위를 매긴 후 원인 후보 유전자 리스트를 확인하는 단계.
  20. 제19항에 있어서,
    상기 i)의 SNV 및 CNV 데이터의 수치화는 뚜렛증후군 환자의 부모 염기서열 데이터를 모두 사용할 수 있는 경우, 하기 수학식 1을 이용하여 수치화하는 것을 특징으로 하고, 이때, 수학식 1에서 SNVjv(0) 및 SNVjv(1)은 각각 정상인 및 뚜렛증후군 환자를 의미하고, v=1, v=2, 및 v=3,...,V는 각각 부, 모, v-2번째 자녀의 염기서열 데이터를 의미하며, 가족 전체의 패턴 빈도인 LFj는 하기 수학식 3(C는 자녀의 숫자)을 이용하여 계산하는 것인, 뚜렛증후군 진단용 시스템:
    [수학식 1]
    Figure pat00030

    [수학식 3]
    Figure pat00031
  21. 제19항에 있어서,
    상기 i)의 SNV 및 CNV 데이터의 수치화는 뚜렛증후군 환자의 부모 염기서열 데이터 중 하나만 사용 가능하거나 부모 염기서열 데이터를 모두 사용할 수 없는 경우, 하기 수학식 2를 이용하여 수치화하는 것을 특징으로 하고, 이때, 수학식 2에서 MSNVj(S=1)은 분석 대상 가족 구성원 내에서 뚜렛증후군 환자들만이 가지고 있는 SNV 중 빈도가 높은 패턴을 의미하는 것인, 뚜렛증후군 진단용 시스템.
    [수학식 2]
    Figure pat00032
  22. 1) 뚜렛증후군이 의심되는 개체로부터 분리된 시료에서 SNV 관련 유전자 또는 CNV 관련 유전자의 변이를 확인하는 단계; 및
    2) 상기 SNV 관련 유전자 또는 CNV 관련 유전자가 변이가 일어난 경우, 개체를 뚜렛증후군으로 판정하는 단계를 포함하는, 뚜렛증후군 진단에 대한 정보의 제공방법.
KR1020190070779A 2019-06-14 2019-06-14 뚜렛증후군의 원인 유전자를 동정하는 방법 KR102250063B1 (ko)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020190070779A KR102250063B1 (ko) 2019-06-14 2019-06-14 뚜렛증후군의 원인 유전자를 동정하는 방법

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020190070779A KR102250063B1 (ko) 2019-06-14 2019-06-14 뚜렛증후군의 원인 유전자를 동정하는 방법

Publications (2)

Publication Number Publication Date
KR20200143026A true KR20200143026A (ko) 2020-12-23
KR102250063B1 KR102250063B1 (ko) 2021-05-12

Family

ID=74089200

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020190070779A KR102250063B1 (ko) 2019-06-14 2019-06-14 뚜렛증후군의 원인 유전자를 동정하는 방법

Country Status (1)

Country Link
KR (1) KR102250063B1 (ko)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114457154A (zh) * 2022-04-13 2022-05-10 山东第一医科大学附属省立医院(山东省立医院) KIBRA rs17070145检测试剂在制备嗅觉功能评价试剂盒中的应用

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012511895A (ja) * 2008-11-14 2012-05-31 ザ チルドレンズ ホスピタル オブ フィラデルフィア ヒト認知の原因となる遺伝子変異体及び診断標的及び治療標的としてのそれらを使用する方法
JP2018526398A (ja) * 2015-09-08 2018-09-13 ザ・チルドレンズ・ホスピタル・オブ・フィラデルフィアThe Children’S Hospital Of Philadelphia トゥレット症候群の診断及び治療方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2012511895A (ja) * 2008-11-14 2012-05-31 ザ チルドレンズ ホスピタル オブ フィラデルフィア ヒト認知の原因となる遺伝子変異体及び診断標的及び治療標的としてのそれらを使用する方法
JP2018526398A (ja) * 2015-09-08 2018-09-13 ザ・チルドレンズ・ホスピタル・オブ・フィラデルフィアThe Children’S Hospital Of Philadelphia トゥレット症候群の診断及び治療方法

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
Biological Psychiatry (2012) 71:392-402* *
BMC Medical Genetics (2012) 13:123 *
Cell Reports (2018) 24(13):3441-3454 *
Current Behavioral Neuroscience Reports (2016) 3:218-231* *
Movement Disorder (2004) 19(10):1237-1238 *
Neuron (2017) 94:486-499 *
Psychiatric Genetics (2008) 18(2):98* *
권순재, 이진영 및 신재필, 뚜렛 증후군을 가진 15세 소아에서 발생한 양측성 망막박리. J Korean Ophthalmol Soc 2012;53(11):1704-1707

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114457154A (zh) * 2022-04-13 2022-05-10 山东第一医科大学附属省立医院(山东省立医院) KIBRA rs17070145检测试剂在制备嗅觉功能评价试剂盒中的应用
CN114457154B (zh) * 2022-04-13 2022-06-28 山东第一医科大学附属省立医院(山东省立医院) KIBRA rs17070145检测试剂在制备嗅觉功能评价试剂盒中的应用

Also Published As

Publication number Publication date
KR102250063B1 (ko) 2021-05-12

Similar Documents

Publication Publication Date Title
US10889865B2 (en) Thyroid tumors identified
DK2681333T3 (en) EVALUATION OF RESPONSE TO GASTROENTEROPANCREATIC NEUROENDOCRINE NEOPLASIS (GEP-NENE) THERAPY
KR101421326B1 (ko) 유방암 예후 예측을 위한 조성물 및 이를 포함하는 키트
US20230416827A1 (en) Assay for distinguishing between sepsis and systemic inflammatory response syndrome
WO2003042661A2 (en) Methods of diagnosis of cancer, compositions and methods of screening for modulators of cancer
AU2016331663A1 (en) Pathogen biomarkers and uses therefor
KR20150043566A (ko) 심장독성 약제의 동정에 마커를 사용하는 용도
AU2012381038A1 (en) Interrogatory cell-based assays for identifying drug-induced toxicity markers
KR20140140069A (ko) 전반적 발달장애의 진단 및 치료용 조성물 및 그 진단 및 치료 방법
CN101258249A (zh) 检测黑素瘤的方法和试剂
CN110628894A (zh) 用于帕金森病基因突变检测的靶向捕获测序试剂盒及其应用
MXPA05005653A (es) Determinacion y seleccion terapeutica de genes de insuficiencia cardiaca.
CA2403946A1 (en) Genes expressed in foam cell differentiation
CN106636344A (zh) 一种基于二代高通量测序技术的地中海贫血症的基因检测试剂盒
AU2016377391A1 (en) Triage biomarkers and uses therefor
KR102250063B1 (ko) 뚜렛증후군의 원인 유전자를 동정하는 방법
US20020137077A1 (en) Genes regulated in activated T cells
JP2003235573A (ja) 糖尿病性腎症マーカーおよびその利用
AU2018304242A1 (en) Methods for detection of plasma cell dyscrasia
AU2016349950A1 (en) Viral biomarkers and uses therefor
EP1497454A2 (en) Methods of diagnosis of cancer, compositions and methods of screening for modulators of cancer
KR102480128B1 (ko) 면역력 강화 소 African Humped Cattle (AFH) 품종 특이적 단일염기다형성 및 그의 용도
US20040067512A1 (en) Single nucleotide polymorphisms and mutations on Alpha-2-Macroglobulin
KR102513462B1 (ko) 유전성 혈액응고 장애 진단용 조성물 및 이의 용도
KR101656744B1 (ko) 트리클로산 노출에 대응하는 바다송사리 유전자 및 이를 이용한 수생태계 환경오염 진단 방법

Legal Events

Date Code Title Description
AMND Amendment
E601 Decision to refuse application
AMND Amendment
X701 Decision to grant (after re-examination)
GRNT Written decision to grant