CN112501205B - Construction method and application of CEACAM1 gene humanized non-human animal - Google Patents

Construction method and application of CEACAM1 gene humanized non-human animal Download PDF

Info

Publication number
CN112501205B
CN112501205B CN202110173466.1A CN202110173466A CN112501205B CN 112501205 B CN112501205 B CN 112501205B CN 202110173466 A CN202110173466 A CN 202110173466A CN 112501205 B CN112501205 B CN 112501205B
Authority
CN
China
Prior art keywords
ceacam1
gene
human
seq
humanized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110173466.1A
Other languages
Chinese (zh)
Other versions
CN112501205A (en
Inventor
赵磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baccetus Beijing Pharmaceutical Technology Co ltd
Original Assignee
Baccetus Beijing Pharmaceutical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baccetus Beijing Pharmaceutical Technology Co ltd filed Critical Baccetus Beijing Pharmaceutical Technology Co ltd
Priority to CN202110173466.1A priority Critical patent/CN112501205B/en
Publication of CN112501205A publication Critical patent/CN112501205A/en
Application granted granted Critical
Publication of CN112501205B publication Critical patent/CN112501205B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/8509Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K67/00Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
    • A01K67/027New or modified breeds of vertebrates
    • A01K67/0271Chimeric vertebrates, e.g. comprising exogenous cells
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K67/00Rearing or breeding animals, not otherwise provided for; New or modified breeds of animals
    • A01K67/027New or modified breeds of vertebrates
    • A01K67/0275Genetically modified vertebrates, e.g. transgenic
    • A01K67/0278Knock-in vertebrates, e.g. humanised vertebrates
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K49/00Preparations for testing in vivo
    • A61K49/0004Screening or testing of compounds for diagnosis of disorders, assessment of conditions, e.g. renal clearance, gastric emptying, testing for diabetes, allergy, rheuma, pancreas functions
    • A61K49/0008Screening agents using (non-human) animal models or transgenic animal models or chimeric hosts, e.g. Alzheimer disease animal model, transgenic model for heart failure
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • C07K14/70503Immunoglobulin superfamily
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2227/00Animals characterised by species
    • A01K2227/10Mammal
    • A01K2227/105Murine
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2267/00Animals characterised by purpose
    • A01K2267/03Animal model, e.g. for test or diseases
    • A01K2267/0331Animal model for proliferative diseases
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2267/00Animals characterised by purpose
    • A01K2267/03Animal model, e.g. for test or diseases
    • A01K2267/035Animal model for multifactorial diseases
    • A01K2267/0387Animal model for diseases of the immune system
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Zoology (AREA)
  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Environmental Sciences (AREA)
  • Organic Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biotechnology (AREA)
  • Biomedical Technology (AREA)
  • Veterinary Medicine (AREA)
  • Toxicology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Molecular Biology (AREA)
  • Cell Biology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Immunology (AREA)
  • Animal Husbandry (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Biochemistry (AREA)
  • Diabetes (AREA)
  • Public Health (AREA)
  • Physics & Mathematics (AREA)
  • Epidemiology (AREA)
  • Urology & Nephrology (AREA)
  • Plant Pathology (AREA)
  • Rheumatology (AREA)
  • Microbiology (AREA)
  • Pathology (AREA)
  • Endocrinology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Medicinal Chemistry (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The invention provides a construction method of a CEACAM1 gene humanized non-human animal, which is characterized in that a nucleotide sequence coding human CEACAM1 protein is introduced into the genome of the non-human animal in a homologous recombination mode, the animal can normally express the humanized CEACAM1 protein, can be used as an animal model for researching the signal mechanism of human CEACAM1 and screening tumor and immune disease drugs, and has important application value for the research and development of new drugs of immune targets. The invention also provides a humanized CEACAM1 protein, a humanized CEACAM1 gene, a targeting vector of the CEACAM1 gene, a non-human animal obtained by the construction method and application thereof in the field of biomedicine.

Description

Construction method and application of CEACAM1 gene humanized non-human animal
Technical Field
The invention belongs to the field of animal genetic engineering and genetic modification, and particularly relates to a construction method of a CEACAM1 gene humanized non-human animal and application thereof in the field of biomedicine.
Background
Carcinoembryonic antigen-related cell adhesion molecule 1 (hereinafter referred to as CEACAM1), also known as CD66a or C-CAM, is a transmembrane glycoprotein and belongs to the carcinoembryonic antigen (CEA) family. CEACAM1 is widely expressed on epithelial cells and vascular endothelial cells, CEACAM1 is also expressed on granulocytes, monocytes-macrophages, platelets, B cells, IL-2 activated T cells), dendritic cells. CEACAM1 has many biological functions, and can prevent tumor growth and epithelial cell proliferation, induce epithelial cell apoptosis, prevent T lymphocyte activation, stimulate B lymphocyte proliferation, inhibit T cell and NK cell cytotoxic effects, prevent tumor-infiltrating lymphocyte activity, delay granulocyte and monocyte apoptosis, promote tumor cell invasion, enhance endothelial cell activity, promote angiogenesis, and regulate blood vessel remodeling. .
In particular, CEACAM1 is thought to be an immune checkpoint molecule similar to PD-1 and CTLA-4, playing a key role in regulating T cell activation. Immune checkpoint pathways protect tissues from immune-mediated damage under non-inflammatory physiological conditions. When CEACAM1 is activated on T lymphocytes, primarily upon CEACAM1-CEACAM1 trans-homologous engagement, CEACAM1 signals inhibition of TCR-mediated inflammatory pathways by recruiting phosphatases into its own cytoplasmic ITIM motif. Therefore, inhibition of immune checkpoint pathways in the cancer environment has become a promising anti-cancer therapeutic strategy.
The experimental animal disease model is an indispensable research tool for researching etiology and pathogenesis of human diseases, developing prevention and treatment technologies and developing medicines. However, due to the differences between the physiological structures and metabolic systems of animals and humans, the traditional animal models cannot reflect the real conditions of human bodies well, and the establishment of disease models closer to the physiological characteristics of human bodies in animal bodies is an urgent need of the biomedical industry.
With the continuous development and maturation of genetic engineering technology, the replacement or substitution of animal homologous genes with human genes has been realized, and the development of humanized experimental animal models in this way is the future development direction of animal models. The gene humanized animal model is one animal model with normal or mutant gene replaced with homologous gene in animal genome and similar physiological or disease characteristics. The gene humanized animal not only has important application value, for example, the humanized animal model of cell or tissue transplantation can be improved and promoted by gene humanization, but also more importantly, the human protein can be expressed or partially expressed in the animal body due to the insertion of the human gene segment, and the gene humanized animal can be used as a target of a drug which can only recognize the human protein sequence, thereby providing possibility for screening anti-human antibodies and other drugs at the animal level. However, due to differences in physiology and pathology between animals and humans, coupled with the complexity of genes, for example, the identity of human and mouse CEACAM1 protein is only 58%, how to construct an "effective" humanized animal model for new drug development remains the greatest challenge.
In view of the complex mechanism of action of CEACAM1 and the huge application value in the field of tumor therapy, there is an urgent need in the art to develop a non-human animal model of CEACAM 1-related signaling pathway in order to further explore its related biological properties, improve the effectiveness of preclinical drug efficacy tests, improve the success rate of research and development, make preclinical tests more effective and minimize the research and development failures. In addition, the non-human animal obtained by the method can be mated with other gene humanized non-human animals to obtain a multi-gene humanized animal model which is used for screening and evaluating the drug effect research of human drugs and combined drugs aiming at the signal path. The invention has wide application prospect in academic and clinical research.
Disclosure of Invention
In a first aspect of the present invention, there is provided a CEACAM1 gene humanized non-human animal or a construction method thereof, wherein the genome of the non-human animal comprises exons 2 to 6 of human CEACAM1 gene.
Preferably, the genome of said non-human animal comprises part of exon 2, all of exons 3 to 5 and part of exon 6 of human CEACAM1 nucleotide sequence, further preferably comprises intron 2-3 and/or intron 5-6, more preferably comprises any intron between exons 2-6; wherein, the part of the No. 2 exon of the nucleotide sequence of the human CEACAM1 at least comprises the nucleotide sequence of the No. 2 exon coding the extracellular region of the human CEACAM1 protein, preferably, the part of the No. 2 exon at least comprises the nucleotide sequence of 313bp, 314bp, 315bp, 316bp, 317bp, 318bp, 319bp, 320bp, 321bp or 322bp in length from the No. 2 exon 3 '-5', the part of the No. 6 exon at least comprises the nucleotide sequence of 1-5 (such as 1, 2, 3, 4, 5) amino acids of the C end of the extracellular region of the human CEACAM1 protein removed from the No. 6 exon, preferably, the part of the No. 6 exon at least comprises the nucleotide sequence of 14bp, 15bp, 16bp, 17bp, 18bp, 19bp, 20bp, 21bp, 22bp or 23bp in length from the No. 6 exon 5 '-3'.
Preferably, said constructing method comprises inserting or replacing at the non-human animal CEACAM1 locus with a nucleotide sequence comprising all or part of exon 2 to 6 of human CEACAM1 nucleotide sequence, further preferably, inserting or replacing at the non-human animal CEACAM1 locus with a nucleotide sequence comprising part of exon 2, all of exon 3 to 5 and part of exon 6 of human CEACAM1 nucleotide sequence, more preferably, comprising intron 2-3 and/or intron 5-6, still more preferably, comprising any intron between exon 2-6; wherein, the part of the No. 2 exon of the human CEACAM1 gene at least comprises the nucleotide sequence of the No. 2 exon coding the extracellular region of the human CEACAM1 protein, preferably, the part of the No. 2 exon at least comprises the nucleotide sequence of 313bp, 314bp, 315bp, 316bp, 317bp, 318bp, 319bp, 320bp, 321bp or 322bp in length from the No. 2 exon 3 '-5', the part of the No. 6 exon at least comprises the nucleotide sequence of 1-5 (such as 1, 2, 3, 4, 5) amino acids of the C end of the extracellular region of the human CEACAM1 protein removed from the No. 6 exon, preferably, the part of the No. 6 exon at least comprises the nucleotide sequence of 14bp, 15bp, 16bp, 17bp, 18bp, 19bp, 20bp, 21bp, 22bp or 23bp in length from the No. 6 exon 5 '-3'.
Preferably, the insertion or substitution to the CEACAM1 locus is to insert or substitute a part of nucleotide sequence of non-human animal endogenous CEACAM1 gene comprising part of exon 2, all of exon 3 to 5 and part of exon 6, preferably intron 2-3 and/or intron 5-6, further preferably any intron between exons 2 and 6, wherein the part of exon 2 of non-human animal CEACAM1 gene comprises at least nucleotide sequence of extracellular region of ceam 1 protein in exon 2, preferably the part of exon 2 comprises at least nucleotide sequence of extracellular region of CEACAM1 protein in exon 2, and the part of exon 2 comprises at least nucleotide sequence of extracellular region of ceam 1 protein with length 313bp, 314bp, 315bp, 316bp, 317bp, 318bp, 319bp, 320bp, 321bp or 322bp, and the part of exon 6 comprises at least extracellular region of CEACAM 385 (for example, No. 1-25C) in exon 6 (for example, No. 2, No. 3 '-5's exon 3-5) is removed Such as a nucleotide sequence of 1, 2, 3, 4, 5) amino acids, preferably, the part of exon 6 comprises at least a nucleotide sequence of 14bp, 15bp, 16bp, 17bp, 18bp, 19bp, 20bp, 21bp, 22bp or 23bp in length from exon 6 ' 5 ' -3 '.
Preferably, the non-human animal body expresses human or humanized CEACAM1 protein.
Preferably, the human or humanized CEACAM1 protein comprises an extracellular region of human CEACAM1 protein. Further preferably, the peptide further comprises a signal peptide, a transmembrane region and/or a cytoplasmic region.
Preferably, the humanized CEACAM1 protein comprises an extracellular region portion of human CEACAM1 protein, and further preferably, the extracellular region portion comprises an extracellular region of human CEACAM1 protein with 0-5 (e.g., 0, 1, 2, 3, 4, 5) amino acids removed from the C-terminus.
Preferably, the construction method comprises insertion or substitution into the non-human animal CEACAM1 locus with a nucleotide sequence comprising an extracellular region encoding the human CEACAM1 protein. Further preferred, comprises insertion or substitution to the non-human animal CEACAM1 locus, preferably substitution at the corresponding position, with a nucleotide sequence comprising the C-terminal deletion of 0-5 (e.g. 0, 1, 2, 3, 4, 5) amino acids of the extracellular region encoding the human CEACAM1 protein.
More preferably, the method of construction comprises the step of using a polynucleotide comprising a nucleotide sequence encoding SEQ ID NO: 2 from position 35 to 423 to the CEACAM1 locus.
Preferably, the method of construction comprises the use of a polynucleotide comprising SEQ ID NO: 5 to the CEACAM1 locus of a non-human animal.
Preferably, the construction method comprises insertion, inversion, knockout or substitution.
More preferably, the construction method is a substitution, and the substitution is a substitution of a nucleotide sequence encoding the nucleotide sequence of the CEACAM1 gene of the non-human animal CEACAM: 1, positions 35 to 419.
Most preferably, the substitution is a substitution of the nucleotide sequence of the non-human animal CEACAM1 gene NC _000073.7 at positions 25165846 to 25176090.
The non-human animal of the invention is a rodent; preferably, the rodent is a rat or a mouse.
In one embodiment of the invention, the method of construction comprises contacting the nucleic acid sequence comprising the nucleic acid sequence encoding SEQ ID NO: 2 or a nucleotide sequence comprising amino acids 35 to 423 of SEQ ID NO: 5 to the corresponding region of the non-human animal CEACAM1 gene.
Preferably, the non-human animal body expresses the human or humanized CEACAM1 protein with reduced or absent expression of endogenous CEACAM1 protein.
Preferably, the humanized CEACAM1 protein comprises all or part of the extracellular region of human CEACAM1 protein, further preferably comprises part of the extracellular region, more preferably comprises the extracellular region of human CEACAM1 protein with 0-5 (e.g., 0, 1, 2, 3, 4, 5) amino acids removed from the C-terminus, and still more preferably comprises a sequence identical to SEQ ID NO: 2 from 35 to 423 or SEQ ID NO: 10 or an amino acid sequence having at least 70%, 80%, 85%, 90%, 95% or at least 99% identity to SEQ ID NO: 2 from 35 to 423 or SEQ ID NO: 10, or a pharmaceutically acceptable salt thereof.
Preferably, the humanized CEACAM1 protein further comprises a portion of a non-human animal CEACAM1 protein, preferably a signal peptide, extracellular region, transmembrane region and/or cytoplasmic region of the non-human animal CEACAM1 protein.
In one embodiment of the present invention, the humanized CEACAM1 protein comprises one of the following groups:
a) SEQ ID NO: 10 or SEQ ID NO: 2, part or all of the amino acid sequence shown at positions 35 to 423;
b) and SEQ ID NO: 10 or SEQ ID NO: 2 from position 35 to 423 is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99%;
c) and SEQ ID NO: 10 or SEQ ID NO: 2 from position 35 to 423 with no more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or no more than 1 amino acid difference;
d) has the sequence shown in SEQ ID NO: 10 or SEQ ID NO: 2, positions 35 to 423, comprising substitution, deletion and/or insertion of one or more amino acid residues.
Preferably, the genome of the non-human animal comprises a humanized CEACAM1 gene, and the humanized CEACAM1 gene encodes a humanized CEACAM1 protein.
Preferably, the humanized CEACAM1 gene comprises SEQ ID NO: 5, and further preferably, the mRNA sequence transcribed by the CEACAM1 gene contained in the non-human animal comprises SEQ ID NO: 9, or a nucleotide sequence shown in the specification.
In one embodiment of the present invention, the humanized CEACAM1 gene comprises one of the following groups:
a) the mRNA sequence of the humanized CEACAM1 gene is SEQ ID NO: 9, or a part or all of the sequence shown in seq id no;
b) the mRNA sequence of the humanized CEACAM1 gene is similar to that of SEQ ID NO: 9 is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99%;
c) the mRNA sequence of the humanized CEACAM1 gene is similar to that of SEQ ID NO: 9 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or by no more than 1 nucleotide;
d) the mRNA sequence of the humanized CEACAM1 gene has the sequence shown in SEQ ID NO: 9, including nucleotide sequences with one or more nucleotides substituted, deleted and/or inserted.
Preferably, the construction method comprises inserting or replacing a nucleotide sequence comprising the humanized CEACAM1 gene at the non-human animal CEACAM1 locus.
Preferably, the construction method comprises inserting or replacing a nucleotide sequence encoding the humanized CEACAM1 protein into the non-human animal CEACAM1 locus.
Preferably, the insertion or substitution site is after an endogenous regulatory element of the CEACAM1 gene.
Preferably, the insertion is performed by first disrupting the coding frame of the endogenous CEACAM1 gene of the non-human animal and then performing the insertion operation, or the insertion step can be performed by both causing a frame shift mutation at the endogenous CEACAM1 gene and performing the insertion step of the human sequence.
Preferably, the humanized CEACAM1 gene is homozygous or heterozygous in the non-human animal.
Preferably, the genome of the non-human animal comprises a humanized CEACAM1 gene on at least one chromosome.
Preferably, at least one cell in the non-human animal expresses a human or humanized CEACAM1 protein.
Preferably, the CEACAM1 gene humanized non-human animal is constructed using gene editing techniques including gene targeting using embryonic stem cells, CRISPR/Cas9, zinc finger nuclease, transcription activator-like effector nuclease, homing endonuclease or other molecular biology techniques.
Preferably, the construction of a non-human animal humanized with CEACAM1 gene is performed using a targeting vector, wherein the targeting vector comprises all or part of the nucleotide sequence of exon nos. 2 to 6 of human CEACAM 1; more preferably, the part of the exon 2, the whole exon 3 to 5 and the part of the exon 6 are contained, more preferably, the intron 2-3 and/or the intron 5-6 are contained, still more preferably, any intron between the exons 2 to 6 is contained, wherein the part of the exon 2 of the nucleotide sequence of the human CEACAM1 at least comprises the nucleotide sequence of the exon 2 coding the extracellular region of the human CEACAM1 protein, preferably, the part of the exon 2 at least comprises the nucleotide sequence of the exon 2 with the length of 313bp, 314bp, 315bp, 316bp, 317bp, 318bp, 319bp, 320bp, 321bp or 322bp, and the part of the exon 6 at least comprises the nucleotide sequence of the exon 6 with the extracellular region 1 protein coding the extracellular region removed (for example, 1, 2bp, 3bp, 316bp, 317bp, 318bp, 319bp, 320bp, 321bp or 322 bp), and the part of the exon 6 at least comprises the exon 6, 3. 4, 5) amino acid nucleotide sequence, preferably, the part of the No. 6 exon at least comprises the nucleotide sequence with the length of 14bp, 15bp, 16bp, 17bp, 18bp, 19bp, 20bp, 21bp, 22bp or 23bp from the No. 6 exon 5 '-3'.
Preferably, the targeting vector comprises a nucleic acid sequence encoding SEQ ID NO: 2, amino acid sequence 35 to 423 or SEQ ID NO: 5.
Preferably, the targeting vector further comprises a DNA fragment homologous to the 5 'end of the transition region to be altered, i.e., the 5' arm, selected from the group consisting of nucleotides of 100-10000 in length of the genomic DNA of the CEACAM1 gene of a non-human animal; preferably, said 5' arm has at least 90% homology to NCBI accession No. NC _ 000068.8; further preferably, the 5' arm sequence is identical to SEQ ID NO: 3 or as shown in SEQ ID NO: 3, respectively.
Preferably, the targeting vector further comprises a DNA fragment homologous to the 3 'end of the transition region to be altered, i.e., the 3' arm, selected from the group consisting of nucleotides of 100-10000 in length of the genomic DNA of the CEACAM1 gene of a non-human animal; preferably, said 3' arm has at least 90% homology to NCBI accession No. NC _ 000068.8; further preferably, the 3' arm sequence is identical to SEQ ID NO: 4 or as shown in SEQ ID NO: 4, respectively.
Preferably, said transition region to be altered is located at the CEACAM1 locus of a non-human animal. Further preferably, it is located from exon 2 to exon 6 of the CEACAM1 gene of non-human animal.
In one embodiment of the present invention, the construction method comprises introducing the targeting vector into a cell of a non-human animal, culturing the cell (preferably an embryonic stem cell), transplanting the cultured cell into an oviduct of a female non-human animal, allowing the female non-human animal to develop, and identifying and screening to obtain the non-human animal.
In a second aspect of the present invention, there is provided a CEACAM1 gene humanized non-human animal obtained by the above construction method.
In a third aspect of the invention, there is provided a targeting vector for CEACAM1 gene, said targeting vector comprising part of the nucleotide sequence of human CEACAM 1.
Preferably, said part of human CEACAM1 nucleotide sequence comprises all or part of the nucleotide sequence of exon 2 to exon 6 of human CEACAM 1; more preferably, the part of the exon 2, the whole exon 3 to 5 and the part of the exon 6 are contained, more preferably, the intron 2-3 and/or the intron 5-6 are contained, still more preferably, any intron between the exons 2 to 6 is contained, wherein the part of the exon 2 of the nucleotide sequence of the human CEACAM1 at least comprises the nucleotide sequence of the exon 2 coding the extracellular region of the human CEACAM1 protein, preferably, the part of the exon 2 at least comprises the nucleotide sequence of which the length from the exon 23 '-5' is 313bp, 314bp, 315bp, 316bp, 317bp, 318bp, 319bp, 320bp, 321bp or 322bp, and the part of the exon 6 at least comprises the nucleotide sequence of which the N end 1-5 (such as 1, 2bp, 316bp, 317bp, 318bp, 319bp, 320bp, 321bp or 322 bp) of the extracellular region of the human CEACAM1 protein is removed from the exon 6, 3. 4, 5) amino acid nucleotide sequence, preferably, the part of the No. 6 exon at least comprises the nucleotide sequence with the length of 14bp, 15bp, 16bp, 17bp, 18bp, 19bp, 20bp, 21bp, 22bp or 23bp from the No. 6 exon 5 '-3'.
Preferably, the targeting vector comprises a nucleic acid sequence encoding SEQ ID NO: 2, amino acid sequence 35 to 423 or SEQ ID NO: 5.
Preferably, the targeting vector further comprises a DNA fragment homologous to the 5 'end of the transition region to be altered, i.e., the 5' arm, selected from the group consisting of nucleotides of 100-10000 in length of the genomic DNA of the CEACAM1 gene of a non-human animal; preferably, said 5' arm has at least 90% homology to NCBI accession No. NC _ 000073.7; further preferably, the 5' arm sequence is identical to SEQ ID NO: 3 or as shown in SEQ ID NO: 3, respectively.
Preferably, the targeting vector further comprises a DNA fragment homologous to the 3 'end of the transition region to be altered, i.e., the 3' arm, selected from the group consisting of nucleotides of 100-10000 in length of the genomic DNA of the CEACAM1 gene of a non-human animal; preferably, said 3' arm has at least 90% homology to NCBI accession No. NC _ 000073.7; further preferably, the 3' arm sequence is identical to SEQ ID NO: 4 or as shown in SEQ ID NO: 4, respectively.
Preferably, said transition region to be altered is located at the CEACAM1 locus, and more preferably, said transition region to be altered is located on exons 2 to 6 of CEACAM1 gene.
The non-human animal of the invention is a rodent; preferably, the rodent is a rat or a mouse.
Preferably, the targeting vector further comprises a marker gene, more preferably, the marker gene is a gene encoding a negative selection marker, and even more preferably, the gene encoding the negative selection marker is a gene encoding diphtheria toxin subunit a (DTA).
In a specific embodiment of the present invention, the targeting vector further comprises a resistance gene selected by a positive clone, and further preferably, the resistance gene selected by the positive clone is neomycin phosphotransferase coding sequence Neo.
In a specific embodiment of the present invention, the targeting vector further comprises a specific recombination system, and further preferably, the specific recombination system is a Frt recombination site (a conventional LoxP recombination system may also be selected), and the specific recombination system has two Frt recombination sites, which are respectively connected to both sides of the resistance gene.
In a fourth aspect of the invention, there is provided a cell comprising the targeting vector described above.
In a fifth aspect of the invention, there is provided the use of a targeting vector as described above, or a cell as described above, in CEACAM1 gene modification, preferably, said use includes but is not limited to inversion, knock-out, insertion or substitution.
The sixth aspect of the invention relates to a CEACAM1 gene humanized cell, wherein the genome of the CEACAM1 gene humanized cell comprises exons 2 to 6 of a human CEACAM1 gene. Preferably, the human CEACAM1 gene encodes SEQ ID NO: 2 or a nucleotide sequence comprising amino acids 35 to 423 of SEQ ID NO: 5, which is regulated by an endogenous CEACAM1 regulatory element; the CEACAM1 gene can express human or humanized CEACAM1 protein in humanized cell body, and simultaneously the expression of endogenous CEACAM1 protein is reduced or deleted. Preferably, the human CEACAM1 gene is regulated by endogenous CEACAM1 regulatory elements.
The seventh aspect of the invention relates to a CEACAM1 gene-deleted cell, wherein the CEACAM1 gene-deleted cell deletes exons 2 to 6 of endogenous CEACAM1 gene.
In an eighth aspect, the present invention relates to a method for preparing a tumor-bearing animal model, which comprises the step of preparing a tumor-bearing animal model from the above-mentioned CEACAM1 gene-humanized non-human animal.
Preferably, the method for preparing the tumor-bearing animal model further comprises the step of implanting tumor cells into the non-human animal or the offspring thereof, which is humanized by the above gene.
The ninth aspect of the invention provides a tumor-bearing animal model obtained by the preparation method.
In a tenth aspect the invention relates to a cell or cell line or primary cell culture derived from a non-human animal as described above or a tumor-bearing animal model as described above.
In an eleventh aspect, the present invention relates to a tissue or organ or culture thereof derived from the above-mentioned non-human animal or the above-mentioned tumor-bearing animal model.
Preferably, the tissue or organ or culture thereof is spleen, tumor or culture thereof.
In a twelfth aspect of the present invention, there is provided a humanized CEACAM1 protein, wherein the humanized CEACAM1 protein comprises all or part of human CEACAM1 protein, further preferably, the humanized CEACAM1 protein comprises all or part of extracellular region of human CEACAM1 protein, further preferably, comprises part of extracellular region, more preferably, the part of extracellular region comprises extracellular region of human CEACAM1 protein with 0-5 (e.g., 0, 1, 2, 3, 4, 5) amino acids removed from C-terminus.
Preferably, the humanized CEACAM1 protein comprises a sequence identical to SEQ ID NO: 2 from 35 to 423 or SEQ ID NO: 10 or an amino acid sequence having at least 70%, 80%, 85%, 90%, 95% or at least 99% identity to SEQ ID NO: 2 from 35 to 423 or SEQ ID NO: 10, or a pharmaceutically acceptable salt thereof.
Preferably, the humanized CEACAM1 protein further comprises a portion of non-human animal CEACAM1 protein, preferably a signal peptide, extracellular region, transmembrane region, cytoplasmic region of non-human animal CEACAM1 protein.
Preferably, the humanized CEACAM1 protein comprises an amino acid sequence encoded by exon 2 to exon 6 of human CEACAM1 gene, and an amino acid sequence of non-human animal CEACAM1 protein.
In one embodiment of the present invention, the humanized CEACAM1 protein comprises one of the following groups:
a) SEQ ID NO: 10 or SEQ ID NO: 2, part or all of the amino acid sequence shown at positions 35 to 423;
b) and SEQ ID NO: 10 or SEQ ID NO: 2 from position 35 to 423 is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or at least 99%;
c) and SEQ ID NO: 10 or SEQ ID NO: 2 from position 35 to 423 with no more than 10, 9, 8, 7, 6, 5, 4, 3, 2 or no more than 1 amino acid difference;
d) has the sequence shown in SEQ ID NO: 10 or SEQ ID NO: 2, positions 35 to 423, comprising substitution, deletion and/or insertion of one or more amino acid residues.
In a thirteenth aspect of the present invention, there is provided a humanized CEACAM1 gene encoding the above-mentioned humanized CEACAM1 protein, said humanized CEACAM1 gene comprising exons 2 to 6 of human CEACAM1 gene, and a nucleotide sequence of non-human animal CEACAM1 gene.
Preferably, the humanized CEACAM1 gene comprises SEQ ID NO: 5.
Preferably, the mRNA sequence transcribed by the humanized CEACAM1 gene comprises SEQ ID NO: 9, or a nucleotide sequence shown in the specification.
In a specific embodiment of the present invention, said humanized CEACAM1 gene comprises a human CEACAM1 nucleotide sequence portion selected from one of the following groups:
(A) comprises the amino acid sequence of SEQ ID NO: 5, all or part of a nucleotide sequence set forth in seq id no;
(B) comprises a nucleotide sequence substantially identical to SEQ ID NO: 5, a nucleotide sequence that is at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% identical in nucleotide sequence;
(C) comprises a nucleotide sequence substantially identical to SEQ ID NO: 5 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or by no more than 1 nucleotide;
(D) has the sequence shown in SEQ ID NO: 5, including nucleotide sequences with one or more nucleotides substituted, deleted and/or inserted.
In a specific embodiment of the present invention, the mRNA transcribed from the nucleotide sequence of the humanized CEACAM1 gene is selected from one of the following groups:
(a) comprises the amino acid sequence of SEQ ID NO: 9, or a portion or all of a nucleotide sequence set forth in seq id no;
(b) comprises a nucleotide sequence substantially identical to SEQ ID NO: 9, a nucleotide sequence that is at least 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or at least 99% identical in nucleotide sequence;
(c) comprises a nucleotide sequence substantially identical to SEQ ID NO: 9 by no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or by no more than 1 nucleotide; or
(d) Comprises the amino acid sequence of SEQ ID NO: 9, including nucleotide sequences with one or more nucleotides substituted, deleted and/or inserted.
In a fourteenth aspect, the present invention relates to a construct expressing the humanized CEACAM1 protein described above.
In a fifteenth aspect, the invention relates to a cell comprising the above construct.
In a sixteenth aspect, the invention relates to a tissue comprising the above-described cells.
Preferably, none of the above cells or cell lines or primary cell cultures, tissues or organs or cultures thereof is capable of developing into an individual animal.
In a seventeenth aspect of the present invention, there is provided a method of constructing a polygene-modified non-human animal, the method comprising:
(a) preparing and obtaining the non-human animal by applying the construction method;
(b) mating the non-human animal obtained in the step (a) with other genetically modified animals except CEACAM1, performing in vitro fertilization or directly performing gene editing, and screening to obtain the polygenic humanized modified non-human animal.
Preferably, the multi-gene humanized modified non-human animal is a two-gene humanized non-human animal, a three-gene humanized non-human animal, a four-gene humanized non-human animal, a five-gene humanized non-human animal, a six-gene humanized non-human animal, a seven-gene humanized non-human animal, an eight-gene humanized non-human animal or a nine-gene humanized non-human animal.
Preferably, the animals modified by other genes except CEACAM1 are selected from one or more than two of the animals modified by genes PD-1, PD-L1, TIGIT or CD 226.
The eighteenth aspect of the present invention relates to the use of the above non-human animal, the above tumor-bearing animal model, the above cell or cell line or primary cell culture, the above tissue or organ or culture thereof, the above humanized CEACAM1 protein or the above humanized CEACAM1 gene in the preparation of a medicament for treating or preventing tumors.
In a nineteenth aspect, the present invention relates to a non-human animal as described above, a tumor-bearing animal model as described above, a cell or cell line or primary cell culture as described above, a tissue or organ as described above or a culture thereof, a humanized CEACAM1 protein as described above or an application of a humanized CEACAM1 gene as described above in studies related to CEACAM1 gene or protein, wherein the application comprises:
A) product development involving the immunological process of human cells, use in the manufacture or screening of human antibodies;
B) as model systems for pharmacological, immunological, microbiological and medical research;
C) the production of immune processes involving human cells and the use of animal experimental disease models for pathogenic research, for the development of diagnostic strategies or for the development of therapeutic strategies;
D) screening, drug effect detection, efficacy evaluation, verification or evaluation of human CEACAM1 signal pathway modulators are studied in vivo; alternatively, the first and second electrodes may be,
E) the application of the CEACAM1 gene function, the human CEACAM1 antibody, the medicines and the drug effects aiming at the target site of the human CEACAM1, the medicines for immune-related diseases and the medicines for resisting tumors or inflammations is researched.
Preferably, the use comprises use in the preparation of a pharmaceutical composition or a test kit.
Preferably, the use is not a method of diagnosis or treatment of disease.
"tumors" as referred to herein include, but are not limited to, lymphomas, B cell tumors, T cell tumors, myeloid/monocytic tumors, non-small cell lung cancer, leukemias, ovarian cancer, nasopharyngeal cancer, breast cancer, endometrial cancer, colon cancer, rectal cancer, stomach cancer, bladder cancer, lung cancer, bronchial cancer, bone cancer, prostate cancer, pancreatic cancer, liver and bile duct cancer, esophageal cancer, kidney cancer, thyroid cancer, head and neck cancer, testicular cancer, glioblastoma, astrocytoma, melanoma, myelodysplastic syndrome, and sarcomas. Wherein the leukemia is selected from acute lymphocytic (lymphoblastic) leukemia, acute myelogenous leukemia, chronic lymphocytic leukemia, multiple myeloma, plasma cell leukemia, and chronic myelogenous leukemia; said lymphoma is selected from Hodgkin's lymphoma and non-Hodgkin's lymphoma, including B-cell lymphoma, diffuse large B-cell lymphoma, follicular lymphoma, mantle cell lymphoma, marginal zone B-cell lymphoma, T-cell lymphoma, and Waldenstrom's macroglobulinemia; the sarcoma is selected from osteosarcoma, Ewing's sarcoma, leiomyosarcoma, synovial sarcoma, soft tissue sarcoma, angiosarcoma, liposarcoma, fibrosarcoma, rhabdomyosarcoma, and chondrosarcoma. In one embodiment of the invention, the tumor is selected from the group consisting of a B cell tumor, a T cell tumor, a bone marrow/monocyte tumor. Preferably B-or T-cell Acute Lymphoblastic Leukemia (ALL), Acute Myeloid Leukemia (AML), non-Hodgkin's lymphoma (NHL) and Multiple Myeloma (MM), nasopharyngeal carcinoma, lung carcinoma.
The "immune-related diseases" described in the present invention include, but are not limited to, allergy, asthma, myocarditis, nephritis, hepatitis, systemic lupus erythematosus, rheumatoid arthritis, scleroderma, hyperthyroidism, idiopathic thrombocytopenic purpura, autoimmune hemolytic anemia, ulcerative colitis, autoimmune liver disease, diabetes, pain, or neurological disorder, etc. In one embodiment of the invention. The immune-related disease is rheumatoid arthritis.
The term "inflammation" as used herein includes acute inflammation as well as chronic inflammation. Specifically, it includes, but is not limited to, degenerative inflammation, exudative inflammation (serous inflammation, cellulolytic inflammation, suppurative inflammation, hemorrhagic inflammation, necrotizing inflammation, catarrhal inflammation), proliferative inflammation, specific inflammation (tuberculosis, syphilis, leprosy, lymphogranuloma, etc.).
The CEACAM1 gene humanized non-human animal body can normally express human or humanized CEACAM1 protein. Can be used for drug screening, drug effect evaluation, immunity-related diseases and tumor treatment aiming at the target site of human CEACAM1, can accelerate the development process of new drugs, and can save time and cost. Provides effective guarantee for researching CEACAM1 protein function and screening related disease drugs.
The invention relates to a whole or part, wherein the whole is a whole, and the part is a part of the whole or an individual forming the whole.
The humanized CEACAM1 protein comprises a part derived from human CEACAM1 protein and a part of non-human CEACAM1 protein. Wherein, the "human CEACAM1 protein" is the same as the whole "human CEACAM1 protein", namely the amino acid sequence of the "human CEACAM1 protein" is consistent with the full-length amino acid sequence of the human CEACAM1 protein. The "part of human CEACAM1 protein" is a continuous or alternate 5-526 (preferably 10-389) amino acid sequence which is identical to the amino acid sequence of human CEACAM1 protein or has more than 70% homology with the amino acid sequence of human CEACAM1 protein.
The whole extracellular region of the human CEACAM1 protein represents that the amino acid sequence of the whole extracellular region of the human CEACAM1 protein is consistent with the full-length amino acid sequence of the extracellular region of the human CEACAM1 protein.
The "part of the extracellular region of the human CEACAM1 protein" of the invention is identical to the amino acid sequence of the extracellular region of the human CEACAM1 protein by 5-394 (preferably 5-389) amino acid sequences in sequence or at intervals, or has homology of more than 70% with the amino acid sequence of the extracellular region of the human CEACAM1 protein.
The humanized CEACAM1 gene comprises a part derived from a human CEACAM1 nucleotide sequence and a part of a non-human CEACAM1 gene. Wherein, the 'human CEACAM1 nucleotide sequence' is identical to the 'human CEACAM1 nucleotide sequence' in all, namely the nucleotide sequence is consistent with the full-length nucleotide sequence of the human CEACAM1 nucleotide sequence. The part of the human CEACAM1 nucleotide sequence is a continuous or alternate 20-21177bp (preferably 20-14906bp or 20-1167 bp) nucleotide sequence which is consistent with the human CEACAM1 nucleotide sequence or has more than 70 percent of homology with the human CEACAM1 nucleotide sequence.
The "exon" from xx to xxx or all of the "exons from xx to xxx" in the present invention include nucleotide sequences of exons and introns therebetween, for example, the "exons 2 to 6" include all nucleotide sequences of exon 2, intron 2-3, exon 3, intron 3-4, exon 4, intron 4-5, exon 5, intron 5-6 and exon 6.
The "x-xx intron" described herein represents an intron between the x exon and the xx exon. For example, "intron 2-3" means an intron between exon 2 and exon 3.
"part of an exon" as referred to herein means that the nucleotide sequence is identical to all exon nucleotide sequences in a sequence of several, several tens or several hundreds of nucleotides in succession or at intervals. For example, the portion of exon 2 of the nucleotide sequence of human CEACAM1, comprises contiguous or spaced nucleotide sequences of 5-360bp, preferably 10-322bp, identical to the exon 2 nucleotide sequence of human CEACAM 1. In a specific embodiment of the present invention, the "portion of exon 2" contained in said "humanized CEACAM1 gene" comprises at least the nucleotide sequence of exon 2 encoding the extracellular domain of human CEACAM1 protein.
The "locus" of the present invention refers to the position of a gene on a chromosome in a broad sense and refers to a DNA fragment of a certain gene in a narrow sense, and the gene may be a single gene or a part of a single gene. For example, the "CEACAM 1 locus" refers to a DNA fragment of any one of exons 1 to 7 of CEACAM1 gene. Preferably any one or a combination of two or more of exon 1, exon 2, exon 3, exon 4, exon 5, exon 6, exon 7, exon 8, exon 9, or introns therebetween, or all or part of one or two or more thereof, more preferably exons 2 to 6 of the CEACAM1 gene.
The "nucleotide sequence" of the present invention includes a natural or modified ribonucleotide sequence and a deoxyribonucleotide sequence. Preferably DNA, cDNA, pre-mRNA, rRNA, hnRNA, miRNAs, scRNA, snRNA, siRNA, sgRNA, tRNA.
The term "treating" (or "treatment") as used herein means slowing, interrupting, arresting, controlling, stopping, alleviating, or reversing the progression or severity of one sign, symptom, disorder, condition, or disease, but does not necessarily refer to the complete elimination of all disease-related signs, symptoms, conditions, or disorders. The term "treatment" or the like refers to a therapeutic intervention that ameliorates the signs, symptoms, etc. of a disease or pathological state after the disease has begun to develop.
"homology" as used herein means that, in the context of using a protein sequence or a nucleotide sequence, one skilled in the art can adjust the sequence as needed to obtain a sequence having (including but not limited to) 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%, 18%, 19%, 20%, 21%, 22%, 23%, 24%, 25%, 26%, 27%, 28%, 29%, 30%, 31%, 32%, 33%, 34%, 35%, 36%, 37%, 38%, 39%, 40%, 41%, 42%, 43%, 44%, 45%, 46%, 47%, 48%, 49%, 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 70%, 80%, 81%, 82%, 83%, 84%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, 99.1%, 99.2%, 99.3%, 99.4%, 99.5%, 99.6%, 99.7%, 99.8%, 99.9% identity.
One skilled in the art can determine and compare sequence elements or degrees of identity to distinguish between additional mouse and human sequences.
In one aspect, the non-human animal is a mammal. In one aspect, the non-human animal is a small mammal, such as a muridae or superfamily murinus. In one embodiment, the genetically modified animal is a rodent. In one embodiment, the rodent is selected from a mouse, a rat, and a hamster. In one embodiment, the rodent is selected from the murine family. In one embodiment, the genetically modified animal is from a family selected from the family of the family. In a particular embodiment, the genetically modified rodent is selected from a true mouse or rat (superfamily murinus), a gerbil, a spiny mouse, and a crowned rat. In one embodiment, the genetically modified mouse is from a member of the murine family. In one embodiment, the animal is a rodent. In a particular embodiment, the rodent is selected from a mouse and a rat. In one embodiment, the non-human animal is a mouse.
In a particular embodiment, the non-human animal is a rodent, a strain of C57BL, C58, a/Br, CBA/Ca, CBA/J, CBA/CBA/mouse selected from BALB/C, a/He, a/J, A/WySN, AKR/A, AKR/J, AKR/N, TA1, TA2, RF, SWR, C3H, C57BR, SJL, C57L, DBA/2, KM, NIH, ICR, CFW, FACA, C57BL/A, C57BL/An, C57BL/GrFa, C57BL/KaLwN, C57BL/6, C57BL/6J, C57BL/6ByJ, C57BL/6NJ, C57BL/10, C57BL/10 sn, C57BL/10Cr and C57 BL/Ola.
The practice of the present invention will employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant DNA, and immunology. These techniques are explained in detail in the following documents. For example: molecular Cloning A Laboratory Manual, 2nd Ed., ed. By Sambrook, FritschandManiatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning, Volumes I and II (d.n. glovered., 1985); oligonucleotide Synthesis (m.j. gaited., 1984); mullisetal U.S. Pat. No.4, 683, 195; nucleic Acid Hybridization (B.D. Hames & S.J. Higgins.1984); transformation And transformation (B.D. Hames & S.J. Higgins.1984); culture Of Animal Cells (r.i. freshney, alanr.liss, inc., 1987); immobilized Cells And Enzymes (IRL Press, 1986); B.Perbal, A Practical Guide To Molecular Cloning (1984); the series, Methods In ENZYMOLOGY (J.Abelson and M.Simon, eds., In-chief, Academic Press, Inc., New York), specific, volumes, 154 and 155 (Wuetal. eds.) and Vol.185, "Gene Expression Technology" (D.Goeddel, ed.); gene Transfer Vectors For Mammalian Cells (J.HMiller and M.P.Caloseds, 1987, Cold Spring Harbor Laboratory); immunochemical Methods In Cell And Molecular Biology (Mayer And Walker, eds., Academic Press, London, 1987); handbook Of Experimental Immunology, Volumes V (d.m.weir and c.c.blackwell, eds., 1986); and Manipulating the Mouse Embryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986).
The foregoing is merely a summary of aspects of the invention and is not, and should not be taken as, limiting the invention in any way.
All patents and publications mentioned in this specification are herein incorporated in their entirety by reference into the specification, to the same extent as if each individual publication was specifically and individually indicated to be incorporated herein by reference. Those skilled in the art will recognize that certain changes may be made to the invention without departing from the spirit or scope of the invention.
The following examples further illustrate the invention in detail and are not to be construed as limiting the scope of the invention or the particular methods described herein.
Drawings
Embodiments of the invention are described in detail below with reference to the attached drawing figures, wherein:
FIG. 1: schematic comparison of mouse CEACAM1 locus to human CEACAM1 locus (not to scale);
FIG. 2: schematic representation (not to scale) of humanization transformation of the CEACAM1 gene in mice;
FIG. 3: CEACAM1 gene targeting strategy and targeting vector design schematic (not to scale);
FIG. 4: PCR assay of CEACAM1 recombinant cells, in which WT was the wild-type control, H2O is water control, PC is positive control, and M is Marker;
FIG. 5: CEACAM1 post-recombination cellular Southern blot results, with WT being wild type control;
FIG. 6: schematic representation (not to scale) of FRT recombination process of CEACAM1 gene humanized mouse;
FIG. 7: CEACAM1 gene humanized mouse F1 mouse tail PCR identification result, wherein WT is wild type, H2O is water control, and PC is positive control;
FIG. 8: the flow detection result of CEACAM1 protein on spleen B cells of C57BL/6 wild type mice (WT) and CAECAM1 gene humanized Homozygote mice (Homozygate) shows that mCEACAM1 represents murine CEACAM1 protein, and h CEACAM1 represents humanized CEACAM1 protein.
Detailed Description
The invention will be further described with reference to specific embodiments, and the advantages and features of the invention will become apparent as the description proceeds. These examples are illustrative only and do not limit the scope of the present invention in any way. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention, and that such changes and modifications may be made without departing from the spirit and scope of the invention.
In each of the following examples, the equipment and materials were obtained from several companies as indicated below:
KpnI, MfeI and AseI enzymes are purchased from NEB, and the cargo numbers are R3142L, R3589S and R0526S respectively;
c57BL/6 mice and Flp tool mice were purchased from the national rodent experimental animal seed center of the Chinese food and drug assay institute;
brilliant Violet 510 anti-mouse CD45 was purchased from Biolegend, cat # 103138;
PerCP/Cy5.5 anti-mouse TCR β chain from Biolegend, cat # 109228;
FITC anti-Mouse CD19 was purchased from Biolegend, cat # 1B 256058;
APC anti-mouse CD66a (CEACAM1a) Antibody, available from Biolegend under cat No. 134509;
PE anti-human CD66a/c/e Antibody was purchased from Biolegend, cat # 342303.
Example 1 CEACAM1 Gene humanized mouse
A schematic comparison of the mouse CEACAM1 Gene (NCBI Gene ID: 26365, Primary source: MGI:1347245, UniProt: Q925P3, from position 25161127 to 25177072 on chromosome 7 NC-000073.7, based on transcript NM-001039185.1 and its encoded protein NP-001034274.1 (SEQ ID NO: 1)) and the human CEACAM1 Gene (NCBI Gene ID: 634, Primary source: HGNC:1814, UniProt ID: P13688-1, from position 42507306 to 42528482 on chromosome 19 NC-000019.10, based on transcript NM-001712.5 and its encoded protein NP-001703.2 (SEQ ID NO: 2)) is shown in FIG. 1.
To achieve the object of the present invention, a nucleotide sequence encoding human CEACAM1 protein may be introduced at the endogenous CEACAM1 locus of a mouse, so that the mouse expresses the human or humanized CEACAM1 protein. Specifically, by using a gene editing technology, under the control of a mouse CEACAM1 gene regulatory element, a part sequence from a part sequence of a mouse exon 2 to a part sequence of a mouse exon 6 is replaced by a part sequence from the part sequence of the exon 2 to the part sequence of the exon 6 containing the human CEACAM1 gene, and a schematic diagram of the humanized CEACAM1 locus is shown in FIG. 2, so that the humanized transformation of the mouse CEACAM1 gene is realized.
The targeting strategy was designed as shown in FIG. 3, which shows the homologous arm sequences on the targeting vector containing the upstream and downstream of the mouse CEACAM1 gene, as well as an A fragment comprising the sequence of human CEACAM 1. Wherein, the upstream homology arm sequence (5 'homology arm, SEQ ID NO: 3) is identical to the nucleotide sequence from position 25176091 to 25182711 of NCBI accession No. NC-000073.7, and the downstream homology arm sequence (3' homology arm, SEQ ID NO: 4) is identical to the nucleotide sequence from position 25163099 to 25165433 of NCBI accession No. NC-000073.7. The nucleotide sequence of human CEACAM1 on fragment A (SEQ ID NO: 5) is identical to the nucleotide sequence from position 42512457 to 42527362 of NCBI accession No. NC-000019.10; the connection of the downstream of the human CEACAM1 sequence with the mouse is designed to be 5' -acagataatgctctaccacaagaaaatGGCCTCTCAGATGGCGCCAT-3' (SEQ ID NO: 6), wherein the sequence "aaat"t" in "is the last nucleotide, sequence, of a human"GGCCThe first "G" in "is the first nucleotide of the mouse sequence.
The targeting vector also comprises a resistance gene used for positive clone screening, namely neomycin phosphotransferase coding sequence Neo, and two site-specific recombination system Frt recombination sites which are arranged in the same direction are arranged on two sides of the resistance gene to form a Neo cassette (Neo cassette). Wherein the connection between the 5 'end of the Neo box and the mouse gene is designed to be 5' -gtccaggaagagagagaagggagggactccaagaagcagcaagactatgcGGTACCGAATTCCGAAGTTCCTATTCTCTAGAAAGTATAGGAACTT-3' (SEQ ID NO: 7), wherein the sequence "ctatgc"last of" c "is the last nucleotide, sequence" of the mouse "GGTA"the first" G "is the first nucleotide of the Neo cassette; the connection between the 3 'end of the Neo box and the mouse gene is designed to be 5' -GAAGTTCCTATTCTCTAGAAAGTATAGGAACTTCATCAGTCAGGTACATAATGGTGGATCCCAATTGAGGTAGCATGTCCTGCTGACTGAAGCAGC-3' (SEQ ID NO: 8), wherein the sequence "AATTG"G" of "is the last nucleotide, sequence of the Neo cassette"AGGTA"the first" A "of" is the first nucleotide of the mouse. In addition, a coding gene with a negative selection marker (diphtheria toxin a subunit coding gene (DTA)) was constructed downstream of the 3' homology arm of the targeting vector. The mRNA sequence of the humanized mouse CEACAM1 after being transformed is shown as SEQ ID NO: 9, the expressed protein sequence is shown as SEQ ID NO: shown at 10.
Given that human CEACAM1 has multiple subtypes or transcripts, the methods described herein can be applied to other subtypes or transcripts.
The construction of the targeting vector can be carried out by adopting a conventional method, such as enzyme digestion connection and the like. And carrying out preliminary verification on the constructed targeting vector by enzyme digestion, and then sending the targeting vector to a sequencing company for sequencing verification. The method comprises the steps of performing electroporation transfection on a targeting vector which is verified to be correct by sequencing into embryonic stem cells of a C57BL/6 mouse, screening the obtained cells by using a positive clone screening marker gene, detecting and confirming the integration condition of an exogenous gene by using PCR and Southern Blot technology, screening correct positive clone cells, detecting clones which are verified to be positive by PCR (figure 4), further performing Southern Blot (digesting cell DNA by KpnI or MfeI or AseI respectively and hybridizing by using 3 probes, wherein the length of the probes and target fragments is shown in table 1), and detecting the result as shown in figure 5, wherein the detection result shows that 10 clones which are verified to be positive by PCR, and the other 8 clones except 1-A11 and 2-F03 are verified to be positive by sequencing and have no random insertion, and are specifically numbered as 1-A05, 1-C01, 1-C05, 1-F06, 06 and 2-F03, 1-H06, 1-E06, 2-B12 and 2-C11.
Table 1: specific probes and target fragment lengths
Figure 496816DEST_PATH_IMAGE001
Wherein the PCR assay comprises the following primers:
F1:5’-GCTCGACTAGAGCTTGCGGA-3’(SEQ ID NO:11),
R1:5’-GGAGTCAATAGAGTGAATGCATGAGTGT-3’(SEQ ID NO:12);
the Southern Blot detection comprises the following probe primers:
5 'Probe (5' Probe):
5’Probe-F:5’-TATCACAAGAGGGAATAAACCACAGGGT-3’(SEQ ID NO:13),
5’Probe-R:5’-ATTGCACCATGAGGTTGAACAGCAT-3’(SEQ ID NO:14);
3 'Probe (3' Probe):
3’Probe-F:5’-ACTCCTACACACAGAGCACTAACAG-3’(SEQ ID NO:15),
3’Probe-R:5’-CAGGCCAGAGGAAATGTAACAAAGG-3’(SEQ ID NO:16);
neo Probe (Neo Probe):
Neo Probe-F:5’-CATAAGGTGGGATCTCTCAGACAGG-3’(SEQ ID NO:17),
Neo Probe-R:5’-GCTCTGAAGTCCAGTAGGATCATGT-3’(SEQ ID NO:18)。
the selected correctly positive cloned cells (black mice) are introduced into the separated blastocysts (white mice) according to the known technology in the field, the obtained chimeric blastocysts are transferred into a culture solution for short-term culture and then transplanted into the oviduct of a recipient mother mouse (white mouse), and F0 generation chimeric mice (black and white alternate) can be produced. The F1 generation mice are obtained by backcrossing the F0 generation chimeric mice and the wild mice, and the F1 generation heterozygous mice are mutually mated to obtain the F2 generation homozygous son mice. The positive mice can also be mated with Flp tool mice to remove the positive clone screening marker gene (the process is shown in the schematic diagram in figure 6), and then the positive mice and Flp tool mice are mated with each other to obtain the CEACAM1 gene humanized homozygote mice. The somatic cell genotype of the progeny mice can be identified by PCR (primers are shown in Table 2), and the results of identification of exemplary F1 generation mice (from which the Neo marker gene has been removed) are shown in FIG. 7, in which 12 mice numbered F1-01, F1-02, F1-03, F1-04, F1-05, F1-06, F1-07, F1-08, F1-09, F1-10, F1-11, and F1-12 are all positive heterozygous mice. This shows that the method can be used for constructing the CEACAM1 gene humanized mouse which can be stably passaged and has no random insertion.
Table 2: primer name and specific sequence
Figure 626446DEST_PATH_IMAGE002
The expression of the humanized CEACAM1 protein in positive mice can be confirmed by conventional detection methods, such as flow cytometry. Specifically, 1 of 8-week-old female C57BL/6 wild-type mice and 8-week-old female CEACAM1 gene-humanized homozygote mice were each harvested, splenic tissues were harvested after cervical dislocation, and flow-type detection was performed after recognition staining with an anti-Mouse CD45 Antibody Brilliant Violet 510 anti-Mouse CD45, a Mouse T cell-specific recognition Antibody, PerCP/cy5.5 anti-Mouse TCR β chain, a B cell-specific recognition Antibody, FITC anti-Mouse CD19, an anti-Mouse CEACAM1 Antibody, APC anti-Mouse CD66a (CEACAM1a) antipod, an anti-human CEACAM1 Antibody PE anti-man CD66a (CEACAM1 a)/C/e antipod, and the results of detection are shown in fig. 8.
As can be seen from fig. 8, the expression of murine CEACAM1 protein was detected on spleen cells of C57BL/6 wild-type mice (fig. 8A), and humanized CEACAM1 protein was not detected (fig. 8C); humanized CEACAM1 protein was detected on humanized homozygote mouse spleen cells of CEACAM1 gene (fig. 8D), and expression of murine CEACAM1 protein was not detected (fig. 8B).
Example 2 preparation of double-humanized or multiple double-humanized mice
A double-humanized or multi-humanized mouse model can be prepared by using the CEACAM1 mouse prepared by the method. For example, in the above example 1, the embryonic stem cells used for blastocyst microinjection can be selected from mice containing other gene modifications such as PD-1, PD-L1, TIGIT, CD226, etc., or can be obtained from humanized CEACAM1 mice by isolating mouse ES embryonic stem cells and gene recombination targeting techniques to obtain a two-gene or multi-gene modified mouse model of CEACAM1 and other gene modifications. The CEACAM1 mouse homozygote or heterozygote obtained by the method can also be mated with other gene modified homozygote or heterozygote mice, the offspring thereof is screened, the humanized CEACAM1 and other gene modified double-gene or multi-gene modified heterozygote mice can be obtained with a certain probability according to Mendelian genetic rule, then the heterozygote is mated with each other to obtain double-gene or multi-gene modified homozygote, and the double-gene or multi-gene modified mice can be used for in vivo efficacy verification of targeted human CEACAM1 and other gene regulators and the like.
The preferred embodiments of the present invention have been described in detail, however, the present invention is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present invention within the technical idea of the present invention, and these simple modifications are within the protective scope of the present invention.
It should be noted that the various technical features described in the above embodiments can be combined in any suitable manner without contradiction, and the invention is not described in any way for the possible combinations in order to avoid unnecessary repetition.
In addition, any combination of the various embodiments of the present invention is also possible, and the same should be considered as the disclosure of the present invention as long as it does not depart from the spirit of the present invention.
Sequence listing
<110> Baiosai Diagram (Beijing) pharmaceutical science and technology Co., Ltd
Construction method and application of <120> CEACAM1 gene humanized non-human animal
<130> 1
<160> 25
<170> SIPOSequenceListing 1.0
<210> 1
<211> 521
<212> PRT
<213> Mouse (Mouse)
<400> 1
Met Glu Leu Ala Ser Ala His Leu His Lys Gly Gln Val Pro Trp Gly
1 5 10 15
Gly Leu Leu Leu Thr Ala Ser Leu Leu Ala Ser Trp Ser Pro Ala Thr
20 25 30
Thr Ala Glu Val Thr Ile Glu Ala Val Pro Pro Gln Val Ala Glu Asp
35 40 45
Asn Asn Val Leu Leu Leu Val His Asn Leu Pro Leu Ala Leu Gly Ala
50 55 60
Phe Ala Trp Tyr Lys Gly Asn Thr Thr Ala Ile Asp Lys Glu Ile Ala
65 70 75 80
Arg Phe Val Pro Asn Ser Asn Met Asn Phe Thr Gly Gln Ala Tyr Ser
85 90 95
Gly Arg Glu Ile Ile Tyr Ser Asn Gly Ser Leu Leu Phe Gln Met Ile
100 105 110
Thr Met Lys Asp Met Gly Val Tyr Thr Leu Asp Met Thr Asp Glu Asn
115 120 125
Tyr Arg Arg Thr Gln Ala Thr Val Arg Phe His Val His Pro Ile Leu
130 135 140
Leu Lys Pro Asn Ile Thr Ser Asn Asn Ser Asn Pro Val Glu Gly Asp
145 150 155 160
Asp Ser Val Ser Leu Thr Cys Asp Ser Tyr Thr Asp Pro Asp Asn Ile
165 170 175
Asn Tyr Leu Trp Ser Arg Asn Gly Glu Ser Leu Ser Glu Gly Asp Arg
180 185 190
Leu Lys Leu Ser Glu Gly Asn Arg Thr Leu Thr Leu Leu Asn Val Thr
195 200 205
Arg Asn Asp Thr Gly Pro Tyr Val Cys Glu Thr Arg Asn Pro Val Ser
210 215 220
Val Asn Arg Ser Asp Pro Phe Ser Leu Asn Ile Ile Tyr Gly Pro Asp
225 230 235 240
Thr Pro Ile Ile Ser Pro Ser Asp Ile Tyr Leu His Pro Gly Ser Asn
245 250 255
Leu Asn Leu Ser Cys His Ala Ala Ser Asn Pro Pro Ala Gln Tyr Phe
260 265 270
Trp Leu Ile Asn Glu Lys Pro His Ala Ser Ser Gln Glu Leu Phe Ile
275 280 285
Pro Asn Ile Thr Thr Asn Asn Ser Gly Thr Tyr Thr Cys Phe Val Asn
290 295 300
Asn Ser Val Thr Gly Leu Ser Arg Thr Thr Val Lys Asn Ile Thr Val
305 310 315 320
Leu Glu Pro Val Thr Gln Pro Phe Leu Gln Val Thr Asn Thr Thr Val
325 330 335
Lys Glu Leu Asp Ser Val Thr Leu Thr Cys Leu Ser Asn Asp Ile Gly
340 345 350
Ala Asn Ile Gln Trp Leu Phe Asn Ser Gln Ser Leu Gln Leu Thr Glu
355 360 365
Arg Met Thr Leu Ser Gln Asn Asn Ser Ile Leu Arg Ile Asp Pro Ile
370 375 380
Lys Arg Glu Asp Ala Gly Glu Tyr Gln Cys Glu Ile Ser Asn Pro Val
385 390 395 400
Ser Val Arg Arg Ser Asn Ser Ile Lys Leu Asp Ile Ile Phe Asp Pro
405 410 415
Thr Gln Gly Gly Leu Ser Asp Gly Ala Ile Ala Gly Ile Val Ile Gly
420 425 430
Val Val Ala Gly Val Ala Leu Ile Ala Gly Leu Ala Tyr Phe Leu Tyr
435 440 445
Ser Arg Lys Ser Gly Gly Gly Ser Asp Gln Arg Asp Leu Thr Glu His
450 455 460
Lys Pro Ser Ala Ser Asn His Asn Leu Ala Pro Ser Asp Asn Ser Pro
465 470 475 480
Asn Lys Val Asp Asp Val Ala Tyr Thr Val Leu Asn Phe Asn Ser Gln
485 490 495
Gln Pro Asn Arg Pro Thr Ser Ala Pro Ser Ser Pro Arg Ala Thr Glu
500 505 510
Thr Val Tyr Ser Glu Val Lys Lys Lys
515 520
<210> 2
<211> 526
<212> PRT
<213> human (human)
<400> 2
Met Gly His Leu Ser Ala Pro Leu His Arg Val Arg Val Pro Trp Gln
1 5 10 15
Gly Leu Leu Leu Thr Ala Ser Leu Leu Thr Phe Trp Asn Pro Pro Thr
20 25 30
Thr Ala Gln Leu Thr Thr Glu Ser Met Pro Phe Asn Val Ala Glu Gly
35 40 45
Lys Glu Val Leu Leu Leu Val His Asn Leu Pro Gln Gln Leu Phe Gly
50 55 60
Tyr Ser Trp Tyr Lys Gly Glu Arg Val Asp Gly Asn Arg Gln Ile Val
65 70 75 80
Gly Tyr Ala Ile Gly Thr Gln Gln Ala Thr Pro Gly Pro Ala Asn Ser
85 90 95
Gly Arg Glu Thr Ile Tyr Pro Asn Ala Ser Leu Leu Ile Gln Asn Val
100 105 110
Thr Gln Asn Asp Thr Gly Phe Tyr Thr Leu Gln Val Ile Lys Ser Asp
115 120 125
Leu Val Asn Glu Glu Ala Thr Gly Gln Phe His Val Tyr Pro Glu Leu
130 135 140
Pro Lys Pro Ser Ile Ser Ser Asn Asn Ser Asn Pro Val Glu Asp Lys
145 150 155 160
Asp Ala Val Ala Phe Thr Cys Glu Pro Glu Thr Gln Asp Thr Thr Tyr
165 170 175
Leu Trp Trp Ile Asn Asn Gln Ser Leu Pro Val Ser Pro Arg Leu Gln
180 185 190
Leu Ser Asn Gly Asn Arg Thr Leu Thr Leu Leu Ser Val Thr Arg Asn
195 200 205
Asp Thr Gly Pro Tyr Glu Cys Glu Ile Gln Asn Pro Val Ser Ala Asn
210 215 220
Arg Ser Asp Pro Val Thr Leu Asn Val Thr Tyr Gly Pro Asp Thr Pro
225 230 235 240
Thr Ile Ser Pro Ser Asp Thr Tyr Tyr Arg Pro Gly Ala Asn Leu Ser
245 250 255
Leu Ser Cys Tyr Ala Ala Ser Asn Pro Pro Ala Gln Tyr Ser Trp Leu
260 265 270
Ile Asn Gly Thr Phe Gln Gln Ser Thr Gln Glu Leu Phe Ile Pro Asn
275 280 285
Ile Thr Val Asn Asn Ser Gly Ser Tyr Thr Cys His Ala Asn Asn Ser
290 295 300
Val Thr Gly Cys Asn Arg Thr Thr Val Lys Thr Ile Ile Val Thr Glu
305 310 315 320
Leu Ser Pro Val Val Ala Lys Pro Gln Ile Lys Ala Ser Lys Thr Thr
325 330 335
Val Thr Gly Asp Lys Asp Ser Val Asn Leu Thr Cys Ser Thr Asn Asp
340 345 350
Thr Gly Ile Ser Ile Arg Trp Phe Phe Lys Asn Gln Ser Leu Pro Ser
355 360 365
Ser Glu Arg Met Lys Leu Ser Gln Gly Asn Thr Thr Leu Ser Ile Asn
370 375 380
Pro Val Lys Arg Glu Asp Ala Gly Thr Tyr Trp Cys Glu Val Phe Asn
385 390 395 400
Pro Ile Ser Lys Asn Gln Ser Asp Pro Ile Met Leu Asn Val Asn Tyr
405 410 415
Asn Ala Leu Pro Gln Glu Asn Gly Leu Ser Pro Gly Ala Ile Ala Gly
420 425 430
Ile Val Ile Gly Val Val Ala Leu Val Ala Leu Ile Ala Val Ala Leu
435 440 445
Ala Cys Phe Leu His Phe Gly Lys Thr Gly Arg Ala Ser Asp Gln Arg
450 455 460
Asp Leu Thr Glu His Lys Pro Ser Val Ser Asn His Thr Gln Asp His
465 470 475 480
Ser Asn Asp Pro Pro Asn Lys Met Asn Glu Val Thr Tyr Ser Thr Leu
485 490 495
Asn Phe Glu Ala Gln Gln Pro Thr Gln Pro Thr Ser Ala Ser Pro Ser
500 505 510
Leu Thr Ala Thr Glu Ile Ile Tyr Ser Glu Val Lys Lys Gln
515 520 525
<210> 3
<211> 6621
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
ggtgggaagc agtgactaag gaatggggtt gtgtatgacc aaagtatatt ttcacgtggg 60
tgtggagttg tcaaataatt aaagtacttt caaaaaagta agtggctcca atcctcttca 120
ctatttgaag aaaacagaaa aaaaaaaaaa aaacttagaa cacaaaagtg cctccaccga 180
ccctcattcg aaacctgggg gggtgggggg gggaggggag gggaggacaa gagcaagcaa 240
gctggtctac ctcccaaagg cttaatgttc tggaaactat taaacacttt caaatgtgaa 300
aaattaaatt gcttttaagt tttagaagaa tttgcattaa aaaacaatag ttggttttgt 360
gaatgaatac tggccctcat gctttagtca agtttactta agtcttctgg ataactttat 420
gtcaacttga cacaggctat agtcatgtca gagaagggag ctccaattga gaaaatgcat 480
tcttaagata cagctgtagg caaccctgta gggaactttt taaattattg atccatgagg 540
gagggcccag accattgtgg ttgctgctgt ccctgggctg gtggtcctgg gttctatagg 600
aaagcagggt aagcaagcca gtaagcagcc cccctccatg gcctctgcat cagctcctgc 660
ctccaggctc cttccctatt taagatcctg tcctgccttc ctttgatgat gagcagtaag 720
taatagggag gtgtaaacca aataaaccct tcggacccaa agttccctct gatcatgatg 780
tttcatcaca gcgacagtaa ccctcagaca gcaagtttgt gaattaaatc tacaaacgga 840
aatgagaata tgcagagcag gaaaacattt ataaactata ctcacaacaa ataactggta 900
caactcagaa aaagcccaat acctagttat catggaaata caaattaaac agctgtgaga 960
taccagccca ccccagttag aatagctata atcaaagtaa ctgataacaa atgttgggag 1020
aggtgttggt ggctttttaa taagtgtttg gttttctctt tacccgggta gcagctcccg 1080
aataatagac tggagtatta tatttattta aaggtttaag ggcacaactg agcagtatga 1140
acctatttta atcctctaag ctaatctggt ggcctcccag cccaaatccc caagatcctt 1200
gcattttatg gctttctctg ctccagctgt ttctccatca tgtctcatgg actctctccc 1260
cttggctaat ctctcttcct cctctctgtt tctctccctt ccttcctccc tctttccctc 1320
cctcccctgg atgggaggaa gtccagtcct attctataaa ccctgctcag tcattagctg 1380
atcagctttt attgacaaat cagagaataa atggggacca atgcttacac agccctgaag 1440
caggagacac agtatagata gttacctggc aaaggggcac agaaatcagc attatacaag 1500
gttaatattt aaacaatatg caataacatt atgcctacag agagagtgtg ggggggaaga 1560
gaaacacaca ctcattgctg gtgggtgtgc aaattaatag ccattataca aagctgtgtg 1620
gaagtctctt aaaaaatagc aataccatat aacgcagcta tttcactgga catataccca 1680
gagaacctgt accctaccat tgagatatct gcactcccat gattacagct gctccatccc 1740
atatagcaag gaagtagaac caacctagat ctccatcaat atatgaatgc acaatacaag 1800
tctaccacac actcacacat acacacacac acacacacac acacactcac acatacacac 1860
acacacacac acacacacac acactgtgaa gtattgttcg tctctacaga aagatgaatc 1920
acaaagttta cagggatgtg gatagactta gaatatatca tgttaaatga gatcatttga 1980
tctcaggaag aaaaccccat atgtcctttc ttgtttgtgg atcctagctc taatgtctgc 2040
atgtacacat gtaattacat atataacatt aggaaggaga acatgaaggg gtttggtctg 2100
gtttggtttg gtttggtttg gtttggtttt caagacaggg tttctctgtg tagccctggc 2160
tgtcctggaa ctcactctgt agaccaggct ggcctgaaac tcacagagat cctccttcct 2220
ctgcctcccg agtgttggag gtgtgcgcca ccgcctggca atggtgtggt agttctaagg 2280
agacaaagtg ctccctcatg ggctcacaca tatgatgtac ctggtgtggg ccctttggat 2340
gctctctcgc atgtaaccgg ttttgacgta aaacaggccc ttagcaagca gtgtccacgt 2400
gggactccat gcggggctgc tgccttactg ggaggcctgg atctgaagtt ctgtctttca 2460
tcctttgaat attcctgtac agtgtagttc ttgtaagtac caacaattag ggcccatagt 2520
tctttaaaca agaatttcac ctataaaaac tatagagagt gaagacactc ataagtatgc 2580
ttaaataatc ataattatga atttataaaa caaattccaa ggaatgctga ctcactgcta 2640
ctgcatctca aatactaaca caccaggaat tacacacact aagaagcttc ctttttgtgg 2700
tttttaatgt tacaattctt tttttaatat atgtatataa tttttattag gtattttcct 2760
catttacatt tccaatgcta tcccaaaagt cccccatacc atccccccga ctcccacttt 2820
ttggccctgg cgttcccctg tactggagca tataaagttt gcaagtccaa tgggcctctc 2880
tttccagtga tggccgacta ggccatcttt tgatacatat gcagctagag tcaagagctc 2940
cggggtactg gttagttcat aatgttgttc cacctatagg gttgcagatc cctttagctc 3000
cttgggtact ttctctagct cctccattgg gggccctgtg atccatccaa tagctgacta 3060
tgagcatcca cttctgtgtt ttctaggccc cggcacagtc tcacaagaga cagctatatc 3120
agggtccttt cagcaaaatc ttggtagtgt atgcaatggt gtcagtgttt ggaggctgat 3180
tatgggatgg ctccctggat atggcaatct ctagatggtc catccttttg tctcagctcc 3240
aaactttgtc tctgtaatgt tacaattcta aaatatgcta ttgttggtga atgaaagtgc 3300
tttgctttaa aatttagaag tcagaataaa gcaattatag agaaagcaca gttctaggat 3360
attgctgccc aggaacaaga tgtacaacta agttactaga cacatgggaa tgggagatca 3420
acctcattag tcaccgtgga actgcagatc aaatggcagt cagtctcatt tcacagatat 3480
aatagtctct agctaaaaaa aaaaaaaaaa ggaatggatc ctgaaaagat ctatacactt 3540
gggatctttc ctggaaacct tagtgctcaa ctcctccaca gttttgaaag ccttgttacc 3600
atccttaggg caacccagat gtgagctgca ggtctcctcc tgaacaccag ggggcgaaat 3660
tggaccatat aatcaacaga aaaaactcct gagccatgac caccccttgt atattttctt 3720
tatttacatt tcaaatgttt tcccctttcc aggtctcccc ttcagaaccc ccctatcctt 3780
acctccctcc ccctacctct atgaggatgc tcccacatcc acccattctc attctcctgc 3840
cctggcattc ccctacactg gagtattgaa cagcctcagg cctgagggcc tctcctccca 3900
ctgatatcca ataaggccat cttctgccac atgtttggcc agggccatgt gtcactaatg 3960
gatacagaaa atgtggtaca gccgggcggt agtggcgcac acctttaatc ccagcacttg 4020
ggaggcagag acaggcggaa ttctgagttc gaggccagcc tggtctacaa agtgagttcc 4080
aggacagcct agactataca gagaaaccct gtctcgaaaa acaaaaaagg aagaaagaaa 4140
gaaagaaaga aagaaagaaa gaaagaaaga aagaaagaaa gaaagaaaga aagaaagaaa 4200
gaaagaaaat gtggtacatt tacacaacag actactaatc agccattaaa aacaatgact 4260
tcatgaaatt cataggcaaa tggatggaac ttgagaatat catcctgagt gaggtaactc 4320
agtcacaaag gaacaaacaa acaagatatg tactcactga taagtggata ttagtccaaa 4380
agttcagaat acccaagatt caattcacag actatatgaa gcctaagaag aagaaagacc 4440
aaagtgtgga tgcttcactg cttcttagaa gggtgaacaa aatcttcaca ggaggaaata 4500
cggagacaaa gtgtggagca aagactgaag gaaaggccat ccagagactg ccccacctgg 4560
ggatccattc aacatatagc caccaaacct gtacacgatt gtggatgtca ggaagtgcta 4620
gctgatggaa gccggatctg tctgtctcct aagaagcttt accagagcct gacaagtaca 4680
gaggcaaggc gaaagctcac agtcaaccat tggactgagc atggggtccc agatggagga 4740
gttggagaag ggactgaagg agccgagggg ggtttgcagc ccagtggagg gagcaacagt 4800
gccaagaggc cagaccccca cagagctccc gggatctgga tgaccaacga aagaatacgc 4860
aagaacctat cttgcaacct ggattgcagt ttccatccca gttgacacac ctggagccag 4920
ctccctcacc taaccatact ctgtcctcct ggctctttcc tgtatctctg agaactccat 4980
tctgcaagga cagccgatgt ggtcccttcc atcctccatc atccatggga gagatgggaa 5040
atcccagggc attcaccaag gagaacagag ccatcctggg acggttacgg agggcacggg 5100
cagacctgaa tcacatttgg ccagcacagg aagctgggga ggtctccctg gcaccctcca 5160
taggaaggtg gagcacacag tcctctttcc aggacacaca ggtcacctcc tcctgcacac 5220
ccaggatatg aagcccctga gacaacttgt atcctaggat cagacacatg actggtgtca 5280
tcagtgacga tggatcaggt cctacccagt catcactcag ctaggccttt ccttaaccct 5340
ccagataact ctgccacttc ctgcctggag taaaccccac ctctgtgagc attgagagca 5400
gggcacagag ggctcccatg ggggtttgtg tcactctagg ctacaggaaa tgctggaact 5460
cctgctgcag ttgacagccc caaggccagg gcacagggca ctcctcagcc ttgctgctcg 5520
gagtatgttc tagaacactg aactggaaag aggaatagaa ggacgggagg cccacactga 5580
caggagttca gcattgtcag actcacaggc tccaccccca gcccacgtgg atctgggagg 5640
tgccctcccc tgggaggaga caaagctcct ttaagaaaag cagggcagat atcagggcag 5700
cctggcttag cagtagtgtt ggagaagaag ctagcaggca ggcagcagag acatggagct 5760
ggcctcagca catctccaca aagggcaggt tccctgggga ggactactgc tcacaggtaa 5820
ggagatattc cttcccagta gagagcaggg gagctcagag actggctggg ctcttctggg 5880
agaggaaagg aacctgagaa gggacatctg gcttctgctt gaagcttgac ggcaacagga 5940
agctctcagt gagtgtgaat tggctagtgg tgaggagtaa ctcagttctt tgctattgtt 6000
aagtttatcg cactggctag gatatcctga agactgaagt ccacaactct gtcagggtga 6060
ctctccacgt aaaaaccaaa ctggggcaag taaacagaat tattgcatac tacactgaga 6120
aatttgcaaa tacactaggc tctgggacac tgatttgcaa acaggtatat gagcatgttc 6180
caggcagcag ggattaagcc aaaacccacc tagccctgca cacaaagccc tagttctatg 6240
tgtaaaacgg cagacctgcg agggcatcta ggtggggtca ggttggcaag cacttcagaa 6300
aaaatcaaag cagagaaagg ccagaaaatg aaggtgcagg gtctttagag gagggggtca 6360
gagaagatag gcctacactc agcaaagact ctgagtgtga gctggggtct gagggcagtg 6420
agaggtgagc tgtgtgaatg ccggagaaga gcgtttccta cagtggaaag atgaggaatg 6480
gaggtgatct gctggccacc acccaatagg acacaggcac agcaaggctg agaggttttg 6540
caaaggtcct aagattgata ggtctttctc tcttccctct tagcctcact tttagcctcc 6600
tggagccctg ccaccactgc t 6621
<210> 4
<211> 2335
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
aggtagcatg tcctgctgac tgaagcagca gcaggaagga cagggatagg ttgagaagaa 60
accctaggat gaagaaaaag cattaaaatc ctcttggtca agaacatgag tagaaacaga 120
agagccacct gcgttaagac agacagggca gggtcaagta aggcaggtgc aggagcacct 180
gagaagctac agcctgggga cacagctcac tcacagtaaa gagccagccc tgggtgtggt 240
catggcacca ggagcaaagg agagcaaact ccccgtccag aaaagaggga agaacaaata 300
ctagcaggca tcatgattag attaagcaca cagacaccag gaggaaaggc agtgccaagt 360
tgatgcgatg gggcatcact gacaccaagg aactctcagg aagaagctgc ccacagagtc 420
caccccaact ccaggagcaa aggccccgtg tgtgggaata gtagtcacag tgtctcatat 480
ttatttaggg gaagtgacca gcgagatctc acagagcaca aaccctcagc ctccaaccac 540
agtaagtaaa gccaatcaca tgatgagaat tggtgttata ccatgtttcc tgttctgcca 600
ggggatcctg ggattgtaga ccatgccttc ctctcagaat tttcatagaa gagagctcta 660
cttcccagtg ctaaggatcc tacaaatcat acccttccta gctacagaat ggcatgagat 720
agccttggag aagggctcat tccacctagg ctagacaggc cataaagaag tcatgtttcc 780
aggagccatc ctagtgtcat cttccctgca tctattacca atgtgggagt tctaccttgg 840
ggaaactttt ttgttttgtt ggatatgggt gccatggcct ctatgttcct ccttgctatt 900
ggggcatgct gctgagaatc aactctgggc atcttgtgaa ctgggtaaac gtggatggct 960
ctatcccact cctcctatga cactaatgca atgggtccag accagactgg ggagggggag 1020
ccaattcatg aactgatgat gggtcccctc ccccccttct tccccctcca cctgctagaa 1080
atcacaaagc cttattcatg tgcactgata tctttcttcc tcccctagat ctggctcctt 1140
ctgacaactc tcctaacaag gtgagcactg ccacttttgc tggctgtttg tgctacaaaa 1200
tgtctctgag gaaacttggg atatctgtat tgttttgatt tctttgtttg ttgagacagg 1260
atctcaccat gcagccttgg caacctttga actatgtaac caaagttggc cttgaactgt 1320
ggccatacac ttgccttagc ctttcatgtg ctgggatgac aagtgtgtgc taccacacca 1380
agctgagaaa agtattcttg aagagacaca actgtgaaat ccagtatggg tctctactcc 1440
tcaacactgc acagaaagac agactggtca atgggtccca tgagtctact acaagagtgt 1500
gtgttggaat tatctctgcc ctgtggttaa tttctggcta tgactcctag aattccatgg 1560
ctcttgtcaa ggaactaact ctgtgatcct caagtttggt cacctagaca atgggcattc 1620
tgcctacctc aaaaagagga caaaagaata cagcaagttg gcttgtgcac agcccggccc 1680
acgcagggca gcacatcata ctctctcctc cagtctagac ctgctctgac ctcaaagcag 1740
ggcttccatt ctaatgggtc tgagatcctc ttccttcatt ttataacaat tttaccaaga 1800
tttcatcatt taagcctttc aacagagtac acacatagac caaagagtgc tttaagtcat 1860
aaggacactt tcacagtggg aactttcaca aacctgtcct cctccccaga cacacattgc 1920
tgtcacattt cctacagagg gcaaaggcag tctgacagtg cacatcctcg cacacatctg 1980
gctttcattc tgctctgtag cccatctgga tgtctgactg tgagtgacag gagccctctc 2040
ccaccgctgc agcaaggtga ggctcaggcc tgtggtgctg agtcactgat caactctcat 2100
cctttcaggt ggatgacgtc gcatacactg tcctgaactt caattcccag caacccaacc 2160
ggccaacttc agccccttct tctccaagag ccacagaaac agtttattca gaagtaaaaa 2220
agaagtgagc ataatctgtc cgtctgtcct gctggctgca ccagtgatgc attcccggat 2280
tctgttcctc actggagggt ctcagcacac acacacacgt acacatgcgc gcgcg 2335
<210> 5
<211> 14906
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
cagctcacta ctgaatccat gccattcaat gttgcagagg ggaaggaggt tcttctcctt 60
gtccacaatc tgccccagca actttttggc tacagctggt acaaagggga aagagtggat 120
ggcaaccgtc aaattgtagg atatgcaata ggaactcaac aagctacccc agggcccgca 180
aacagcggtc gagagacaat ataccccaat gcatccctgc tgatccagaa cgtcacccag 240
aatgacacag gattctacac cctacaagtc ataaagtcag atcttgtgaa tgaagaagca 300
actggacagt tccatgtata ccgtgagtat ttccccatga cctctgggtg ttggggggtt 360
agttctactt cccacacatg ggattgtcag gcctgggctg tgcctgtgtc ctcctctgca 420
ttatacccca tgttaaggtt tgagcatcta gtgcaggaca cactatggga cagacatcaa 480
aataccgaat gtcttactct ggagtcctga tcctgcagac atttgcttca gaggaaggac 540
aatctgatgt gggtaactta tcgaggggag accagtctca atctagcccc ccgggaccct 600
cctggtgaac atgtccctaa gaaagaccca gtaggactca gtcagggtct ggcctgaagg 660
gccttctggg atcctcacag acaagctcag ccctgggaat cccctgctcc aaaacactat 720
cccaaggttc tcaactcctg gtgaccctga ggagcctggg ccagggctag attgtggcct 780
cctgggcagg gctgactagg aacaagaatt tgagctatcc gagggctgtg gctcctggag 840
ctggtaacca gccagggttc aggcactaga gcctcatttg ggcaaggatg gagcctcatc 900
cttcacctta gtctcagcct ggagaggaca gatggacaag ctccccaggc catcagccaa 960
ctgccctggg ggcttaggag atgccatagg aagtttacca tccccaggag gaggaacaga 1020
ggagacagga tgctcctgaa agctccttgt ccaccaggga tcaggctgag aggcactctc 1080
agggaatcca acaacaatag atgctgcggc cgggagcggt ggctcacgcc tgtaatccca 1140
gtactttggg aggccgaggt ggatggacta tgaggtcagg agttcaagac cagcctgacc 1200
aacatggtga aatgccgtct ctactaaaaa tacaaaaaaa ttagccgggc atggtggtgc 1260
gtgcctgtaa tcccagctac tcaggaggct gaggcaggag aattgcttga acctgggagg 1320
cagaggttgc agtgagccaa gatcacacca ctgcacccta gcctgggtga cagagcgaga 1380
ctccatctca aaaaaataaa ataaaataag attgtttatt tgggcagctc ccctatgcag 1440
agctgaggtc aaataattat aaatatttta aggttaattc acagaccaca agataagcca 1500
aaaccattaa ccatttatac tatttcactg aggggaaact gagccacatg gcaataaata 1560
cagtcactat atcaaggtca cacagaccgt aactggaaga tcagttacag gagatttgtc 1620
ttcagccaca gtcttaacct ctcctctact ccaggggcct tatttggcat ctgtggaccc 1680
caataccaga gttggttcag ttcctttctt cttaggcatc tgcagcccag aggggagatc 1740
ctggtctggg gagtgagatc aaatggagaa ttaaccaagt tttattctgg aacttattca 1800
acaattagag tcagtgtcag aaatgttcag gcctgggtca ggtgtggtgg ctcacacctg 1860
taatcccagc actttgggat gccaaggtgg gcggatcact tgaggtcagg agttcaagac 1920
cagcctggcc aacatggtga aaccccatct ctactaaaaa tacaaaaaaa attagtcggg 1980
catggtggtg cgtgcctcta atcccagcta cttgggatgc tgaggtggta gaatcgcttg 2040
aacccaggag gcagaggttg cagtgagcca agatcatgtc actgtgctcc agcctgggag 2100
acagagtgag actctatctc aaaaaaaaaa aaaaaaaatg ttcaggccta tcccctcaga 2160
cccacacctc tgctccctgg acttgaaatc ctgtgtttcc tgatgtgttg gtgtcactcc 2220
cacaggagga taaaggaaag gactttgctt tctctcctca ctcacaccct gcaccagcca 2280
gggcccaata tgaaacacac actcagtagt tctctcacga atgaatgaat gattgaatga 2340
tccataacct ctttagagac tggatctggt tgcagaatcc tgggaggttc ttgccatacc 2400
tgctccccat ccctccagag actgagaccc atgttccatc ccctgaccat ctacccctaa 2460
agccacccca agtaccatat ttcatgtgac tctggggctg ctcggctgtg ggaaggtttt 2520
cagggctccc tggtcttggt tctgggacac cagaggctcc tggttgctgg ggtctctaag 2580
gtcacttagc ccccactgct cactgtcatg ggcatctctg actctctctg ctcctccatg 2640
tcctcatttt cctctccttt tatcccagct gaactgcaca gtttcctcca cccttaggcc 2700
tttcccagac actccgtcta acaaggttga ctgtcctgtt cccttcccgc tcacactgtg 2760
gccaggccca cctcccaggc aataggaaag gcacagaaat gagcccagag cagccacccc 2820
tgccaggtcc atcataagcc actgtcccca cgtccctcag tgaggctgac atgtggaaca 2880
ggccagggga cagggacgag cgactcctcc atacactctc tatactgact caccagggga 2940
tcagaggcag aaggacaggt ctgcagtccc caaagcccat gggctcattt cacccatttg 3000
actcctaact ccatccctgt tctgctgtgg gctcacatcc tctagtggtt cctgggacct 3060
tccccaggta gagctagcca ggcaggtgct gtctgatggg tttgctgccc attccaccta 3120
cacctgtgtc ctcatgatga caccattgtc ataaggtggg atctctcaga cagggagatg 3180
cattagccac tggtgtctca cttagactct gcccaatttg gatgaatttg gtcaaactct 3240
agttgtgttt cctgaggttg agagaaaagg aagaaagcat ttcacatgac tgttttgggt 3300
tcctcttttt acctgaattt ccacaggaat ttgtaacata gaactctttc taaatttact 3360
aacataacat acatcacttc tcctcatatg gcatcgtttc tctcttaatg caattgagaa 3420
ctctcagata tcctttctga caagttctgt cttcctaaca ggacccatga gtcccttcta 3480
cccagggagt cactgatttc aaactgactt cagcatcttt ctttgatcat aacatgatcc 3540
tactggactt cagagcttgg tttaagagtt tatcacgctc tctagagaat attcctctta 3600
ttaattggtg ttagatccag agtattaata acaattttca caaatggaat aatttaactt 3660
ctcaaattta tatcccagat ctaccaaaca cacacggtcc atgagggtca ggctttcagc 3720
aagttcatgc tccttccgtt ccactagctg gttattctgc atttgcaaaa aacccacata 3780
tttcagaaaa gatccacagt gtcatcatct gctcttttca gggggaaaag ggacattaaa 3840
gaccaaagac aaggaacgta gatatgctgt taaatccagg cagcccctgc ctgtcaccct 3900
cactcttttc tagatcattc cttggactct gctctatctt tagggggtca ctggctcaag 3960
tcagtcatca tcaaacacct gggaaaaact gccccacctt gtggttctgc tgcctgacga 4020
ctgagctacc ttcaggcttg cccctggtgt cccctgttat ttctgctgaa acatacagtc 4080
ccaggccagg ctgctcagta tcctcagggt ttaaggacaa taggaagtcc catcatcacc 4140
catctctagg atgtcctcag acagggaagc tgcagagaaa acacacctag tggggcaaag 4200
taggactgtg aagctggaag ggacccagca cctgtatgtt ccaggtgagg acccacaggt 4260
gggtcaggca ggcatcagcc agtcagggaa ggaccagaag tgcctggggc tgtgactccc 4320
agtcctcggt ctgtccacga cccaacactg ctgctcagtt cacacttgag aaagtctgtg 4380
cttctctcac acagagcagg cggcctcacg gtctctgagc cctcagatca ttgcacatct 4440
gtcttgtgaa acacacactt gccatgggct tttagggact tgggttggct gagaggtggg 4500
gagatgccaa ctctgattga aaaatgcccg gacggaatcc cagcactttg ggaggccgag 4560
gcgggcagat cacgaggtca ggaggtcgag accatcctgg ctaacagtga ccatcctggc 4620
taaaatacaa aaaactagcc gggcatggtg gcatgcacct gtagtcccag ctactcagga 4680
ggctgaggca ggagaatcgc ttgaacccgg gaggaagagg ttgcagtgag ccaagatcgt 4740
gccactgcac cccagcctca gcaacaaagc gagactctgt ctcaaaaaaa aaaagaaaga 4800
aagaaaaatg cccagccagg cacggtggct caggcctgta atcccagcac tttgagaggc 4860
cgaatcgggc ggattgccag agctcaggag tttgagacca gcctgggcaa cacggtgaaa 4920
ccccgtcagg catggtggca tgtgcctgta atcccagcta ctcaggaagc tgaggcagga 4980
gaattgcttg aatccaggag gtggaggttg cagtgagccg agatcgtgcc attgcactcc 5040
agcctgggtg acagagcgag gctccatctc caaaaaaaat aaataaataa gaaaagaaaa 5100
atgcctgtgg aggaatcaaa ggtgccacac agggcaatct tctctctgtt ttctgcatag 5160
cggagctgcc caagccctcc atctccagca acaactccaa ccctgtggag gacaaggatg 5220
ctgtggcctt cacctgtgaa cctgagactc aggacacaac ctacctgtgg tggataaaca 5280
atcagagcct cccggtcagt cccaggctgc agctgtccaa tggcaacagg accctcactc 5340
tactcagtgt cacaaggaat gacacaggac cctatgagtg tgaaatacag aacccagtga 5400
gtgcgaaccg cagtgaccca gtcaccttga atgtcacctg tgagtatctt ctgttcctct 5460
gtggcccagg ctgccagccc aaatccacgc agccagaggc caggcctctc agtccctctc 5520
aggtctaagg acgcagaccc ttaaccctgg acacccaggc tggccatgac ttcctttccc 5580
caggcaaacc tgggcagccc cagcctgaac caagaatagg aggggagagg ctgctcctgt 5640
cctgggaggc tcagggtcca cagcctgtga tgggagaaac aggtgaatgt ctcagaccca 5700
gactcagtgg acacaatgga ggtttggtta ggacttcagg gttgtgactt agtagagagg 5760
aacactgtgg cccttctcca gaccagcagc ttccccttcc ctctgatgac atcacctgtg 5820
gctttattct ctttgctcca gatggcccgg acacccccac catttcccct tcagacacct 5880
attaccgtcc aggggcaaac ctcagcctct cctgctatgc agcctctaac ccacctgcac 5940
agtactcctg gcttatcaat ggaacattcc agcaaagcac acaagagctc tttatcccta 6000
acatcactgt gaataatagt ggatcctata cctgccacgc caataactca gtcactggct 6060
gcaacaggac cacagtcaag acgatcatag tcactggtaa gtaattcctg gagcatcaac 6120
actaagatct ggggtacaag ctttctggtt ttcaaatagg agcagagaag aaattttctt 6180
ttgcagcctg tatccaacag gcacaaacaa gtccaaattc tcccctgaac cctctcaatt 6240
catctgtgca gactctcttc cctttgtttt tctgatttct cacagctgac cttaggtcca 6300
gcctggaatg tggggagggg gttctctcag ccccagaaag ccccgtgtag caggaggggc 6360
ttcacagagg gggaagcaga aagggtcctc aaggtcaatt tgcttctgtc actaacatgt 6420
ccctttctgt aacttcttgg ccttctttta cctattccat gagatataag gaatatgtga 6480
ggttttaaaa cagactcaca atagttttcc ctaaatgaga gaaggaaatg cccttcatca 6540
gggatgagca gctcagactc tgctccctgc tctactcccg gcttgcccgg tgattggctc 6600
tgccctgacc ccatgtgggg taggacgcag gtgtgtgcag aaggtgtcca ggtggcctgt 6660
catgaatcca gctaaatcaa gatggcagtc aatggctggg cgctgtggtt catgcctgtg 6720
atcccagtac tttggaaggc cgaggtgaga ggatcacctg aggtcaggag ttcgagacca 6780
gcctgaccaa catggcaaaa ctccatctct actaaaaata caaaaaaaaa aatttagcca 6840
ggcatggttg cacatgccac taggcatgcc actagggagg ctgaggcaca agaatcactt 6900
gaacctggga ggcagaggtt gcaatgagcc gagatggcac cactgcactc cagcctggac 6960
aacacagaga gactctgtct caaaaactaa ataaataaat aaataaaggc agtcaacacc 7020
tgagcctccc ctgggtcagg ctgcctccct gagcttgtcc tggctctgaa gtcaccagct 7080
gtatgaggct gtgggcacag cacatgggat agcacagagc acagcgagtg acccacactt 7140
ggagaaatcg ggagattcag ccacaggggc tctgcattgg agagaatggg caatgccaaa 7200
cagcgtgtat ttgtagagaa ggtaagaata tcagcctttt gttaacactg tgcctactct 7260
aggaatctcc ttcaccgtga tattctatcc acagaccagg aagtaaaact cctctttaca 7320
gtgggaaatc cttcggattg gaactccaga tagtaaggtc atgaagactg gatggggcat 7380
catcattccc taaaaaatta tttaatgaaa aaaaacacta ccttcccttt tgtatgtaaa 7440
gtgacagtca caggaaggat gcctgatcac agctccagga aagggtcagt gggaggccag 7500
gcacagtggc tcacgcctgt aatcccagca ctttgggagg ctgaggcggg tggatcatga 7560
ggtcaggaga tcaagaccat cctggctaac atggtgaaac cccgtctcta ctaaaaatac 7620
aaaaaattag ctgggcatgg tggcacatgc ctttagtccc agctactcgg gaggctgagg 7680
caggagaatg gcttgaaccc gggaggcaga gcttgcagtg agctgagatc atgccactgc 7740
attccggcct gggtgacaga gcgagactcc gtctgaaaaa aaaaaaaaaa aagtcagtgg 7800
gaaaaacatt ctacctgatg atgaggttgc tcggtctgtg cgctgagaag aagattccaa 7860
gtggagatat agagatatcc agagggtcac tctgagacga tctggggtca ggagggaggt 7920
gcagccctct ccttacaatt catcacctga acaaagacac tcgaccttct gcagagggtc 7980
agggctatcc cctggttggt gacctttgca cagctcactg tgggacctga gagctggcta 8040
aaatctcagg gaaaggagca tagccctagg ccccaggccc caaccctatt ctcagtaggt 8100
tatctcagat actctgcttg tccacagagc taagtccagt agtagcaaag ccccaaatca 8160
aagccagcaa gaccacagtc acaggagata aggactctgt gaacctgacc tgctccacaa 8220
atgacactgg aatctccatc cgttggttct tcaaaaacca gagtctcccg tcctcggaga 8280
ggatgaagct gtcccagggc aacaccaccc tcagcataaa ccctgtcaag agggaggatg 8340
ctgggacgta ttggtgtgag gtcttcaacc caatcagtaa gaaccaaagc gaccccatca 8400
tgctgaacgt aaactgtaag tgactcctca ccccttccta tatgtccctc taggattact 8460
ctgtcaatgg tgtgcaaaat ggataaaact cacaggaggc agaatatcaa tgaagagacc 8520
attatagcaa acagaattgc aaagtggtta agagctcagc tcaggccggg cacagtggct 8580
cacgcctgta atcccagcag tttgggaggc caaggcgggc ggatcacgag ggcaggagat 8640
cgagaccatc ctggctaata tggtgaaacc ccgtgtctac taaaaataca aaaaaaaatt 8700
agccgggcat ggtggcgggc gcctgtagtc ccagctactc gggaggctga ggcgggagaa 8760
tggcgtgaac ctgggaggcg gagctttcag tgagccgaga tggtgccact gcactccagt 8820
ctaggcaaca gagcaagact ctgtctcaaa aaaaaaaaaa aaaaaagagc tcaggctctg 8880
aatcaaatat acatatactt agttggtttt ttttggttgg ttgggttttt ttgtttgttt 8940
tgttgttttg agacagggtc tcactctgtc acccaggctg gagtgctgtg gtgtgatcaa 9000
agctcactcc cgcctcaatc ccctgggttc aagcaatcct gccacctcag cctccagagt 9060
agctgggact acaggttgca ccaccatgcc tggctaagtt tttaagtttt tttgtagagt 9120
tggggtttca ctgtgttgcc caggctggtc tcaatctcct ggtctcagcc tcggcctccc 9180
aaagtgctgg gattacagga atgagccact ctgcccaccc cgtatttaac tattctaagt 9240
acctctcata cagatggaag catgcaatat ttgtcctttt gtgtctggct tacttcattt 9300
agcacaatgt cttcaagctc catctatgtt gtagaatgta tcagaatttc attccttttg 9360
gagactgaat aatattatgc tgtgtagata gatacatcac attttgctta tccactcatc 9420
catctatgga cagttaggtt gcttccacct tttggctatt atgaataatg ctattacaaa 9480
catgggtata caaatatctg ttcaaatccc tgctttcagt tcttttaaat agatactcag 9540
aagtggaatt gctggatcaa atggtaatcc tgtttaattt tgaagaacca tcataccatt 9600
ttccacagtg gctataccat ttcacattcc caccagcaat gcactagagt tccaatttct 9660
ctacatcttc aaaaacattt gttgctttct ggttttgttt tgttttttat aatggccatc 9720
ctaatggtta taaggtggta tatcattgga gttttgattt gcacttccct aatgattagc 9780
aatatttagc atcttttcat gtgcttattt gccatttatc ttctttggag aaatgtttat 9840
tcaagtcctt tgcccatgtt ttaattaggt tgtttgggga tttttggttg agttgcagta 9900
gttctttata tattttggat attaatccct tatcagatat atgattctca aatattttct 9960
cccattctat aagaagtctt ttcacttttg tgataatgtg ctttgataca caaaagcttt 10020
taattttcat taagtccaat ttctctactt cttctttcgt tgcctatgct tttagtgtca 10080
tagccaagaa atcattgcca aattcaatgt tccaaagttt tcactctatc ttccaagagc 10140
tttatagttt tagctcttac atttaggtct tttatgcatt ttgaattaat ttttatatat 10200
ggtgttacat aaaggttcaa tttcattctt ttgcatggat atccagtttc ttcaatgcca 10260
tttgttgaaa agactatcct ttccccactg aatgatcttg gcacccttgt caaaaaacat 10320
ttggctatgt atgcaaacat ttctttctgg gctctatatt ttattccact ggtttctatt 10380
tctttttgcc agtaccatac tgttttgatt actgtagctt ttggattttg tttgtttgtt 10440
ttattgttgt tgtttgggtt tttttgtttt gttttgtttt tttgcttttc tttgtagaga 10500
tggcgtttca ccatgttgcc aaggctggtc tcaaactcct gagctcaagc aatccacccg 10560
cctccacctc ccaaagtgct aagattacag gtgtgatgat taccatagct ttgtaaaaaa 10620
ttttgaaacc aggaagtgtg agcccttcaa ctttgttctt tttcaagatt gctttggcta 10680
tccatggtcc cttcagagtc tatataaatt ttagaatgaa tttttctatt tctgcaaaaa 10740
atattactgg aattttgata gagattgcac tgaatctgta gatcactttg ggtagtactg 10800
tcatcttaac aatattaagt cttctaatcc atgaaaatgg ggtgtctttt caatttatgt 10860
cttatttaat ttcttttggc aatgttttgt attttcaggg tacaaatctt tcacctcttt 10920
ggttaagttt atttctaagt atttttaaag ctcttataaa tagaattttt ttcttaattt 10980
tcctttgaat tgttattagt atacaaaaat acaactgatt tttgcatgtg gattttgtat 11040
cctgccactt tgctaaattt attattctaa cagttttttt gtggaatctc tagggttttc 11100
tatatataag ttagtgtatt ctgcaaacag gtataatttt acttctttcc aatctagatg 11160
cttttttttt cttgcctaat tgttctgtct aggtcttcca atactacatt gaatagaaat 11220
ggcaaaagca ggcatccttg tcttgttctt gatcttaaag gaaaagtttt caatctttca 11280
ccattgacta tgatggtagc tgggggtttt cacatgtagc atttattatg ttgagaattt 11340
ccttctattc ctagtttcag tgttttttag catgaaagaa tgttgaattt tgtcaaatgc 11400
ttttatcgac tcattttcat tactggttat aggtctattc agattttcta tttattcatg 11460
attctatcat ggcaggtttt gtgtttctag gaatttgttc atttcatcca ggttatccaa 11520
tttgttggca ttcaattact catagtactc ttataatcct tattatttct gcagaattag 11580
tagtaatgtt ttactttcat ttctgacttt agtaatttga atcttctttc tttcttagtc 11640
aatctaatta acagttgtca attatagtga tcttttttga agaacaactt ttttttttcg 11700
gtttgagaca ggttctcact ctgtcaccga ggctgatcat ggctcaccac agcctcaact 11760
tcccgggttc aagcaatcct cctgcttcag cctcctgggt agctgaaact acagacaagc 11820
actaccacct ccggctaatt tttgtaattt tttgtagaga cagggtttca ccatcttgcc 11880
cagctggtct caaattcctg agctcaagtg atacacctgc ctcagcctcc caaattgctg 11940
ggattacagt catacaccac tgtacctggc ctacagttat aaatttcttt cttgcacaag 12000
attcttaact actctgagcc tcggattcct caaccgaaaa ttgcactgtg aatgcctgct 12060
ccatagtatt gcacgggttt ggggtttttg ttttgttttt gagacagggt gtcactctgt 12120
cacccagact ggagtgcagt ggtgcaaaca cagctcactg cagcctcaac ctcctgggct 12180
caagcaattc cctcacctta gcctcctaag tagtacatac taccacatct ggctaattta 12240
tttttatttt tgttttcaga gagacagaat ctcaccatgt tacccaggct ggactcgaac 12300
tcctgggctc aagcaatcct cccatctgtt tcccaaagtg ctgagattac aggtgtgagc 12360
caccacgcct ggccccatag tgttatttta aagatttaat gtaataataa accttcagca 12420
aaacaccaca cacagaggaa atgtttcata aatgttagct gctattacta ctactattat 12480
cattagcctt gaaatcaggt agtcctaggg tcaaatctca gatccacctc tcactagcca 12540
tctgacttta ggtaagcctt ttaccactct aagcttccat tttttcatgt ttaaaatgga 12600
aataatgtct acctgacagc actattttat ggatcaaata agatacatgt aaagcattta 12660
gcagcacagg gcctggcaca caggaagtac tccacaaaag tagctaacat agcattagtc 12720
accagcctga gttgactggt gagggttaag ccccaaatag ttgcaacaga tataaacaag 12780
aaataggcta gacacagtgg ctcacacctg taatcccaac atttgggagg ccgaggctgg 12840
aggatctctt gagcccagga atccaagacc agcctaggca atatagtgga accctatctc 12900
tacaaaaatt attttttttt aattagccag gtgggtgggc gtggtggctc acgcctgtaa 12960
tcccaacact ttgggaggtt gaggcaggcg ggtcacctga ggtcgggagt tcaagaccag 13020
cctgaccagg atggagaaac cccgtctcta ctaaaaatac aaaattagcc aggcgtggtg 13080
gcgcatgcct gtaatcccag caattcagga ggctgaggta ggagaatcgc ttgaacctgg 13140
gaggcagagg ttgcagtgag ccgagatcac gccattgcac tctagcctag gcaacaagag 13200
caaaactcgg tctcaaaaaa aaaaaaaaga aagaaaaaaa ttagccaggt gtggtggcat 13260
gtgcttatag tctcagctac tgaggagact gaggtgggag gatcacttga tcccaagagg 13320
ctacaatgag ccatgattgt gccactgcac tccagcctgg gtgatagagt gagaacctgc 13380
ctcaaaaaaa aaaaaaaaaa aaaagaagaa gaagaaatag atgcaaaagg tattatttat 13440
atattatata tatatatata tatatatatg gagggagaag cattatacaa gaaacccact 13500
gggacatggc tatgatcaaa tatgggaaag ggggaaaaaa ggaggtaaag caaagtctca 13560
agcctggtat gttagtttcc atctactgag atacagtgaa gatgggatta aacatacgag 13620
ataatttatt ggggaaaatg cctgtgaggg aaagtaaggc gagagtgaga ggaacctcag 13680
accatgatgc agatctgatt cctgtggaag agaaagagag gaaggaagtt ttagattgaa 13740
gtgcagtttt tgtttgtttg ttttttgaga cagagtctca ctctgttgcc caggcttgag 13800
tgcagtagtg tgatctcggc tcactgaaac ctctgccccc cgggttcaag cgattctcct 13860
gcctcagcct ctcaagtagc tgggattata ggcacctgcc accgcaccca gctaaatttt 13920
gtatttttag tagagatagg gtttcaccat cttggccagg ctggtcttga actcctgacc 13980
tcgtgatcca cccgcctcag cctcccaaag ttctgggatt acaggcgtca gccaccgcgc 14040
ccggcctgca gtgcagttct aagagcattt ctgcaaggct gacagggagt cctccagcca 14100
ttcacacttc agaataaaac agtcacacaa aactgggcta gctttcatac ccctgctggg 14160
agcctgtggg aagccagttc tctatgcaaa agaggtggtg aattcagaat gcaccaactg 14220
ccacaactga gacactgaga aaaagatgca accacgaaaa aggtggaaag ttctaatcac 14280
atacaaaata gcaatcagcc tttctcatat ttcaaagcct taaaaatggc tgagcgcaga 14340
aaagccaggg tggaattggc agaagagaga tcatcaacct agaaacatgg tgactggggt 14400
tgggcgcagt ggctcacgcc tataatccaa gcactttggg aggccgaggc aggcggatca 14460
tgaggtcagg agttcaagac cagtctgacc aatatggtga aaccccgtct ctactaaaaa 14520
aatacagaaa ttagccaggt gtggtggcac gtgcctgtag tccagcctga ggcaggagaa 14580
tcgcttgaac ctgggaggcg gaggttgcag tgagccaaga tcatgctact gcactccagc 14640
ctgagagaca gagcaagact ctgtctcaaa aaaaaaaaag aaaagaaaaa aagaaacatg 14700
gtgattgaaa aaaaaaaatt gcaaggatat agttagctaa tcaccttcca gcaaccttcc 14760
cacaacgaaa ctgtattcct tgaaggaaca attagaaact acttcattct gagagttgtt 14820
tcccagcccc cattgtaaaa taatttcact ttcatttctt ctcctctttt ctctccatga 14880
cagataatgc tctaccacaa gaaaat 14906
<210> 6
<211> 47
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
acagataatg ctctaccaca agaaaatggc ctctcagatg gcgccat 47
<210> 7
<211> 96
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gtccaggaag agagagaagg gagggactcc aagaagcagc aagactatgc ggtaccgaat 60
tccgaagttc ctattctcta gaaagtatag gaactt 96
<210> 8
<211> 96
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
gaagttccta ttctctagaa agtataggaa cttcatcagt caggtacata atggtggatc 60
ccaattgagg tagcatgtcc tgctgactga agcagc 96
<210> 9
<211> 3748
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
aaagctcctt taagaaaagc agggcagata tcagggcagc ctggcttagc agtagtgttg 60
gagaagaagc tagcaggcag gcagcagaga catggagctg gcctcagcac atctccacaa 120
agggcaggtt ccctggggag gactactgct cacagcctca cttttagcct cctggagccc 180
tgccaccact gctcagctca ctactgaatc catgccattc aatgttgcag aggggaagga 240
ggttcttctc cttgtccaca atctgcccca gcaacttttt ggctacagct ggtacaaagg 300
ggaaagagtg gatggcaacc gtcaaattgt aggatatgca ataggaactc aacaagctac 360
cccagggccc gcaaacagcg gtcgagagac aatatacccc aatgcatccc tgctgatcca 420
gaacgtcacc cagaatgaca caggattcta caccctacaa gtcataaagt cagatcttgt 480
gaatgaagaa gcaactggac agttccatgt atacccggag ctgcccaagc cctccatctc 540
cagcaacaac tccaaccctg tggaggacaa ggatgctgtg gccttcacct gtgaacctga 600
gactcaggac acaacctacc tgtggtggat aaacaatcag agcctcccgg tcagtcccag 660
gctgcagctg tccaatggca acaggaccct cactctactc agtgtcacaa ggaatgacac 720
aggaccctat gagtgtgaaa tacagaaccc agtgagtgcg aaccgcagtg acccagtcac 780
cttgaatgtc acctatggcc cggacacccc caccatttcc ccttcagaca cctattaccg 840
tccaggggca aacctcagcc tctcctgcta tgcagcctct aacccacctg cacagtactc 900
ctggcttatc aatggaacat tccagcaaag cacacaagag ctctttatcc ctaacatcac 960
tgtgaataat agtggatcct atacctgcca cgccaataac tcagtcactg gctgcaacag 1020
gaccacagtc aagacgatca tagtcactga gctaagtcca gtagtagcaa agccccaaat 1080
caaagccagc aagaccacag tcacaggaga taaggactct gtgaacctga cctgctccac 1140
aaatgacact ggaatctcca tccgttggtt cttcaaaaac cagagtctcc cgtcctcgga 1200
gaggatgaag ctgtcccagg gcaacaccac cctcagcata aaccctgtca agagggagga 1260
tgctgggacg tattggtgtg aggtcttcaa cccaatcagt aagaaccaaa gcgaccccat 1320
catgctgaac gtaaactata atgctctacc acaagaaaat ggcctctcag atggcgccat 1380
tgctggcatc gtgattggag ttgtggctgg ggtggctcta atagcagggc tggcatattt 1440
cctctattcc aggaagtctg gcgggggaag tgaccagcga gatctcacag agcacaaacc 1500
ctcagcctcc aaccacaatc tggctccttc tgacaactct cctaacaagg tggatgacgt 1560
cgcatacact gtcctgaact tcaattccca gcaacccaac cggccaactt cagccccttc 1620
ttctccaaga gccacagaaa cagtttattc agaagtaaaa aagaagtgag cataatctgt 1680
ccgtctgtcc tgctggctgc accagtgatg cattcccgga ttctgttcct cactggaggg 1740
tctcagcaca cacacacacg tacacatgcg cgcgcgcaca cacacacaca cacacacaca 1800
cacacactta cacacacact catgcattca ctctattgac tccttcagtg tctatagaag 1860
aaaaggtgga tcctggagcc tacagaaaac tcaacccttc taggctttca aatttggctg 1920
agagtgaggt atcaaaattt ctcacccttt cactttcctg acccagattg ttgaaaattg 1980
acctattcag agcaccttca ttcccctccc aactccaagt cctgccctat cagagtctga 2040
cttgaatttc cataaacctt ggaggtcacc taagtgctta cgccaaacaa aacaaaacaa 2100
aacaaaacaa aacaaaacaa aacaaaacaa aacaaaccag aagcaggaaa tggccagtcc 2160
catatcttta aaggctgatt ggaagccacc atacatgaga agatcaaacc tccatgggca 2220
atctacacac ccgacaactg tcatgcttac ccatctggga cattcgagtc tctgaacctt 2280
gtgccctcac gcctgagccc ttctctgagc ctttctccag aaaatccact cacagcaact 2340
agagaggctc tttgtcagca actccaagca aactgctagg caggattcag aagaaaagac 2400
agcatctcta acatccacca ggaaggtgcc cagaaaagca gagctggtga ctttggactg 2460
acagacatct ggagtgtgaa aaagcagcac agagctaacc ttcggagagt gttgaaatta 2520
tttgaaaaga agccatattt ggaggtattg gagttttcct ctttctgaga caatccacta 2580
tttgaaaatt gtagctactg aattgcctct cagtatgcga gctgatcact ttgccttagg 2640
gccactagat ttctgtctcc cttagcccct caagcccttt tgatcatgag ttccaaacca 2700
aaaataaata aatgaacagt gaggcagtcc cttgcagtac cactgtcatg ggtcaggcta 2760
agcctcctgc ttttctgaat tagtcaagaa aagccttggt ttcccttttt ccatctcttt 2820
atcttgtctt tcagatactg gccagagcct ggacactctt cctctgagat ctccagcttc 2880
tctgccttct tgtgtttctt ttaaactcta acaaaaactg ttctcacctt caaaaaataa 2940
aataataaca agctttccac atccccacca aagagggacc cagctaggtt tctggaaacc 3000
cagcaccagc ctccagctgc ccttctgcag tgtttctgcc tctgtttccc tttcgttttg 3060
acttttttcc ttcttttgag acagagttcc agcatggagc ctgtgcaggt ttcaatccca 3120
cagtaacacc ttctgcagca ccccacctgc tcagactgca gccctggcca ccaggcctgg 3180
ctacctggac attctgtctg ccctgcactc tcaggaaacc ttggcctctg ctactgtctg 3240
tttggctcat tcaaagtgtg tccttaaagg aatgcagtca cccatgccag aggcagtgtt 3300
tacagcctgg aatgctctgc acttccagtg gaccagtgct ccaccggaag tgggctgtta 3360
gcagggtcct ctcacctggc cctggccttt ctgtagcctt gaatcctgcc ttccccacca 3420
gggcaccagg gatgagtgca gcagcaggag gagaggcaaa cagtcacctc aggaaccttc 3480
tgagctaagg cacaccctct gtgcctgtca agcaaaggtt gtattggata tcaagtgttt 3540
ggtctcacgc caagccaaca ggctttggag agaattaatt agttctccta ctcagggatt 3600
tctttcagtc ctaacacagc ctgtgtatat tttgcttcac ccacgcaatg ctggattatt 3660
taattttgcc cggcttaaga caaatctgag ttacttgtaa atttgctcta tgttcataat 3720
aaaaatgtat tatatatcac tgatagca 3748
<210> 10
<211> 525
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 10
Met Glu Leu Ala Ser Ala His Leu His Lys Gly Gln Val Pro Trp Gly
1 5 10 15
Gly Leu Leu Leu Thr Ala Ser Leu Leu Ala Ser Trp Ser Pro Ala Thr
20 25 30
Thr Ala Gln Leu Thr Thr Glu Ser Met Pro Phe Asn Val Ala Glu Gly
35 40 45
Lys Glu Val Leu Leu Leu Val His Asn Leu Pro Gln Gln Leu Phe Gly
50 55 60
Tyr Ser Trp Tyr Lys Gly Glu Arg Val Asp Gly Asn Arg Gln Ile Val
65 70 75 80
Gly Tyr Ala Ile Gly Thr Gln Gln Ala Thr Pro Gly Pro Ala Asn Ser
85 90 95
Gly Arg Glu Thr Ile Tyr Pro Asn Ala Ser Leu Leu Ile Gln Asn Val
100 105 110
Thr Gln Asn Asp Thr Gly Phe Tyr Thr Leu Gln Val Ile Lys Ser Asp
115 120 125
Leu Val Asn Glu Glu Ala Thr Gly Gln Phe His Val Tyr Pro Glu Leu
130 135 140
Pro Lys Pro Ser Ile Ser Ser Asn Asn Ser Asn Pro Val Glu Asp Lys
145 150 155 160
Asp Ala Val Ala Phe Thr Cys Glu Pro Glu Thr Gln Asp Thr Thr Tyr
165 170 175
Leu Trp Trp Ile Asn Asn Gln Ser Leu Pro Val Ser Pro Arg Leu Gln
180 185 190
Leu Ser Asn Gly Asn Arg Thr Leu Thr Leu Leu Ser Val Thr Arg Asn
195 200 205
Asp Thr Gly Pro Tyr Glu Cys Glu Ile Gln Asn Pro Val Ser Ala Asn
210 215 220
Arg Ser Asp Pro Val Thr Leu Asn Val Thr Tyr Gly Pro Asp Thr Pro
225 230 235 240
Thr Ile Ser Pro Ser Asp Thr Tyr Tyr Arg Pro Gly Ala Asn Leu Ser
245 250 255
Leu Ser Cys Tyr Ala Ala Ser Asn Pro Pro Ala Gln Tyr Ser Trp Leu
260 265 270
Ile Asn Gly Thr Phe Gln Gln Ser Thr Gln Glu Leu Phe Ile Pro Asn
275 280 285
Ile Thr Val Asn Asn Ser Gly Ser Tyr Thr Cys His Ala Asn Asn Ser
290 295 300
Val Thr Gly Cys Asn Arg Thr Thr Val Lys Thr Ile Ile Val Thr Glu
305 310 315 320
Leu Ser Pro Val Val Ala Lys Pro Gln Ile Lys Ala Ser Lys Thr Thr
325 330 335
Val Thr Gly Asp Lys Asp Ser Val Asn Leu Thr Cys Ser Thr Asn Asp
340 345 350
Thr Gly Ile Ser Ile Arg Trp Phe Phe Lys Asn Gln Ser Leu Pro Ser
355 360 365
Ser Glu Arg Met Lys Leu Ser Gln Gly Asn Thr Thr Leu Ser Ile Asn
370 375 380
Pro Val Lys Arg Glu Asp Ala Gly Thr Tyr Trp Cys Glu Val Phe Asn
385 390 395 400
Pro Ile Ser Lys Asn Gln Ser Asp Pro Ile Met Leu Asn Val Asn Tyr
405 410 415
Asn Ala Leu Pro Gln Glu Asn Gly Leu Ser Asp Gly Ala Ile Ala Gly
420 425 430
Ile Val Ile Gly Val Val Ala Gly Val Ala Leu Ile Ala Gly Leu Ala
435 440 445
Tyr Phe Leu Tyr Ser Arg Lys Ser Gly Gly Gly Ser Asp Gln Arg Asp
450 455 460
Leu Thr Glu His Lys Pro Ser Ala Ser Asn His Asn Leu Ala Pro Ser
465 470 475 480
Asp Asn Ser Pro Asn Lys Val Asp Asp Val Ala Tyr Thr Val Leu Asn
485 490 495
Phe Asn Ser Gln Gln Pro Asn Arg Pro Thr Ser Ala Pro Ser Ser Pro
500 505 510
Arg Ala Thr Glu Thr Val Tyr Ser Glu Val Lys Lys Lys
515 520 525
<210> 11
<211> 20
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
gctcgactag agcttgcgga 20
<210> 12
<211> 28
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
ggagtcaata gagtgaatgc atgagtgt 28
<210> 13
<211> 28
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
tatcacaaga gggaataaac cacagggt 28
<210> 14
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 14
attgcaccat gaggttgaac agcat 25
<210> 15
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
actcctacac acagagcact aacag 25
<210> 16
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
caggccagag gaaatgtaac aaagg 25
<210> 17
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
cataaggtgg gatctctcag acagg 25
<210> 18
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
gctctgaagt ccagtaggat catgt 25
<210> 19
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
cttactcatg gctgccacac tgaga 25
<210> 20
<211> 22
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
gccacaactc caatcacgat gc 22
<210> 21
<211> 24
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 21
gctactgcac tccagcctga gaga 24
<210> 22
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 22
attgctggca tcgtgattgg agttg 25
<210> 23
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 23
caggtggctc ttctgtttct actca 25
<210> 24
<211> 25
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 24
gacaagcgtt agtaggcaca tatac 25
<210> 25
<211> 24
<212> DNA/RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 25
gctccaattt cccacaacat tagt 24

Claims (8)

1. A construction method of a CEACAM1 gene humanized non-human animal is characterized in that the construction method comprises the following steps of using a nucleotide sequence SEQ ID NO: 5 to the CEACAM1 locus of a non-human animal, wherein the replacement to the CEACAM1 locus of the non-human animal is the replacement of the nucleotide sequence of the non-human animal CEACAM1 gene encoding SEQ ID NO: 1, amino acids 35 to 419 of the sequence of the human CEACAM1 protein is shown as SEQ ID NO: 2, amino acids 35 to 423, the construction method comprises constructing the non-human animal by using a targeting vector, wherein the targeting vector comprises the amino acid sequence shown in SEQ ID NO: 5, and the targeting vector further comprises a 5 ' arm and a 3 ' arm, wherein the nucleotide sequence of the 5 ' arm is shown as SEQ ID NO: 3, and the nucleotide sequence of the 3' arm is shown as SEQ ID NO: 4, the non-human animal is a mouse.
2. The construction method as claimed in claim 1, wherein the non-human animal body expresses humanized CEACAM1 protein with reduced or absent expression of endogenous CEACAM1 protein, and the amino acid sequence of the humanized CEACAM1 protein is as shown in SEQ ID NO: shown at 10.
3. The method according to claim 1 or 2, wherein the genome of the non-human animal comprises humanized CEACAM1 gene, and the nucleotide sequence of mRNA transcribed from the humanized CEACAM1 gene is as shown in SEQ ID NO: shown at 9.
4. A targeting vector of CEACAM1 gene, which comprises SEQ ID NO: 5, and the targeting vector further comprises a 5 ' arm and a 3 ' arm, wherein the nucleotide sequence of the 5 ' arm is shown as SEQ ID NO: 3, and the nucleotide sequence of the 3' arm is shown as SEQ ID NO: 4, respectively.
5. Use of a CEACAM1 gene humanized non-human animal obtained by the construction method as set forth in any one of claims 1-3 in CEACAM1 gene or protein related studies, which is not a method for diagnosis and treatment of diseases, said use comprising:
A) use in the development of products related to the CEACAM1 immune process involving human cells;
B) use as a model system in pharmacological, immunological, microbiological and medical research associated with CEACAM 1;
C) the screening, the drug effect detection and the curative effect evaluation of the human CEACAM1 signal channel regulator are researched in a non-human animal body; alternatively, the first and second electrodes may be,
D) the application of the CEACAM1 gene function is researched.
6. The use of claim 5, wherein said use further comprises the study of human CEACAM1 antibody.
7. The use as claimed in claim 5, wherein the use further comprises studying the drug and its effect against the target site of human CEACAM 1.
8. The use as claimed in claim 5, wherein the use further comprises the use for studying CEACAM1 immune-related disease drugs or antitumor drugs.
CN202110173466.1A 2021-02-09 2021-02-09 Construction method and application of CEACAM1 gene humanized non-human animal Active CN112501205B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110173466.1A CN112501205B (en) 2021-02-09 2021-02-09 Construction method and application of CEACAM1 gene humanized non-human animal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110173466.1A CN112501205B (en) 2021-02-09 2021-02-09 Construction method and application of CEACAM1 gene humanized non-human animal

Publications (2)

Publication Number Publication Date
CN112501205A CN112501205A (en) 2021-03-16
CN112501205B true CN112501205B (en) 2021-05-25

Family

ID=74952793

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110173466.1A Active CN112501205B (en) 2021-02-09 2021-02-09 Construction method and application of CEACAM1 gene humanized non-human animal

Country Status (1)

Country Link
CN (1) CN112501205B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115010799A (en) * 2021-06-01 2022-09-06 百奥赛图(北京)医药科技股份有限公司 Construction method and application of BCMA gene humanized non-human animal

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002081715A1 (en) * 2001-04-04 2002-10-17 Mcgill University A transgenic non-human animal having a disruption of at least one allele to the ceacam1 gene and method of making same
CN102482354A (en) * 2009-04-30 2012-05-30 特尔汗什莫尔医学基础设施研究和服务公司 Anti ceacam1 antibodies and methods of using same
CN103060361A (en) * 2013-01-25 2013-04-24 昆明医科大学第一附属医院 Preparation method of carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) magnetic protein grains and application thereof
CN107815466A (en) * 2016-08-31 2018-03-20 北京百奥赛图基因生物技术有限公司 The preparation method and application of humanization genetic modification animal model
CN108588126A (en) * 2017-03-31 2018-09-28 北京百奥赛图基因生物技术有限公司 The preparation method and application of the humanization modified animal model of CD47 genes
CN110157704A (en) * 2019-04-04 2019-08-23 中山大学 A kind of mouse and preparation method thereof of anti-mouse hepatitis virus

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002081715A1 (en) * 2001-04-04 2002-10-17 Mcgill University A transgenic non-human animal having a disruption of at least one allele to the ceacam1 gene and method of making same
CN102482354A (en) * 2009-04-30 2012-05-30 特尔汗什莫尔医学基础设施研究和服务公司 Anti ceacam1 antibodies and methods of using same
CN103060361A (en) * 2013-01-25 2013-04-24 昆明医科大学第一附属医院 Preparation method of carcinoembryonic antigen-related cell adhesion molecule 1 (CEACAM1) magnetic protein grains and application thereof
CN107815466A (en) * 2016-08-31 2018-03-20 北京百奥赛图基因生物技术有限公司 The preparation method and application of humanization genetic modification animal model
CN108588126A (en) * 2017-03-31 2018-09-28 北京百奥赛图基因生物技术有限公司 The preparation method and application of the humanization modified animal model of CD47 genes
CN110157704A (en) * 2019-04-04 2019-08-23 中山大学 A kind of mouse and preparation method thereof of anti-mouse hepatitis virus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CEACAM1 (carcinoembryonic antigen-related cell adhesion molecule 1 (biliary glycoprotein));Yasunobu Matsuda;《Atlas Genet Cytogenet Oncol Haematol.》;20101231;第14卷(第4期);361-364页 *
Generation of Human CEACAM1 Transgenic Mice and Binding of Neisseria Opa Protein to Their Neutrophils;Angel Gu等;《PLOS ONE》;20100430;第5卷(第4期);摘要,"材料和方法","讨论"部分 *

Also Published As

Publication number Publication date
CN112501205A (en) 2021-03-16

Similar Documents

Publication Publication Date Title
CN108531487B (en) Preparation method and application of humanized SIRPA gene modified animal model
CN111304246B (en) Humanized cytokine animal model, preparation method and application
CN111057721B (en) Preparation method and application of humanized IL-4 and/or IL-4R alpha modified animal model
CN111837036A (en) Genetically modified non-human animals with human or chimeric genes
CN112779285B (en) Construction method and application of humanized IL-10 and IL-10RA gene modified animal
CN111793646B (en) Construction method and application of non-human animal subjected to IL1R1 gene humanization transformation
CN112430621B (en) Construction method and application of IL2RA gene humanized non-human animal
CN112501205B (en) Construction method and application of CEACAM1 gene humanized non-human animal
CN112553213B (en) CX3CR1 gene humanized non-human animal and construction method and application thereof
CN113881681B (en) CCR8 gene humanized non-human animal and construction method and application thereof
CN113046389B (en) CCR2 gene humanized non-human animal and construction method and application thereof
CN112553252B (en) Construction method and application of TNFR2 gene humanized non-human animal
CN111304247B (en) Preparation method and application of humanized LAG-3 gene modified animal model
CN114751973A (en) Construction method and application of SIGLEC15 gene humanized non-human animal
CN114316026A (en) IL17RA and/or IL17RC gene humanized non-human animal and construction method and application thereof
CN113264996A (en) Humanized non-human animal and preparation method and application thereof
CN113461802A (en) CD276 gene humanized non-human animal and construction method and application thereof
CN112481303B (en) IL15RA gene humanized non-human animal and construction method and application thereof
CN112501204B (en) IL21R gene humanized non-human animal and construction method and application thereof
CN112501203B (en) Construction method and application of IL17RB gene humanized non-human animal
CN113388640B (en) CCR4 gene humanized non-human animal and construction method and application thereof
CN114853871B (en) Humanized non-human animal of CSF1 and/or CSF1R gene, construction method and application thereof
CN114276433A (en) Non-human animal humanized with CD38 gene and construction method and application thereof
CN113234139A (en) TNFSF9 gene humanized non-human animal and construction method and application thereof
CN114853877A (en) Humanized non-human animal of CD93 gene and construction method and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant