CN114958758A - Construction method and application of breast cancer model pig - Google Patents

Construction method and application of breast cancer model pig Download PDF

Info

Publication number
CN114958758A
CN114958758A CN202110187956.7A CN202110187956A CN114958758A CN 114958758 A CN114958758 A CN 114958758A CN 202110187956 A CN202110187956 A CN 202110187956A CN 114958758 A CN114958758 A CN 114958758A
Authority
CN
China
Prior art keywords
seq
pig
safe harbor
harbor site
homology arm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110187956.7A
Other languages
Chinese (zh)
Other versions
CN114958758B (en
Inventor
牛冬
汪滔
陶裴裴
曾为俊
王磊
程锐
黄彩云
赵泽英
马翔
段星
刘璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Qizhen Genetic Engineering Co Ltd
Original Assignee
Nanjing Qizhen Genetic Engineering Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Qizhen Genetic Engineering Co Ltd filed Critical Nanjing Qizhen Genetic Engineering Co Ltd
Priority to CN202110187956.7A priority Critical patent/CN114958758B/en
Publication of CN114958758A publication Critical patent/CN114958758A/en
Application granted granted Critical
Publication of CN114958758B publication Critical patent/CN114958758B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0652Cells of skeletal and connective tissues; Mesenchyme
    • C12N5/0656Adult fibroblasts
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K67/00Rearing or breeding animals, not otherwise provided for; New breeds of animals
    • A01K67/027New breeds of vertebrates
    • A01K67/0275Genetically modified vertebrates, e.g. transgenic
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K49/00Preparations for testing in vivo
    • A61K49/0004Screening or testing of compounds for diagnosis of disorders, assessment of conditions, e.g. renal clearance, gastric emptying, testing for diabetes, allergy, rheuma, pancreas functions
    • A61K49/0008Screening agents using (non-human) animal models or transgenic animal models or chimeric hosts, e.g. Alzheimer disease animal model, transgenic model for heart failure
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/005Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from viruses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/63Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
    • C12N15/79Vectors or expression systems specially adapted for eukaryotic hosts
    • C12N15/85Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
    • C12N15/8509Vectors or expression systems specially adapted for eukaryotic hosts for animal cells for producing genetically modified animals, e.g. transgenic
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/87Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
    • C12N15/90Stable introduction of foreign DNA into chromosome
    • C12N15/902Stable introduction of foreign DNA into chromosome using homologous recombination
    • C12N15/907Stable introduction of foreign DNA into chromosome using homologous recombination in mammalian cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N5/00Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
    • C12N5/06Animal cells or tissues; Human cells or tissues
    • C12N5/0602Vertebrate cells
    • C12N5/0625Epidermal cells, skin cells; Cells of the oral mucosa
    • C12N5/0631Mammary cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/5005Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells
    • G01N33/5008Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing or evaluating the effect of chemical or biological compounds, e.g. drugs, cosmetics
    • G01N33/5011Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving human or animal cells for testing or evaluating the effect of chemical or biological compounds, e.g. drugs, cosmetics for testing antineoplastic activity
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/07Animals genetically altered by homologous recombination
    • A01K2217/072Animals genetically altered by homologous recombination maintaining or altering function, i.e. knock in
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2227/00Animals characterised by species
    • A01K2227/10Mammal
    • A01K2227/108Swine
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2267/00Animals characterised by purpose
    • A01K2267/03Animal model, e.g. for test or diseases
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2503/00Use of cells in diagnostics
    • C12N2503/02Drug screening
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2510/00Genetically modified cells
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2710/00MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA dsDNA viruses
    • C12N2710/00011Details
    • C12N2710/22011Polyomaviridae, e.g. polyoma, SV40, JC
    • C12N2710/22022New viral proteins or individual genes, new structural or functional aspects of known viral proteins or genes
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2800/00Nucleic acids vectors
    • C12N2800/10Plasmid DNA
    • C12N2800/106Plasmid DNA for vertebrates
    • C12N2800/107Plasmid DNA for vertebrates for mammalian
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N2830/00Vector systems having a special element relevant for transcription
    • C12N2830/008Vector systems having a special element relevant for transcription cell type or tissue specific enhancer/promoter combination
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2500/00Screening for compounds of potential therapeutic value
    • G01N2500/10Screening for compounds of potential therapeutic value involving cells

Abstract

The invention provides a pig cell expressing PyMT, a breast cancer model pig obtained by the pig cell through a somatic cell cloning technology, a construction method and application in the field of biomedicine. Wherein the method comprises the step of inserting a nucleotide sequence for coding PyMT into a safe harbor site of a pig to obtain a nucleotide sequence for expressing SEQ ID NO: 14 and a pig model pig with breast cancer, wherein the pig safety harbor site is selected from the group consisting of pig ROSA26, AAVS1, H11 and COL1a1 safety harbor sites. The application has good applicability of research objects, high expression quantity of target genes in pig cells and high gene editing efficiency.

Description

Construction method and application of breast cancer model pig
Technical Field
The invention relates to the technical field of gene editing, in particular to a pig recombinant cell which is constructed by a CRISPR/Cas9 system and a homologous recombination technology, is integrated at a specific position in a genome and is driven by a mammary gland specific promoter MMTV-LTR to express a PyMT oncogene.
Background
According to the latest cancer burden data released by the international agency for research on cancer (IARC) of the world health organization in 2020 worldwide, 60% of newly added cancer cases in 2020 come from the most common 10 tumors, of which female breast cancer accounts for 11.7% and surpasses lung cancer for the first time in number (11.4%), and become the most cancer diagnosed worldwide. Breast cancer in women, the first cancer to be diagnosed, is the first killer threatening the health of women, and is one of the most urgent health problems today. With the continued development of modern medicine, despite advances in cancer diagnosis, treatment and longevity, mortality has not improved to a great extent. The lack of understanding of the natural history of the disease is the main reason for this limitation, and it is currently not clear at the molecular level which changes in breast tumors may lead to invasion and metastasis.
MMTV is an important virus causing mouse mammary gland tumor, and can use a virus tissue specific promoter thereof, such as a mouse mammary gland tumor virus long terminal repeat promoter (MMTV-LTR), to mediate high expression of oncogenes in mouse mammary glands to generate mammary cancer. Polyoma intermediate T antigen (PyMT, encoded amino acid sequence shown in SEQ ID NO: 14) is a membrane-associated protein encoded by small DNA polyoma virus. PyMT was found to induce tumor generation, and the induced tumors were also prone to tissue metastasis, and biomarker expression was also consistent with that associated with human adverse prognosis. PyMT is a powerful oncogene, the product of which binds and absorbs several signal transduction pathways, including the Src family, Ras, and PI3k kinase pathways, all of which are altered in human breast cancer.
The tumor development mechanism and the treatment research need to be carried out on the basis of corresponding animal models, the currently common animal model is a mouse model, for example, patent TW201536324A discloses a method for increasing the effectiveness of medical therapy, wherein, in the examples, the method of obtaining a breast cancer cell model by injecting MMTV-PyMT cells into p16-3MR transgenic mice and then carrying out treatment and the like is generally mentioned. However, the mouse is very different from the human body in body type, organ size, physiology, pathology and the like, and can not truly simulate the normal physiological and pathological states of the human body. Furthermore, the mouse breast cancer model obtained by injection of MMTV-PyMT cells was not stably inherited. Further, patent CN103173496B discloses a tree shrew breast cancer model, which is established by injecting lentivirus through a nipple, specifically, a virus vector is firstly constructed, then virus packaging and titer determination are carried out, and finally, the tree shrew nipple is used for injecting the lentivirus. However, the tree shrews are not common animal models for diseases, the similarity between the body size and the physiological function of the tree shrews and the human is not high, and the normal physiological and pathological states of the human cannot be truly simulated. Thus, there remains a need for more suitable animal models of breast cancer, including further advances in animal selection and methods of making animal models. The pig is a large animal, is a main meat food supply animal for human for a long time, is similar to human in body size and physiological function, is easy to breed and feed in a large scale, has low requirements on ethics, animal protection and the like, and is an ideal human disease model animal.
Therefore, the pig recombinant cell of the mammary tissue specific expression oncogene PyMT is constructed by adopting a gene editing technology and the mammary specific promoter MMTV-LTR, and then the recombinant cell is used as a nuclear transplantation cell donor to clone and produce the breast cancer model pig.
Disclosure of Invention
The invention provides a method for preparing a pig cell expressing PyMT protein by inserting a nucleotide sequence coding PyMT into a pig safe harbor site at a fixed point through a gene editing method, and the pig cell can further produce a breast cancer model pig expressing PyMT protein in mammary gland tissues through a somatic cell cloning technology, thereby providing a powerful experimental tool for the research and development of disease pathogenesis and therapeutic drugs.
In a first aspect of the invention, a PyMT-expressing pig cell is provided, wherein a nucleotide sequence encoding PyMT is inserted into a pig safe harbor site to obtain a nucleic acid sequence expressing SEQ ID NO: 14, a porcine cell of PyMT.
Preferably, the inserted nucleotide sequence encoding PyMT may be the CDS sequence or cDNA sequence of PyMT.
Preferably, the amino acid sequence of PyMT is SEQ ID NO: 14, or a pharmaceutically acceptable salt thereof.
In one embodiment of the invention, the inserted nucleotide sequence encoding PyMT is as set forth in SEQ ID NO: shown at 39.
Preferably, the pig safe harbor site is selected from the group consisting of pig ROSA26, AAVS1, H11 and COL1a1 safe harbor sites.
In one embodiment of the invention, the nucleotide sequence of the ROSA26 harbor site region and 500bp upstream and downstream thereof is shown in SEQ ID NO: 40, the AAVS1 safety harbor site region and the nucleotide sequences of 500bp on the upstream and the downstream of the AAVS1 safety harbor site region are shown as SEQ ID NO: 41, the nucleotide sequences of the H11 safety harbor site region and the upstream and downstream thereof of each 500bp are shown as SEQ ID NO: 42, the COL1A1 safe harbor site region and the nucleotide sequences of 500bp respectively at the upstream and the downstream thereof are shown as SEQ ID NO: shown at 43.
Further preferably, the optimal safe harbor site of the pig is COL1A1 site.
Preferably, the nucleotide sequence encoding PyMT is regulated in the porcine cells by an exogenous promoter, which is MMTV-LTR.
Mouse Mammary Tumor Virus (MMTV) Long Terminal Repeat (LTR) driven transgenes allow for targeted expression of various oncogenes and growth factors in breast tumor transformation.
In one embodiment of the invention, the PyMT-encoding nucleotide sequence is driven in porcine cells by a MMTV-LTR, the MMTV-LTR nucleotide sequence being as set forth in SEQ ID NO: shown at 15.
Preferably, the porcine cells are somatic cells of pigs. Further preferred are any porcine somatic cells that can be used in somatic cell nuclear transfer technology.
Preferably, the porcine cells can be breast cells, embryonic stem cells, adult stem cells, hematopoietic stem cells, bone marrow mesenchymal stem cells, neural stem cells, hepatic stem cells, muscle satellite cells, skin epidermal stem cells, intestinal epithelial stem cells, retinal stem cells, pancreatic stem cells, somatic cells, fibroblasts, muscle cells, glial cells, adipose cells, or germ cells, and the like.
In one embodiment of the present invention, the porcine cell is a porcine fibroblast or a mammary gland cell (preferably a mammary gland epithelial cell).
In a second aspect of the present invention, a method for constructing the above-mentioned pig cell is provided, wherein a nucleotide sequence encoding PyMT is inserted into a pig safe harbor site to obtain a nucleic acid sequence expressing SEQ ID NO: 14, a porcine cell of PyMT.
Specifically, homologous recombination-based gene editing, nuclease-based ZFN, TALEN, CRISPR/Cas9 and other editing technologies can be adopted.
Preferably, the construction method comprises inserting a nucleotide sequence encoding PyMT into a porcine safe harbor site using a safe harbor site vector, wherein the safe harbor site vector comprises the nucleotide sequence encoding PyMT and a safe harbor site vector backbone, the safe harbor site vector backbone comprises a5 'homology arm and a 3' homology arm of the safe harbor insertion site, the nucleotide sequence encoding PyMT is located between the 5 'homology arm and the 3' homology arm, and the safe harbor site vector backbone is selected from any one of the following:
A) the ROSA26 safety harbor site vector backbone, the 5' homology arm of which is set forth in SEQ ID NO: 5, the 3' homology arm is shown as SEQ ID NO: and 6. Preferably, the nucleotide sequence of the ROSA26 safety harbor site vector skeleton is shown as SEQ ID NO: 4, respectively.
B) AAVS1 safety harbor site vector backbone with 5' homology arms as set forth in SEQ ID NO: 7, and the 3' homology arm is shown as SEQ ID NO: shown in fig. 8. Preferably, the nucleotide sequence of the AAVS1 safety harbor site vector skeleton is the nucleotide sequence shown in SEQ ID NO: the 5 'and 3' homology arms of ROSA26 in 4 were replaced with the 5 'and 3' homology arms of AAVS 1.
C) H11 safe harbor site vector backbone, the 5' homology arm of which is as shown in SEQ ID NO: 9, and the 3' homology arm is shown as SEQ ID NO: shown at 10. Preferably, the nucleotide sequence of the framework of the H11 safety harbor site vector is a nucleotide sequence represented by SEQ ID NO: the 5 'and 3' homology arms of ROSA26 in 4 were replaced with the 5 'and 3' homology arms of H11.
Or D) COL1A1 safe harbor site vector backbone, the 5' homology arm of which is set forth in SEQ ID NO: 11, the 3' homology arm is shown in SEQ ID NO: shown at 12. Preferably, the nucleotide sequence of the framework of the COL1A1 safety harbor site vector is the nucleotide sequence shown in SEQ ID NO: the 5 'and 3' homology arms of ROSA26 in 4 were replaced with the 5 'and 3' homology arms of COL1a 1.
Further preferably, the optimal pig safe harbor site vector skeleton is a COL1A1 safe harbor site vector skeleton.
Preferably, the safe harbor site vector further comprises a promoter, a signal molecule and nucleotide sequences encoding the EGFP protein, the mCherry protein and the puro resistance protein. Wherein, the promoter is an EF-1 alpha promoter, a PGK promoter and/or a pCAG promoter. The signal molecule is EF-1 alpha poly (A) signal, bGH poly (A) signal and/or beta-globin poly (A) signal. Further preferably, the insulating layer further comprises an insulating sub-region.
In a specific embodiment of the invention, the safe harbor site vector skeleton comprises, in order from 5 'to 3', a5 'homology arm, an insulator region, an EF-1 α poly (a) signal, a nucleotide sequence encoding EGFP, an EF-1 α promoter, an insulator region, a PGK promoter, a nucleotide sequence encoding mCherry, a bGH poly (a) signal, a loxP-puro-loxP expression cassette region, an insulator region, a β -globin poly (a) signal, a pCAG promoter, an insulator region, and a 3' homology arm.
In a specific embodiment of the invention, the nucleotide sequence of the COL1a1 safe harbor site vector is as shown in SEQ ID NO: shown at 13.
Preferably, the construction of pig cells is performed using sgRNA vectors comprising sgrnas targeting ROSA26, AAVS1, H11, or COL1a1 safe harbor sites, wherein:
the nucleotide sequence of sgRNA targeting ROSA26 is set forth in SEQ ID NO: 21, the nucleotide sequence of sgRNA targeting AAVS1 is shown in SEQ ID NO: 22, the nucleotide sequence of sgRNA targeting H11 is shown in SEQ ID NO: 23, the nucleotide sequence of sgRNA targeting COL1a1 is set forth in SEQ ID NO: as shown at 24.
Preferably, the sgRNA vector further comprises a backbone vector having the nucleotide sequence of SEQ ID NO: 3.
preferably, a Cas vector is used for constructing the porcine cells, the Cas vector comprises nucleotide sequences for encoding Cas protein, EGFP and Puro resistance protein, wherein the Cas vector further comprises an EF1a promoter, a WPRE element and a3 ' LTR sequence element, and preferably, the nucleotide sequences of the Cas vector sequentially from 5 ' to 3 ': a CMV enhancer, an EF1a promoter, a nuclear localization signal, a nucleotide sequence encoding a Cas protein, a nuclear localization signal, a nucleotide sequence encoding a self-splicing polypeptide P2A, a nucleotide sequence encoding an EGFP, a nucleotide sequence encoding a self-cleaving polypeptide T2A, a nucleotide sequence encoding a Puro resistance protein, a WPRE sequence element, a 3' LTR sequence element and a polyA signal sequence element, said Cas protein being selected from the group consisting of case, CaslB, Cas2, Cas3, Cas4, Cas 54, Cscl, cstc 4, Cstl, cstyl, csnly 4, csnll, cs3672, csflx 4, cscscsflc, csflx 4, cscscscscscscscscscscscscsflx 4, cscscscscsflx 4, csflx 4, cscscscscscsflx 4, cscscscscscscscscscsflx 4, csflx 4, cscscscscscscscsflx 4, csflx 4, cscscsflx 4, cscscscscscscscscscscscscscscscscsflx 4, cscscscsflx 4, cscscscscsflx 4, csflx 4, cscsflx 4, csflx 4, cscscscscscscscscscscscscscscscscscscscscscscscscsflx 4, csflx 4, cscscscscscscsflc 4, cscscsflx 4, csflc 4, cscscsflc 4, cscscscsflc 4, cscsflc 4, cscscscscscscscscscscscscscscsflc 4, cscscscscscscscscscscscscscscscscscscscscscscscscscscsflc 4, csflc 4, cscscsflc 4, csflc 4, cscscscscscsflc 4, csflc 4, cscscscscscscsflc 4, csflc 4, cscscsflc 4, csflc 4, cscscscscscscscscscscsflc 4, csflc 4, csa4, Csa5, C2cl, C2C2, C2C3, Cpfl, CARF, DinG, homologs thereof, or modified forms thereof, preferably Cas 9.
In one embodiment of the present invention, the nucleotide sequence of the Cas vector is as set forth in SEQ ID NO: 1 or 2.
In a specific embodiment of the invention, the construction method comprises co-transfecting a safe harbor site vector, a sgRNA vector and a Cas vector into a pig cell.
In order to increase the gene editing capacity of the Cas9 vector, the invention obtains pU6gRNA-eEF1a-mNLS-hSpCas9-EGFP-PURO (particle pKG-GE3 for short) by modifying a vector purchased from addge (Plasmid #42230, from Zhang Feng lab) pX330-U6-Chimeric _ BB-CBh-hSpCas9 (PX 330 for short). The map of PX330 is shown in FIG. 1, and the modification mode is as follows:
1) removing redundant invalid sequences in the original vector gRNA framework;
2) modifying a promoter: the original promoter (chicken beta-actin promoter) is transformed into an EF1a promoter with higher expression activity, so that the protein expression capacity of the Cas9 gene is increased;
3) increase of nuclear localization signal: a nuclear localization signal coding sequence (NLS) is added at the N end and the C end of the Cas9, and the nuclear localization capability of the Cas9 is increased;
4) adding double screening marks: the original vector does not have any screening marker, is not beneficial to screening and enriching of positive transformation cells, and is inserted with P2A-EGFP-T2A-PURO at the C end of Cas9, so that the vector is endowed with fluorescence and resistance screening capabilities;
5) the insertion of WPRE and 3' LTR and other gene expression regulating sequences: the protein translation capability of the Cas9 gene can be enhanced by inserting WPRE, 3' LTR and other sequences in the reading frame of the gene.
The modified vector pU6gRNA-eEF1a-mNLS-hSpCas9-EGFP-PURO (pKG-GE 3 for short) and the modified site are shown in figure 2, and the whole sequence of the plasmid is shown in SEQ ID NO: 2 is shown in the specification; the main elements of pKG-GE3 are:
1) gRNA expression elements: u6gRNA scaffold;
2) a promoter: the EF1a promoter and CMV enhancer;
3) cas9 gene containing multiple NLS: a Cas9 gene containing N-terminal and C-terminal multinuclear localization signals (NLS);
4) screening marker genes: the fluorescent and resistant double-selection marker element P2A-EGFP-T2A-PURO;
5) elements that enhance translation: WPRE and 3' LTR enhance the translation efficiency of Cas9 and the screening marker gene;
6) transcription termination signal: a bGHpolyA signal;
7) carrier skeleton: including Amp resistance elements and ori replicons, among others.
The plasmid pKG-GE3 has a specific fusion gene; the specific fusion gene encodes a specific fusion protein;
the specific fusion protein sequentially comprises the following elements from N end to C end: two Nuclear Localization Signals (NLS), Cas9 protein, two nuclear localization signals, self-splicing polypeptide P2A, fluorescent reporter protein, self-cleavage polypeptide T2A, resistance selection marker protein;
in plasmid pKG-GE3, the expression of the specific fusion gene is driven by the EF1a promoter;
in the plasmid pKG-GE3, the downstream of the specific fusion gene was a WPRE sequence element, a 3' LTR sequence element and a bGH poly (A) signal sequence element.
The plasmid pKG-GE3 has the following elements in the following order: CMV enhancer, EF1a promoter, the specific fusion gene, WPRE sequence element, 3' LTR sequence element, bGH poly (A) signal sequence element.
In the specific fusion protein, two nuclear localization signals at the upstream of the Cas9 protein are SV40 nuclear localization signals, and two nuclear localization signals at the downstream of the Cas9 protein are nucleoplasmin nuclear localization signals.
In the specific fusion protein, the fluorescent reporter protein can be EGFP protein.
In the specific fusion protein, the resistance screening marker protein can be Puromycin resistance protein.
The amino acid sequence of self-cleaving polypeptide P2A is "ATNFSLLKQAGDVEENPGP" (the cleavage site that occurs self-cleaves is between the first and second amino acid residues from the C-terminus).
The amino acid sequence of self-cleaving polypeptide T2A is "EGRGSLLTCGDVEENPGP" (the cleavage site that occurs self-cleaves is between the first and second amino acid residues from the C-terminus).
The specific fusion gene is specifically shown as SEQ ID NO: 2, nucleotide 911-6706.
The CMV enhancer is as set forth in SEQ ID NO: 2 at nucleotide 395-680.
The EF1a promoter is shown as SEQ ID NO: 2, nucleotide 682-890.
The WPRE sequence element is shown as SEQ ID NO: 2, 6722-7310 nucleotide.
The 3' LTR sequence element is shown in SEQ ID NO: nucleotide 7382-7615 in 2.
The bGH poly (a) signal sequence element is as set forth in SEQ ID NO: 2 as shown by nucleotide 7647-7871.
Preferably, the safe harbor site vector, sgRNA vector or Cas vector are all circular plasmids.
In a third aspect of the invention, there is provided a tissue or organ comprising a porcine cell as described above.
Preferably, the tissue is a mammary gland tissue, and more preferably a mammary gland epithelial tissue. Preferably, the organ is a breast.
In the fourth aspect of the invention, a method for constructing a model pig expressing PyMT is provided, wherein a nucleotide sequence coding PyMT is inserted into a pig safe harbor site to obtain a nucleotide sequence expressing the nucleotide sequence shown in SEQ ID NO: 14, model pig of PyMT. Preferably, the pig safe harbor site is selected from the group consisting of pig ROSA26, AAVS1, H11 and COL1a1 safe harbor sites. Further preferably, the optimal safe harbor site of the pig is COL1A1 site.
Preferably, the construction method further comprises the step of preparing the porcine cells.
Preferably, the construction method comprises transferring the pig cell into an enucleated pig oocyte to obtain a model pig. In one embodiment of the invention, the engraftment is in the perivitelline space of enucleated porcine oocytes.
In a specific embodiment of the invention, the construction method comprises providing the above pig cell or obtaining the pig cell by the above pig cell construction method, and then performing somatic cell nuclear transfer animal cloning on the pig cell to obtain a model pig expressing PyMT protein.
In the fifth aspect of the invention, a method for constructing a breast cancer model pig is provided, wherein a nucleotide sequence coding PyMT is inserted into a pig safe harbor site to obtain a gene which expresses the nucleotide sequence shown in SEQ ID NO: 14, model pig of PyMT. Preferably, the porcine safe harbor site is selected from the group consisting of porcine ROSA26, AAVS1, H11 and COL1a1 safe harbor sites. Further preferably, the optimal safe harbor site of the pig is COL1A1 site.
In a specific embodiment of the invention, the construction method comprises providing the above pig cell or obtaining the pig cell by using the above pig cell construction method, and then performing somatic cell nuclear transfer animal cloning on the pig cell to obtain a model pig with the breast cancer homozygous or heterozygous knock-in PyMT gene.
The sixth aspect of the present invention provides a safety harbor site vector, wherein the safety harbor site vector comprises a nucleotide sequence encoding PyMT and a safety harbor site vector backbone, the safety harbor site vector backbone comprises a5 'homology arm and a 3' homology arm of a safety harbor insertion site, the nucleotide sequence encoding PyMT is located between the 5 'homology arm and the 3' homology arm, and the safety harbor site vector backbone is selected from any one of the following items:
A) the ROSA26 safety harbor site vector backbone, the 5' homology arm of which is set forth in SEQ ID NO: 5, the 3' homology arm is shown as SEQ ID NO: and 6. Preferably, the nucleotide sequence of the ROSA26 safety harbor site vector skeleton is shown as SEQ ID NO: 4, respectively.
B) AAVS1 safety harbor site vector backbone with its 5' homology arm set forth in SEQ ID NO: 7, the 3' homology arm is shown as SEQ ID NO: shown in fig. 8. Preferably, the nucleotide sequence of the AAVS1 safety harbor site vector skeleton is the nucleotide sequence shown in SEQ ID NO: the 5 'and 3' homology arms of ROSA26 in 4 were replaced with the 5 'and 3' homology arms of AAVS 1.
C) H11 safe harbor site vector backbone, the 5' homology arm of which is as shown in SEQ ID NO: 9, the 3' homology arm is shown in SEQ ID NO: shown at 10. Preferably, the nucleotide sequence of the framework of the H11 safety harbor site vector is a nucleotide sequence represented by SEQ ID NO: the 5 'and 3' homology arms of ROSA26 in 4 were replaced with the 5 'and 3' homology arms of H11.
Or D) COL1A1 safe harbor site vector backbone, the 5' homology arm of which is set forth in SEQ ID NO: 11, the 3' homology arm is shown in SEQ ID NO: shown at 12. Preferably, the nucleotide sequence of the framework of the COL1A1 safety harbor site vector is the nucleotide sequence shown in SEQ ID NO: the 5 'and 3' homology arms of ROSA26 in 4 were replaced with the 5 'and 3' homology arms of COL1a 1.
Further preferably, the optimal pig safe harbor site vector skeleton is a COL1A1 safe harbor site vector skeleton.
Preferably, the safe harbor site vector further comprises a promoter, a signal molecule and nucleotide sequences encoding the EGFP protein, the mCherry protein and the puro resistance protein. Wherein, the promoter is an EF-1 alpha promoter, a PGK promoter and/or a pCAG promoter. The signal molecule is EF-1 alpha poly (A) signal, bGH poly (A) signal and/or beta-globin poly (A) signal. Further preferably, the insulating layer further comprises an insulating sub-region.
In a specific embodiment of the invention, the safe harbor site vector skeleton comprises, in order from 5 'to 3', a5 'homology arm, an insulator region, an EF-1 α poly (a) signal, a nucleotide sequence encoding EGFP, an EF-1 α promoter, an insulator region, a PGK promoter, a nucleotide sequence encoding mCherry, a bGH poly (a) signal, a loxP-puro-loxP expression cassette region, an insulator region, a β -globin poly (a) signal, a pCAG promoter, an insulator region, and a 3' homology arm.
The seventh aspect of the invention provides an application of the safe harbor site vector, the sgRNA vector or the sgRNA in preparation of pig cells, model pigs expressing PyMT protein or model pigs expressing breast cancer.
The eighth aspect of the present invention provides an application of the pig cell, the pig cell obtained by the construction method, and the model pig obtained by the construction method of the model pig expressing PyMT in preparation of an animal model with breast cancer, or in screening of drugs for treating breast cancer and evaluation of drug effects, or in gene and cell therapy, or in research of pathogenesis of breast cancer.
The ninth aspect of the invention provides an application of the tissue or organ or the model pig obtained by the construction method in screening drugs for treating breast cancer and evaluating drug effects, or in gene and cell therapy, or in researching pathogenesis of breast cancer.
The term "vector" is a polynucleotide capable of replicating within a cell under its own control, or a genetic element, such as a plasmid, chromosome, virus, transposon, which replicates and/or is expressed by insertion into the chromosome of a host cell. Suitable vectors include, but are not limited to, plasmids, transposons, bacteriophages and cosmids.
The "gRNA" of the present invention, also referred to as guide RNA, is an RNA transcribed from a sgRNA vector in a cell, has specificity for a target sequence in the cell, and can form a complex with a Cas protein.
Compared with the prior art, the invention at least has the following beneficial effects:
(1) the subject of the invention (pig) has better applicability than other animals (rats, mice, primates).
Rodents such as rats and mice have great differences from humans in body types, organ sizes, physiology, pathology and the like, and cannot truly simulate normal physiological and pathological states of humans. Studies have shown that over 95% of drugs validated to be effective in large mice are not effective in human clinical trials. In large animals, primates are animals that have a close relationship with humans, but are small in size, late in sexual maturity (mating starts at age 6-7), and are single-birth animals, and the population propagation speed is extremely slow, and the raising cost is high. In addition, the cloning efficiency of the primate is low, the difficulty is high and the cost is high.
However, pigs, which are animals related to humans other than primates, do not have the above-mentioned disadvantages, and have body types, body weights, organ sizes, and the like similar to those of humans, and are very similar to those of humans in terms of anatomy, physiology, immunology, nutritional metabolism, disease pathogenesis, and the like. Meanwhile, the pigs have early sexual maturity (4-6 months), high reproductive capacity and multiple piglets in one birth, and can form a large group within 2-3 years. In addition, the cloning technology of the pig is very mature, and the cloning and feeding cost is much lower than that of a primate. Pigs are therefore very suitable animals as models for human diseases.
(2) According to the invention, the pU6gRNA-eEF1a-mNLS-hSpCas9-EGFP-PURO (pKG-GE 3 for short) vector which is modified through experimental verification is replaced by a stronger promoter and is added with a component for enhancing protein translation relative to a pX330 vector before modification, so that the expression of Cas9 is improved, the number of nuclear localization signals is increased, the nuclear localization capability of Cas9 protein is improved, and the gene editing efficiency is higher. The invention also adds fluorescent mark and resistance mark in the carrier, which is more convenient to be applied to the screening and enrichment of the positive transformation cell of the carrier. The Cas9 high-efficiency expression vector modified by the invention is used for gene editing, and the editing efficiency is improved by more than 100% compared with that of the original vector.
(3) The invention aims at the pig genome to carry out exploration on the expression condition of 4 safe harbor site genes after knocking in, and screens out the optimal safe harbor site of the pig genome for inserting the exogenous gene, thereby effectively improving the expression condition of the target gene after knocking in the gene.
(4) The invention adopts mammary gland specific promoter, namely mouse mammary gland tumor virus long terminal repetitive promoter (MMTV-LTR) to drive the specific expression of the exogenous oncogene in mammary gland tissues, so that the exogenous oncogene can specifically play a role in the mammary gland tissues, and simultaneously, the influence of the high-level extensive expression of the exogenous oncogene on organisms is avoided.
(5) The MMTV-PyMT expression frame homozygous knocked-in single-cell clone strain obtained by the invention is used for somatic cell nuclear transfer animal cloning, so that the MMTV-PyMT expression frame homozygous knocked-in cloned pig can be directly obtained, and the homozygous inserted gene can be stably inherited. Further, the method can be used in the biomedical fields of next-step drug screening and drug effect evaluation, gene and cell therapy, research on pathogenesis of breast cancer and the like.
In mouse model making, fertilized egg is usually injected with gene editing material in a microinjection way and then embryo transplantation is carried out, because the probability of directly obtaining gene knock-in offspring is very low (less than 1 percent), and meanwhile, offspring hybridization and breeding are required to be carried out to screen homozygous knock-in individuals, so that the method is not suitable for making large animal (such as pig) models with longer gestation period. Therefore, the method adopts the method of in vitro editing and screening the positive editing single cell clone of the primary cells with great technical difficulty and high challenge, and then directly obtains the corresponding model pig by the somatic cell nuclear transfer animal cloning technology, thereby greatly shortening the manufacturing period of the model pig and saving manpower, material resources and financial resources.
The PyMT model pig highly similar to the development process of the human breast cancer is obtained by gene editing and somatic cell cloning technology, is helpful for researching and disclosing the pathogenesis of the breast cancer, can be used for researching drug screening, drug effect detection, gene and cell treatment and the like, can provide effective experimental data for further clinical application, and further provides a powerful experimental means for preventing and treating the human breast cancer. The invention has great application value for the pathogenesis research of human breast cancer, the research and development of therapeutic drugs and preclinical tests.
Drawings
Embodiments of the invention are described in detail below with reference to the attached drawing figures, wherein:
FIG. 1 is a schematic diagram of the structure of plasmid pX 330.
FIG. 2 is a schematic structural diagram of plasmid pKG-GE 3.
FIG. 3 is a schematic diagram of the structure of a pU6gRNA vector.
FIG. 4 is a schematic diagram showing the insertion of a DNA molecule of about 20bp (for transcription to form a gRNA capable of binding to a target sequence) into a plasmid pKG-U6 gRNA.
FIG. 5 is a schematic diagram of the structure of a fluorescent donor plasmid containing an insertion site of ROSA 26.
FIG. 6 is a schematic diagram of the structure of a fluorescent donor plasmid containing an AAVS1 insertion site.
FIG. 7 is a schematic diagram of the structure of a fluorescent donor plasmid containing an insertion site of H11.
FIG. 8 is a schematic structural view of a fluorescent donor plasmid containing an insertion site of COL1A 1.
FIG. 9 is a schematic structural diagram of pKG-MMTV-PyMT Donor plasmid containing COL1A1 insertion site.
FIG. 10 shows the sequencing results of the plasmid proportion optimization test.
FIG. 11 shows the sequencing results of the editing effect of plasmid pX330 and plasmid pKG-GE 3.
FIG. 12 is a graph of the green fluorescence expression of GFP regulated by different harbor safety loci.
FIG. 13 shows the fluorescent quantitative PCR results of the GFP transcription level regulated by different safe harbor sites.
FIG. 14 shows the results of FACS measurements of the expression of GFP protein regulated by different harbor safety loci.
FIG. 15 is an electrophoretogram for identifying whether the recombination of the MMTV-PyMT expression cassette at the 5 'end of the porcine COL1A1 safe harbor insertion site was successful, wherein WT is a wild type control, Blank is a Blank control, sh4 represents the safe harbor site COL1A1, Lr represents the 5' homology arm, JDF represents the identifying primer F, JDR represents the identifying primer R, 1414 or 5965 represents the detection site information.
FIG. 16 is an electrophoretogram for identifying whether the recombination of the MMTV-PyMT expression cassette at the 3 'end of the porcine COL1A1 safe harbor insertion site was successful, wherein WT is a wild type control, Blank is a Blank control, sh4 represents the safe harbor site COL1A1, Rr represents the 3' homology arm, and 282 or 4723 represents the detection site information.
FIG. 17 is an electrophoretogram showing the identification of whether MMTV-PyMT expression cassette is homozygous inserted into the porcine COL1A1 safe harbor site, wherein WT is wild type control, Blank is Blank control, sh4 represents the safe harbor site COL1A1, JDF represents the identification primer F, JDR represents the identification primer R, 1085 or 1560 represents the detection site information.
FIG. 18 shows the fluorescent quantitative PCR results of PyMT transcript level regulated by porcine COL1A1 safe harbor locus.
FIG. 19 shows the FACS test results of the regulation of PyMT protein expression by porcine COL1A1 harbor site, wherein WT represents the cloned porcine mammary epithelial cells without modification, and PyMT represents the mammary epithelial cells of a PyMT-expressing breast cancer model pig prepared by nuclear transfer technique.
Detailed Description
The present invention is described in further detail below with reference to specific embodiments, which are given for the purpose of illustration only and are not intended to limit the scope of the invention. The examples provided below serve as a guide for further modifications by a person skilled in the art and do not constitute a limitation of the invention in any way.
The experimental procedures in the following examples, unless otherwise indicated, are conventional and are carried out according to the techniques or conditions described in the literature in the field or according to the instructions of the products. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified. The recombinant plasmids constructed in the examples were all sequence verified. Complete culture broth (% by volume): 15% fetal bovine serum (Gibco) + 83% DMEM medium (Gibco) + 1% Penicilin-Streptomyces (Gibco) + 1% HEPES (Solarbio). Cell culture conditions: 37 ℃ and 5% CO 2 、5%O 2 The constant temperature incubator.
The method for preparing the primary pig fibroblast comprises the following steps: taking 0.5g of pig ear tissue, removing hair and bone tissue, soaking for 30-40s by using 75% of alcohol, washing for 5 times by using PBS (phosphate buffer solution) containing 5% (volume ratio) Penicillin-streptomycin (Gibco), and washing for one time by using the PBS; ② the tissue is cut into pieces by scissors, 5ml of 0.1 percent collagenase solution (Sigma) is adopted to digest for 1h at 37 ℃, then 500g is centrifuged for 5min, and the supernatant is discarded; thirdly, resuspending the precipitate with 1mL of complete culture solution, then paving the precipitate into a 10-diameter cell culture dish which contains 10mL of complete culture medium and is sealed by 0.2% gelatin (VWR), and culturing until the cell grows to be about 60% of the bottom of the dish; and fourthly, after the step III is finished, digesting and collecting cells by adopting trypsin, and then resuspending the cells in complete culture solution for carrying out a subsequent electrotransformation experiment.
Example 1 construction of vectors
Construction of Cas9 high-efficiency expression vector (pKG-GE 3 for short)
The starting commercial plasmids were: pX330-U6-Chimeric _ BB-CBh-hSpCas9, plasmid pX330 for short, as shown in SEQ ID NO: 1 is shown.
Based on the pX330 plasmid, a plasmid pU6gRNAeEF1a-mNLS-hSpCas9-EGFP-PURO is constructed, which is called plasmid pKG-GE3 for short, and is shown as SEQ ID NO: 2, respectively.
Both plasmid pX330 and plasmid pKG-GE3 are circular plasmids.
The structure of plasmid pX330 is schematically shown in FIG. 1. The amino acid sequence of SEQ ID NO: 1, the 440-st-725 nucleotide constitutes the CMV enhancer, the 727-1208 th-1208 nucleotide constitutes the chicken beta-actin promoter, the 1304-st-1324 nucleotide encodes SV40 Nuclear Localization Signal (NLS), the 1325-st-5449 nucleotide encodes the Cas9 protein, and the 5450-st-5497 nucleotide encodes the nucleosplastin Nuclear Localization Signal (NLS).
The structure of plasmid pKG-GE3 is schematically shown in FIG. 2. SEQ ID NO: 2, the 395-680 nucleotide constitutes a CMV enhancer, the 682-890 nucleotide constitutes an EF1a promoter, the 986-1006 nucleotide encodes a Nuclear Localization Signal (NLS), the 1016-1036 nucleotide encodes a Nuclear Localization Signal (NLS), the 1037-5161 nucleotide encodes a Cas9 protein, the 5162-5209 nucleotide encodes a Nuclear Localization Signal (NLS), the 5219-5266 nucleotide encodes a Nuclear Localization Signal (NLS), the 5276-5332 nucleotide encodes a self-splicing polypeptide P2A (the amino acid sequence of the self-splicing polypeptide P2A is ' ATNFSLLKQAGDVEENPGP ', and the position of the self-splicing cleavage is ' ATNFSLLKQAGDVEENPGPBetween the first amino acid residue and the second amino acid residue from the C terminal), the EGFP protein is encoded by the nucleotide 5333-6046, the EGFP protein is encoded by the nucleotide 6056-6109 from the cleavage polypeptide T2A (the amino acid sequence of the cleavage polypeptide T2A is "EGRGSLLTCGDVEENPGP", the cleavage position occurring from the cleavage is between the first amino acid residue and the second amino acid residue from the C terminal), and the Puromycin resistance protein (short for Puro) is encoded by the nucleotide 6110-6703 R Protein), nucleotide 6722-7310 constitutes the WPRE sequence element, nucleotide 7382-7615 constitutes the 3' LTR sequence element, and nucleotide 7647-7871 constitutes the bGH poly (A) signal sequence element. SEQ ID NO: 2, 911-6706 form a fusion gene to express the fusion protein. Due to the presence of the self-cleaving polypeptides P2A and T2A, the fusion protein spontaneously cleaves into three separate proteins, the Cas9 protein, the EGFP protein and the Puro resistance protein.
Compared with plasmid pX330, the constructed plasmid pKG-GE3 was mainly modified as follows: removing residual gRNA framework sequences (GTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTTT) to reduce interference; secondly, the original chicken beta-actin promoter is transformed into an EF1a promoter with higher expression activity, so that the protein expression capacity of the Cas9 gene is improved; ③ the nuclear localization signal coding gene (NLS) is added at the upstream and the downstream of the Cas9 gene, and the nuclear localization capability of the Cas9 protein is increased; the original plasmid does not have any eukaryotic cell screening marker, is not beneficial to screening and enriching of positive transformed cells, and is sequentially inserted with a P2A-EGFP-T2A-PURO coding gene at the downstream of the Cas9 gene to endow eukaryotic cells with fluorescence and puromycin resistance dual screening markers; inserting WPRE element and 3' LTR sequence element to strengthen the protein translating capacity of Cas9 gene.
Second, construction of pKG-U6gRNA expression vector
pUC57 is used as a starting plasmid to construct a pKG-U6gRNA vector, the structural schematic diagram is shown in figure 3, and the sequence is shown in SEQ ID NO: 3, respectively. SEQ ID NO: 3, the 2280-2539 th nucleotide constitutes the hU6 promoter, and the 2558-2637 th nucleotide is used for transcription to form a gRNA framework. When the recombinant gRNA is used, a DNA molecule (a target sequence binding region for forming gRNA through transcription) of about 20bp is inserted into a plasmid pKG-U6gRNA to form a recombinant plasmid, and the recombinant plasmid is transcribed in a cell to obtain the gRNA, wherein a schematic diagram is shown in figure 4.
Thirdly, constructing Donor vectors containing GFP genes at different safe harbor sites
Plasmids PB-1G 2R 3-puro-ROSA26, PB-1G 2R 3-puro-AAVS1, PB-1G 2R 3-puro-H11 and PB-1G 2R3-puro-COL1A1 were constructed.
The plasmid PB-1G 2R 3-puro-ROSA26 is shown in FIG. 5. SEQ ID NO: 4, the 1st to 345 th nucleotides form a5 'end pig genome region (SH1 left arm, shown as SEQ ID NO: 5) of a ROSA26 safety harbor insertion site, the 9184 th and the 10195 th nucleotides form a 3' end pig genome region (SH1 right arm, shown as SEQ ID NO: 6) of a ROSA26 safety harbor insertion site, the 346 th and the 3132 th nucleotides 3531, 6506 and 8975 th and the 9175 th nucleotides respectively form 4 different insulator regions, the 1954 th and the 3131 th nucleotides form an EF-1 alpha promoter, the 1216 th and the 1935 th nucleotides encode EGFP proteins, the 637 and the 1209 th nucleotides form an EF-1 alpha poly (A) signal, the 3543 th and the 4042 th nucleotides form a PGK promoter, the 4059 th nucleotides 4769 encode mChehrysen proteins, the 5014791 th and the 50191 th nucleotides form a bGH 5 signal, the 4059 and the 659 th nucleotides encode mCHRx protein, the nucleotide 7259-8974 constitutes the pCAG promoter, and the nucleotide 69669-7233 constitutes the beta-globin poly (A) signal.
The plasmid PB-1G 2R 3-puro-AAVS1 is shown in FIG. 6. Only the sequences of SEQ ID NOs: 4 with AAVS1 safe harbor insertion site 5' to the porcine genomic region (SH2 left arm), see SEQ ID NO: 7; converting SEQ ID NO: 4 by the 3' region of the porcine genome at the AAVS1 harbor safety insertion site (SH2 right arm), see SEQ ID NO: 8. other sequences are similar to SEQ ID NO: and 4 are consistent.
The plasmid PB-1G 2R 3-puro-H11 is schematically shown in FIG. 7. Only the sequences of SEQ ID NOs: 4 with the 5' porcine genomic region from the H11 safe harbor insertion site (SH3 left arm), see SEQ ID NO: 9; converting SEQ ID NO: 4 to the 3' region of the porcine genome from the H11 harbor safe insertion site (SH3 right arm), see SEQ ID NO: 10. other sequences are similar to SEQ ID NO: 4 are identical.
The plasmid PB-1G 2R3-puro-COL1A1 is schematically shown in FIG. 8. Only the sequences of SEQ ID NOs: 4 to the pig genome region 5' of the COL1a1 safe harbor insertion site (SH4 left arm), see SEQ ID NO: 11; converting the amino acid sequence of SEQ ID NO: 4 to the 3' region of the porcine genome at the COL1A1 harbor insertion site (SH4 right arm), see SEQ ID NO: 12. other sequences are similar to SEQ ID NO: and 4 are consistent.
Fourthly, construction of pKG-MMTV-PyMT Donor vector
The plasmid pKG-MMTV-PyMT was constructed and the structure is schematically shown in FIG. 9. SEQ ID NO: 13, the 1 st-852 th nucleotide is a homologous sequence at the 5' end of a COL1A1 safety harbor insertion site of a pig genome, the 879 nd 1079 th nucleotide is an Insulator 1(Insulator1) sequence, and the 1080 nd 2394 th nucleotide is an MMTV-LTR promoter sequence (derived from pGL4.36[ luc2P MMTV Hygro)]The plasmid has a sequence shown in SEQ ID NO: 15, purchased from newcastle disease biotechnology ltd, shanghai), and the 2407-position 3672 nucleotide is a coding sequence of PyMT (the whole gene is synthesized in a living organism, and the coded amino acid sequence is shown in SEQ ID NO: 14), the 3721-position 3945 nucleotide is bGHPoly (A) sequence, the 4107-position 4436 nucleotide is SV40 promoter sequence, the 4485-position 5081 nucleotide is Puromycin resistance protein (Puro for short) R Protein) coding sequence, wherein the 5261-5382 th nucleotide is SV40 Poly (A) sequence, the 4031-4064 th and 5427-5460 th nucleotides are LoxP sequences with the same direction respectively, the 5469-5669 th nucleotide is Insulator 2(Insulator2) sequence, and the 5690-6396 th nucleotide is homologous sequence at the 3' end of COL1A1 safety harbor insertion site of porcine genome.
Example 2 comparison of the Effect of plasmid pX330 and plasmid pKG-GE3
Selecting a high efficiency gRNA target located in the RAG1 gene:
target of RAG1-gRNA 4: 5'-AGTTATGGCAGAACTCAGTG-3' (SEQ ID NO: 16).
The primers used to amplify the fragment containing the target were as follows:
RAG1-nF126:5’-CCCCATCCAAAGTTTTTAAAGGA-3’(SEQ ID NO:17);
RAG1-nR525:5’-TGTGGCAGATGTCACAGTTTAGG-3’(SEQ ID NO:18)。
porcine primary fibroblasts were prepared from ear tissue of newborn Jiangxiang pigs (female, blood group AO).
Construction of recombinant plasmid of gRNA of RAG1 gene
The plasmid pKG-U6gRNA was digested with the restriction enzyme BbsI, and the vector backbone (approximately 3kb linear large fragment) was recovered. RAG1-4S and RAG1-4A were synthesized separately, mixed and annealed to give double-stranded DNA molecules with sticky ends. The double-stranded DNA molecule with cohesive ends was ligated to the vector backbone to give the plasmid pKG-U6gRNA (RAG1-gRNA 4).
RAG1-4S:5’-caccgAGTTATGGCAGAACTCAGTG-3’(SEQ ID NO:19);
RAG1-4A:5’-aaacCACTGAGTTCTGCCATAACTc-3’(SEQ ID NO:20)。
RAG1-4S and RAG1-4A are both single stranded DNA molecules.
Second, plasmid proportion optimization
1. Plasmid co-transfected porcine primary fibroblast
A first group: plasmid pKG-U6gRNA (RAG1-gRNA4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.44 μ g plasmid pKG-U6gRNA (RAG1-gRNA 4): 1.56. mu.g of plasmid pKG-GE 3. Namely, the molar ratio of the plasmid pKG-U6gRNA (RAG1-gRNA4) to the plasmid pKG-GE3 is as follows: 1:1.
Second group: plasmid pKG-U6gRNA (RAG1-gRNA4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: 0.72 μ g plasmid pKG-U6gRNA (RAG1-gRNA 4): 1.28. mu.g of plasmid pKG-GE 3. Namely, the molar ratio of the plasmid pKG-U6gRNA (RAG1-gRNA4) to the plasmid pKG-GE3 is as follows: 2:1.
Third group: the plasmid pKG-U6gRNA (RAG1-gRNA4) and the plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.92 μ g plasmid pKG-U6gRNA (RAG1-gRNA 4): 1.08. mu.g of plasmid pKG-GE 3. Namely, the molar ratio of the plasmid pKG-U6gRNA (RAG1-gRNA4) to the plasmid pKG-GE3 is as follows: 3:1.
And a fourth group: plasmid pKG-U6gRNA (RAG1-gRNA4) was transfected into porcine primary fibroblasts. Proportioning: about 20 ten thousand porcine primary fibroblasts: mu.g of plasmid pKG-U6gRNA (RAG1-gRNA 4).
Co-transfection was performed by electroporation using a mammalian nuclear transfection kit (Neon kit, Thermofeisher) and a Neon TM transfection system electrotransformation apparatus (parameters set at 1450V, 10ms, 3 pulses).
2. After the completion of step 1, the culture is carried out for 16 to 18 hours by using the complete culture solution, and then the culture is carried out by replacing with a new complete culture solution. The total time of incubation was 48 hours.
3. After completion of step 2, cells were trypsinized and collected, genomic DNA was extracted, PCR amplified using a primer pair consisting of RAG1-nF126 and RAG1-nR525, and then subjected to electrophoresis.
After electrophoresis, the band of interest was recovered and sequenced, and the sequencing results are shown in FIG. 10.
The editing efficiency of different targets was obtained by analyzing the sequencing peak patterns using the syntheo ICE tool. The gene editing efficiency of the first group to the third group was 9%, 53%, and 66% in this order. The fourth group did not undergo gene editing. The result shows that the editing efficiency of the third group is the highest, the optimal matching ratio of the single gRNA plasmid to the Cas9 plasmid is determined to be 3:1, and the actual dosage of the plasmid is 0.92 mu g to 1.08 mu g.
Thirdly, the effect comparison of plasmid pX330 and plasmid pKG-GE3
1. Cotransfection
RAG 1-group B: plasmid pKG-U6gRNA (RAG1-gRNA4) was transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.92. mu.g of plasmid pKG-U6gRNA (RAG1-gRNA 4).
RAG1-330 group: plasmid pKG-U6gRNA (RAG1-gRNA4) and plasmid pX330 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.92 μ g plasmid pKG-U6gRNA (RAG1-gRNA 4): 1.08. mu.g of plasmid pX330, i.e.the molar ratio of the two DNAs is 3: 1.
RAG1-KG group: plasmid pKG-U6gRNA (RAG1-gRNA4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.92 μ g plasmid pKG-U6gRNA (RAG1-gRNA 4): mu.g of plasmid pKG-GE3, i.e.the molar ratio of the two DNAs was 3: 1.
Co-transfection was performed by electroporation using a mammalian nuclear transfection kit (Neon kit, Thermofeisher) and a Neon TM transfection system electrotransformation apparatus (parameters set at 1450V, 10ms, 3 pulses).
2. After step 1, the culture is carried out for 16 to 18 hours by using the complete culture solution, and then the culture is carried out by replacing the complete culture solution with a new one. The total time of incubation was 48 hours.
3. After completion of step 2, cells were trypsinized and harvested, genomic DNA was extracted, PCR amplified using a primer pair consisting of RAG1-nF126 and RAG1-nR525, and the products were sequenced.
The editing efficiency of different targets was obtained by analyzing the sequencing peak patterns using the syntheo ICE tool. Gene editing did not occur in the RAG1-B group. The editing efficiency of the RAG1-330 group and the RAG1-KG group is 28% and 68% in sequence. An exemplary peak pattern of the sequencing results is shown in FIG. 11. The results showed that the use of plasmid pKG-GE3 resulted in a significant improvement in gene editing efficiency compared to the use of plasmid pX 330.
Example 3 selection of optimal safe harbor site for site-directed insertion of foreign Gene into pig genome
Construction of pig genome ROSA26, AAVS1, H11 and COL1A1 safety harbor site gRNA recombinant vector and high-efficiency cutting target screening
Through early screening, the efficient cutting targets of ROSA26, H11, AAVS1 and COL1A1 safety harbor sites are sgRNA respectively ROSA26-g3 (cleavage efficiency 38%), sgRNA AAVS1-g4 (cleavage efficiency 30%), sgRNA H11-g1 (cleavage efficiency 60%), sgRNA COL1A1-g3 (cleavage efficiency 56%) the target sequences were as follows:
sgRNA ROSA26-g3 and (3) target point: 5'-GAAGGAGCAAACTGACATGG-3' (SEQ ID NO: 21);
sgRNA AAVS1-g4 and (3) target point: 5'-TGCAGTGGGTCTTTGGGGAC-3' (SEQ ID NO: 22);
sgRNA H11-g1 and (3) target point: 5'-TTCCAGGAACATAAGAAAGT-3' (SEQ ID NO: 23);
sgRNA COL1A1-g3 and (3) target point: 5'-GCAGTCTCAGCAACCACTGA-3' (SEQ ID NO: 24).
The gRNA plasmids corresponding to the 4 gRNA targets are pKG-U6gRNA (ROSA26-g3), pKG-U6gRNA (AAVS1-g4), pKG-U6gRNA (H11-g1) and pKG-U6gRNA (COL1A1-g3), wherein the framework vectors are all pKG-U6gRNA (SEQ ID NO: 3), and the plasmid construction method is the same as that in example 2.
Second, a fluorescent Donor vector (i.e., a vector containing a foreign gene GFP at a different harbor site), a sgRNA vector and a Cas9 vector (pKG-GE 3 prepared in example 1) including homology arms on both sides of different harbor insertion sites were mixedly electroporated into porcine primary fibroblasts
The PB-1G 2R 3-puro-different safe harbor insertion site fluorescent vectors, corresponding high-efficiency sgRNA vectors and high-efficiency Cas9 vectors are co-transfected to the porcine primary fibroblasts respectively. Electrotransfer experiments were carried out using a mammalian nuclear transfection kit (Neon kit, Thermofisiher) with a Neon TM transfection system electrotransfer instrument (parameter settings: 1450V, 10ms, 3 pulse).
Cotransfection plasmid combination and proportion:
a first group: the plasmid PB-1G 2R 3-puro-ROSA26, the plasmid pKG-U6gRNA (ROSA26-G3) and the plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 1.26. mu.g of plasmid PB-1G 2R 3-puro-ROSA 26: 0.82. mu.g of plasmid pKG-U6gRNA (ROSA26-g 3): 0.92. mu.g of plasmid pKG-GE3, i.e.the molar ratio of the 3 DNAs: 1: 3: 1.
second group: the plasmid PB-1G 2R 3-puro-AAVS1, plasmid pKG-U6gRNA (AAVS1-G4) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 1.26. mu.g of plasmid PB-1G 2R 3-puro-AAVS 1: 0.82 μ g plasmid pKG-U6gRNA (AAVS1-g 4): 0.92. mu.g of plasmid pKG-GE3, i.e.the molar ratio of the 3 DNAs: 1: 3: 1.
third group: plasmid PB-1G 2R 3-puro-H11, plasmid pKG-U6gRNA (H11-G1) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 1.26. mu.g of plasmid PB-1G 2R 3-puro-H11: 0.82 μ g plasmid pKG-U6gRNA (H11-g 1): 0.92. mu.g of plasmid pKG-GE3, i.e.the molar ratio of the 3 DNAs: 1: 3: 1.
and a fourth group: the plasmid PB-1G 2R3-puro-COL1A1, plasmid pKG-U6gRNA (COL1A1-G3) and plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 1.26. mu.g of plasmid PB-1G 2R3-puro-COL1A 1: 0.82 μ g plasmid pKG-U6gRNA (COL1A1-g 3): 0.92. mu.g of plasmid pKG-GE3, i.e.the molar ratio of the 3 DNAs: 1: 3: 1.
and a fifth group: carrying out electrotransformation operation on primary pig fibroblasts without adding any plasmid according to isoelectric parameters.
The specific implementation method comprises the following steps:
cell: before electric transfer, the fusion degree of primary fibroblasts of the pigs reaches 60%, trypsinase digestion is carried out by 0.25%, trypan blue staining is carried out for counting, and five groups of electric transfer are carried out on equivalent cells.
Electrically transforming primary pig cells:
(1) digesting the cells by using pancreatin, washing the obtained cell suspension once by using PBS phosphate buffer solution (Solarbio), centrifuging for 6min at 600g, discarding a supernatant, and resuspending the cells (11 mu L/cell) by using 58 mu L of electric transfer basic solution R buffer, wherein bubbles are prevented from being generated in the resuspension process;
(2) sucking 10 μ L of cell suspension and plasmid electrotransformation reaction liquid, mixing, and deliberately avoiding generating bubbles in the mixing process;
(3) placing the electric rotating cup of the reagent cassette in a cup groove of a Neon (TM) transformation system electric rotating instrument, and adding 3mL of Buffer E;
(4) sucking 10 mu L of the mixed solution obtained in the step 2) by using an electric rotating gun, inserting the mixed solution into an electric shock cup, selecting an electric rotating program (1450V 10ms 3pulse), immediately transferring the mixed solution in the electric rotating gun into a 6-hole plate after electric shock transfection, wherein each hole contains 3mL of complete culture solution (15% fetal bovine serum (Gibco) + 83% DMEM culture medium (Gibco) + 1% P/S (Gibco penil Penicillin-Streptomyces) + 1% HEPES (Solarbio));
(5) mixing, and placing at 37 deg.C with 5% CO 2 、5%O 2 Culturing in a constant-temperature incubator;
(6) and (4) performing electrotransformation for 12-24h for liquid exchange, performing electrotransformation for 48h, pressurizing by using puromycin, and screening positive cells.
Thirdly, puromycin pressure screening and cell GFP fluorescence intensity detection
Cells are subjected to plasmid electrotransfer for 48 hours, puromycin with the concentration of 1.5 mu g/mL is added for screening, a culture medium containing puromycin with the same concentration is replaced every two days, GFP green fluorescence photographing is carried out simultaneously, screening is carried out continuously for two weeks, and after the plasmids in the cells are completely degraded, pressure screening is carried out continuously for one week. The level of the efficiency of the safe harbor site to express the exogenous gene is judged by the strength of GFP fluorescence expression.
One week after puromycin screening, the fluorescence intensity of ROSA26 and COL1A1 safety harbor site experimental groups is obviously stronger than that of AAVS1 and H11 experimental groups; after two weeks of puromycin screening, the fluorescence intensity is from strong to weak: COL1A1, ROSA26, H11 and AAVS1, wherein fluorescence intensity of the H11 group is not very uniform, fluorescence intensity of the ROSA26 group is relatively uniform and relatively high, fluorescence expression of cells of the AAVS1 group is the weakest, and fluorescence is the strongest when the number of fluorescent cells of the COL1A1 group is the largest; after the puromycin is continuously screened for three weeks, the fluorescence intensity is from strong to weak:
COL1A1> ROSA26> H11> AAVS1, with results as in FIG. 12.
Fourth, GFP Gene transcript level detection
To compare the difference in mRNA transcript levels after the GFP gene was integrated into four different safe harbor sites, it was possible to determine whether it could be involved in the regulation of GFP expression and influence on the expression level. Designing a pair of primers at the exon of the GFP gene, selecting puromycin to screen cells after three weeks, extracting total RNA, performing reverse transcription to obtain cDNA, detecting the transcription level of primary cells after the GFP gene is integrated at four different safe harbor sites, and simultaneously using wild primary cells as a control. GAPDH as reference gene according to 2 -ΔCt The method carries out calculation.
(1) Primer information (Table 1)
Table 1: fluorescent quantitative PCR primer information
Figure BDA0002943886490000121
(2) Total RNA extraction from cells
Total RNA extraction from cells was performed according to Simply P Total RNA extraction kit from Bio Flux
(3) First Strand cDNA obtaining
Reverse transcription kit according to Vazyme
Figure BDA0002943886490000123
II 1st Strand cDNAsSynthesis Kit (R211-01/02) Synthesis of cDNA
The first chain, the specific steps and procedures are as follows:
1) preparing first chain cDNA synthetic reaction liquid
The following mixed solution of Table 2 was prepared in an RNase-free centrifuge tube
TABLE 2
Figure BDA0002943886490000122
Gently blow and beat the mixture by a pipette gun and mix the mixture evenly.
2) The first strand cDNA synthesis reaction was carried out under the following conditions, and the reaction conditions are shown in Table 3.
TABLE 3
Figure BDA0002943886490000131
The product is immediately used for qPCR reaction or stored at-80 ℃ for storage, so that repeated freeze thawing is avoided.
(4) Fluorescent quantitative PCR
Detection of four different inserted safety harbor sites (ROSA26, AAVS1, H11, COL1A1) by real-time fluorescent quantitative PCR
Expression level of GFP in porcine primary fibroblasts, GAPDH was used as an internal control gene. The operation steps and procedures are as follows:
1) the reaction system is formulated as shown in Table 4
TABLE 4
Figure BDA0002943886490000132
2) qPCR reaction procedure is given in Table 5 below
TABLE 5
Figure BDA0002943886490000133
3) Statistics and analysis
Data analysis was performed using SPSS statistical software, expressed as (mean. + -. standard deviation), and statistical analysis was performed using two-way analysis of variance. 2 -ΔCt The results showed that the GFP expression levels were low in AAVS1 and H11, high in ROSA26 and COL1A1, and very significant in the difference between the GFP transcription levels in COL1A1 and ROSA26 groups relative to AAVS1 and H11 groups after three weeks of puromycin screening (P. sup. TM.)<0.01),2 -ΔCt The values are shown in table 6, and the results of the significance analysis of differences are shown in fig. 13.
Table 6: 2 -ΔCt Value information
Figure BDA0002943886490000134
Figure BDA0002943886490000141
From the results of real-time fluorescence quantitative PCR of GFP gene and fluorescence signal intensity three weeks after culturing the cells, it was concluded that, among the four genomic safety harbor sites ROSA26, AAVS1, H11, and COL1A1, the COL1A1 site had the best expression effect when a foreign gene was inserted.
Fifth, protein expression level FACS detection of GFP Gene
To compare the expression of the GFP protein after the GFP gene was integrated into four different safe harbor sites. Cells were digested with trypsin, centrifuged at 400g for 4min, and the supernatant was discarded. Cells were suspended at 1mL medium weight and cell suspensions were transferred separately into flow tubes. GFP signals were detected in the FITC channel of a BD FACCSmolody flow cytometer, and 5X 10 cells were collected with wild type cells as negative controls 4 The results of the analysis of each cell are shown in FIG. 14. The results showed that GFP fluorescence signal COL1A1>ROSA26>H11>AAVS1。
Therefore, combining the above results, the COL1a1 site is the safe harbor site of the porcine primary cell which most efficiently expresses the foreign gene among four safe harbor sites, ROSA26, AAVS1, H11, and COL1a 1.
Example 4 preparation of a Single cell clone with MMTV-PyMT expression frame punctually inserted into the COL1A1 safe harbor site in Swine
One, cotransfection
The plasmid pKG-U6gRNA (COL1A1-g3), the plasmid pKG-GE3 and the plasmid pKG-MMTV-PyMT (shown in SEQ ID NO: 13) were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.89 μ g plasmid pKG-U6gRNA (COL1A1-g 3): 0.99. mu.g of plasmid pKG-GE 3: 1.12. mu.g plasmid pKG-MMTV-PyMT, i.e.the molar ratio of the 3 DNAs: 3: 1: 1.
co-transfection was performed by electroporation using a mammalian nuclear transfection kit (Neon kit, Thermofeisher) and a Neon TM transfection system electrotransformation apparatus (parameters set at 1450V, 10ms, 3 pulses).
Pressure screening of puromycin
1. Puromycin screening MMTV-PyMT expression cassette positive insert cell
After cells are subjected to plasmid electrotransfer for 48 hours, 1.5 mu g/mL puromycin is added for screening, a culture medium containing puromycin with the same concentration is replaced every day, all cells of the wild control wells die after continuous screening for one week, and the cells die greatly after one week of electroporation screening of the pKG-MMTV-PyMT plasmid due to low electrotransfer efficiency; continuously adding puromycin for screening for one week, wherein the cells only die sporadically, part of positive clones start division and proliferation, and the number of the cells is increased continuously; the pressure screening was continued for one week to complete the degradation of intracellular plasmid to eliminate false positive cell clones. And (4) stopping pressurizing after three weeks of pressurizing and screening, and culturing for two generations to perform monoclonal sorting after the cell state is recovered.
2. Sorting monoclone and enlarging culture
(1) Selecting three weeks later puromycin, performing monoclonal sorting, digesting with trypsin, neutralizing with complete culture medium, centrifuging for 5min at 500g, removing supernatant, re-suspending the precipitate with 1mL complete culture medium, diluting, picking the single cells with oral suction tube, transferring to 96-well plate containing 100 μ L complete culture medium, picking single cells in one plate of 96-well plate with one cell per well, placing at 37 deg.C and 5% CO 2 、5%O 2 The cell culture medium is changed every 2-3 days (1.5%Puromycin), during which the growth of cells in each well is observed with a microscope, and wells without cells and non-single cell clones are excluded;
(2) after the cells in the wells of the 96-well plate grew to the bottom of the well (about 2 weeks), the cells were digested with trypsin and harvested, wherein 2/3 cells were seeded into a 6-well plate containing complete medium and the remaining 1/3 cells were harvested in a 1.5mL centrifuge tube for further genotyping;
(3) when the 6-well plate cells were grown to 50% confluency, they were digested with 0.25% (Gibco) trypsin and harvested, and frozen using cell cryopreservation (90% complete medium + 10% DMSO by volume).
Thirdly, identifying the genome level of the single cell clone by inserting the MMTV-PyMT expression frame into the porcine COL1A1 safe harbor site at fixed point
To examine whether the porcine COL1A1 safe harbor site was successfully inserted into the MMTV-PyMT expression cassette in a fixed-point manner. Taking single cell clone after puromycin pressure screening, extracting genome DNA, performing PCR amplification (respectively adopting a primer pair consisting of sh4-Lr-JDF1414 and sh4-Lr-JDR5965, a primer pair consisting of sh4-Rr-JDF282 and sh4-Rr-JDR4723, and a primer pair consisting of sh4-wt-JDF1085 and sh4-wt-JDR 1560), and then performing electrophoresis. Porcine primary adipose stem cells were used as wild type controls. A primer pair consisting of sh4-Lr-JDF1414 and sh4-Lr-JDR5965 is used for identifying whether the MMTV-PyMT expression cassette at the 5' end of the porcine COL1A1 safe harbor insertion site is successfully recombined; a primer pair consisting of sh4-Rr-JDF282 and sh4-Rr-JDR4723 is used for identifying whether the MMTV-PyMT expression cassette at the 3' end of the porcine COL1A1 safe harbor insertion site is successfully recombined; the primer pair consisting of sh4-wt-JDF1085 and sh4-wt-JDR1560 was used to identify whether the MMTV-PyMT expression cassette inserted into the porcine COL1A1 safe harbor site was homozygous or heterozygous.
sh4-Lr-JDF1414:CCTGCTGTAAGTGCCGTAGT(SEQ ID NO:29)
sh4-Lr-JDR5965:CTAGGGGCACAGCACGTC(SEQ ID NO:30)
sh4-Rr-JDF282:AAGTTATTAGGTCTGAAGAGGAGTTT(SEQ ID NO:31)
sh4-Rr-JDR4723:CCCATCATTCCGTCCCAGAG(SEQ ID NO:32)
sh4-wt-JDF1085:TGCTGAGTTCTGGCTTCCTG(SEQ ID NO:33)
sh4-wt-JDR1560:TCTACCAAGAGAGTGACCAGCAG(SEQ ID NO:34)
The electrophorograms are shown in fig. 15, fig. 16 and fig. 17, respectively. From the results of electrophoresis, we preliminarily determined that single-cell clones numbered 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 were clones that succeeded in site-directed insertion of MMTV-PyMT at the COL1a1 safe harbor site in swine, wherein single-cell clone No. 6, 10 was homozygous site-directed insertion, and single-cell clone No. 1, 2, 3, 4, 5, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 was heterozygous site-directed insertion (table 7).
TABLE 7 genotype of MMTV-PyMT expression frame punctually inserted into porcine COL1A1 safe harbor site single cell clone
Figure BDA0002943886490000151
Figure BDA0002943886490000161
Cloning and producing breast cancer model pig by somatic cell nuclear transfer technology
1. In vitro maturation of oocytes
Pig ovaries were first harvested from a slaughterhouse and oocytes harvested and cultured for In Vitro Maturation (IVM). Cumulus Oocyte Complexes (COCs) are extracted from follicles of 3-6 mm diameter. COCs having at least three layers of dense cumulus cells were selected and approximately 300-400 COCs were cultured in four-well plates containing IVM medium, wherein each well contained 200. mu.L of IVM medium to culture approximately 50 COCs. Culturing the plates containing COCs at 38.5 deg.C with 5% CO 2 And culturing in an incubator with saturated humidity for 42-44 hours.
2. Somatic Cell Nuclear Transfer (SCNT) and embryo transfer
The SCNT technical scheme is as follows: the cumulus cells were removed by treating the cultured COCs with 0.1% (w/v) hyaluronidase. Removal of perivitelline space by gentle aspiration in TLH-PVA solution using beveled glass needleThe first polar body and the adjacent cytoplasm (containing the oocyte nucleus). A positive fibroblast homozygous for insertion of MMTV-PyMT was injected into the perivitelline space of an enucleated oocyte. In fusion medium (0.25 MD-sorbitol, 0.05mM Mg (C) 2 H 3 O 2 ) 2 ,20mg/mLBSA and 0.5mM HEPES[acid-free]) In this method, embryos were reconstituted for 20. mu.s using an electrofusion apparatus (LF201, NEPA Gene Co., Ltd., Japan) by fusing with a single DC pulse of 200V/mm. Then, the embryos are cultured in PZM-3 for 0.5-1h, and then 0.25 MD-sorbitol, 0.01mM Ca (C) 2 H 3 O 2 ) 2 ,0.05mM Mg(C 2 H 3 O 2 ) 2 And 0.1mg/mLBSA embryos were activated for 100ms with a single pulse of 150V/mm. The embryos are placed in PZM-3 solution containing 5. mu.g/mL cytochalasin B at 38.5 ℃ and 5% CO 2 、5%O 2 And 90% N 2 Is equilibrated for 2h and then cultured in PZM-3 medium under the same culture conditions as described above until embryo transfer.
The SCNT embryos are transplanted into the oviducts of recipient sows. About 23 days after embryo transfer, pregnancy was confirmed using an ultrasonic scanner (HS-101V, Kyooka Honda electronic Co., Ltd., Japan), and the piglet was cloned at day 116 and 117. 4 pregnant sows which are pregnant successfully produce 7 breast cancer model pigs.
Fifthly, detecting the transcription level of the pig PyMT gene of the breast cancer model
In order to detect whether the model pig with the MMTV-PyMT expression frame inserted into the COL1A1 safe harbor site at a fixed point can express the mRNA of PyMT gene, a pair of primers is designed in the MMTV-PyMT expression frame, the mammary tissues of PyMT model clone pig and unmodified control clone pig (same cell source) are separated, total RNA is extracted and is subjected to reverse transcription to form cDNA, the cDNA is used for detecting the mRNA expression level of the PyMT gene of pig mammary cells, and meanwhile, the unmodified clone pig mammary gland cell (called WT cell) is used as a control. Beta-actin is taken as an internal reference gene according to the formula 2 -ΔCt And (4) calculating by the method. The detailed procedure was as described in example 3 (IV, GFP gene transcript level assay).
Primer information is shown in table 8:
TABLE 8 fluorescent quantitative PCR primer information
Figure BDA0002943886490000162
Data analysis was performed using SPSS statistical software, expressed as (mean. + -. standard deviation), and statistical analysis was performed using one-way analysis of variance. 2 -ΔCt The results show that the PyMT gene expression level of the modified porcine mammary gland cells was significantly higher than that of the non-modified cloned porcine mammary gland cells (fig. 18).
In conclusion, according to the results of real-time fluorescent quantitative PCR, PyMT gene was significantly expressed in breast cells of the modified breast cancer model pig.
Sixth, FACS detection of protein expression level of breast cancer model pig PyMT gene
To compare PyMT gene expression in modified and unmodified porcine mammary gland cells. Mammary tissues of the model pig and the control pig are respectively separated, placed into digestive juice containing 250U/mL collagenase I, 150U/mL hyaluronidase, 1% penicillin and 1% streptomycin, subjected to shake digestion at 37 ℃ for 1h, then neutralized with an equal volume of complete medium (cell culture solution of DMEM/F12 containing 10% FBS), and respectively sieved through a 100-micron cell sieve and a 40-micron cell sieve to obtain single mammary epithelial cells. PBS was added to wash the cells, after centrifugation, the supernatant was discarded. Adding 90% pre-cooled methanol at-20 deg.C, fully suspending the cells, and fixing for 20 min. After fixation is finished, centrifuging and discarding the fixing solution. Blocking was performed by adding 3% BSA for 1 h. And after the sealing is finished, centrifuging and discarding sealing liquid. And washed with complete medium, and after washing, the cells were resuspended in a specific PyMT antibody (Abcam, ab15085) dilution (final antibody concentration 1: 200 dilution) and incubated at room temperature for 2 h. After the antibody incubation was completed, and after washing well with the complete medium, goat anti-rat secondary antibody (Abcam, ab150157) was added to the medium at a final concentration of 1: 1000 dilution, room temperature incubation for 1h, then use complete medium after washing, add 500 u L complete medium basic suspension cells, and the cell suspension transferred to the flow tube. PyMT antibody fluorescence signal was detected in the FITC channel of a BD FACSELODY flow cytometer and 5X 10 collected 4 Into one cellThe results of the line analysis are shown in FIG. 19. The results show that the fluorescent signal of PyMT antibody is obviously detected in mammary epithelial cells (PyMT) inserted into MMTV-PyMT at the site of the porcine COL1A1 safe harbor site, while the fluorescent signal of PyMT antibody is not detected in wild-type mammary epithelial cells (WT) of a control pig, which indicates that the inserted PyMT has higher expression in mammary epithelial cells of a breast cancer model pig, and further indicates that the breast cancer model pig is successfully constructed.
Furthermore, the breast cancer model pig prepared by the method can be used in the biomedical fields of next drug screening and drug effect evaluation, gene and cell therapy, research on the pathogenesis of breast cancer and the like.
The preferred embodiments of the present invention have been described in detail, however, the present invention is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present invention within the technical idea of the present invention, and these simple modifications are within the protective scope of the present invention.
It should be noted that, in the above embodiments, the various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, the present invention does not separately describe various possible combinations.
Sequence listing
<110> Nanjing King Gene engineering Co., Ltd
<120> construction method and application of breast cancer model pig
<130> 1
<160> 43
<170> SIPOSequenceListing 1.0
<210> 1
<211> 8484
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg ggtcttcgag aagacctgtt ttagagctag aaatagcaag ttaaaataag 300
gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcttttttg ttttagagct 360
agaaatagca agttaaaata aggctagtcc gtttttagcg cgtgcgccaa ttctgcagac 420
aaatggctct agaggtaccc gttacataac ttacggtaaa tggcccgcct ggctgaccgc 480
ccaacgaccc ccgcccattg acgtcaatag taacgccaat agggactttc cattgacgtc 540
aatgggtgga gtatttacgg taaactgccc acttggcagt acatcaagtg tatcatatgc 600
caagtacgcc ccctattgac gtcaatgacg gtaaatggcc cgcctggcat tgtgcccagt 660
acatgacctt atgggacttt cctacttggc agtacatcta cgtattagtc atcgctatta 720
ccatggtcga ggtgagcccc acgttctgct tcactctccc catctccccc ccctccccac 780
ccccaatttt gtatttattt attttttaat tattttgtgc agcgatgggg gcgggggggg 840
ggggggggcg gggcgagggg cggggcgggg cgaggcggag aggtgcggcg gcagccaatc 900
agagcggcgc gctccgaaag tttcctttta tggcgaggcg gcggcggcgg cggccctata 960
aaaagcgaag cgcgcggcgg gcgggagtcg ctgcgcgctg ccttcgcccc gtgccccgct 1020
ccgccgccgc ctcgcgccgc ccgccccggc tctgactgac cgcgttactc ccacaggtga 1080
gcgggcggga cggcccttct cctccgggct gtaattagct gagcaagagg taagggttta 1140
agggatggtt ggttggtggg gtattaatgt ttaattacct ggagcacctg cctgaaatca 1200
ctttttttca ggttggaccg gtgccaccat ggactataag gaccacgacg gagactacaa 1260
ggatcatgat attgattaca aagacgatga cgataagatg gccccaaaga agaagcggaa 1320
ggtcggtatc cacggagtcc cagcagccga caagaagtac agcatcggcc tggacatcgg 1380
caccaactct gtgggctggg ccgtgatcac cgacgagtac aaggtgccca gcaagaaatt 1440
caaggtgctg ggcaacaccg accggcacag catcaagaag aacctgatcg gagccctgct 1500
gttcgacagc ggcgaaacag ccgaggccac ccggctgaag agaaccgcca gaagaagata 1560
caccagacgg aagaaccgga tctgctatct gcaagagatc ttcagcaacg agatggccaa 1620
ggtggacgac agcttcttcc acagactgga agagtccttc ctggtggaag aggataagaa 1680
gcacgagcgg caccccatct tcggcaacat cgtggacgag gtggcctacc acgagaagta 1740
ccccaccatc taccacctga gaaagaaact ggtggacagc accgacaagg ccgacctgcg 1800
gctgatctat ctggccctgg cccacatgat caagttccgg ggccacttcc tgatcgaggg 1860
cgacctgaac cccgacaaca gcgacgtgga caagctgttc atccagctgg tgcagaccta 1920
caaccagctg ttcgaggaaa accccatcaa cgccagcggc gtggacgcca aggccatcct 1980
gtctgccaga ctgagcaaga gcagacggct ggaaaatctg atcgcccagc tgcccggcga 2040
gaagaagaat ggcctgttcg gaaacctgat tgccctgagc ctgggcctga cccccaactt 2100
caagagcaac ttcgacctgg ccgaggatgc caaactgcag ctgagcaagg acacctacga 2160
cgacgacctg gacaacctgc tggcccagat cggcgaccag tacgccgacc tgtttctggc 2220
cgccaagaac ctgtccgacg ccatcctgct gagcgacatc ctgagagtga acaccgagat 2280
caccaaggcc cccctgagcg cctctatgat caagagatac gacgagcacc accaggacct 2340
gaccctgctg aaagctctcg tgcggcagca gctgcctgag aagtacaaag agattttctt 2400
cgaccagagc aagaacggct acgccggcta cattgacggc ggagccagcc aggaagagtt 2460
ctacaagttc atcaagccca tcctggaaaa gatggacggc accgaggaac tgctcgtgaa 2520
gctgaacaga gaggacctgc tgcggaagca gcggaccttc gacaacggca gcatccccca 2580
ccagatccac ctgggagagc tgcacgccat tctgcggcgg caggaagatt tttacccatt 2640
cctgaaggac aaccgggaaa agatcgagaa gatcctgacc ttccgcatcc cctactacgt 2700
gggccctctg gccaggggaa acagcagatt cgcctggatg accagaaaga gcgaggaaac 2760
catcaccccc tggaacttcg aggaagtggt ggacaagggc gcttccgccc agagcttcat 2820
cgagcggatg accaacttcg ataagaacct gcccaacgag aaggtgctgc ccaagcacag 2880
cctgctgtac gagtacttca ccgtgtataa cgagctgacc aaagtgaaat acgtgaccga 2940
gggaatgaga aagcccgcct tcctgagcgg cgagcagaaa aaggccatcg tggacctgct 3000
gttcaagacc aaccggaaag tgaccgtgaa gcagctgaaa gaggactact tcaagaaaat 3060
cgagtgcttc gactccgtgg aaatctccgg cgtggaagat cggttcaacg cctccctggg 3120
cacataccac gatctgctga aaattatcaa ggacaaggac ttcctggaca atgaggaaaa 3180
cgaggacatt ctggaagata tcgtgctgac cctgacactg tttgaggaca gagagatgat 3240
cgaggaacgg ctgaaaacct atgcccacct gttcgacgac aaagtgatga agcagctgaa 3300
gcggcggaga tacaccggct ggggcaggct gagccggaag ctgatcaacg gcatccggga 3360
caagcagtcc ggcaagacaa tcctggattt cctgaagtcc gacggcttcg ccaacagaaa 3420
cttcatgcag ctgatccacg acgacagcct gacctttaaa gaggacatcc agaaagccca 3480
ggtgtccggc cagggcgata gcctgcacga gcacattgcc aatctggccg gcagccccgc 3540
cattaagaag ggcatcctgc agacagtgaa ggtggtggac gagctcgtga aagtgatggg 3600
ccggcacaag cccgagaaca tcgtgatcga aatggccaga gagaaccaga ccacccagaa 3660
gggacagaag aacagccgcg agagaatgaa gcggatcgaa gagggcatca aagagctggg 3720
cagccagatc ctgaaagaac accccgtgga aaacacccag ctgcagaacg agaagctgta 3780
cctgtactac ctgcagaatg ggcgggatat gtacgtggac caggaactgg acatcaaccg 3840
gctgtccgac tacgatgtgg accatatcgt gcctcagagc tttctgaagg acgactccat 3900
cgacaacaag gtgctgacca gaagcgacaa gaaccggggc aagagcgaca acgtgccctc 3960
cgaagaggtc gtgaagaaga tgaagaacta ctggcggcag ctgctgaacg ccaagctgat 4020
tacccagaga aagttcgaca atctgaccaa ggccgagaga ggcggcctga gcgaactgga 4080
taaggccggc ttcatcaaga gacagctggt ggaaacccgg cagatcacaa agcacgtggc 4140
acagatcctg gactcccgga tgaacactaa gtacgacgag aatgacaagc tgatccggga 4200
agtgaaagtg atcaccctga agtccaagct ggtgtccgat ttccggaagg atttccagtt 4260
ttacaaagtg cgcgagatca acaactacca ccacgcccac gacgcctacc tgaacgccgt 4320
cgtgggaacc gccctgatca aaaagtaccc taagctggaa agcgagttcg tgtacggcga 4380
ctacaaggtg tacgacgtgc ggaagatgat cgccaagagc gagcaggaaa tcggcaaggc 4440
taccgccaag tacttcttct acagcaacat catgaacttt ttcaagaccg agattaccct 4500
ggccaacggc gagatccgga agcggcctct gatcgagaca aacggcgaaa ccggggagat 4560
cgtgtgggat aagggccggg attttgccac cgtgcggaaa gtgctgagca tgccccaagt 4620
gaatatcgtg aaaaagaccg aggtgcagac aggcggcttc agcaaagagt ctatcctgcc 4680
caagaggaac agcgataagc tgatcgccag aaagaaggac tgggacccta agaagtacgg 4740
cggcttcgac agccccaccg tggcctattc tgtgctggtg gtggccaaag tggaaaaggg 4800
caagtccaag aaactgaaga gtgtgaaaga gctgctgggg atcaccatca tggaaagaag 4860
cagcttcgag aagaatccca tcgactttct ggaagccaag ggctacaaag aagtgaaaaa 4920
ggacctgatc atcaagctgc ctaagtactc cctgttcgag ctggaaaacg gccggaagag 4980
aatgctggcc tctgccggcg aactgcagaa gggaaacgaa ctggccctgc cctccaaata 5040
tgtgaacttc ctgtacctgg ccagccacta tgagaagctg aagggctccc ccgaggataa 5100
tgagcagaaa cagctgtttg tggaacagca caagcactac ctggacgaga tcatcgagca 5160
gatcagcgag ttctccaaga gagtgatcct ggccgacgct aatctggaca aagtgctgtc 5220
cgcctacaac aagcaccggg ataagcccat cagagagcag gccgagaata tcatccacct 5280
gtttaccctg accaatctgg gagcccctgc cgccttcaag tactttgaca ccaccatcga 5340
ccggaagagg tacaccagca ccaaagaggt gctggacgcc accctgatcc accagagcat 5400
caccggcctg tacgagacac ggatcgacct gtctcagctg ggaggcgaca aaaggccggc 5460
ggccacgaaa aaggccggcc aggcaaaaaa gaaaaagtaa gaattcctag agctcgctga 5520
tcagcctcga ctgtgccttc tagttgccag ccatctgttg tttgcccctc ccccgtgcct 5580
tccttgaccc tggaaggtgc cactcccact gtcctttcct aataaaatga ggaaattgca 5640
tcgcattgtc tgagtaggtg tcattctatt ctggggggtg gggtggggca ggacagcaag 5700
ggggaggatt gggaagagaa tagcaggcat gctggggagc ggccgcagga acccctagtg 5760
atggagttgg ccactccctc tctgcgcgct cgctcgctca ctgaggccgg gcgaccaaag 5820
gtcgcccgac gcccgggctt tgcccgggcg gcctcagtga gcgagcgagc gcgcagctgc 5880
ctgcaggggc gcctgatgcg gtattttctc cttacgcatc tgtgcggtat ttcacaccgc 5940
atacgtcaaa gcaaccatag tacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg 6000
tggttacgcg cagcgtgacc gctacacttg ccagcgcctt agcgcccgct cctttcgctt 6060
tcttcccttc ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc 6120
tccctttagg gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgatttgg 6180
gtgatggttc acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg 6240
agtccacgtt ctttaatagt ggactcttgt tccaaactgg aacaacactc aactctatct 6300
cgggctattc ttttgattta taagggattt tgccgatttc ggtctattgg ttaaaaaatg 6360
agctgattta acaaaaattt aacgcgaatt ttaacaaaat attaacgttt acaattttat 6420
ggtgcactct cagtacaatc tgctctgatg ccgcatagtt aagccagccc cgacacccgc 6480
caacacccgc tgacgcgccc tgacgggctt gtctgctccc ggcatccgct tacagacaag 6540
ctgtgaccgt ctccgggagc tgcatgtgtc agaggttttc accgtcatca ccgaaacgcg 6600
cgagacgaaa gggcctcgtg atacgcctat ttttataggt taatgtcatg ataataatgg 6660
tttcttagac gtcaggtggc acttttcggg gaaatgtgcg cggaacccct atttgtttat 6720
ttttctaaat acattcaaat atgtatccgc tcatgagaca ataaccctga taaatgcttc 6780
aataatattg aaaaaggaag agtatgagta ttcaacattt ccgtgtcgcc cttattccct 6840
tttttgcggc attttgcctt cctgtttttg ctcacccaga aacgctggtg aaagtaaaag 6900
atgctgaaga tcagttgggt gcacgagtgg gttacatcga actggatctc aacagcggta 6960
agatccttga gagttttcgc cccgaagaac gttttccaat gatgagcact tttaaagttc 7020
tgctatgtgg cgcggtatta tcccgtattg acgccgggca agagcaactc ggtcgccgca 7080
tacactattc tcagaatgac ttggttgagt actcaccagt cacagaaaag catcttacgg 7140
atggcatgac agtaagagaa ttatgcagtg ctgccataac catgagtgat aacactgcgg 7200
ccaacttact tctgacaacg atcggaggac cgaaggagct aaccgctttt ttgcacaaca 7260
tgggggatca tgtaactcgc cttgatcgtt gggaaccgga gctgaatgaa gccataccaa 7320
acgacgagcg tgacaccacg atgcctgtag caatggcaac aacgttgcgc aaactattaa 7380
ctggcgaact acttactcta gcttcccggc aacaattaat agactggatg gaggcggata 7440
aagttgcagg accacttctg cgctcggccc ttccggctgg ctggtttatt gctgataaat 7500
ctggagccgg tgagcgtgga agccgcggta tcattgcagc actggggcca gatggtaagc 7560
cctcccgtat cgtagttatc tacacgacgg ggagtcaggc aactatggat gaacgaaata 7620
gacagatcgc tgagataggt gcctcactga ttaagcattg gtaactgtca gaccaagttt 7680
actcatatat actttagatt gatttaaaac ttcattttta atttaaaagg atctaggtga 7740
agatcctttt tgataatctc atgaccaaaa tcccttaacg tgagttttcg ttccactgag 7800
cgtcagaccc cgtagaaaag atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa 7860
tctgctgctt gcaaacaaaa aaaccaccgc taccagcggt ggtttgtttg ccggatcaag 7920
agctaccaac tctttttccg aaggtaactg gcttcagcag agcgcagata ccaaatactg 7980
ttcttctagt gtagccgtag ttaggccacc acttcaagaa ctctgtagca ccgcctacat 8040
acctcgctct gctaatcctg ttaccagtgg ctgctgccag tggcgataag tcgtgtctta 8100
ccgggttgga ctcaagacga tagttaccgg ataaggcgca gcggtcgggc tgaacggggg 8160
gttcgtgcac acagcccagc ttggagcgaa cgacctacac cgaactgaga tacctacagc 8220
gtgagctatg agaaagcgcc acgcttcccg aagggagaaa ggcggacagg tatccggtaa 8280
gcggcagggt cggaacagga gagcgcacga gggagcttcc agggggaaac gcctggtatc 8340
tttatagtcc tgtcgggttt cgccacctct gacttgagcg tcgatttttg tgatgctcgt 8400
caggggggcg gagcctatgg aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct 8460
tttgctggcc ttttgctcac atgt 8484
<210> 2
<211> 10476
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 2
gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc tgttagagag 60
ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac gtgacgtaga 120
aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat ggactatcat 180
atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt gtggaaagga 240
cgaaacaccg ggtcttcgag aagacctgtt ttagagctag aaatagcaag ttaaaataag 300
gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcttttttc tagcgcgtgc 360
gccaattctg cagacaaatg gctctagagg tacccgttac ataacttacg gtaaatggcc 420
cgcctggctg accgcccaac gacccccgcc cattgacgtc aatagtaacg ccaataggga 480
ctttccattg acgtcaatgg gtggagtatt tacggtaaac tgcccacttg gcagtacatc 540
aagtgtatca tatgccaagt acgcccccta ttgacgtcaa tgacggtaaa tggcccgcct 600
ggcattgtgc ccagtacatg accttatggg actttcctac ttggcagtac atctacgtat 660
tagtcatcgc tattaccatg ggggcagagc gcacatcgcc cacagtcccc gagaagttgg 720
ggggaggggt cggcaattga tccggtgcct agagaaggtg gcgcggggta aactgggaaa 780
gtgatgtcgt gtactggctc cgcctttttc ccgagggtgg gggagaaccg tatataagtg 840
cagtagtcgc cgtgaacgtt ctttttcgca acgggtttgc cgccagaaca caggttggac 900
cggtgccacc atggactata aggaccacga cggagactac aaggatcatg atattgatta 960
caaagacgat gacgataaga tggcccccaa aaagaaacga aaggtgggtg ggtccccaaa 1020
gaagaagcgg aaggtcggta tccacggagt cccagcagcc gacaagaagt acagcatcgg 1080
cctggacatc ggcaccaact ctgtgggctg ggccgtgatc accgacgagt acaaggtgcc 1140
cagcaagaaa ttcaaggtgc tgggcaacac cgaccggcac agcatcaaga agaacctgat 1200
cggagccctg ctgttcgaca gcggcgaaac agccgaggcc acccggctga agagaaccgc 1260
cagaagaaga tacaccagac ggaagaaccg gatctgctat ctgcaagaga tcttcagcaa 1320
cgagatggcc aaggtggacg acagcttctt ccacagactg gaagagtcct tcctggtgga 1380
agaggataag aagcacgagc ggcaccccat cttcggcaac atcgtggacg aggtggccta 1440
ccacgagaag taccccacca tctaccacct gagaaagaaa ctggtggaca gcaccgacaa 1500
ggccgacctg cggctgatct atctggccct ggcccacatg atcaagttcc ggggccactt 1560
cctgatcgag ggcgacctga accccgacaa cagcgacgtg gacaagctgt tcatccagct 1620
ggtgcagacc tacaaccagc tgttcgagga aaaccccatc aacgccagcg gcgtggacgc 1680
caaggccatc ctgtctgcca gactgagcaa gagcagacgg ctggaaaatc tgatcgccca 1740
gctgcccggc gagaagaaga atggcctgtt cggaaacctg attgccctga gcctgggcct 1800
gacccccaac ttcaagagca acttcgacct ggccgaggat gccaaactgc agctgagcaa 1860
ggacacctac gacgacgacc tggacaacct gctggcccag atcggcgacc agtacgccga 1920
cctgtttctg gccgccaaga acctgtccga cgccatcctg ctgagcgaca tcctgagagt 1980
gaacaccgag atcaccaagg cccccctgag cgcctctatg atcaagagat acgacgagca 2040
ccaccaggac ctgaccctgc tgaaagctct cgtgcggcag cagctgcctg agaagtacaa 2100
agagattttc ttcgaccaga gcaagaacgg ctacgccggc tacattgacg gcggagccag 2160
ccaggaagag ttctacaagt tcatcaagcc catcctggaa aagatggacg gcaccgagga 2220
actgctcgtg aagctgaaca gagaggacct gctgcggaag cagcggacct tcgacaacgg 2280
cagcatcccc caccagatcc acctgggaga gctgcacgcc attctgcggc ggcaggaaga 2340
tttttaccca ttcctgaagg acaaccggga aaagatcgag aagatcctga ccttccgcat 2400
cccctactac gtgggccctc tggccagggg aaacagcaga ttcgcctgga tgaccagaaa 2460
gagcgaggaa accatcaccc cctggaactt cgaggaagtg gtggacaagg gcgcttccgc 2520
ccagagcttc atcgagcgga tgaccaactt cgataagaac ctgcccaacg agaaggtgct 2580
gcccaagcac agcctgctgt acgagtactt caccgtgtat aacgagctga ccaaagtgaa 2640
atacgtgacc gagggaatga gaaagcccgc cttcctgagc ggcgagcaga aaaaggccat 2700
cgtggacctg ctgttcaaga ccaaccggaa agtgaccgtg aagcagctga aagaggacta 2760
cttcaagaaa atcgagtgct tcgactccgt ggaaatctcc ggcgtggaag atcggttcaa 2820
cgcctccctg ggcacatacc acgatctgct gaaaattatc aaggacaagg acttcctgga 2880
caatgaggaa aacgaggaca ttctggaaga tatcgtgctg accctgacac tgtttgagga 2940
cagagagatg atcgaggaac ggctgaaaac ctatgcccac ctgttcgacg acaaagtgat 3000
gaagcagctg aagcggcgga gatacaccgg ctggggcagg ctgagccgga agctgatcaa 3060
cggcatccgg gacaagcagt ccggcaagac aatcctggat ttcctgaagt ccgacggctt 3120
cgccaacaga aacttcatgc agctgatcca cgacgacagc ctgaccttta aagaggacat 3180
ccagaaagcc caggtgtccg gccagggcga tagcctgcac gagcacattg ccaatctggc 3240
cggcagcccc gccattaaga agggcatcct gcagacagtg aaggtggtgg acgagctcgt 3300
gaaagtgatg ggccggcaca agcccgagaa catcgtgatc gaaatggcca gagagaacca 3360
gaccacccag aagggacaga agaacagccg cgagagaatg aagcggatcg aagagggcat 3420
caaagagctg ggcagccaga tcctgaaaga acaccccgtg gaaaacaccc agctgcagaa 3480
cgagaagctg tacctgtact acctgcagaa tgggcgggat atgtacgtgg accaggaact 3540
ggacatcaac cggctgtccg actacgatgt ggaccatatc gtgcctcaga gctttctgaa 3600
ggacgactcc atcgacaaca aggtgctgac cagaagcgac aagaaccggg gcaagagcga 3660
caacgtgccc tccgaagagg tcgtgaagaa gatgaagaac tactggcggc agctgctgaa 3720
cgccaagctg attacccaga gaaagttcga caatctgacc aaggccgaga gaggcggcct 3780
gagcgaactg gataaggccg gcttcatcaa gagacagctg gtggaaaccc ggcagatcac 3840
aaagcacgtg gcacagatcc tggactcccg gatgaacact aagtacgacg agaatgacaa 3900
gctgatccgg gaagtgaaag tgatcaccct gaagtccaag ctggtgtccg atttccggaa 3960
ggatttccag ttttacaaag tgcgcgagat caacaactac caccacgccc acgacgccta 4020
cctgaacgcc gtcgtgggaa ccgccctgat caaaaagtac cctaagctgg aaagcgagtt 4080
cgtgtacggc gactacaagg tgtacgacgt gcggaagatg atcgccaaga gcgagcagga 4140
aatcggcaag gctaccgcca agtacttctt ctacagcaac atcatgaact ttttcaagac 4200
cgagattacc ctggccaacg gcgagatccg gaagcggcct ctgatcgaga caaacggcga 4260
aaccggggag atcgtgtggg ataagggccg ggattttgcc accgtgcgga aagtgctgag 4320
catgccccaa gtgaatatcg tgaaaaagac cgaggtgcag acaggcggct tcagcaaaga 4380
gtctatcctg cccaagagga acagcgataa gctgatcgcc agaaagaagg actgggaccc 4440
taagaagtac ggcggcttcg acagccccac cgtggcctat tctgtgctgg tggtggccaa 4500
agtggaaaag ggcaagtcca agaaactgaa gagtgtgaaa gagctgctgg ggatcaccat 4560
catggaaaga agcagcttcg agaagaatcc catcgacttt ctggaagcca agggctacaa 4620
agaagtgaaa aaggacctga tcatcaagct gcctaagtac tccctgttcg agctggaaaa 4680
cggccggaag agaatgctgg cctctgccgg cgaactgcag aagggaaacg aactggccct 4740
gccctccaaa tatgtgaact tcctgtacct ggccagccac tatgagaagc tgaagggctc 4800
ccccgaggat aatgagcaga aacagctgtt tgtggaacag cacaagcact acctggacga 4860
gatcatcgag cagatcagcg agttctccaa gagagtgatc ctggccgacg ctaatctgga 4920
caaagtgctg tccgcctaca acaagcaccg ggataagccc atcagagagc aggccgagaa 4980
tatcatccac ctgtttaccc tgaccaatct gggagcccct gccgccttca agtactttga 5040
caccaccatc gaccggaaga ggtacaccag caccaaagag gtgctggacg ccaccctgat 5100
ccaccagagc atcaccggcc tgtacgagac acggatcgac ctgtctcagc tgggaggcga 5160
caaaaggccg gcggccacga aaaaggccgg ccaggcaaaa aagaaaaagg gcggctccaa 5220
gcggcctgcc gcgacgaaga aagcgggaca ggccaagaaa aagaaaggat ccggcgcaac 5280
aaacttctct ctgctgaaac aagccggaga tgtcgaagag aatcctggac cggtgagcaa 5340
gggcgaggag ctgttcaccg gggtggtgcc catcctggtc gagctggacg gcgacgtaaa 5400
cggccacaag ttcagcgtgt ccggcgaggg cgagggcgat gccacctacg gcaagctgac 5460
cctgaagttc atctgcacca ccggcaagct gcccgtgccc tggcccaccc tcgtgaccac 5520
cctgacctac ggcgtgcagt gcttcagccg ctaccccgac cacatgaagc agcacgactt 5580
cttcaagtcc gccatgcccg aaggctacgt ccaggagcgc accatcttct tcaaggacga 5640
cggcaactac aagacccgcg ccgaggtgaa gttcgagggc gacaccctgg tgaaccgcat 5700
cgagctgaag ggcatcgact tcaaggagga cggcaacatc ctggggcaca agctggagta 5760
caactacaac agccacaacg tctatatcat ggccgacaag cagaagaacg gcatcaaggt 5820
gaacttcaag atccgccaca acatcgagga cggcagcgtg cagctcgccg accactacca 5880
gcagaacacc cccatcggcg acggccccgt gctgctgccc gacaaccact acctgagcac 5940
ccagtccgcc ctgagcaaag accccaacga gaagcgcgat cacatggtcc tgctggagtt 6000
cgtgaccgcc gccgggatca ctctcggcat ggacgagctg tacaagggct ccggcgaggg 6060
caggggaagt cttctaacat gcggggacgt ggaggaaaat cccggcccaa ccgagtacaa 6120
gcccacggtg cgcctcgcca cccgcgacga cgtccccagg gccgtacgca ccctcgccgc 6180
cgcgttcgcc gactaccccg ccacgcgcca caccgtcgat ccggaccgcc acatcgagcg 6240
ggtcaccgag ctgcaagaac tcttcctcac gcgcgtcggg ctcgacatcg gcaaggtgtg 6300
ggtcgcggac gacggcgccg cggtggcggt ctggaccacg ccggagagcg tcgaagcggg 6360
ggcggtgttc gccgagatcg gcccgcgcat ggccgagttg agcggttccc ggctggccgc 6420
gcagcaacag atggaaggcc tcctggcgcc gcaccggccc aaggagcccg cgtggttcct 6480
ggccaccgtc ggagtctcgc ccgaccacca gggcaagggt ctgggcagcg ccgtcgtgct 6540
ccccggagtg gaggcggccg agcgcgccgg ggtgcccgcc ttcctggaga cctccgcgcc 6600
ccgcaacctc cccttctacg agcggctcgg cttcaccgtc accgccgacg tcgaggtgcc 6660
cgaaggaccg cgcacctggt gcatgacccg caagcccggt gcctgaacgc gttaagtcga 6720
caatcaacct ctggattaca aaatttgtga aagattgact ggtattctta actatgttgc 6780
tccttttacg ctatgtggat acgctgcttt aatgcctttg tatcatgcta ttgcttcccg 6840
tatggctttc attttctcct ccttgtataa atcctggttg ctgtctcttt atgaggagtt 6900
gtggcccgtt gtcaggcaac gtggcgtggt gtgcactgtg tttgctgacg caacccccac 6960
tggttggggc attgccacca cctgtcagct cctttccggg actttcgctt tccccctccc 7020
tattgccacg gcggaactca tcgccgcctg ccttgcccgc tgctggacag gggctcggct 7080
gttgggcact gacaattccg tggtgttgtc ggggaaatca tcgtcctttc cttggctgct 7140
cgcctgtgtt gccacctgga ttctgcgcgg gacgtccttc tgctacgtcc cttcggccct 7200
caatccagcg gaccttcctt cccgcggcct gctgccggct ctgcggcctc ttccgcgtct 7260
tcgccttcgc cctcagacga gtcggatctc cctttgggcc gcctccccgc gtcgacttta 7320
agaccaatga cttacaaggc agctgtagat cttagccact ttttaaaaga aaagggggga 7380
ctggaagggc taattcactc ccaacgaaga caagatctgc tttttgcttg tactgggtct 7440
ctctggttag accagatctg agcctgggag ctctctggct aactagggaa cccactgctt 7500
aagcctcaat aaagcttgcc ttgagtgctt caagtagtgt gtgcccgtct gttgtgtgac 7560
tctggtaact agagatccct cagacccttt tagtcagtgt ggaaaatctc tagcagggcc 7620
cgtttaaacc cgctgatcag cctcgactgt gccttctagt tgccagccat ctgttgtttg 7680
cccctccccc gtgccttcct tgaccctgga aggtgccact cccactgtcc tttcctaata 7740
aaatgaggaa attgcatcgc attgtctgag taggtgtcat tctattctgg ggggtggggt 7800
ggggcaggac agcaaggggg aggattggga agacaatagc aggcatgctg gggatgcggt 7860
gggctctatg gcctgcaggg gcgcctgatg cggtattttc tccttacgca tctgtgcggt 7920
atttcacacc gcatacgtca aagcaaccat agtacgcgcc ctgtagcggc gcattaagcg 7980
cggcgggtgt ggtggttacg cgcagcgtga ccgctacact tgccagcgcc ttagcgcccg 8040
ctcctttcgc tttcttccct tcctttctcg ccacgttcgc cggctttccc cgtcaagctc 8100
taaatcgggg gctcccttta gggttccgat ttagtgcttt acggcacctc gaccccaaaa 8160
aacttgattt gggtgatggt tcacgtagtg ggccatcgcc ctgatagacg gtttttcgcc 8220
ctttgacgtt ggagtccacg ttctttaata gtggactctt gttccaaact ggaacaacac 8280
tcaactctat ctcgggctat tcttttgatt tataagggat tttgccgatt tcggtctatt 8340
ggttaaaaaa tgagctgatt taacaaaaat ttaacgcgaa ttttaacaaa atattaacgt 8400
ttacaatttt atggtgcact ctcagtacaa tctgctctga tgccgcatag ttaagccagc 8460
cccgacaccc gccaacaccc gctgacgcgc cctgacgggc ttgtctgctc ccggcatccg 8520
cttacagaca agctgtgacc gtctccggga gctgcatgtg tcagaggttt tcaccgtcat 8580
caccgaaacg cgcgagacga aagggcctcg tgatacgcct atttttatag gttaatgtca 8640
tgataataat ggtttcttag acgtcaggtg gcacttttcg gggaaatgtg cgcggaaccc 8700
ctatttgttt atttttctaa atacattcaa atatgtatcc gctcatgaga caataaccct 8760
gataaatgct tcaataatat tgaaaaagga agagtatgag tattcaacat ttccgtgtcg 8820
cccttattcc cttttttgcg gcattttgcc ttcctgtttt tgctcaccca gaaacgctgg 8880
tgaaagtaaa agatgctgaa gatcagttgg gtgcacgagt gggttacatc gaactggatc 8940
tcaacagcgg taagatcctt gagagttttc gccccgaaga acgttttcca atgatgagca 9000
cttttaaagt tctgctatgt ggcgcggtat tatcccgtat tgacgccggg caagagcaac 9060
tcggtcgccg catacactat tctcagaatg acttggttga gtactcacca gtcacagaaa 9120
agcatcttac ggatggcatg acagtaagag aattatgcag tgctgccata accatgagtg 9180
ataacactgc ggccaactta cttctgacaa cgatcggagg accgaaggag ctaaccgctt 9240
ttttgcacaa catgggggat catgtaactc gccttgatcg ttgggaaccg gagctgaatg 9300
aagccatacc aaacgacgag cgtgacacca cgatgcctgt agcaatggca acaacgttgc 9360
gcaaactatt aactggcgaa ctacttactc tagcttcccg gcaacaatta atagactgga 9420
tggaggcgga taaagttgca ggaccacttc tgcgctcggc ccttccggct ggctggttta 9480
ttgctgataa atctggagcc ggtgagcgtg gaagccgcgg tatcattgca gcactggggc 9540
cagatggtaa gccctcccgt atcgtagtta tctacacgac ggggagtcag gcaactatgg 9600
atgaacgaaa tagacagatc gctgagatag gtgcctcact gattaagcat tggtaactgt 9660
cagaccaagt ttactcatat atactttaga ttgatttaaa acttcatttt taatttaaaa 9720
ggatctaggt gaagatcctt tttgataatc tcatgaccaa aatcccttaa cgtgagtttt 9780
cgttccactg agcgtcagac cccgtagaaa agatcaaagg atcttcttga gatccttttt 9840
ttctgcgcgt aatctgctgc ttgcaaacaa aaaaaccacc gctaccagcg gtggtttgtt 9900
tgccggatca agagctacca actctttttc cgaaggtaac tggcttcagc agagcgcaga 9960
taccaaatac tgttcttcta gtgtagccgt agttaggcca ccacttcaag aactctgtag 10020
caccgcctac atacctcgct ctgctaatcc tgttaccagt ggctgctgcc agtggcgata 10080
agtcgtgtct taccgggttg gactcaagac gatagttacc ggataaggcg cagcggtcgg 10140
gctgaacggg gggttcgtgc acacagccca gcttggagcg aacgacctac accgaactga 10200
gatacctaca gcgtgagcta tgagaaagcg ccacgcttcc cgaagggaga aaggcggaca 10260
ggtatccggt aagcggcagg gtcggaacag gagagcgcac gagggagctt ccagggggaa 10320
acgcctggta tctttatagt cctgtcgggt ttcgccacct ctgacttgag cgtcgatttt 10380
tgtgatgctc gtcagggggg cggagcctat ggaaaaacgc cagcaacgcg gcctttttac 10440
ggttcctggc cttttgctgg ccttttgctc acatgt 10476
<210> 3
<211> 3120
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
gacgaaaggg cctcgtgata cgcctatttt tataggttaa tgtcatgata ataatggttt 60
cttagacgtc aggtggcact tttcggggaa atgtgcgcgg aacccctatt tgtttatttt 120
tctaaataca ttcaaatatg tatccgctca tgagacaata accctgataa atgcttcaat 180
aatattgaaa aaggaagagt atgagtattc aacatttccg tgtcgccctt attccctttt 240
ttgcggcatt ttgccttcct gtttttgctc acccagaaac gctggtgaaa gtaaaagatg 300
ctgaagatca gttgggtgca cgagtgggtt acatcgaact ggatctcaac agcggtaaga 360
tccttgagag ttttcgcccc gaagaacgtt ttccaatgat gagcactttt aaagttctgc 420
tatgtggcgc ggtattatcc cgtattgacg ccgggcaaga gcaactcggt cgccgcatac 480
actattctca gaatgacttg gttgagtact caccagtcac agaaaagcat cttacggatg 540
gcatgacagt aagagaatta tgcagtgctg ccataaccat gagtgataac actgcggcca 600
acttacttct gacaacgatc ggaggaccga aggagctaac cgcttttttg cacaacatgg 660
gggatcatgt aactcgcctt gatcgttggg aaccggagct gaatgaagcc ataccaaacg 720
acgagcgtga caccacgatg cctgtagcaa tggcaacaac gttgcgcaaa ctattaactg 780
gcgaactact tactctagct tcccggcaac aattaataga ctggatggag gcggataaag 840
ttgcaggacc acttctgcgc tcggcccttc cggctggctg gtttattgct gataaatctg 900
gagccggtga gcgtgggtct cgcggtatca ttgcagcact ggggccagat ggtaagccct 960
cccgtatcgt agttatctac acgacgggga gtcaggcaac tatggatgaa cgaaatagac 1020
agatcgctga gataggtgcc tcactgatta agcattggta actgtcagac caagtttact 1080
catatatact ttagattgat ttaaaacttc atttttaatt taaaaggatc taggtgaaga 1140
tcctttttga taatctcatg accaaaatcc cttaacgtga gttttcgttc cactgagcgt 1200
cagaccccgt agaaaagatc aaaggatctt cttgagatcc tttttttctg cgcgtaatct 1260
gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt ttgtttgccg gatcaagagc 1320
taccaactct ttttccgaag gtaactggct tcagcagagc gcagatacca aatactgttc 1380
ttctagtgta gccgtagtta ggccaccact tcaagaactc tgtagcaccg cctacatacc 1440
tcgctctgct aatcctgtta ccagtggctg ctgccagtgg cgataagtcg tgtcttaccg 1500
ggttggactc aagacgatag ttaccggata aggcgcagcg gtcgggctga acggggggtt 1560
cgtgcacaca gcccagcttg gagcgaacga cctacaccga actgagatac ctacagcgtg 1620
agctatgaga aagcgccacg cttcccgaag ggagaaaggc ggacaggtat ccggtaagcg 1680
gcagggtcgg aacaggagag cgcacgaggg agcttccagg gggaaacgcc tggtatcttt 1740
atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg atttttgtga tgctcgtcag 1800
gggggcggag cctatggaaa aacgccagca acgcggcctt tttacggttc ctggcctttt 1860
gctggccttt tgctcacatg ttctttcctg cgttatcccc tgattctgtg gataaccgta 1920
ttaccgcctt tgagtgagct gataccgctc gccgcagccg aacgaccgag cgcagcgagt 1980
cagtgagcga ggaagcggaa gagcgcccaa tacgcaaacc gcctctcccc gcgcgttggc 2040
cgattcatta atgcagctgg cacgacaggt ttcccgactg gaaagcgggc agtgagcgca 2100
acgcaattaa tgtgagttag ctcactcatt aggcacccca ggctttacac tttatgcttc 2160
cggctcgtat gttgtgtgga attgtgagcg gataacaatt tcacacagga aacagctatg 2220
accatgatta cgccaagctt gcatgcaggc ctctgcagtc gacgggcccg ggatccgatg 2280
ataaacatgt gagggcctat ttcccatgat tccttcatat ttgcatatac gatacaaggc 2340
tgttagagag ataattggaa ttaatttgac tgtaaacaca aagatattag tacaaaatac 2400
gtgacgtaga aagtaataat ttcttgggta gtttgcagtt ttaaaattat gttttaaaat 2460
ggactatcat atgcttaccg taacttgaaa gtatttcgat ttcttggctt tatatatctt 2520
gtggaaagga cgaaacaccg ggtcttcgag aagacctgtt ttagagctag aaatagcaag 2580
ttaaaataag gctagtccgt tatcaacttg aaaaagtggc accgagtcgg tgcttttttc 2640
tagcgcgtgc gccaattctg cagacaaatg gctctagagg tacccataga tctagatgca 2700
ttcgcgaggt accgagctcg aattcactgg ccgtcgtttt acaacgtcgt gactgggaaa 2760
accctggcgt tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta 2820
atagcgaaga ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat 2880
ggcgcctgat gcggtatttt ctccttacgc atctgtgcgg tatttcacac cgcatatggt 2940
gcactctcag tacaatctgc tctgatgccg catagttaag ccagccccga cacccgccaa 3000
cacccgctga cgcgccctga cgggcttgtc tgctcccggc atccgcttac agacaagctg 3060
tgaccgtctc cgggagctgc atgtgtcaga ggttttcacc gtcatcaccg aaacgcgcga 3120
<210> 4
<211> 14138
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 4
ggcgcgccct ctacctgctc tcggacccgt gggggtgggg ggtggaggaa ggagtggggg 60
gtcggtcctg ctggcttgtg ggtgggaggc gcatgttctc caaaaacccg cgcgagctgc 120
aatcctgagg gagctgcagt ggaggaggcg gagagaaggc cgcacccttc tccgcagggg 180
gaggggagtg ccgcaatacc tttatgggag ttctctgctg cctccttttc ctaaggaccg 240
ccctgggcct agaaaaatcc ctccctcccc cgcgatctcg tcatcgcctc catgtcagtt 300
tgctccttct cgattatggg cgggattctt ttgccctggc gcgccccaga cccgggcctg 360
gggggcaagt cggggggcgg ggggaggtcg ggcagggtcc cctgggagga tggggacgtg 420
ctgtgcccct agcggccacc agagggcacc aggacaccac tgcggtcggc tcagcggctc 480
ctgccctggt cagggggcgc caggtcctgc ccctcctggg gagggcgggg ggcgagaagg 540
gcgattttaa ttaacccacg tttcaacatg cacatcccag taatttggaa acattttgtt 600
tccaaagatt cacttaacat tggtttagca acatgaagct ttctatgcaa cccaaggact 660
cagtttttgg cctgttttag tgacaggcaa tcagcaacat gctgcatttc tctccagtgt 720
tgtaatcaaa gaaaccctcc catagcttta aatgatattc cttccccttc caattatgtg 780
gggggaaaac aaccctattc tccacccaga agtgttaact caagaattac attttcaaga 840
agtttccaga ttcgtaaaac cagaattaga tgtctttcac ctaaatgtct cggtgttgac 900
caaaggaaca cacaggtttc tcatttaact tttttaatgg gtctcaaaat tctgtgacaa 960
atttttggtc aagttgtttc cattaaaaag tactgatttt aaaaactaat aacttaaaac 1020
tgccacacgc aaaaaagaaa accaaagtgg tccacaaaac attctccttt ccttctgaag 1080
gttttacgat gcattgttat cattaaccag tcttttacta ctaaacttaa atggccaatt 1140
gaaacaaaca gttctgagac cgttcttcca ccactgatta agagtggggt ggcaggtatt 1200
agggataatg ctagcttact tgtacagctc gtccatgccg agagtgatcc cggcggcggt 1260
cacgaactcc agcaggacca tgtgatcgcg cttctcgttg gggtctttgc tcagggcgga 1320
ctgggtgctc aggtagtggt tgtcgggcag cagcacgggg ccgtcgccga tgggggtgtt 1380
ctgctggtag tggtcggcga gctgcacgct gccgtcctcg atgttgtggc ggatcttgaa 1440
gttcaccttg atgccgttct tctgcttgtc ggccatgata tagacgttgt ggctgttgta 1500
gttgtactcc agcttgtgcc ccaggatgtt gccgtcctcc ttgaagtcga tgcccttcag 1560
ctcgatgcgg ttcaccaggg tgtcgccctc gaacttcacc tcggcgcggg tcttgtagtt 1620
gccgtcgtcc ttgaagaaga tggtgcgctc ctggacgtag ccttcgggca tggcggactt 1680
gaagaagtcg tgctgcttca tgtggtcggg gtagcggctg aagcactgca cgccgtaggt 1740
cagggtggtc acgagggtgg gccagggcac gggcagcttg ccggtggtgc agatgaactt 1800
cagggtcagc ttgccgtagg tggcatcgcc ctcgccctcg ccggacacgc tgaacttgtg 1860
gccgtttacg tcgccgtcca gctcgaccag gatgggcacc accccggtga acagctcctc 1920
gcccttgctc accatggtgg cgtcgaccgt acgtcacgac acctgaaatg gaagaaaaaa 1980
actttgaacc actgtctgag gcttgagaat gaaccaagat ccaaactcaa aaagggcaaa 2040
ttccaaggag aattacatca agtgccaagc tggcctaact tcagtctcca cccactcagt 2100
gtggggaaac tccatcgcat aaaacccctc cccccaacct aaagacgacg tactccaaaa 2160
gctcgagaac taatcgaggt gcctggacgg cgcccggtac tccgtggagt cacatgaagc 2220
gacggctgag gacggaaagg cccttttcct ttgtgtgggt gactcacccg cccgctctcc 2280
cgagcgccgc gtcctccatt ttgagctccc tgcagcaggg ccgggaagcg gccatctttc 2340
cgctcacgca actggtgccg accgggccag ccttgccgcc cagggcgggg cgatacacgg 2400
cggcgcgagg ccaggcacca gagcaggccg gccagcttga gactaccccc gtccgattct 2460
cggtggccgc gctcgcaggc cccgcctcgc cgaacatgtg cgctgggacg cacgggcccc 2520
gtcgccgccc gcggccccaa aaaccgaaat accagtgtgc agatcttggc ccgcatttac 2580
aagactatct tgccagaaaa aaagcgtcgc agcaggtcat caaaaatttt aaatggctag 2640
agacttatcg aaagcagcga gacaggcgcg aaggtgccac cagattcgca cgcggcggcc 2700
ccagcgccca ggccaggcct caactcaagc acgaggcgaa ggggctcctt aagcgcaagg 2760
cctcgaactc tcccacccac ttccaacccg aagctcggga tcaagaatca cgtactgcag 2820
ccagtggaag taattcaagg cacgcaaggg ccataacccg taaagaggcc aggcccgcgg 2880
gaaccacaca cggcacttac ctgtgttctg gcggcaaacc cgttgcgaaa aagaacgttc 2940
acggcgacta ctgcacttat atacggttct cccccaccct cgggaaaaag gcggagccag 3000
tacacgacat cactttccca gtttaccccg cgccaccttc tctaggcacc ggttcaattg 3060
ccgacccctc cccccaactt ctcggggact gtgggcgatg tgcgctctgc ccactgacgg 3120
gcaccggagc cctagattcg attccctttg gggcaaaact caccgcctaa tcccctataa 3180
ctctaccggg gagcccggtg gagagcagac gggctgacgc tgccacctgc cggccatccc 3240
aggataggac cgccgtattc aagtcgccct caggaaggac cctcggggca ccagaggcct 3300
tcgaagcccc aatgagtgag gcaactgagg gtcgcgggtg ccattacaag gcccagccaa 3360
ggcctagagc caaggcttga accgtggggg acccccaagc cccacctgcc caggaacagc 3420
agacactggg acactttgtt tcaggtcctg cccaggcccc tcccactgtg aggctgggat 3480
ttgtcgccca gggtgcagat gagaagagtg gggaaagcag tcctgagcca ggaaattcta 3540
ccgggtaggg gaggcgcttt tcccaaggca gtctggagca tgcgctttag cagccccgct 3600
gggcacttgg cgctacacaa gtggcctctg gcctcgcaca cattccacat ccaccggtag 3660
gcgccaaccg gctccgttct ttggtggccc cttcgcgcca ccttctactc ctcccctagt 3720
caggaagttc ccccccgccc cgcagctcgc gtcgtgcagg acgtgacaaa tggaagtagc 3780
acgtctcact agtctcgtgc agatggacag caccgctgag caatggaagc gggtaggcct 3840
ttggggcagc ggccaatagc agctttgctc cttcgctttc tgggctcaga ggctgggaag 3900
gggtgggtcc gggggcgggc tcaggggcgg gctcaggggc ggggcgggcg cccgaaggtc 3960
ctccggaggc ccggcattct gcacgcttca aaagcgcacg tctgccgcgc tgttctcctc 4020
ttcctcatct ccgggccttt cgacctccta gggccaccat ggtgagcaag ggcgaggacg 4080
acaacatggc catcatcaag gagttcatgc gcttcaaggt gcacatggag ggctccgtga 4140
acggccacga gttcgagatc gagggcgagg gcgagggccg cccctacgag ggcacccaga 4200
ccgccaagct gaaggtgacc aagggcggcc ccctgccctt cgcctgggac atcctgtccc 4260
ctcagttcat gtacggctcc aaggcctacg tgaagcaccc cgccgacatc cccgactact 4320
tgaagctgtc cttccccgag ggcttcaagt gggagcgcgt gatgaacttc gaggacggcg 4380
gcgtggtgac cgtgacccag gactcctccc tgcaggacgg cgagttcatc tacaaggtga 4440
agctgcgcgg caccaacttc ccctccgacg gccccgtaat gcagaagaag accatgggct 4500
gggaggcctc ctccgagcgg atgtaccccg aggacggcgc cctgaagggc gagatcaagc 4560
agaggctgaa gctgaaggac ggcggccact acgacgccga ggtcaagacc acctacaagg 4620
ccaagaagcc cgtgcagctg cccggcgcct acaacgtcaa catcaagctg gacatcacct 4680
cccacaacga ggactacacc atcgtggaac agtacgagcg cgccgagggc cgccactcca 4740
ccggcggcat ggacgagctg tacaagtgag gatccgctga tcagcctcga ctgtgccttc 4800
tagttgccag ccatctgttg tttgcccctc ccccgtgcct tccttgaccc tggaaggtgc 4860
cactcccact gtcctttcct aataaaatga ggaaattgca tcgcattgtc tgagtaggtg 4920
tcattctatt ctggggggtg gggtggggca ggacagcaag ggggaggatt gggaagacaa 4980
tagcaggcat gctggggatg cggtgggctc tatggcttct gaggcggaaa gaacccttct 5040
gaggcggaaa gaaccagctg ccttaatata acttcgtata atgtatgcta tacgaagtta 5100
ttaggtctga agaggagttt acgtccagcc aattctgtgg aatgtgtgtc agttagggtg 5160
tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa agcatgcatc tcaattagtc 5220
agcaaccagg tgtggaaagt ccccaggctc cccagcaggc agaagtatgc aaagcatgca 5280
tctcaattag tcagcaacca tagtcccgcc cctaactccg cccatcccgc ccctaactcc 5340
gcccagttcc gcccattctc cgccccatgg ctgactaatt ttttttattt atgcagaggc 5400
cgaggccgcc tctgcctctg agctattcca gaagtagtga ggaggctttt ttggaggcct 5460
aggcttttgc aaaaagctcc cgggagcttg tatatccatt ttcggcggcc gcgccaccat 5520
gaccgagtac aagcccacgg tgcgcctcgc cacccgcgac gacgtcccca gggccgtacg 5580
caccctcgcc gccgcgttcg ccgactaccc cgccacgcgc cacaccgtcg atccggaccg 5640
ccacatcgag cgggtcaccg agctgcaaga actcttcctc acgcgcgtcg ggctcgacat 5700
cggcaaggtg tgggtcgcgg acgacggcgc cgcggtggcg gtctggacca cgccggagag 5760
cgtcgaagcg ggggcggtgt tcgccgagat cggcccgcgc atggccgagt tgagcggttc 5820
ccggctggcc gcgcagcaac agatggaagg cctcctggcg ccgcaccggc ccaaggagcc 5880
cgcgtggttc ctggccaccg tcggagtctc gcccgaccac cagggcaagg gtctgggcag 5940
cgccgtcgtg ctccccggag tggaggcggc cgagcgcgcc ggggtgcccg ccttcctgga 6000
gacctccgcg ccccgcaacc tccccttcta cgagcggctc ggcttcaccg tcaccgccga 6060
cgtcgaggtg cccgaaggac cgcgcacctg gtgcatgacc cgcaagcccg gtgcctgaga 6120
attcgcggga ctctggggtt cgaaatgacc gaccaagcga cgcccaacct gccatcacga 6180
gatttcgatt ccaccgccgc cttctatgaa aggttgggct tcggaatcgt tttccgggac 6240
gccggctgga tgatcctcca gcgcggggat ctcatgctgg agttcttcgc ccaccccaac 6300
ttgtttattg cagcttataa tggttacaaa taaagcaata gcatcacaaa tttcacaaat 6360
aaagcatttt tttcactgca ttctagttgt ggtttgtcca aactcatcaa tgtatcttat 6420
catgtctgta taccgctcga ctagagcttg cggaaccctt aatataactt cgtataatgt 6480
atgctatacg aagttattag gtccgctggc catctacgag ccaaagactt tcaaatcttt 6540
ggctgccttg gccagtagga ggcgacacga aggatttgct gctgccttgg gggatgggaa 6600
ggaacctgaa ggcatttttt ccagagtggt gcagtaccac tgaggactgt tgctgtattg 6660
attaggaaaa gagacagagt aatttgcagt ttgtttgatt tatactgggc tgcaggtcga 6720
gggatcttca taagagaaga gggacagcta tgactgggag tagtcaggag aggaggaaaa 6780
atctggctag taaaacatgt aaggaaaatt ttagggatgt taaagaaaaa aataacacaa 6840
aacaaaatat aaaaaaaatc taacctcaag tcaaggcttt tctatggaat aaggaatgga 6900
cagcaggggg ctgtttcata tactgatgac ctctttatag ccacctttgt tcatggcagc 6960
cagcatatgg catatgttgc caaactctaa accaaatact cattctgatg ttttaaatga 7020
tttgccctcc catatgtcct tccgagtgag agacacaaaa aattccaaca cactattgca 7080
atgaaaataa atttccttta ttagccagaa gtcagatgct caaggggctt catgatgtcc 7140
ccataatttt tggcagaggg aaaaagatct cagtggtatt tgtgagccag ggcattggcc 7200
acaccagcca ccaccttctg ataggcagcc tgcggtacct tacatggtgg cgaattcgtt 7260
tgccaaaatg atgagacagc acaataacca gcacgttgcc caggagctgt aggaaaaaga 7320
agaaggcatg aacatggtta gcagaggctc tagagccgcc ggtcacacgc cagaagccga 7380
accccgccct gccccgtccc ccccgaaggc agccgtcccc ctgcggcagc cccgaggctg 7440
gagatggaga aggggacggc ggcgcggcga cgcacgaagg ccctccccgc ccatttcctt 7500
cctgccggcg ccgcaccgct tcgcccgcgc ccgctagagg gggtgcggcg gcgcctccca 7560
gatttcggct ccgccagatt tgggacaaag gaagtccctg cgccctctcg cacgattacc 7620
ataaaaggca atggctgcgg ctcgccgcgc ctcgacagcc gccggcgctc cggggccgcc 7680
gcgcccctcc cccgagccct ccccggcccg aggcggcccc gccccgcccg gcacccccac 7740
ctgccgccac cccccgcccg gcacggcgag ccccgcgcca cgccccgcac ggagccccgc 7800
acccgaagcc gggccgtgct cagcaactcg gggagggggg tgcagggggg ggttacagcc 7860
cgaccgccgc gcccacaccc cctgctcacc cccccacgca cacaccccgc acgcagcctt 7920
tgttcccctc gcagcccccc cgcaccgcgg ggcaccgccc ccggccgcgc tcccctcgcg 7980
cacacgcgga gcgcacaaag ccccgcgccg cgcccgcagc gctcacagcc gccgggcagc 8040
gcgggccgca cgcggcgctc cccacgcaca cacacacgca cgcacccccc gagccgctcc 8100
cccccgcaca aagggccctc ccggagccct ttaaggcttt cacgcagcca cagaaaagaa 8160
acgagccgtc attaaaccaa gcgctaatta cagcccggag gagaagggcc gtcccgcccg 8220
ctcacctgtg ggagtaacgc ggtcagtcag agccggggcg ggcggcgcga ggcggcgcgg 8280
agcggggcac ggggcgaagg caacgcagcg actcccgccc gccgcgcgct tcgcttttta 8340
tagggccgcc gccgccgccg cctcgccata aaaggaaact ttcggagcgc gccgctctga 8400
ttggctgccg ccgcacctct ccgcctcgcc ccgccccgcc cctcgccccg ccccgccccg 8460
cctggcgcgc gccccccccc cccccgcccc catcgctgca caaaataatt aaaaaataaa 8520
taaatacaaa attgggggtg gggagggggg ggagatgggg agagtgaagc agaacgtggg 8580
gctcacctcg acccatggta atagcgatga ctaatacgta gatgtactgc caagtaggaa 8640
agtcccataa ggtcatgtac tgggcataat gccaggcggg ccatttaccg tcattgacgt 8700
caataggggg cgtacttggc atatgataca cttgatgtac tgccaagtgg gcagtttacc 8760
gtaaatagtc cacccattga cgtcaatgga aagtccctat tggcgttact atgggaacat 8820
acgtcattat tgacgtcaat gggcgggggt cgttgggcgg tcagccaggc gggccattta 8880
ccgtaagtta tgtaacgcgg aactccatat atgggctatg aactaatgac cccgtaattg 8940
attactatta ataactagtc aataatcaat gtcgtaaatg tcgtaaatgt ctcagctagt 9000
caggtagtaa aaggtgtcaa ctaggcagtg gcagagcagg attcaaattc agggctgttg 9060
tgatgcctcc gcagactctg agcgccacct ggtggtaatt tgtctgtgcc tcttctgacg 9120
tggaagaaca gcaactaaca cactaacacg gcatttacta tgggccagcc attgtacgcg 9180
ttgcttaacc tgattcttgg gcgttgtcct gcaggggatt gagcaggtgt acgaggacga 9240
gcccaatttc tctatattcc cacagtcttg agtttgtgtc acaaaataat tatagtgggg 9300
tggagatggg aaatgagtcc aggcaacacc taagcctgat tttatgcatt gagactgcgt 9360
gttattacta aagatctttg tgtcgcaatt tcctgatgaa gggagatagg ttaaaaagca 9420
cggatctact gagttttaca gtcatcccat ttgtagactt ttgctacacc accaaagtat 9480
agcatctgag attaaatatt aatctccaaa ccttaggccc cctcacttgc atccttacgg 9540
tcagataact ctcactcata ctttaagccc attttgtttg ttgtacttgc tcatccagtc 9600
ccagacatag cattggcttt ctcctcacct gttttaggta gccagcaagt catgaaatca 9660
gataagttcc accaccaatt aacactaccc atcttgagca taggcccaac agtgcattta 9720
ttcctcattt actgatgttc gtgaatattt accttgattt tcattttttt ctttttctta 9780
agctgggatt ttactcctga ccctattcac agtcagatga tcttgactac cactgcgatt 9840
ggacctgagg ttcagcaata ctccccttta tgtcttttga atacttttca ataaatctgt 9900
ttgtattttc attagttagt aactgagctc agttgccgta atgctaatag cttccaaact 9960
agtgtctctg tctccagtat ctgataaatc ttaggtgttg ctgggacagt tgtcctaaaa 10020
ttaagataaa gcatgaaaat aactgacaca actccattac tggctcctaa ctacttaaac 10080
aatgcattct atcatcacaa atgtgaaaaa ggagttccct cagtggacta accttatctt 10140
ttctcaacac ctttttcttt gcacaatttt ccacacatgc ctacaaaaag tacttatgcg 10200
gccgccataa aagttttgtt actttataga agaaattttg agtttttgtt ttttttaata 10260
aataaataaa cataaataaa ttgtttgttg aatttattat tagtatgtaa gtgtaaatat 10320
aataaaactt aatatctatt caaattaata aataaacctc gatatacaga ccgataaaac 10380
acatgcgtca attttacaca tgattatctt taacgtacgt cacaatatga ttatctttct 10440
agggttaatc tagctgcgtg ttctgcagcg tgtcgagcat cttcatctgc tccatcacgc 10500
tgtaaaacac atttgcaccg cgagtctgcc cgtcctccac gggttcaaaa acgtgaatga 10560
acgaggcgcg ctcactggcc gtcgttttac aacgtcgtga ctgggaaaac cctggcgtta 10620
cccaacttaa tcgccttgca gcacatcccc ctttcgccag ctggcgtaat agcgaagagg 10680
cccgcaccga tcgcccttcc caacagttgc gcagcctgaa tggcgaatgg gacgcgccct 10740
gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg cagcgtgacc gctacacttg 10800
ccagcgccct agcgcccgct cctttcgctt tcttcccttc ctttctcgcc acgttcgccg 10860
gctttccccg tcaagctcta aatcgggggc tccctttagg gttccgattt agtgctttac 10920
ggcacctcga ccccaaaaaa cttgattagg gtgatggttc acgtagtggg ccatcgccct 10980
gatagacggt ttttcgccct ttgacgttgg agtccacgtt ctttaatagt ggactcttgt 11040
tccaaactgg aacaacactc aaccctatct cggtctattc ttttgattta taagggattt 11100
tgccgatttc ggcctattgg ttaaaaaatg agctgattta acaaaaattt aacgcgaatt 11160
ttaacaaaat attaacgctt acaatttagg tggcactttt cggggaaatg tgcgcggaac 11220
ccctatttgt ttatttttct aaatacattc aaatatgtat ccgctcatga gacaataacc 11280
ctgataaatg cttcaataat attgaaaaag gaagagtatg agtattcaac atttccgtgt 11340
cgcccttatt cccttttttg cggcattttg ccttcctgtt tttgctcacc cagaaacgct 11400
ggtgaaagta aaagatgctg aagatcagtt gggtgcacga gtgggttaca tcgaactgga 11460
tctcaacagc ggtaagatcc ttgagagttt tcgccccgaa gaacgttttc caatgatgag 11520
cacttttaaa gttctgctat gtggcgcggt attatcccgt attgacgccg ggcaagagca 11580
actcggtcgc cgcatacact attctcagaa tgacttggtt gagtactcac cagtcacaga 11640
aaagcatctt acggatggca tgacagtaag agaattatgc agtgctgcca taaccatgag 11700
tgataacact gcggccaact tacttctgac aacgatcgga ggaccgaagg agctaaccgc 11760
ttttttgcac aacatggggg atcatgtaac tcgccttgat cgttgggaac cggagctgaa 11820
tgaagccata ccaaacgacg agcgtgacac cacgatgcct gtagcaatgg caacaacgtt 11880
gcgcaaacta ttaactggcg aactacttac tctagcttcc cggcaacaat taatagactg 11940
gatggaggcg gataaagttg caggaccact tctgcgctcg gcccttccgg ctggctggtt 12000
tattgctgat aaatctggag ccggtgagcg tggttcacgc ggtatcattg cagcactggg 12060
gccagatggt aagccctccc gtatcgtagt tatctacacg acggggagtc aggcaactat 12120
ggatgaacga aatagacaga tcgctgagat aggtgcctca ctgattaagc attggtaact 12180
gtcagaccaa gtttactcat atatacttta gattgattta aaacttcatt tttaatttaa 12240
aaggatctag gtgaagatcc tttttgataa tctcatgacc aaaatccctt aacgtgagtt 12300
ttcgttccac tgagcgtcag accccgtaga aaagatcaaa ggatcttctt gagatccttt 12360
ttttctgcgc gtaatctgct gcttgcaaac aaaaaaacca ccgctaccag cggtggtttg 12420
tttgccggat caagagctac caactctttt tccgaaggta actggcttca gcagagcgca 12480
gataccaaat actgtccttc tagtgtagcc gtagttaggc caccacttca agaactctgt 12540
agcaccgcct acatacctcg ctctgctaat cctgttacca gtggctgctg ccagtggcga 12600
taagtcgtgt cttaccgggt tggactcaag acgatagtta ccggataagg cgcagcggtc 12660
gggctgaacg gggggttcgt gcacacagcc cagcttggag cgaacgacct acaccgaact 12720
gagataccta cagcgtgagc tatgagaaag cgccacgctt cccgaaggga gaaaggcgga 12780
caggtatccg gtaagcggca gggtcggaac aggagagcgc acgagggagc ttccaggggg 12840
aaacgcctgg tatctttata gtcctgtcgg gtttcgccac ctctgacttg agcgtcgatt 12900
tttgtgatgc tcgtcagggg ggcggagcct atggaaaaac gccagcaacg cggccttttt 12960
acggttcctg gccttttgct ggccttttgc tcacatgttc tttcctgcgt tatcccctga 13020
ttctgtggat aaccgtatta ccgcctttga gtgagctgat accgctcgcc gcagccgaac 13080
gaccgagcgc agcgagtcag tgagcgagga agcggaagag cgcccaatac gcaaaccgcc 13140
tctccccgcg cgttggccga ttcattaatg cagctggcac gacaggtttc ccgactggaa 13200
agcgggcagt gagcgcaacg caattaatgt gagttagctc actcattagg caccccaggc 13260
tttacacttt atgcttccgg ctcgtatgtt gtgtggaatt gtgagcggat aacaatttca 13320
cacaggaaac agctatgacc atgattacgc caagcgcgcc cgccgggtaa ctcacggggt 13380
atccatgtcc atttctgcgg catccagcca ggatacccgt cctcgctgac gtaatatccc 13440
agcgccgcac cgctgtcatt aatctgcaca ccggcacggc agttccggct gtcgccggta 13500
ttgttcgggt tgctgatgcg cttcgggctg accatccgga actgtgtccg gaaaagccgc 13560
gacgaactgg tatcccaggt ggcctgaacg aacagttcac cgttaaaggc gtgcatggcc 13620
acaccttccc gaatcatcat ggtaaacgtg cgttttcgct caacgtcaat gcagcagcag 13680
tcatcctcgg caaactcttt ccatgccgct tcaacctcgc gggaaaaggc acgggcttct 13740
tcctccccga tgcccagata gcgccagctt gggcgatgac tgagccggaa aaaagacccg 13800
acgatatgat cctgatgcag ctagattaac cctagaaaga tagtctgcgt aaaattgacg 13860
catgcattct tgaaatattg ctctctcttt ctaaatagcg cgaatccgtc gctgtgcatt 13920
taggacatct cagtcgccgc ttggagctcc cgtgaggcgt gcttgtcaat gcggtaagtg 13980
tcactgattt tgaactataa cgaccgcgtg agtcaaaatg acgcatgatt atcttttacg 14040
tgacttttaa gatttaactc atacgataat tatattgtta tttcatgttc tacttacgtg 14100
ataacttatt atatatatat tttcttgtta tagatatc 14138
<210> 5
<211> 345
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
ggcgcgccct ctacctgctc tcggacccgt gggggtgggg ggtggaggaa ggagtggggg 60
gtcggtcctg ctggcttgtg ggtgggaggc gcatgttctc caaaaacccg cgcgagctgc 120
aatcctgagg gagctgcagt ggaggaggcg gagagaaggc cgcacccttc tccgcagggg 180
gaggggagtg ccgcaatacc tttatgggag ttctctgctg cctccttttc ctaaggaccg 240
ccctgggcct agaaaaatcc ctccctcccc cgcgatctcg tcatcgcctc catgtcagtt 300
tgctccttct cgattatggg cgggattctt ttgccctggc gcgcc 345
<210> 6
<211> 1012
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 6
cttaacctga ttcttgggcg ttgtcctgca ggggattgag caggtgtacg aggacgagcc 60
caatttctct atattcccac agtcttgagt ttgtgtcaca aaataattat agtggggtgg 120
agatgggaaa tgagtccagg caacacctaa gcctgatttt atgcattgag actgcgtgtt 180
attactaaag atctttgtgt cgcaatttcc tgatgaaggg agataggtta aaaagcacgg 240
atctactgag ttttacagtc atcccatttg tagacttttg ctacaccacc aaagtatagc 300
atctgagatt aaatattaat ctccaaacct taggccccct cacttgcatc cttacggtca 360
gataactctc actcatactt taagcccatt ttgtttgttg tacttgctca tccagtccca 420
gacatagcat tggctttctc ctcacctgtt ttaggtagcc agcaagtcat gaaatcagat 480
aagttccacc accaattaac actacccatc ttgagcatag gcccaacagt gcatttattc 540
ctcatttact gatgttcgtg aatatttacc ttgattttca tttttttctt tttcttaagc 600
tgggatttta ctcctgaccc tattcacagt cagatgatct tgactaccac tgcgattgga 660
cctgaggttc agcaatactc ccctttatgt cttttgaata cttttcaata aatctgtttg 720
tattttcatt agttagtaac tgagctcagt tgccgtaatg ctaatagctt ccaaactagt 780
gtctctgtct ccagtatctg ataaatctta ggtgttgctg ggacagttgt cctaaaatta 840
agataaagca tgaaaataac tgacacaact ccattactgg ctcctaacta cttaaacaat 900
gcattctatc atcacaaatg tgaaaaagga gttccctcag tggactaacc ttatcttttc 960
tcaacacctt tttctttgca caattttcca cacatgccta caaaaagtac tt 1012
<210> 7
<211> 1073
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
gtgctgagtc cttttcccat cccacccacc tggagctccc ctcttccagt cctgagccac 60
ttgaactggc ctggtttttg ccatcctgcg ctgccctctc tccggactcg agccactgct 120
gagggcctca ggccagtcca tcctcgtctt gtctctttcg ccctgctctt tccccacctt 180
gagcgctctt aaccagcctg gcccgtgcca cctctactct gccatcgaat gctgccccac 240
tttctcgagt ccgccacttc tcccagcttc accggtaccc actgtttccc ctagtccagg 300
caggtaccac tttccctgag cgtcctcctc ctctctcctg ggcctgtgct gcttcttttc 360
ccgctctctg gcctgggccg tttcttcggc cagcccccga gccttccatg ccctttcctt 420
caggtttctg ctcttcatcc ttggtctctg ccatctgttg ccatgtaagg gtgctctttc 480
ctgagccatc gccctcaagg cgctctgctc ctcaagtgga tgcttccctc gcctggctca 540
cctcctgctc tctctcctgc ccccttcacc tgcgtgccct cctcattctc cctctgtgcc 600
acctctggcc ttgcactgta ggctctctct tggggatgtt tctccttctc cacacacttc 660
tctttcactc tgtcctcttg ctttgtgtgg gcctgcagcg ttaccctttt ttctgggcac 720
actcagagca ccctcctctt tctggttctg ggccacctgt ctgtcctcgg gtcatcttgc 780
tctctctgcc tggatgccct cctgtggctt tgggcagctt ctccctcctt cagagtgcac 840
cgccagttct cctaggcccg gtcacttccc cttcccaggg gacctagagc cctgctaggt 900
cctctctctc cacaacctgg gcccccaaac ctttccaaaa caccttgctt tctgcctcca 960
ttggtcttgt gttccagagc cagagtcact atatgtccca gaaccaggat tccctctggt 1020
tctgagggct tttatcgcat cccctgcctg gctgcagtgg gtctttgggc gcc 1073
<210> 8
<211> 260
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
gacaggccac agaagagcct ctactcctcc ctctgtcccc gaggctgtct ccctcccagt 60
cttcccagct caggccagtc cccaggcctc tcttccctgc cagagcccgt caggttcggt 120
tactttgggg cccagagagg accctgtgaa ggaagcgtgg gtaggggcac gggaatgggg 180
aggatgcctg aagaggcccc cttagccaga agaggagcag aagaggagca ggtacccaga 240
agaggagcag ttcagggaaa 260
<210> 9
<211> 546
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
aaatacccac gtttattggg acaaaagttg ttagggaaaa tggggcctca gagttatgat 60
tcaagtcata attctttcca tttataattt cactcgagac tctgttaact gattccttgt 120
gtgttgtatc ttactcctca gctcacaatt acttttagtt attcacctta actgtatgaa 180
taacagtgga gaaaaggatt ctaccagaat actctaatta tggttttgag tcccctttcc 240
agactgaaga tttttcagtc tttttgatct gaggtgattt ttcagtcttt tcgatctgag 300
gtgacagtct caagctcctc aattcaccca gtctcttgat acttgtccat ttagggccac 360
caaagctact ttgacttcat actagagagt caattaatga ggccattctc tgatggacag 420
gtgaagcagg caaggtgact atattttgac taaacggtag aaaacagcct gagtgttaac 480
agtgtagcct ataaaaccca gagctgccca ccctgatcta aacttccagg aacataagaa 540
cgcgcc 546
<210> 10
<211> 1009
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
agtaggtcac atttcagtaa aacctggctt tgtggattga gcatggtctg tctcttcctg 60
gtacttcatt agtcccctaa gtgggatttg ctgagcaaga ctcctcaatt acagaaatac 120
tccagtttag aattctcgca aaggcttttt gtttccacaa gtagaatcta gaaagcaatc 180
tcaagtaaca acagcagaga cctgaatccc aatccatctt tcctgtgtgt cctcttttac 240
ctccttccct ttcatgttga accaacagtc ctttttcagt ctagaagcta gtacgaaaga 300
aatgtacaga tgtaggtacc aagcaaagcc attagccaat aactggtgag atggagctaa 360
gaggaaataa aagtgttcct aagaatagca cagcagaagc tagatccaca gatcttaaaa 420
caattttggt tgagtaagag tagaggcaaa agaggaagct aataatgcag tttttaggag 480
ctaagagcca gataaagggt aagggcagga ggaagtgcta tctcagctaa cgagatacat 540
gaaacaacgg tggaagtcca gcaggcacaa gatgagttga gaagcaatca gggccagaag 600
gatgtgcaag gcctcaaaat aaaaaagcac agggccacag ggaaccttat ggaaattaaa 660
aggaagagga tgcagtcagg agaggaaaaa atagtgctcc ctcccccatg cccaaggaag 720
cagctgagca gccagtactt gggaagttag tagtaataag ttggtaagag ggagttctgt 780
tcgtggctca atggttaaca aatcagacta gaaaccgtga ggttgcgggt ttgatccctg 840
gccttgctca gtgggttaag gatccggcat tgccgtgacc tgtggtgtag gtcacagacg 900
tggctcagtt cccgcattcc tgtggctctg gtgtaggctg gtggctacag ctctgattag 960
acccctaggc tgggaacctc catatgccct ggaagtggcc gtagaaaag 1009
<210> 11
<211> 878
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
ggatggggac tcatgtgaat tttctaaagg tgctatttaa acggggggca cgagtgccgg 60
ctttggacag ggccgctcgc tctccaccct ttcttcttcc ccctcggccg cctctcaccc 120
cctgaggcct ctctcccccc acgacctcct ctctctcctc tgaaaccctc tcctcctcag 180
ctgcatccca ccctcgtggc ctctctctct ctctgtctgt cctgtgtcct ctctcactgg 240
gtttcagagc acagatgccc aaagcacaaa agcagttttc ccctggggtg ggaggaagca 300
agagactttg tacctatttt gtatgtgtat aataatttga gatgttttta attattttga 360
ttgctggaat aaagcatgtg gaaatgaccc aaaccaatct tgcactggcc tcctgatttc 420
cttccttgga gacggaggga gggggagacc tgggggaggg cgcttggggg ggggtgggct 480
ctcttctttc tgcgctcccc ccccccacct ccaacacctt gacgacccct cctgcttccg 540
cttgcctttc tcaggcttta acactttctc ctcgccctct cagcatgcgc atgcgcgtgc 600
ctctacctcc cccgcacatc ctggcctgcc caccctgaat ggcctggccc agcgatgcca 660
ccaactctct cgctccgtcc acggctgggg aggggggcac tctgcagggt tggggggcac 720
tgggaggctg ggttgggtga gggaggggtg cctgggcccc caccccccag caagttctct 780
ccctaggcga actggagggt cgtctggcct cttgagcctt gttgctggct ctgagctcta 840
ccaagagagt gaccagcagg accgcaccat cacgcgcc 878
<210> 12
<211> 727
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
gtggttgctg agactgcgtg ggggcccaag gagacctgga gaaaggaatg cttcctgctc 60
cttcttctgg ggccccagga gagccttccc agggccttgg agaggtgctg tccagggact 120
aaccctgtgc tctaggaagg ctgcaggccc tgaccagctg ggcaggtcct gggtccctcc 180
tggccttcta agttccccaa acatgagacc tctgggtgtg gggtggcctg gggaggtcat 240
tttgcccagg ccctacctcc tgcccattcc taaccctttt taaaaatctg tgcgtcctct 300
tcttccttct tctccctccc ttcccttttc gctcaccctc tgctgctggc ctgagagccg 360
gaggccccca gggggaaggc gactggtctc ctccccagtc tcagggaagg gagacagaga 420
atccaggaag ccagaactca gcagacgaag cacccaggga cctagagatg ggttgaaaag 480
ttgacagctg tcccacctgc ctcccaaggt ctcagggcct aaacctccaa ggcaggaaag 540
gcccctgtcc ctccctgggg tccatagaaa gagggacaag tctgcacgga ccatttgctg 600
taatattaac accttggctg tcattaggta gtcttggctg ttaattatgt cctgtgataa 660
tgtattatta gcacgccgac cacatagggt agggaactgc agctagtaaa caaaagtttg 720
ttcctat 727
<210> 13
<211> 10339
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
ggatggggac tcatgtgaat tttctaaagg tgctatttaa acggggggca cgagtgccgg 60
ctttggacag ggccgctcgc tctccaccct ttcttcttcc ccctcggccg cctctcaccc 120
cctgaggcct ctctcccccc acgacctcct ctctctcctc tgaaaccctc tcctcctcag 180
ctgcatccca ccctcgtggc ctctctctct ctctgtctgt cctgtgtcct ctctcactgg 240
gtttcagagc acagatgccc aaagcacaaa agcagttttc ccctggggtg ggaggaagca 300
agagactttg tacctatttt gtatgtgtat aataatttga gatgttttta attattttga 360
ttgctggaat aaagcatgtg gaaatgaccc aaaccaatct tgcactggcc tcctgatttc 420
cttccttgga gacggaggga gggggagacc tgggggaggg cgcttggggg ggggtgggct 480
ctcttctttc tgcgctcccc ccccccacct ccaacacctt gacgacccct cctgcttccg 540
cttgcctttc tcaggcttta acactttctc ctcgccctct cagcatgcgc atgcgcgtgc 600
ctctacctcc cccgcacatc ctggcctgcc caccctgaat ggcctggccc agcgatgcca 660
ccaactctct cgctccgtcc acggctgggg aggggggcac tctgcagggt tggggggcac 720
tgggaggctg ggttgggtga gggaggggtg cctgggcccc caccccccag caagttctct 780
ccctaggcga actggagggt cgtctggcct cttgagcctt gttgctggct ctgagctcta 840
ccaagagagt gaccagcagg accgcaccat cacgcgcccc agacccgggc ctggggggca 900
agtcgggggg cggggggagg tcgggcaggg tcccctggga ggatggggac gtgctgtgcc 960
cctagcggcc accagagggc accaggacac cactgcggtc ggctcagcgg ctcctgccct 1020
ggtcaggggg cgccaggtcc tgcccctcct ggggagggcg gggggcgaga agggcgattg 1080
cagaaatggt tgaactcccg agagtgtcct acacctaggg gagaagcagc caaggggttg 1140
tttcccacca aggacgaccc gtctgcgcac aaacggatga gcccatcaga caaagacata 1200
ttcattctct gctgcaaact tggcatagct ctgctttgcc tggggctatt gggggaagtt 1260
gcggttcgtg ctcgcagggc tctcaccctt gactctttca ataataactc ttctgtgcaa 1320
gattacaatc taaacaattc ggagaactcg accttcctcc tgaggcaagg accacagcca 1380
acttcctctt acaagccgca tcgattttgt ccttcagaaa tagaaataag aatgcttgct 1440
aaaaattata tttttaccaa taagaccaat ccaataggta gattattagt tactatgtta 1500
agaaatgaat cattatcttt tagtactatt tttactcaaa ttcagaagtt agaaatggga 1560
atagaaaata gaaagagacg ctcaacctca attgaagaac aggtgcaagg actattgacc 1620
acaggcctag aagtaaaaaa gggaaaaaag agtgtttttg tcaaaatagg agacaggtgg 1680
tggcaaccag ggacttatag gggaccttac atctacagac caacagatgc ccccttacca 1740
tatacaggaa gatatgactt aaattgggat aggtgggtta cagtcaatgg ctataaagtg 1800
ttatatagat ccctcccctt tcgtgaaaga ctcgccagag ctagacctcc ttggtgtatg 1860
ttgtctcaag aaaagaaaga cgacatgaaa caacaggtac atgattatat ttatctagga 1920
acaggaatgc acttttgggg aaagattttc cataccaagg aggggacagt ggctggacta 1980
atagaacatt attctgcaaa aacttatggc atgagttatt atgattagcc ttgatttgcc 2040
caaccttgcg gttcccaagg cttaagtaag tttttggtta caaactgttc ttaaaacaag 2100
gatgtgagac aagtggtttc ctgacttggt ttggtatcaa aggttctgat ctgagctctg 2160
agtgttctat tttcctatgt tcttttggaa tttatccaaa tcttatgtaa atgcttatgt 2220
aaaccaagat ataaaagagt gctgattttt tgagtaaact tgcaacagtc ctaacattca 2280
cctcttgtgt gtttgtgtct gttcgccatc ccgtctccgc tcgtcactta tccttcactt 2340
tccagagggt ccccccgcag accccggcga ccctcaggtc ggccgactgc ggcatctaga 2400
gccaccatgg atagagttct gagcagagct gacaaagaaa ggctgctaga acttctaaaa 2460
cttcccagac aactatgggg ggattttgga agaatgcagc aggcatataa gcagcagtca 2520
ctgctactgc acccagacaa aggtggaagc catgccttaa tgcaggaatt gaacagtctc 2580
tggggaacat ttaaaactga agtatacaat ctgagaatga atctaggagg aaccggcttc 2640
caggtaagaa ggctacatgc ggatgggtgg aatctaagta ccaaagacac ctttggtgat 2700
agatactacc agcggttctg cagaatgcct cttacctgcc tagtaaatgt taaatacagc 2760
tcatgtagtt gtatattatg cctgcttaga aagcaacata gagagctcaa agacaaatgt 2820
gatgccaggt gcctagtact tggagaatgt ttttgtcttg aatgttacat gcaatggttt 2880
ggaacaccaa cccgagatgt gctgaacctg tatgcagact tcattgcaag catgcctata 2940
gactggctgg acctggatgt gcacagcgtg tataatccaa aacggcggag cgaggaactg 3000
aggagagcgg ccacagtcca ctacacgatg actactggtc attcagctat ggaagcaagt 3060
acttcacaag ggaatggaat gatttcttca gaaagtggga ccccagctac cagtcgccgc 3120
ctaagactgc cgagtcttct gagcaacccg acctattctg ttatgaggag ccactcctat 3180
cccccaaccc gagttctcca acagatacac ccgcacatac tgctggaaga agacgaaatc 3240
cttgtgttgc tgagcccgat gacagcatat ccccggaccc ccccagaact cctgtatcca 3300
gaaagcgacc aagaccagct ggagccactg gaggaggagg aggaggagta catgccaatg 3360
gaggatctgt atttggacat cctaccgggg gaacaagtac cccagctcat ccccccccct 3420
atcattccca gggcgggtct gagtccatgg gagggtctga ttcttcggga tttgcagagg 3480
gctcatttcg atccgatcct agatgcgagt cagagaatga gagctactca cagagctgct 3540
ctcagagctc attcaatgca acgccaccta agaaggctag ggaggaccct gctcctagtg 3600
actttcctag cagccttact gggtatttgt ctcatgctat ttattctaat aaaacgttcc 3660
cggcatttct agggccgcga ctctagagtc ggggcggccg gccgcttcga gcagacatga 3720
ctgtgccttc tagttgccag ccatctgttg tttgcccctc ccccgtgcct tccttgaccc 3780
tggaaggtgc cactcccact gtcctttcct aataaaatga ggaaattgca tcgcattgtc 3840
tgagtaggtg tcattctatt ctggggggtg gggtggggca ggacagcaag ggggaggatt 3900
gggaagacaa tagcaggcat gctggggatg cggtgggctc tatggaacaa caacaattgc 3960
attcatttta tgtttcaggt tcagggggag gtgtgggagg tctgaggcgg aaagaaccag 4020
ctgccttaat ataacttcgt ataatgtatg ctatacgaag ttattaggtc tgaagaggag 4080
tttacgtcca gccaattctg tggaatgtgt gtcagttagg gtgtggaaag tccccaggct 4140
ccccagcagg cagaagtatg caaagcatgc atctcaatta gtcagcaacc aggtgtggaa 4200
agtccccagg ctccccagca ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa 4260
ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt tccgcccatt 4320
ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc gcctctgcct 4380
ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt tgcaaaaagc 4440
tcccgggagc ttgtatatcc attttcggcg gccgcgccac catgaccgag tacaagccca 4500
cggtgcgcct cgccacccgc gacgacgtcc ccagggccgt acgcaccctc gccgccgcgt 4560
tcgccgacta ccccgccacg cgccacaccg tcgatccgga ccgccacatc gagcgggtca 4620
ccgagctgca agaactcttc ctcacgcgcg tcgggctcga catcggcaag gtgtgggtcg 4680
cggacgacgg cgccgcggtg gcggtctgga ccacgccgga gagcgtcgaa gcgggggcgg 4740
tgttcgccga gatcggcccg cgcatggccg agttgagcgg ttcccggctg gccgcgcagc 4800
aacagatgga aggcctcctg gcgccgcacc ggcccaagga gcccgcgtgg ttcctggcca 4860
ccgtcggagt ctcgcccgac caccagggca agggtctggg cagcgccgtc gtgctccccg 4920
gagtggaggc ggccgagcgc gccggggtgc ccgccttcct ggagacctcc gcgccccgca 4980
acctcccctt ctacgagcgg ctcggcttca ccgtcaccgc cgacgtcgag gtgcccgaag 5040
gaccgcgcac ctggtgcatg acccgcaagc ccggtgcctg agaattcgcg ggactctggg 5100
gttcgaaatg accgaccaag cgacgcccaa cctgccatca cgagatttcg attccaccgc 5160
cgccttctat gaaaggttgg gcttcggaat cgttttccgg gacgccggct ggatgatcct 5220
ccagcgcggg gatctcatgc tggagttctt cgcccacccc aacttgttta ttgcagctta 5280
taatggttac aaataaagca atagcatcac aaatttcaca aataaagcat ttttttcact 5340
gcattctagt tgtggtttgt ccaaactcat caatgtatct tatcatgtct gtataccgct 5400
cgactagagc ttgcggaacc cttaatataa cttcgtataa tgtatgctat acgaagttat 5460
taggtccgct ggccatctac gagccaaaga ctttcaaatc tttggctgcc ttggccagta 5520
ggaggcgaca cgaaggattt gctgctgcct tgggggatgg gaaggaacct gaaggcattt 5580
tttccagagt ggtgcagtac cactgaggac tgttgctgta ttgattagga aaagagacag 5640
agtaatttgc agtttgtttg atttatactg tggttgctga gactgcgtgg gggcccaagg 5700
agacctggag aaaggaatgc ttcctgctcc ttcttctggg gccccaggag agccttccca 5760
gggccttgga gaggtgctgt ccagggacta accctgtgct ctaggaaggc tgcaggccct 5820
gaccagctgg gcaggtcctg ggtccctcct ggccttctaa gttccccaaa catgagacct 5880
ctgggtgtgg ggtggcctgg ggaggtcatt ttgcccaggc cctacctcct gcccattcct 5940
aacccttttt aaaaatctgt gcgtcctctt cttccttctt ctccctccct tcccttttcg 6000
ctcaccctct gctgctggcc tgagagccgg aggcccccag ggggaaggcg actggtctcc 6060
tccccagtct cagggaaggg agacagagaa tccaggaagc cagaactcag cagacgaagc 6120
acccagggac ctagagatgg gttgaaaagt tgacagctgt cccacctgcc tcccaaggtc 6180
tcagggccta aacctccaag gcaggaaagg cccctgtccc tccctggggt ccatagaaag 6240
agggacaagt ctgcacggac catttgctgt aatattaaca ccttggctgt cattaggtag 6300
tcttggctgt taattatgtc ctgtgataat gtattattag cacgccgacc acatagggta 6360
gggaactgca gctagtaaac aaaagtttgt tcctatatgc ggccgccata aaagttttgt 6420
tactttatag aagaaatttt gagtttttgt tttttttaat aaataaataa acataaataa 6480
attgtttgtt gaatttatta ttagtatgta agtgtaaata taataaaact taatatctat 6540
tcaaattaat aaataaacct cgatatacag accgataaaa cacatgcgtc aattttacac 6600
atgattatct ttaacgtacg tcacaatatg attatctttc tagggttaat ctagctgcgt 6660
gttctgcagc gtgtcgagca tcttcatctg ctccatcacg ctgtaaaaca catttgcacc 6720
gcgagtctgc ccgtcctcca cgggttcaaa aacgtgaatg aacgaggcgc gctcactggc 6780
cgtcgtttta caacgtcgtg actgggaaaa ccctggcgtt acccaactta atcgccttgc 6840
agcacatccc cctttcgcca gctggcgtaa tagcgaagag gcccgcaccg atcgcccttc 6900
ccaacagttg cgcagcctga atggcgaatg ggacgcgccc tgtagcggcg cattaagcgc 6960
ggcgggtgtg gtggttacgc gcagcgtgac cgctacactt gccagcgccc tagcgcccgc 7020
tcctttcgct ttcttccctt cctttctcgc cacgttcgcc ggctttcccc gtcaagctct 7080
aaatcggggg ctccctttag ggttccgatt tagtgcttta cggcacctcg accccaaaaa 7140
acttgattag ggtgatggtt cacgtagtgg gccatcgccc tgatagacgg tttttcgccc 7200
tttgacgttg gagtccacgt tctttaatag tggactcttg ttccaaactg gaacaacact 7260
caaccctatc tcggtctatt cttttgattt ataagggatt ttgccgattt cggcctattg 7320
gttaaaaaat gagctgattt aacaaaaatt taacgcgaat tttaacaaaa tattaacgct 7380
tacaatttag gtggcacttt tcggggaaat gtgcgcggaa cccctatttg tttatttttc 7440
taaatacatt caaatatgta tccgctcatg agacaataac cctgataaat gcttcaataa 7500
tattgaaaaa ggaagagtat gagtattcaa catttccgtg tcgcccttat tccctttttt 7560
gcggcatttt gccttcctgt ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct 7620
gaagatcagt tgggtgcacg agtgggttac atcgaactgg atctcaacag cggtaagatc 7680
cttgagagtt ttcgccccga agaacgtttt ccaatgatga gcacttttaa agttctgcta 7740
tgtggcgcgg tattatcccg tattgacgcc gggcaagagc aactcggtcg ccgcatacac 7800
tattctcaga atgacttggt tgagtactca ccagtcacag aaaagcatct tacggatggc 7860
atgacagtaa gagaattatg cagtgctgcc ataaccatga gtgataacac tgcggccaac 7920
ttacttctga caacgatcgg aggaccgaag gagctaaccg cttttttgca caacatgggg 7980
gatcatgtaa ctcgccttga tcgttgggaa ccggagctga atgaagccat accaaacgac 8040
gagcgtgaca ccacgatgcc tgtagcaatg gcaacaacgt tgcgcaaact attaactggc 8100
gaactactta ctctagcttc ccggcaacaa ttaatagact ggatggaggc ggataaagtt 8160
gcaggaccac ttctgcgctc ggcccttccg gctggctggt ttattgctga taaatctgga 8220
gccggtgagc gtggttcacg cggtatcatt gcagcactgg ggccagatgg taagccctcc 8280
cgtatcgtag ttatctacac gacggggagt caggcaacta tggatgaacg aaatagacag 8340
atcgctgaga taggtgcctc actgattaag cattggtaac tgtcagacca agtttactca 8400
tatatacttt agattgattt aaaacttcat ttttaattta aaaggatcta ggtgaagatc 8460
ctttttgata atctcatgac caaaatccct taacgtgagt tttcgttcca ctgagcgtca 8520
gaccccgtag aaaagatcaa aggatcttct tgagatcctt tttttctgcg cgtaatctgc 8580
tgcttgcaaa caaaaaaacc accgctacca gcggtggttt gtttgccgga tcaagagcta 8640
ccaactcttt ttccgaaggt aactggcttc agcagagcgc agataccaaa tactgtcctt 8700
ctagtgtagc cgtagttagg ccaccacttc aagaactctg tagcaccgcc tacatacctc 8760
gctctgctaa tcctgttacc agtggctgct gccagtggcg ataagtcgtg tcttaccggg 8820
ttggactcaa gacgatagtt accggataag gcgcagcggt cgggctgaac ggggggttcg 8880
tgcacacagc ccagcttgga gcgaacgacc tacaccgaac tgagatacct acagcgtgag 8940
ctatgagaaa gcgccacgct tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc 9000
agggtcggaa caggagagcg cacgagggag cttccagggg gaaacgcctg gtatctttat 9060
agtcctgtcg ggtttcgcca cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg 9120
gggcggagcc tatggaaaaa cgccagcaac gcggcctttt tacggttcct ggccttttgc 9180
tggccttttg ctcacatgtt ctttcctgcg ttatcccctg attctgtgga taaccgtatt 9240
accgcctttg agtgagctga taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca 9300
gtgagcgagg aagcggaaga gcgcccaata cgcaaaccgc ctctccccgc gcgttggccg 9360
attcattaat gcagctggca cgacaggttt cccgactgga aagcgggcag tgagcgcaac 9420
gcaattaatg tgagttagct cactcattag gcaccccagg ctttacactt tatgcttccg 9480
gctcgtatgt tgtgtggaat tgtgagcgga taacaatttc acacaggaaa cagctatgac 9540
catgattacg ccaagcgcgc ccgccgggta actcacgggg tatccatgtc catttctgcg 9600
gcatccagcc aggatacccg tcctcgctga cgtaatatcc cagcgccgca ccgctgtcat 9660
taatctgcac accggcacgg cagttccggc tgtcgccggt attgttcggg ttgctgatgc 9720
gcttcgggct gaccatccgg aactgtgtcc ggaaaagccg cgacgaactg gtatcccagg 9780
tggcctgaac gaacagttca ccgttaaagg cgtgcatggc cacaccttcc cgaatcatca 9840
tggtaaacgt gcgttttcgc tcaacgtcaa tgcagcagca gtcatcctcg gcaaactctt 9900
tccatgccgc ttcaacctcg cgggaaaagg cacgggcttc ttcctccccg atgcccagat 9960
agcgccagct tgggcgatga ctgagccgga aaaaagaccc gacgatatga tcctgatgca 10020
gctagattaa ccctagaaag atagtctgcg taaaattgac gcatgcattc ttgaaatatt 10080
gctctctctt tctaaatagc gcgaatccgt cgctgtgcat ttaggacatc tcagtcgccg 10140
cttggagctc ccgtgaggcg tgcttgtcaa tgcggtaagt gtcactgatt ttgaactata 10200
acgaccgcgt gagtcaaaat gacgcatgat tatcttttac gtgactttta agatttaact 10260
catacgataa ttatattgtt atttcatgtt ctacttacgt gataacttat tatatatata 10320
ttttcttgtt atagatatc 10339
<210> 14
<211> 421
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 14
Met Asp Arg Val Leu Ser Arg Ala Asp Lys Glu Arg Leu Leu Glu Leu
1 5 10 15
Leu Lys Leu Pro Arg Gln Leu Trp Gly Asp Phe Gly Arg Met Gln Gln
20 25 30
Ala Tyr Lys Gln Gln Ser Leu Leu Leu His Pro Asp Lys Gly Gly Ser
35 40 45
His Ala Leu Met Gln Glu Leu Asn Ser Leu Trp Gly Thr Phe Lys Thr
50 55 60
Glu Val Tyr Asn Leu Arg Met Asn Leu Gly Gly Thr Gly Phe Gln Val
65 70 75 80
Arg Arg Leu His Ala Asp Gly Trp Asn Leu Ser Thr Lys Asp Thr Phe
85 90 95
Gly Asp Arg Tyr Tyr Gln Arg Phe Cys Arg Met Pro Leu Thr Cys Leu
100 105 110
Val Asn Val Lys Tyr Ser Ser Cys Ser Cys Ile Leu Cys Leu Leu Arg
115 120 125
Lys Gln His Arg Glu Leu Lys Asp Lys Cys Asp Ala Arg Cys Leu Val
130 135 140
Leu Gly Glu Cys Phe Cys Leu Glu Cys Tyr Met Gln Trp Phe Gly Thr
145 150 155 160
Pro Thr Arg Asp Val Leu Asn Leu Tyr Ala Asp Phe Ile Ala Ser Met
165 170 175
Pro Ile Asp Trp Leu Asp Leu Asp Val His Ser Val Tyr Asn Pro Lys
180 185 190
Arg Arg Ser Glu Glu Leu Arg Arg Ala Ala Thr Val His Tyr Thr Met
195 200 205
Thr Thr Gly His Ser Ala Met Glu Ala Ser Thr Ser Gln Gly Asn Gly
210 215 220
Met Ile Ser Ser Glu Ser Gly Thr Pro Ala Thr Ser Arg Arg Leu Arg
225 230 235 240
Leu Pro Ser Leu Leu Ser Asn Pro Thr Tyr Ser Val Met Arg Ser His
245 250 255
Ser Tyr Pro Pro Thr Arg Val Leu Gln Gln Ile His Pro His Ile Leu
260 265 270
Leu Glu Glu Asp Glu Ile Leu Val Leu Leu Ser Pro Met Thr Ala Tyr
275 280 285
Pro Arg Thr Pro Pro Glu Leu Leu Tyr Pro Glu Ser Asp Gln Asp Gln
290 295 300
Leu Glu Pro Leu Glu Glu Glu Glu Glu Glu Tyr Met Pro Met Glu Asp
305 310 315 320
Leu Tyr Leu Asp Ile Leu Pro Gly Glu Gln Val Pro Gln Leu Ile Pro
325 330 335
Pro Pro Ile Ile Pro Arg Ala Gly Leu Ser Pro Trp Glu Gly Leu Ile
340 345 350
Leu Arg Asp Leu Gln Arg Ala His Phe Asp Pro Ile Leu Asp Ala Ser
355 360 365
Gln Arg Met Arg Ala Thr His Arg Ala Ala Leu Arg Ala His Ser Met
370 375 380
Gln Arg His Leu Arg Arg Leu Gly Arg Thr Leu Leu Leu Val Thr Phe
385 390 395 400
Leu Ala Ala Leu Leu Gly Ile Cys Leu Met Leu Phe Ile Leu Ile Lys
405 410 415
Arg Ser Arg His Phe
420
<210> 15
<211> 7246
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 15
ggcctaactg gccgcagaaa tggttgaact cccgagagtg tcctacacct aggggagaag 60
cagccaaggg gttgtttccc accaaggacg acccgtctgc gcacaaacgg atgagcccat 120
cagacaaaga catattcatt ctctgctgca aacttggcat agctctgctt tgcctggggc 180
tattggggga agttgcggtt cgtgctcgca gggctctcac ccttgactct ttcaataata 240
actcttctgt gcaagattac aatctaaaca attcggagaa ctcgaccttc ctcctgaggc 300
aaggaccaca gccaacttcc tcttacaagc cgcatcgatt ttgtccttca gaaatagaaa 360
taagaatgct tgctaaaaat tatattttta ccaataagac caatccaata ggtagattat 420
tagttactat gttaagaaat gaatcattat cttttagtac tatttttact caaattcaga 480
agttagaaat gggaatagaa aatagaaaga gacgctcaac ctcaattgaa gaacaggtgc 540
aaggactatt gaccacaggc ctagaagtaa aaaagggaaa aaagagtgtt tttgtcaaaa 600
taggagacag gtggtggcaa ccagggactt ataggggacc ttacatctac agaccaacag 660
atgccccctt accatataca ggaagatatg acttaaattg ggataggtgg gttacagtca 720
atggctataa agtgttatat agatccctcc cctttcgtga aagactcgcc agagctagac 780
ctccttggtg tatgttgtct caagaaaaga aagacgacat gaaacaacag gtacatgatt 840
atatttatct aggaacagga atgcactttt ggggaaagat tttccatacc aaggagggga 900
cagtggctgg actaatagaa cattattctg caaaaactta tggcatgagt tattatgatt 960
agccttgatt tgcccaacct tgcggttccc aaggcttaag taagtttttg gttacaaact 1020
gttcttaaaa caaggatgtg agacaagtgg tttcctgact tggtttggta tcaaaggttc 1080
tgatctgagc tctgagtgtt ctattttcct atgttctttt ggaatttatc caaatcttat 1140
gtaaatgctt atgtaaacca agatataaaa gagtgctgat tttttgagta aacttgcaac 1200
agtcctaaca ttcacctctt gtgtgtttgt gtctgttcgc catcccgtct ccgctcgtca 1260
cttatccttc actttccaga gggtcccccc gcagaccccg gcgaccctca ggtcggccga 1320
ctgcggcagg cctcggcggc caagcttggc aatccggtac tgttggtaaa gccaccatgg 1380
aagatgccaa aaacattaag aagggcccag cgccattcta cccactcgaa gacgggaccg 1440
ccggcgagca gctgcacaaa gccatgaagc gctacgccct ggtgcccggc accatcgcct 1500
ttaccgacgc acatatcgag gtggacatta cctacgccga gtacttcgag atgagcgttc 1560
ggctggcaga agctatgaag cgctatgggc tgaatacaaa ccatcggatc gtggtgtgca 1620
gcgagaatag cttgcagttc ttcatgcccg tgttgggtgc cctgttcatc ggtgtggctg 1680
tggccccagc taacgacatc tacaacgagc gcgagctgct gaacagcatg ggcatcagcc 1740
agcccaccgt cgtattcgtg agcaagaaag ggctgcaaaa gatcctcaac gtgcaaaaga 1800
agctaccgat catacaaaag atcatcatca tggatagcaa gaccgactac cagggcttcc 1860
aaagcatgta caccttcgtg acttcccatt tgccacccgg cttcaacgag tacgacttcg 1920
tgcccgagag cttcgaccgg gacaaaacca tcgccctgat catgaacagt agtggcagta 1980
ccggattgcc caagggcgta gccctaccgc accgcaccgc ttgtgtccga ttcagtcatg 2040
cccgcgaccc catcttcggc aaccagatca tccccgacac cgctatcctc agcgtggtgc 2100
catttcacca cggcttcggc atgttcacca cgctgggcta cttgatctgc ggctttcggg 2160
tcgtgctcat gtaccgcttc gaggaggagc tattcttgcg cagcttgcaa gactataaga 2220
ttcaatctgc cctgctggtg cccacactat ttagcttctt cgctaagagc actctcatcg 2280
acaagtacga cctaagcaac ttgcacgaga tcgccagcgg cggggcgccg ctcagcaagg 2340
aggtaggtga ggccgtggcc aaacgcttcc acctaccagg catccgccag ggctacggcc 2400
tgacagaaac aaccagcgcc attctgatca cccccgaagg ggacgacaag cctggcgcag 2460
taggcaaggt ggtgcccttc ttcgaggcta aggtggtgga cttggacacc ggtaagacac 2520
tgggtgtgaa ccagcgcggc gagctgtgcg tccgtggccc catgatcatg agcggctacg 2580
ttaacaaccc cgaggctaca aacgctctca tcgacaagga cggctggctg cacagcggcg 2640
acatcgccta ctgggacgag gacgagcact tcttcatcgt ggaccggctg aagagcctga 2700
tcaaatacaa gggctaccag gtagccccag ccgaactgga gagcatcctg ctgcaacacc 2760
ccaacatctt cgacgccggg gtcgccggcc tgcccgacga cgatgccggc gagctgcccg 2820
ccgcagtcgt cgtgctggaa cacggtaaaa ccatgaccga gaaggagatc gtggactatg 2880
tggccagcca ggttacaacc gccaagaagc tgcgcggtgg tgttgtgttc gtggacgagg 2940
tgcctaaagg actgaccggc aagttggacg cccgcaagat ccgcgagatt ctcattaagg 3000
ccaagaaggg cggcaagatc gccgtgaatt ctcacggctt ccctcccgag gtggaggagc 3060
aggccgccgg caccctgccc atgagctgcg cccaggagag cggcatggat agacaccctg 3120
ctgcttgcgc cagcgccagg atcaacgtct aaggccgcga ctctagagtc ggggcggccg 3180
gccgcttcga gcagacatga taagatacat tgatgagttt ggacaaacca caactagaat 3240
gcagtgaaaa aaatgcttta tttgtgaaat ttgtgatgct attgctttat ttgtaaccat 3300
tataagctgc aataaacaag ttaacaacaa caattgcatt cattttatgt ttcaggttca 3360
gggggaggtg tgggaggttt tttaaagcaa gtaaaacctc tacaaatgtg gtaaaatcga 3420
taaggatccg tttgcgtatt gggcgctctt ccgctgatct gcgcagcacc atggcctgaa 3480
ataacctctg aaagaggaac ttggttagct accttctgag gcggaaagaa ccagctgtgg 3540
aatgtgtgtc agttagggtg tggaaagtcc ccaggctccc cagcaggcag aagtatgcaa 3600
agcatgcatc tcaattagtc agcaaccagg tgtggaaagt ccccaggctc cccagcaggc 3660
agaagtatgc aaagcatgca tctcaattag tcagcaacca tagtcccgcc cctaactccg 3720
cccatcccgc ccctaactcc gcccagttcc gcccattctc cgccccatgg ctgactaatt 3780
ttttttattt atgcagaggc cgaggccgcc tctgcctctg agctattcca gaagtagtga 3840
ggaggctttt ttggaggcct aggcttttgc aaaaagctcg attcttctga cactagcgcc 3900
accatgaaga agcccgaact caccgctacc agcgttgaaa aatttctcat cgagaagttc 3960
gacagtgtga gcgacctgat gcagttgtcg gagggcgaag agagccgagc cttcagcttc 4020
gatgtcggcg gacgcggcta tgtactgcgg gtgaatagct gcgctgatgg cttctacaaa 4080
gaccgctacg tgtaccgcca cttcgccagc gctgcactac ccatccccga agtgttggac 4140
atcggcgagt tcagcgagag cctgacatac tgcatcagta gacgcgccca aggcgttact 4200
ctccaagacc tccccgaaac agagctgcct gctgtgttac agcctgtcgc cgaagctatg 4260
gatgctattg ccgccgccga cctcagtcaa accagcggct tcggcccatt cgggccccaa 4320
ggcatcggcc agtacacaac ctggcgggat ttcatttgcg ccattgctga tccccatgtc 4380
taccactggc agaccgtgat ggacgacacc gtgtccgcca gcgtagctca agccctggac 4440
gaactgatgc tgtgggccga agactgtccc gaggtgcgcc acctcgtcca tgccgacttc 4500
ggcagcaaca acgtcctgac cgacaacggc cgcatcaccg ccgtaatcga ctggtccgaa 4560
gctatgttcg gggacagtca gtacgaggtg gccaacatct tcttctggcg gccctggctg 4620
gcttgcatgg agcagcagac tcgctacttc gagcgccggc atcccgagct ggccggcagc 4680
cctcgtctgc gagcctacat gctgcgcatc ggcctggatc agctctacca gagcctcgtg 4740
gacggcaact tcgacgatgc tgcctgggct caaggccgct gcgatgccat cgtccgcagc 4800
ggggccggca ccgtcggtcg cacacaaatc gctcgccgga gcgcagccgt atggaccgac 4860
ggctgcgtcg aggtgctggc cgacagcggc aaccgccggc ccagtacacg accgcgcgct 4920
aaggaggtag gtcgagttta aactctagaa ccggtcatgg ccgcaataaa atatctttat 4980
tttcattaca tctgtgtgtt ggttttttgt gtgttcgaac tagatgctgt cgaccgatgc 5040
ccttgagagc cttcaaccca gtcagctcct tccggtgggc gcggggcatg actatcgtcg 5100
ccgcacttat gactgtcttc tttatcatgc aactcgtagg acaggtgccg gcagcgctct 5160
tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca 5220
gctcactcaa aggcggtaat acggttatcc acagaatcag gggataacgc aggaaagaac 5280
atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt 5340
ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg 5400
cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc 5460
tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc 5520
gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc 5580
aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac 5640
tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt 5700
aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct 5760
aactacggct acactagaag aacagtattt ggtatctgcg ctctgctgaa gccagttacc 5820
ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt 5880
ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg 5940
atcttttcta cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc 6000
atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg aagttttaaa 6060
tcaatctaaa gtatatatga gtaaacttgg tctgacagcg gccgcaaatg ctaaaccact 6120
gcagtggtta ccagtgcttg atcagtgagg caccgatctc agcgatctgc ctatttcgtt 6180
cgtccatagt ggcctgactc cccgtcgtgt agatcactac gattcgtgag ggcttaccat 6240
caggccccag cgcagcaatg atgccgcgag agccgcgttc accggccccc gatttgtcag 6300
caatgaacca gccagcaggg agggccgagc gaagaagtgg tcctgctact ttgtccgcct 6360
ccatccagtc tatgagctgc tgtcgtgatg ctagagtaag aagttcgcca gtgagtagtt 6420
tccgaagagt tgtggccatt gctactggca tcgtggtatc acgctcgtcg ttcggtatgg 6480
cttcgttcaa ctctggttcc cagcggtcaa gccgggtcac atgatcaccc atattatgaa 6540
gaaatgcagt cagctcctta gggcctccga tcgttgtcag aagtaagttg gccgcggtgt 6600
tgtcgctcat ggtaatggca gcactacaca attctcttac cgtcatgcca tccgtaagat 6660
gcttttccgt gaccggcgag tactcaacca agtcgttttg tgagtagtgt atacggcgac 6720
caagctgctc ttgcccggcg tctatacggg acaacaccgc gccacatagc agtactttga 6780
aagtgctcat catcgggaat cgttcttcgg ggcggaaaga ctcaaggatc ttgccgctat 6840
tgagatccag ttcgatatag cccactcttg cacccagttg atcttcagca tcttttactt 6900
tcaccagcgt ttcggggtgt gcaaaaacag gcaagcaaaa tgccgcaaag aagggaatga 6960
gtgcgacacg aaaatgttgg atgctcatac tcgtcctttt tcaatattat tgaagcattt 7020
atcagggtta ctagtacgtc tctcaaggat aagtaagtaa tattaaggta cgggaggtat 7080
tggacaggcc gcaataaaat atctttattt tcattacatc tgtgtgttgg ttttttgtgt 7140
gaatcgatag tactaacata cgctctccat caaaacaaaa cgaaacaaaa caaactagca 7200
aaataggctg tccccagtgc aagtgcaggt gccagaacat ttctct 7246
<210> 16
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 16
agttatggca gaactcagtg 20
<210> 17
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
ccccatccaa agtttttaaa gga 23
<210> 18
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
tgtggcagat gtcacagttt agg 23
<210> 19
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
caccgagtta tggcagaact cagtg 25
<210> 20
<211> 25
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 20
aaaccactga gttctgccat aactc 25
<210> 21
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 21
gaaggagcaa actgacatgg 20
<210> 22
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 22
tgcagtgggt ctttggggac 20
<210> 23
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 23
ttccaggaac ataagaaagt 20
<210> 24
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 24
gcagtctcag caaccactga 20
<210> 25
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 25
ggtcggagtg aacggatttg 20
<210> 26
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 26
ccatttgatg ttggcgggat 20
<210> 27
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 27
agatccgcca caacatcgag 20
<210> 28
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 28
gtccatgccg agagtgatcc 20
<210> 29
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 29
cctgctgtaa gtgccgtagt 20
<210> 30
<211> 18
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 30
ctaggggcac agcacgtc 18
<210> 31
<211> 26
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 31
aagttattag gtctgaagag gagttt 26
<210> 32
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 32
cccatcattc cgtcccagag 20
<210> 33
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 33
tgctgagttc tggcttcctg 20
<210> 34
<211> 23
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 34
tctaccaaga gagtgaccag cag 23
<210> 35
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 35
cacgccatcc tgcgtctgga 20
<210> 36
<211> 20
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 36
agcaccgtgt tggcgtagag 20
<210> 37
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 37
gtctccgctc gtcacttatc ct 22
<210> 38
<211> 22
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 38
ctagcagcct ttctttgtca gc 22
<210> 39
<211> 1266
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 39
atggatagag ttctgagcag agctgacaaa gaaaggctgc tagaacttct aaaacttccc 60
agacaactat ggggggattt tggaagaatg cagcaggcat ataagcagca gtcactgcta 120
ctgcacccag acaaaggtgg aagccatgcc ttaatgcagg aattgaacag tctctgggga 180
acatttaaaa ctgaagtata caatctgaga atgaatctag gaggaaccgg cttccaggta 240
agaaggctac atgcggatgg gtggaatcta agtaccaaag acacctttgg tgatagatac 300
taccagcggt tctgcagaat gcctcttacc tgcctagtaa atgttaaata cagctcatgt 360
agttgtatat tatgcctgct tagaaagcaa catagagagc tcaaagacaa atgtgatgcc 420
aggtgcctag tacttggaga atgtttttgt cttgaatgtt acatgcaatg gtttggaaca 480
ccaacccgag atgtgctgaa cctgtatgca gacttcattg caagcatgcc tatagactgg 540
ctggacctgg atgtgcacag cgtgtataat ccaaaacggc ggagcgagga actgaggaga 600
gcggccacag tccactacac gatgactact ggtcattcag ctatggaagc aagtacttca 660
caagggaatg gaatgatttc ttcagaaagt gggaccccag ctaccagtcg ccgcctaaga 720
ctgccgagtc ttctgagcaa cccgacctat tctgttatga ggagccactc ctatccccca 780
acccgagttc tccaacagat acacccgcac atactgctgg aagaagacga aatccttgtg 840
ttgctgagcc cgatgacagc atatccccgg acccccccag aactcctgta tccagaaagc 900
gaccaagacc agctggagcc actggaggag gaggaggagg agtacatgcc aatggaggat 960
ctgtatttgg acatcctacc gggggaacaa gtaccccagc tcatcccccc ccctatcatt 1020
cccagggcgg gtctgagtcc atgggagggt ctgattcttc gggatttgca gagggctcat 1080
ttcgatccga tcctagatgc gagtcagaga atgagagcta ctcacagagc tgctctcaga 1140
gctcattcaa tgcaacgcca cctaagaagg ctagggagga ccctgctcct agtgactttc 1200
ctagcagcct tactgggtat ttgtctcatg ctatttattc taataaaacg ttcccggcat 1260
ttctag 1266
<210> 40
<211> 1104
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 40
aataaatgca ctgttgggcc tatgctcaag atgggtagtg ttaattggtg gtggaactta 60
tctgatttca tgacttgctg gctacctaaa acaggtgagg agaaagccaa tgctatgtct 120
gggactggat gagcaagtac aacaaacaaa atgggcttaa agtatgagtg agagttatct 180
gaccgtaagg atgcaagtga gggggcctaa ggtttggaga ttaatattta atctcagatg 240
ctatactttg gtggtgtagc aaaagtctac aaatgggatg actgtaaaac tcagtagatc 300
cgtgcttttt aacctatctc ccttcatcag gaaattgcga cacaaagatc tttagtaata 360
acacgcagtc tcaatgcata aaatcaggct taggtgttgc ctggactcat ttcccatctc 420
caccccacta taattatttt gtgacacaaa ctcaagactg tgggaatata gagaaattgg 480
gctcgtcctc gtacacctgc tcaatcccct gcaggacaac gcccaagaat caggttaagc 540
cagggcaaaa gaatcccgcc cataatcgag aaggagcaaa ctgacatgga ggcgatgacg 600
agatcgcggg ggagggaggg atttttctag gcccagggcg gtccttagga aaaggaggca 660
gcagagaact cccataaagg tattgcggca ctcccctccc cctgcggaga agggtgcggc 720
cttctctccg cctcctccac tgcagctccc tcaggattgc agctcgcgcg ggtttttgga 780
gaacatgcgc ctcccaccca caagccagca ggaccgaccc cccactcctt cctccacccc 840
ccacccccac gggtccgaga gcaggtagag ggctagtctc gtccttcagg cggcggacgc 900
ccagggcgga gccgcagtca ccaccaccca gaagcctcgg cccggcagcc cgcccccgcc 960
tcctgcgcgc gcttcctgcc acgttgcgca ggggcgaggg gccagacact gcggcgctgg 1020
cctcggggag ggccgtacca aagaccgcct ccctgccgac tcgcgtagtg gtttcgctca 1080
tttgggaccc aagccaataa caag 1104
<210> 41
<211> 1056
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 41
tgctctctct cctgccccct tcacctgcgt gccctcctca ttctccctct gtgccacctc 60
tggccttgca ctgtaggctc tctcttgggg atgtttctcc ttctccacac acttctcttt 120
cactctgtcc tcttgctttg tgtgggcctg cagcgttacc cttttttctg ggcacactca 180
gagcaccctc ctctttctgg ttctgggcca cctgtctgtc ctcgggtcat cttgctctct 240
ctgcctggat gccctcctgt ggctttgggc agcttctccc tccttcagag tgcaccgcca 300
gttctcctag gcccggtcac ttccccttcc caggggacct agagccctgc taggtcctct 360
ctctccacaa cctgggcccc caaacctttc caaaacacct tgctttctgc ctccattggt 420
cttgtgttcc agagccagag tcactatatg tcccagaacc aggattccct ctggttctga 480
gggcttttat cgcatcccct gcctggctgc agtgggtctt tggggacagg ccacagaaga 540
gcctctactc ctccctctgt ccccgaggct gtctccctcc cagtcttccc agctcaggcc 600
agtccccagg cctctcttcc ctgccagagc ccgtcaggtt cggttacttt ggggcccaga 660
gaggaccctg tgaaggaagc gtgggtaggg gcacgggaat ggggaggatg cctgaagagg 720
cccccttagc cagaagagga gcagaagagg agcaggtacc cagaagagga gcagttcagg 780
gaaatagaag agtcccgagc tctttttttt tttttttttt atttcttttc ttttcttttc 840
tttttatggc agcatccgtg gtatatggag gttcccagcc taggggtcag atcatacctg 900
caactgccag cctacaccac agccacagca ctcaggatcc gagctgcatc tgcggcttac 960
gccacaggtc acagcaacgc tggatcctta acccactgaa tgaggccagg gattgaacct 1020
gcaacctcat gcacactatg ctggggtctt aatcgg 1056
<210> 42
<211> 1108
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 42
acttcctcct gcccttaccc tttatctggc tcttagctcc taaaaactgc attattagct 60
tcctcttttg cctctactct tactcaacca aaattgtttt aagatctgtg gatctagctt 120
ctgctgtgct attcttagga acacttttat ttcctcttag ctccatctca ccagttattg 180
gctaatggct ttgcttggta cctacatctg tacatttctt tcgtactagc ttctagactg 240
aaaaaggact gttggttcaa catgaaaggg aaggaggtaa aagaggacac acaggaaaga 300
tggattggga ttcaggtctc tgctgttgtt acttgagatt gctttctaga ttctacttgt 360
ggaaacaaaa agcctttgcg agaattctaa actggagtat ttctgtaatt gaggagtctt 420
gctcagcaaa tcccacttag gggactaatg aagtaccagg aagagacaga ccatgctcaa 480
tccacaaagc caggttttac tgaaatgtga cctactttct tatgttcctg gaagtttaga 540
tcagggtggg cagctctggg ttttataggc tacactgtta acactcaggc tgttttctac 600
cgtttagtca aaatatagtc accttgcctg cttcacctgt ccatcagaga atggcctcat 660
taattgactc tctagtatga agtcaaagta gctttggtgg ccctaaatgg acaagtatca 720
agagactggg tgaattgagg agcttgagac tgtcacctca gatcgaaaag actgaaaaat 780
cacctcagat caaaaagact gaaaaatctt cagtctggaa aggggactca aaaccataat 840
tagagtattc tggtagaatc cttttctcca ctgttattca tacagttaag gtgaataact 900
aaaagtaatt gtgagctgag gagtaagata caacacacaa ggaatcagtt aacagagtct 960
cgagtgaaat tataaatgga aagaattatg acttgaatca taactctgag gccccatttt 1020
ccctaacaac ttttgtccca ataaacgtgg gtatttgttt gggagaaact atcatataca 1080
tgattaccca gtaaacagac tgtttact 1108
<210> 43
<211> 1089
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 43
actttgtacc tattttgtat gtgtataata atttgagatg tttttaatta ttttgattgc 60
tggaataaag catgtggaaa tgacccaaac caatcttgca ctggcctcct gatttccttc 120
cttggagacg gagggagggg gagacctggg ggagggcgct tggggggggg tgggctctct 180
tctttctgcg ctcccccccc ccacctccaa caccttgacg acccctcctg cttccgcttg 240
cctttctcag gctttaacac tttctcctcg ccctctcagc atgcgcatgc gcgtgcctct 300
acctcccccg cacatcctgg cctgcccacc ctgaatgtcc tggcccagcg atgccaccaa 360
ctctctcgct ccgtccacgg ctggggaggg gggcactctg cagggttggg gggcactggg 420
aggctgggtt gggtgaggga ggggtgcctg ggcccccacc ccccagcaag ttctctccct 480
aggcgaactg gagggtcgtc tggcctcttg agccttgttg ctggctctga gctctaccaa 540
gagagtgacc agcaggaccg caccatcagt ggttgctgag actgcgtggg ggcccaagga 600
gacctggaga aaggaatgct tcctgctcct tcttctgggg ccccaggaga gccttcccag 660
ggccttggag aggtgctgtc cagggactaa ccctgtgctc taggaaggct gcaggccctg 720
accagctggg caggtcctgg gtccctcctg gccttctaag ttccccaaac atgagacctc 780
tgggtgtggg gtggcctggg gaggtcattt tgcccaggcc ctacctcctg cccattccta 840
acccttttta aaaatctgtg cgtcctcttc ttccttcttc tccctccctt cccttttcgc 900
tcaccctctg ctgctggcct gagagccgga ggcccccagg gggaaggcga ctggtctcct 960
ccccagtctc agggaaggga gacagagaat ccaggaagcc agaactcagc agacgaagca 1020
cccagggacc tagagatggg ttgaaaagtt gacagctgtc ccacctgcct cccaaggtct 1080
cagggccta 1089

Claims (15)

1. A PyMT-expressing pig cell, wherein a PyMT-encoding nucleotide sequence is inserted into a pig harbor site of safety to obtain a nucleic acid sequence expressing SEQ ID NO: 14, wherein the porcine safe harbor site is selected from the group consisting of porcine ROSA26, AAVS1, H11 and COL1a1 safe harbor sites.
2. The porcine cell of claim 1, wherein the inserted PyMT-encoding nucleotide sequence is set forth in SEQ ID NO: shown at 39.
3. The porcine cell according to claim 1 or2, wherein the nucleotide sequence of 500bp of the ROSA26 safety harbor site region and upstream and downstream thereof is as shown in SEQ ID NO: 40, the AAVS1 safety harbor site region and the nucleotide sequences of 500bp on the upstream and the downstream of the AAVS1 safety harbor site region are shown as SEQ ID NO: 41, the nucleotide sequences of the H11 safety harbor site region and the upstream and downstream thereof of each 500bp are shown as SEQ ID NO: 42, the COL1A1 safe harbor site region and the nucleotide sequences of 500bp respectively at the upstream and the downstream thereof are shown as SEQ ID NO: shown at 43.
4. A porcine cell according to any of claims 1 to 3, wherein the nucleotide sequence encoding PyMT is regulated in the porcine cell by a foreign promoter, which is a MMTV-LTR, preferably wherein the MMTV-LTR has a nucleotide sequence as set forth in SEQ ID NO: shown at 15.
5. The porcine cell according to any of claims 1-4, wherein the porcine cell is a porcine fibroblast or mammary cell.
6. A method of constructing a porcine cell according to any one of claims 1 to 5, wherein a PyMT encoding nucleotide sequence is inserted into the porcine safe harbor site using a safe harbor site vector comprising a PyMT encoding nucleotide sequence and a safe harbor site vector backbone, wherein the safe harbor site vector backbone comprises a5 'homology arm and a 3' homology arm of the safe harbor insertion site, wherein the PyMT encoding nucleotide sequence is located between the 5 'homology arm and the 3' homology arm, and wherein the safe harbor site vector backbone is selected from any one of the following:
A) the ROSA26 safe harbor site vector backbone, the 5' homology arm of which is shown in SEQ ID NO: 5, the 3' homology arm is shown as SEQ ID NO: 6 is shown in the specification;
B) AAVS1 safety harbor site vector backbone with 5' homology arms as set forth in SEQ ID NO: 7, the 3' homology arm is shown as SEQ ID NO: 8 is shown in the specification;
C) h11 safe harbor site vector backbone, the 5' homology arm of which is as shown in SEQ ID NO: 9, the 3' homology arm is shown in SEQ ID NO: 10 is shown in the figure;
or D) COL1A1 safe harbor site vector backbone, the 5' homology arm of which is set forth in SEQ ID NO: 11, the 3' homology arm is shown in SEQ ID NO: shown at 12.
7. The method of claim 6, wherein a sgRNA vector comprising a sgRNA that targets a ROSA26, AAVS1, H11, or COL1A1 safe harbor site is used for the construction of pig cells, wherein:
the nucleotide sequence of sgRNA targeting ROSA26 is set forth in SEQ ID NO: 21, the nucleotide sequence of sgRNA targeting AAVS1 is shown in SEQ ID NO: 22, the nucleotide sequence of sgRNA targeting H11 is shown in SEQ ID NO: 23, the nucleotide sequence of sgRNA targeting COL1a1 is shown in SEQ ID NO: as shown at 24.
8. The method of claim 6 or 7, wherein the construction of the pig cell is carried out using a Cas vector, the Cas vector comprises nucleotide sequences encoding a Cas protein, an EGFP and a Puro resistance protein, wherein the Cas protein is selected from Casl, CaslB, Cas5, Cas, CaslB, CaslO, Cyl, Csy, Csel, Cse5, Ccl, Csc, Csa, Csl, Csn, Cml, Csm, Crl, Cmr, Cbl, Csb, Csx, CsxO, Csx, Csxl, CsCsF, CsO, Csf, Cdl, Csd, Csl, Csfl, Csh, Csal, Csh, Csa, C2, Csc, Csfl, CsC2, CsF, CsC2, CsID, CsC2, CsID, CsC 5, Csq 5, Cml, Csq 5, Csml, Csq, Csml, CsL, Csml, CsL, Cml, CsL, Cml, CsL, Csl, CsL, Csl, Csml, CsL, Cml, Csl, CsL: 1 or 2.
9. The construction method according to any one of claims 6 to 8, which comprises co-transfecting a safe harbor site vector, a sgRNA vector and a Cas vector into a pig cell.
10. A tissue or organ comprising the porcine cells of any of claims 1-5.
11. A method for constructing a breast cancer model pig, characterized in that a nucleotide sequence encoding PyMT is inserted into a pig safe harbor site to obtain a nucleic acid sequence expressing the nucleotide sequence shown in SEQ ID NO: 14, wherein the pig safety harbor site is selected from the group consisting of porcine ROSA26, AAVS1, H11 and COL1a1 safety harbor sites.
12. The method of construction according to claim 11, which comprises transferring the porcine cell according to any one of claims 1 to 5 into an enucleated porcine oocyte to obtain a model pig.
13. A safe harbor site vector, characterized in that, the safe harbor site vector comprises a nucleotide sequence coding PyMT and a safe harbor site vector skeleton, the safe harbor site vector skeleton comprises a5 'homology arm and a 3' homology arm of a safe harbor insertion site, the nucleotide sequence coding PyMT is positioned between the 5 'homology arm and the 3' homology arm, the safe harbor site vector skeleton is selected from any one of the following items:
A) the ROSA26 safe harbor site vector backbone, the 5' homology arm of which is shown in SEQ ID NO: 5, the 3' homology arm is shown as SEQ ID NO: 6 is shown in the specification;
B) AAVS1 safety harbor site vector backbone with 5' homology arms as set forth in SEQ ID NO: 7, the 3' homology arm is shown as SEQ ID NO: 8 is shown in the specification;
C) h11 safe harbor site vector backbone, the 5' homology arm of which is as shown in SEQ ID NO: 9, the 3' homology arm is shown in SEQ ID NO: 10 is shown in the figure;
or D) COL1A1 safe harbor site vector backbone, the 5' homology arm of which is set forth in SEQ ID NO: 11, the 3' homology arm is shown in SEQ ID NO: shown at 12.
14. The pig cell of any one of claims 1 to 5 and the pig cell obtained by the construction method of any one of claims 6 to 9 are applied to the preparation of an animal model for breast cancer, or the screening of drugs for treating breast cancer and the evaluation of drug effects, or the application of the pig cell in gene and cell therapy, or the research of pathogenesis of breast cancer.
15. Use of the tissue or organ of claim 10 or the model pig obtained by the construction method of any one of claims 11-12 for screening drugs for treating breast cancer and evaluating drug effects, or for gene and cell therapy, or for studying pathogenesis of breast cancer.
CN202110187956.7A 2021-02-18 2021-02-18 Construction method and application of breast cancer model pig Active CN114958758B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110187956.7A CN114958758B (en) 2021-02-18 2021-02-18 Construction method and application of breast cancer model pig

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110187956.7A CN114958758B (en) 2021-02-18 2021-02-18 Construction method and application of breast cancer model pig

Publications (2)

Publication Number Publication Date
CN114958758A true CN114958758A (en) 2022-08-30
CN114958758B CN114958758B (en) 2024-04-23

Family

ID=82954269

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110187956.7A Active CN114958758B (en) 2021-02-18 2021-02-18 Construction method and application of breast cancer model pig

Country Status (1)

Country Link
CN (1) CN114958758B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1981196A (en) * 2004-05-19 2007-06-13 哥本哈根大学 ADAM12, a novel marker for abnormal cell function
CN102137939A (en) * 2008-08-29 2011-07-27 霍夫曼-拉罗奇有限公司 Diagnostics and treatments for VEGF-independent tumors
CN104087615A (en) * 2014-07-03 2014-10-08 上海交通大学医学院附属第九人民医院 Strain construction method of hemangioma animal model
CN105112449A (en) * 2015-09-02 2015-12-02 中国农业大学 CD28 gene overexpression vector and application thereof
US20170042129A1 (en) * 2012-08-23 2017-02-16 Buck Institute For Research On Aging Animal models for cancer and uses thereof
CN110283847A (en) * 2019-06-04 2019-09-27 西北农林科技大学 A kind of while site-directed integration FAD3 and FABP4 gene carrier and recombinant cell
CN110358792A (en) * 2019-07-19 2019-10-22 华中农业大学 Fixed point integration of foreign gene is to the targeting vector construction method of ACTB downstream of gene and its application
CN110651046A (en) * 2017-02-22 2020-01-03 艾欧生物科学公司 Nucleic acid constructs comprising gene editing multiple sites and uses thereof
CN110951784A (en) * 2019-12-29 2020-04-03 华中农业大学 Unmarked pig β -defensin 2 gene site-directed knock-in plasmid vector and application thereof
KR20200110557A (en) * 2019-03-15 2020-09-24 국립암센터 Method of manufacturing breast cancer animal model and uses thereof

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1981196A (en) * 2004-05-19 2007-06-13 哥本哈根大学 ADAM12, a novel marker for abnormal cell function
CN102137939A (en) * 2008-08-29 2011-07-27 霍夫曼-拉罗奇有限公司 Diagnostics and treatments for VEGF-independent tumors
US20170042129A1 (en) * 2012-08-23 2017-02-16 Buck Institute For Research On Aging Animal models for cancer and uses thereof
CN104087615A (en) * 2014-07-03 2014-10-08 上海交通大学医学院附属第九人民医院 Strain construction method of hemangioma animal model
CN105112449A (en) * 2015-09-02 2015-12-02 中国农业大学 CD28 gene overexpression vector and application thereof
CN110651046A (en) * 2017-02-22 2020-01-03 艾欧生物科学公司 Nucleic acid constructs comprising gene editing multiple sites and uses thereof
KR20200110557A (en) * 2019-03-15 2020-09-24 국립암센터 Method of manufacturing breast cancer animal model and uses thereof
CN110283847A (en) * 2019-06-04 2019-09-27 西北农林科技大学 A kind of while site-directed integration FAD3 and FABP4 gene carrier and recombinant cell
CN110358792A (en) * 2019-07-19 2019-10-22 华中农业大学 Fixed point integration of foreign gene is to the targeting vector construction method of ACTB downstream of gene and its application
CN110951784A (en) * 2019-12-29 2020-04-03 华中农业大学 Unmarked pig β -defensin 2 gene site-directed knock-in plasmid vector and application thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHANTALE T. GUY ET AL.: "Induction of Mammary Tumors by Expression of Polyomavirus Middle T Oncogene: A Transgenic Mouse Model for Metastatic Disease", 《MOLECULAR AND CELLULAR BIOLOGY》, vol. 12, no. 3, pages 954 - 961 *
SHERIF ATTALLA ET AL.: "Insights from transgenic mouse models of PyMT-induced breast cancer: recapitulating human breast cancer progression in vivo", 《ONCOGENE》, pages 475 - 491 *
马林媛: "猪转基因友好整合位点的筛选与应用", 《中国博士学位论文全文数据库 农业科技辑》, no. 5, pages 6 - 24 *

Also Published As

Publication number Publication date
CN114958758B (en) 2024-04-23

Similar Documents

Publication Publication Date Title
CN112779292B (en) Method for constructing high-quality pig nuclear transplantation donor cells with high lean meat percentage and rapid growth and capable of resisting blue ear diseases and serial diarrhea diseases and application of donor cells
CN112779291B (en) Method for constructing high-quality pig nuclear transplantation donor cells with high lean meat percentage, fast growth, high reproductive capacity and resistance to series epidemic diseases and application thereof
CN112877362A (en) Gene editing system for constructing high-quality porcine nuclear transplantation donor cells with high fertility and capability of resisting porcine reproductive and respiratory syndrome and serial diarrhea diseases and application of gene editing system
CN114990157B (en) Gene editing system for constructing LMNA gene mutation dilated cardiomyopathy model pig nuclear transplantation donor cells and application thereof
CN114958762B (en) Method for constructing nerve tissue specific overexpression humanized SNCA parkinsonism model pig and application
CN112522313B (en) CRISPR/Cas9 system for constructing depression cloned pig nuclear donor cells with TPH2 gene mutation
CN112522264B (en) CRISPR/Cas9 system causing congenital deafness and application thereof in preparation of model pig nuclear donor cells
CN113046388B (en) CRISPR system for constructing atherosclerosis pig nuclear transfer donor cells with double genes in combined knockout mode and application of CRISPR system
CN114958758B (en) Construction method and application of breast cancer model pig
CN114958759A (en) Construction method and application of amyotrophic lateral sclerosis model pig
CN114958761B (en) Construction method and application of stomach cancer model pig
CN112608941B (en) CRISPR system for constructing obese pig nuclear transplantation donor cells with MC4R gene mutation and application of CRISPR system
CN112575033B (en) CRISPR system and application thereof in construction of SCN1A gene mutated epileptic encephalopathy clone pig nuclear donor cell
CN112813101B (en) Gene editing system for constructing high-quality pig nuclear transplantation donor cells with high lean meat percentage and rapid growth and application thereof
CN112680453B (en) CRISPR system and application thereof in construction of STXBP1 mutant epileptic encephalopathy clone pig nuclear donor cell
CN112522311B (en) CRISPR system for ADCY3 gene editing and application thereof in construction of obese pig nuclear transfer donor cells
CN112899306B (en) CRISPR system and application thereof in construction of GABRG2 gene mutation cloned pig nuclear donor cells
CN112522256B (en) CRISPR/Cas9 system and application thereof in construction of dystrophin gene-deficient porcine recombinant cells
CN112522255B (en) CRISPR/Cas9 system and application thereof in construction of porcine recombinant cell with insulin receptor substrate gene defect
CN112795566B (en) OPG gene editing system for constructing osteoporosis clone pig nuclear donor cell line and application thereof
CN112680444B (en) CRISPR system for OCA2 gene mutation and application thereof in construction of albino clone pig nuclear donor cells
CN113584078B (en) CRISPR system for double-target gene editing and application thereof in construction of depressive pig nuclear transfer donor cells
CN112522202B (en) Method for preparing ADDI four-gene combined knockout severe immunodeficiency swine-derived recombinant cell and special kit thereof
CN114958760B (en) Gene editing technology for constructing Alzheimer disease model pig and application thereof
CN112877363A (en) Gene editing system for constructing high-quality pig nuclear transplantation donor cells with high lean meat percentage, fast growth and high reproductive capacity and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant