CN115247153A

CN115247153A - Gene editing system for constructing diabetes model pig nuclear transplantation donor cells with HNF1A gene mutation and application thereof

Info

Publication number: CN115247153A
Application number: CN202110676866.4A
Authority: CN
Inventors: 牛冬; 汪滔; 段星; 刘璐; 马翔; 陶裴裴; 曾为俊; 王磊; 程锐; 赵泽英; 黄彩云
Original assignee: Nanjing Qizhen Genetic Engineering Co Ltd
Current assignee: Nanjing Qizhen Genetic Engineering Co Ltd
Priority date: 2021-06-18
Filing date: 2021-06-18
Publication date: 2022-10-28

Abstract

The invention discloses a gene editing system for constructing an HNF1A gene mutant MODY model pig nuclear transplantation donor cell and application thereof. The invention provides a method for preparing recombinant cells, which comprises the following steps: using the nucleotide sequence of SEQ ID NO:18 to substitute the DNA molecule shown in SEQ ID NO:19 to obtain the recombinant cell. The specific implementation mode is as follows: and the binding region of the target sequence is shown as SEQ ID NO:16, and a target sequence binding region shown as SEQ ID NO:17, HNF1A-gD1, SEQ ID NO:18, co-transfecting the porcine cells with HNF1A-mutant-ss163 and NCN protein. And (3) taking the recombinant cell as a nuclear transplantation donor cell to clone a somatic cell to obtain a cloned pig, namely the MODY3 type diabetes model pig. The invention has great application value for researching and developing MODY3 type diabetes drugs and disclosing pathogenesis of the disease.

Description

Gene editing system for constructing diabetes model pig nuclear transplantation donor cells with HNF1A gene mutation and application thereof

Technical Field

The invention belongs to the technical field of biology, particularly belongs to the technical field of gene editing, and more particularly relates to application of a CRISPR/Cas9 system and a ssoDN homologous recombination technology in construction of a HNF1A gene mutation diabetes model pig nuclear transplantation donor cell.

Background

Monogenic diabetes is a special type of diabetes caused by mutations in a single gene that plays a key role in islet beta cell development, function, or insulin signaling pathway, accounting for approximately 1% -5% of all types of diabetes. According to the age of onset of clinical symptoms, monogenic diabetes can be divided into two main categories: neonatal Diabetes (NDM) and juvenile onset adult diabetes (diabetes onset diabetes of the young, MODY). Monogenic diabetes is clinically characterized by a variety of features, with most MODY's clinical manifestations similar to type 2 diabetes, while NDM's clinical manifestations caused by mutations in genes such as ATP-sensitive potassium channel (Kcnj 11) or insulin (Ins) are similar to type 1 diabetes. Studies have shown that about 90% of patients with monogenic diabetes are misdiagnosed as type 1 or type 2 diabetes, but due to clinical prognosis, outcome and choice of treatment regimen of monogenic diabetes, there are great differences from type 1 and type 2 diabetes, and nearly 70% of patients do not receive a correct and effective treatment regimen at present. Because of the different mutant genes and the different clinical characteristics of monogenic diabetes, at least 14 kinds of gene mutation have been found to cause MODY, wherein MODY3, MODY2 and MODY1 are the most common and account for more than 90% of all monogenic diabetes patients.

MODY3 type diabetes, which is caused by mutation of a gene (HNF 1A gene) encoding hepatocyte nuclear factor 1-alpha (HNF-1 alpha), is the most common monogenic diabetes, accounting for about 50% of the total monogenic diabetes. And P291fsinsC mutation located in exon 4 of the HNF1A gene, wherein the mutation is that a reading frame is displaced due to insertion of a base C between proline codons 291, and an abort code appears prematurely to form a truncation of the HNF-1 alpha protein. This mutation in the HNF1A gene is a hot spot for causing MODY3 type diabetes. HNF-1 alpha is widely expressed in pancreatic islet beta cells, liver, intestinal tract and other organs, and is an important transcription factor in pancreatic islet development and beta cell differentiation processes. The HNF1A gene mutation causes progressive reduction of islet function, has high penetrance rate, and is mostly diseased before the age of 25. Clinically, the blood sugar after meal is obviously increased, which can be expressed as more than three and one less, but ketosis is less occurred. In addition, as HNF-1 alpha is expressed in the kidney, the HNF1A gene defect can reduce the glucose reabsorption capability of the kidney by changing the expression of a renal distal convoluted tubule sodium-glucose cotransporter, thereby reducing the renal glucose threshold, which is also one of the characteristics of MODY3 type diabetes clinical manifestations.

At present, mouse models are mainly used for researching disease animal models of diabetes and can be divided into two types, namely experimental diabetes-induced mouse models and spontaneous diabetes mouse models, however, the mouse is greatly different from the human body in aspects of body type, organ size, physiology, pathology and the like, and the normal physiological and pathological states of the human body cannot be truly simulated. The pig as a large animal is a main meat food supply animal for human for a long time, the size and physiological function of the pig are similar to those of the human, the pig is easy to breed and feed in a large scale, and the pig is low in requirements on ethics, animal protection and the like and is an ideal human disease model animal.

Gene editing is a biotechnology that has been greatly developed in recent years, and includes editing technologies from homologous recombination-based gene editing to nuclease-based ZFNs, TALENs, CRISPR/Cas9, and the like, wherein CRISPR/Cas9 technology is currently the most advanced gene editing technology. Currently, gene editing techniques are increasingly applied to the production of animal models.

Homologous recombination (HDR) is the exchange of DNA sequence information by sequence homology: that is, the repair template contains the desired insert, and the recombination arms with sequence homology near the insertion site are at both ends of the repair template. In the past, double-stranded DNA (dsDNA) was commonly used as a repair template, but recent studies have revealed the superiority of single-stranded oligodeoxynucleotides (ssodns) as HDR donor templates. First, ssODN is more site-specific as a donor template than dsDNA template, which is susceptible to random insertions. Second, the ssODN requires shorter length for homologous recombination arms than dsDNA templates, and 30-60 base-on-a-side recombination arm designs can achieve high efficiency and stable HDR, which provides higher insertion efficiency than similar dsDNA templates. Third, dsDNA is easily incorporated by the NHEJ repair pathway, resulting in duplication of the homology arms or partial integration of the dsDNA template, which is not easily produced by ssODN. In addition, dsDNAs are detrimental to cultured cells, transfection of linear or plasmid dsDNAs is inefficient, and causes adverse reactions in the cells, and the ssODN templates are more advantageous in these respects.

Disclosure of Invention

The invention aims to provide a gene editing system for constructing HNF1A gene mutation diabetes model pig nuclear transplantation donor cells and application thereof.

The invention provides a method for preparing recombinant cells, which comprises the following steps: using the nucleotide sequence of SEQ ID NO:18 to substitute the DNA molecule shown in SEQ ID NO:19 to obtain the recombinant cell.

Using the nucleotide sequence of SEQ ID NO:18 to substitute the DNA molecule shown in SEQ ID NO:19 is the DNA molecule shown in seq id no: co-transfecting the HNF1A-gU2, the HNF1A-gD1, the HNF1A-mutant-ss163 and the NCN protein into a pig cell; the HNF1A-gU2 is sgRNA, and the target sequence binding region is shown as SEQ ID NO:16, nucleotides 3 to 22; the HNF1A-gD1 is sgRNA, and the target sequence binding region is shown as SEQ ID NO:17 at nucleotides 3-22; the HNF1A-mutant-ss163 is SEQ ID NO:18, a single-stranded DNA molecule; the NCN protein is a Cas9 protein or a fusion protein with a Cas9 protein.

Specifically, the NCN protein is shown as SEQ ID NO:3, respectively.

Specifically, the HNF1A-gU2 is shown as SEQ ID NO: shown at 16.

Specifically, the HNF1A-gD1 is shown as SEQ ID NO: shown at 17.

Specifically, the HNF1A-gU2 is shown as SEQ ID NO:11, respectively.

Specifically, the HNF1A-gD1 is shown as SEQ ID NO: shown at 12.

The porcine cells are porcine fibroblasts.

The porcine cells are porcine primary fibroblasts.

The proportions of the pig cells, the HNF1A-gU2, the HNF1A-gD1, the HNF1A-mutant-ss163 and the NCN protein are as follows in sequence: 10 ten thousand porcine cells: 0.8-1.2 μ g HNF1A-gU2:0.8-1.2 μ g HNF1A-gD1: 1.8-2.2. Mu.g HNF1A-mutant-ss163: 3-5. Mu.g NCN protein.

The proportions of the pig cells, HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and NCN protein are as follows in sequence: 10 ten thousand porcine cells: 1 μ g HNF1A-gU2:1 μ g HNF1A-gD1:2 μ g HNF1A-mutant-ss163: mu.g NCN protein.

The co-transfection is specifically a shock transfection.

The parameters for electroporation transfection may specifically be: 1450V, 10ms, 3pulse.

The co-transfection may be specifically carried out using a mammalian nuclear transfection kit (Neon kit, thermofeisher) and a Neon TM transfection system electrotransfer apparatus.

The preparation method of the NCN protein comprises the following steps:

(1) Introducing the plasmid pKG-GE4 into escherichia coli BL21 (DE 3) to obtain a recombinant strain;

(2) Culturing the recombinant strain by adopting a liquid culture medium at 30 ℃, then adding IPTG (isopropyl-beta-thiogalactoside) and carrying out induced culture at 25 ℃, and then collecting thalli;

(3) Crushing the collected thalli, and collecting a crude protein solution;

(4) Purification of the crude protein solution with His by affinity chromatography ₆ A fusion protein of the tag;

(5) By means of a compound having His ₆ Tagged enterokinase cleavage with His ₆ The tagged fusion protein was then removed with His using Ni-NTA resin ₆ A tagged protein, resulting in a purified NCN protein;

plasmid pKG-GE4 has the sequence shown in SEQ ID NO:1, nucleotide 5209-9852.

The preparation method of the NCN protein specifically comprises the following steps:

(1) The plasmid pKG-GE4 was introduced into E.coli BL21 (DE 3) to obtain a recombinant strain.

(2) Inoculating the recombinant bacteria obtained in the step (1) to a liquid LB culture medium containing ampicillin, and carrying out shake culture;

(3) Inoculating the bacterial liquid obtained in the step (2) to a liquid LB culture medium, and performing shaking culture at 30 ℃ and 230rpm until OD is reached _600nm The value =1.0, then IPTG was added to make the concentration in the system 0.5mM, followed by shaking culture at 230rpm at 25 ℃ for 12 hours, and then the cells were collected by centrifugation;

(4) Taking the thalli obtained in the step (3), and washing the thalli with a PBS (phosphate buffer solution);

(5) Adding the crude extraction buffer solution into the thalli obtained in the step (4), suspending the thalli, then crushing the thalli, then centrifugally collecting supernate, filtering by adopting a filter membrane with the aperture of 0.22 mu m, and collecting filtrate;

(6) Purifying the filtrate obtained in step (5) by affinity chromatography to obtain a purified product having His ₆ A fusion protein of the tag (the fusion protein shown in SEQ ID NO: 2);

(7) Taking the solution after column chromatography collected in the step (6), concentrating by using an ultrafiltration tube, and then diluting with 25mM Tris-HCl (pH8.0);

(8) Will have His ₆ Adding the labeled recombinant bovine enterokinase into the solution obtained in the step (7), and performing enzyme digestion;

(9) Mixing the solution obtained in the step (8) with Ni-NTA resin, incubating, centrifuging and collecting supernatant;

(10) And (5) taking the supernatant obtained in the step (9), concentrating by using an ultrafiltration tube, and then adding the supernatant into an enzyme stock solution to obtain the NCN protein solution.

Purifying the filtrate obtained in step (5) to give a purified product having His by affinity chromatography ₆ The specific method of the labeled fusion protein is as follows:

firstly, balancing a Ni-NTA agarose column by using a balance solution with 5 column volumes (the flow rate is 1 ml/min); then, 50ml of the filtrate obtained in the step (5) is loaded (the flow rate is 0.5-1 ml/min); the column was then washed with 5 column volumes of equilibration solution (flow rate 1 ml/min); the column was then washed with 5 column volumes of buffer (flow rate 1 ml/min) to remove contaminating proteins; then eluting with 10 column volumes of eluent at flow rate of 0.5-1ml/min, and collecting the solution (90-100 ml) after passing through the column.

The invention also protects a kit which comprises any one of the HNF1A-gU2, any one of the HNF1A-gD1, the HNF1A-mutant-ss163 and any one of the NCN proteins.

The invention also protects a kit comprising any one of the HNF1A-gU2, any one of the HNF1A-gD1, the HNF1A-mutant-ss163 and PRONCN protein.

The invention also provides a kit comprising any one of the HNF1A-gU2, any one of the HNF1A-gD1, the HNF1A-mutant-ss163 and a specific plasmid. The kit also comprises escherichia coli BL21 (DE 3).

Any of the kits above further comprising porcine cells.

The porcine cells are porcine fibroblasts.

The porcine cells are porcine primary fibroblasts.

The invention also protects the application of any one of the HNF1A-gU2, any one of the HNF1A-gD1, the HNF1A-mutant-ss163 and any one of the NCN proteins in the preparation of a kit.

The invention also protects the application of any one of the HNF1A-gU2, any one of the HNF1A-gD1, the HNF1A-mutant-ss163 and the PRONCN protein in the preparation of the kit.

The invention also protects the application of any one of the HNF1A-gU2, any one of the HNF1A-gD1, the HNF1A-mutant-ss163 and the heterogenous particles in the preparation of the kit.

The use of any one of the above kits is (a) or (b) or (c): (a) preparing a recombinant cell; (b) preparing a diabetes model pig; (c) Preparing a diabetes cell model or a diabetes tissue model or a diabetes organ model.

Any one of the PRONCN proteins sequentially comprises the following elements from upstream to downstream: signal peptide, molecular chaperone protein, protein tag, protease cleavage site, nuclear localization signal, cas9 protein, nuclear localization signal.

The signal peptide has the function of promoting protein secretion expression. The signal peptide may be selected from the group consisting of the escherichia coli alkaline phosphatase (phoA) signal peptide, the staphylococcus aureus protein a signal peptide, the escherichia coli outer membrane protein (ompa) signal peptide or the signal peptide of any other prokaryotic gene, preferably the alkaline phosphatase signal peptide (phoA signal peptide). The alkaline phosphatase signal peptide is used for guiding the secretion and expression of the target protein into the bacterial periplasm cavity so as to be separated from the bacterial intracellular protein, and the target protein secreted into the bacterial periplasm cavity is soluble expression and can be cleaved by the signal peptidase in the bacterial periplasm cavity.

The chaperone protein functions to increase the solubility of the protein. The chaperone may be any protein that helps to form disulfide bonds, preferably a thioredoxin (TrxA protein). Thioredoxin, which can be used as a molecular chaperone to help a co-expressed target protein (such as Cas9 protein) to form a disulfide bond, so that the stability of the protein and the correctness of folding are improved, and the solubility and the activity of the target protein are increased.

The protein tag functions for protein purification. The Tag can be His Tag (His-Tag, his) ₆ Protein tag), GST tag, flag tag, HA tag, c-Myc tag, or any other protein tag, more preferably His tag. The His tag can be combined with a Ni column, and the target protein can be purified by one-step Ni column affinity chromatography, so that the purification process of the target protein can be greatly simplified.

The protease cleavage site functions to cleave non-functional segments after purification to release the native form of the Cas9 protein. The protease may be selected from Enterokinase (Enterokinase), factor Xa (Factor Xa), thrombin (thrombobin), TEV protease (TEV protease), HRV 3C protease (HRV 3C protease), WELQut protease or any other endoprotease, further preferably Enterokinase. EK is an enterokinase enzyme cutting site, and is convenient for cutting the fused TrxA-His section by using enterokinase to obtain the Cas9 protein in a natural form. After the commodity enterokinase enzyme digestion fusion protein with the His label is used, the TrxA-His section and the enterokinase with the His label can be removed through once affinity chromatography to obtain the Cas9 protein in a natural form, and the damage and the loss of the target protein caused by repeated purification and dialysis are avoided.

The nuclear localization signal may be any nuclear localization signal, preferably an SV40 nuclear localization signal and/or a nucleocapsin nuclear localization signal. NLS is a nuclear localization signal, and NLS sites are respectively designed at the N end and the C end of Cas9, so that Cas9 can more effectively enter a cell nucleus for gene editing.

The Cas9 protein may be saCas9 or spCas9, preferably is a spCas9 protein.

The PRONCN protein is specifically shown as SEQ ID NO:2, respectively.

Any one of the specific plasmids comprises the following elements from upstream to downstream in sequence: promoter, operator, ribosome binding site, PRONCN protein coding gene and terminator.

The promoter may specifically be a T7 promoter. The T7 promoter is a prokaryotic expression strong promoter and can efficiently drive the expression of exogenous genes.

The operon may specifically be a Lac operon. The Lac operon is a regulatory element for lactose-induced expression, and IPTG can be used for inducing the expression of the target protein at low temperature after bacteria grow to a certain amount, so that the influence of the premature expression of the target protein on the growth of host bacteria can be avoided, and the solubility of the expressed target protein can be obviously improved by inducing expression at low temperature.

The ribosome binding site is a ribosome binding site for protein translation, and is essential for protein translation.

The terminator may specifically be a T7 terminator. The T7 terminator can effectively terminate gene transcription at the end of the target gene, and prevent other downstream sequences except the target gene from being transcribed and translated.

For the codon of the spCas9 protein, the codon is optimized, so that the codon preference of the escherichia coli high-efficiency expression strain E.coli BL21 (DE 3) selected by the application is completely adapted, and the expression level of the Cas9 protein is improved.

The T7 promoter is shown as SEQ ID NO:1 from nucleotide 5121 to nucleotide 5139.

The Lac operon is shown as SEQ ID NO:1 from nucleotide 5140 to nucleotide 5164.

The ribosome binding site is shown as SEQ ID NO:1 from nucleotide 5178 to 5201.

The coding sequence of the alkaline phosphatase signal peptide is shown as SEQ ID NO:1, nucleotides 5209-5271.

The coding sequence of the TrxA protein is shown as SEQ ID NO:1, nucleotides 5272-5598.

The coding sequence of His-Tag is shown in SEQ ID NO:1, nucleotides 5620-5637.

The coding sequence of the enterokinase enzyme cutting site is shown as SEQ ID NO:1 from nucleotide 5638 to 5652.

The coding sequence of the nuclear localization signal is shown as SEQ ID NO:1, nucleotides 5656-5670.

The coding sequence of the spCas9 protein is shown in SEQ ID NO:1, nucleotides 5701-9801.

The coding sequence of the nuclear localization signal is shown as SEQ ID NO:1 from nucleotide 9802 to nucleotide 9849.

The T7 terminator is shown as SEQ ID NO: nucleotides 9902-9949 of 1.

Specifically, the specific plasmid is plasmid pKG-GE4.

Plasmid pKG-GE4 has the sequence SEQ ID NO:1, nucleotides 5121-9949.

Specifically, any one of the plasmids pKG-GE4 is shown as SEQ ID NO:1 is shown.

The proportions of HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and NCN protein are as follows in sequence: 0.8-1.2. Mu.g of HNF1A-gU2:0.8-1.2 μ g HNF1A-gD1: 1.8-2.2. Mu.g HNF1A-mutant-ss163: 3-5. Mu.g NCN protein.

The proportions of HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and NCN protein are as follows in sequence: 1 μ g HNF1A-gU2:1 μ g HNF1A-gD1:2 μ g HNF1A-mutant-ss163: mu.g NCN protein.

The proportions of the pig cells, HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and NCN protein are as follows in sequence: 10 ten thousand porcine cells: 0.8-1.2 μ g HNF1A-gU2:0.8-1.2 μ g HNF1A-gD1: 1.8-2.2. Mu.g HNF1A-mutant-ss163: 3-5. Mu.g NCN protein.

The invention also protects the recombinant cell prepared by any one of the methods.

The invention also protects the application of the recombinant cell in preparing diabetes model pigs.

The recombinant cell is taken as a nuclear transplantation donor cell to carry out somatic cell cloning, so that a cloned pig, namely a diabetes model pig can be obtained.

The invention also protects a pig tissue of a model pig prepared by using the recombinant cell, namely a diabetes tissue model.

The invention also protects a pig organ of a model pig prepared by the recombinant cell, namely a diabetes organ model.

The invention also protects pig cells (such as islet cells or liver cells) of a model pig prepared by using the recombinant cells, namely a diabetes cell model.

The invention also protects the application of the recombinant cell, the diabetic tissue model, the diabetic organ model, the diabetic cell model or the diabetic model pig, which is (d 1) or (d 2) or (d 3) or (d 4) as follows:

(d1) Screening a medicament for treating diabetes;

(d2) Evaluating the drug effect of the diabetes drug;

(d3) Evaluating the curative effect of gene therapy and/or cell therapy of diabetes;

(d4) The pathogenesis of diabetes is studied.

Any one of the pigs may be a Yuanjiang fragrant pig.

Any of the above diabetes may be juvenile onset adult diabetes (MODY diabetes).

Any of the above diabetes may be MODY3 diabetes.

MODY3 diabetes is caused by the following mutations in the HNF1A gene: a P291fsinsC mutation located at exon 4 of the HNF1A gene.

Pig HNF1A gene information: encoding hepatocyte nuclear factor 1-alpha; is located on chromosome 14; geneID was 574067, sus scrofa. The amino acid sequence of the protein coded by the pig HNF1A gene is shown as SEQ ID NO. 8. The pig HNF1A gene has the nucleotide sequence shown in SEQ ID NO:9, or a fragment of DNA as set forth in seq id no.

Compared with the prior art, the invention has at least the following beneficial effects:

(1) The subject of the invention (pig) has better applicability than other animals (rats, mice, primates).

Rodents such as rats and mice have great differences from humans in body types, organ sizes, physiology, pathology and the like, and cannot truly simulate normal physiological and pathological states of humans. Studies have shown that over 95% of drugs validated to be effective in large mice are not effective in human clinical trials. As for large animals, primates are animals having a close relationship with humans, but they are small in size, mature late (starting mating at age 6-7), and are monozygotic animals, and they are extremely slow in population propagation speed and high in raising cost. In addition, primate cloning efficiency is low, difficulty is high, and cost is high.

However, pigs are animals that have been closely related to humans except primates, and have body shapes, body weights, organ sizes, and the like similar to those of humans, and are very similar to those of humans in terms of anatomy, physiology, immunology, nutritional metabolism, disease pathogenesis, and the like. Meanwhile, the pigs have early sexual maturity (4-6 months), high reproductive capacity and multiple piglets, and can form a large group within 2-3 years. In addition, the cloning technology of the pig is very mature, and the cloning and feeding cost is much lower than that of a primate. Pigs are therefore very suitable animals as models for human disease.

(2) The vector constructed by the invention uses a strong promoter T7-lac which can efficiently express the target protein to express the target protein, and uses a signal peptide of bacterial periplasmic protein alkaline phosphatase (phoA) to guide the secretion and expression of the target protein to a bacterial periplasm cavity, so that the target protein is separated from the bacterial intracellular protein and is expressed in a soluble way. Meanwhile, the thioredoxin TrxA and the Cas9 protein are fused and expressed, the TrxA can help the coexpressed target protein to form a disulfide bond, the stability and the folding correctness of the protein are improved, and the solubility and the activity of the target protein are increased. In order to facilitate the purification of the target protein, the His tag is designed, and the target protein can be purified through one-step Ni column affinity chromatography, so that the purification process of the target protein is greatly simplified. Meanwhile, an enterokinase enzyme cutting site is designed behind the His tag, so that the fused TrxA-His polypeptide fragment can be conveniently cut off, and the Cas9 protein in a natural form can be obtained. After the fusion protein is digested by using the enterokinase with the His tag, the TrxA-His polypeptide fragment and the enterokinase with the His tag can be removed through one-time affinity chromatography to obtain the Cas9 protein in a natural form, so that the damage and loss of the target protein caused by multiple times of purification dialysis are avoided. Meanwhile, the invention also designs an NLS site at the N end and the C end of the Cas9 respectively, so that the Cas9 can enter the cell nucleus more effectively for gene editing. In addition, the E.coli BL21 (DE 3) strain is selected as a target protein expression strain, and the strain can efficiently express and clone the exogenous gene of an expression vector (such as pET-32 a) containing a bacteriophage T7 promoter. Meanwhile, as for the codon of the Cas9 protein, the invention carries out codon optimization, so that the codon is completely adapted to the codon preference of an expression strain, thereby improving the expression level of the target protein. In addition, after the bacteria grow to a certain amount, IPTG is used for inducing the expression of the target protein at low temperature, so that the influence of the premature expression of the target protein on the growth of host bacteria can be avoided, and the solubility of the expressed target protein is also obviously improved by inducing the expression at low temperature. Through the optimization design and experimental implementation of the above items, the activity of the obtained Cas9 protein is significantly improved compared with that of a commercial Cas9 protein.

(3) The gene editing is carried out by combining the Cas9 high-efficiency protein constructed and expressed with the gRNA transcribed in vitro, the optimal dosage ratio of the Cas9 and the gRNA is optimized, and the synthesized ssODN is used as Donor DNA, so that the single cell cloning rate of the target site point mutation is up to 20 percent and is far higher than the conventional point mutation efficiency (less than 5 percent).

(4) The cloned pig containing target gene point mutation can be directly obtained by cloning somatic cell nuclear transfer animals by using the obtained target gene point mutation unicellular cloned strain, and the mutation can be stably inherited.

The method for embryo transplantation after injecting gene editing materials into fertilized eggs in the mouse model making is not suitable for making large animal (such as pig) models with longer gestation period because the probability of directly obtaining point mutation offspring is very low (less than 1 percent) and the offspring hybridization breeding is needed. Therefore, the method adopts the primary cell in-vitro editing with great technical difficulty and high challenge, the ssODN homologous recombination and the screening positive editing single cell clone, and directly obtains the corresponding disease model pig through the somatic cell nuclear transfer animal cloning technology in the later stage, thereby greatly shortening the model pig manufacturing period and saving manpower, material resources and financial resources.

The invention adopts CRISPR/Cas9 technology combined with ssoDN homologous recombination technology to carry out site-directed modification of HNF1A gene, simulates the natural morbidity genetic characteristic of MODY3 type diabetes, obtains single cell clone of HNF1A gene with precise site-directed modification, and lays a foundation for breeding MODY3 type diabetes disease model pigs by somatic cell nuclear transfer animal cloning technology in the later period. The model pig provides a powerful experimental tool for researching pathogenesis of MODY3 diabetes and drug research and development.

The invention lays a solid foundation for obtaining MODY3 type diabetes model pigs with HNF1A gene mutation by means of gene editing, is helpful for researching and disclosing pathogenesis of MODY3 type diabetes caused by HNF1A gene mutation, can also be used for research such as drug screening, drug effect detection, gene therapy, cell therapy and the like, can provide effective experimental data for further clinical application, and further provides a powerful experimental means for successfully treating human MODY3 type diabetes. The invention has great application value for researching and developing MODY3 diabetes drugs and disclosing the pathogenesis of the disease.

Drawings

FIG. 1 is a schematic diagram of the structure of plasmid pET-32 a.

FIG. 2 is a schematic diagram of the structure of plasmid pKG-GE4.

FIG. 3 is an electrophoretogram of optimized dosage ratio of gRNA and NCN protein in example 3.

Fig. 4 is an electrophoretogram comparing gene editing efficiency of NCN protein and a commercial Cas9 protein in example 3.

FIG. 5 is an electrophoretogram of PCR amplification using different primer pairs using ear tissue-extracted genomes of swine designated as 1 as templates in example 4.

FIG. 6 is an electrophoretogram of PCR amplification using primer pairs consisting of HNF1A-E4-F174 and HNF1A-E4-R724, using genomic DNA of 18 pigs as templates, respectively, in example 4.

FIG. 7 is an electrophoretogram comparing the editing efficiency of different targets in example 4.

FIG. 8 is an electrophoretogram in example 5.

FIG. 9 shows the forward and reverse sequencing of single cell clone numbered 6 aligned simultaneously with the target site-directed modified sequence.

FIG. 10 shows the alignment of forward and reverse sequencing of single cell clone numbered 1 with the target site-directed modified sequence.

FIG. 11 is a forward and reverse sequencing alignment of single cell clone numbered 15 with the target site-directed modified sequence.

FIG. 12 is a forward and reverse sequencing alignment of single cell clone numbered 7 with the target site-directed modified sequence.

FIG. 13 is a forward and reverse sequencing alignment of single cell clone numbered 17 simultaneously with target site-directed modification of the sequence.

FIG. 14 is a forward and reverse sequencing alignment of single cell clone numbered 4 simultaneously with target site-directed modification of the sequence.

Detailed Description

The present invention is described in further detail below with reference to specific embodiments, and the examples are given only for illustrating the present invention and not for limiting the scope of the present invention. The examples provided below serve as a guide for further modifications by a person skilled in the art and do not constitute a limitation of the invention in any way.

The experimental procedures in the following examples, unless otherwise specified, were carried out in a conventional manner according to the techniques or conditions described in the literature in this field or according to the product instructions. Materials, reagents and the like used in the following examples are commercially available unless otherwise specified. The recombinant plasmids constructed in the examples were all sequence verified. The commercial Cas9-A protein is a commercial Cas9 protein with good effect. The commercial Cas9-B protein is a commercial Cas9 protein with good effect. Complete culture broth (% by volume): 15% fetal bovine serum (Gibco) +83% DMEM medium (Gibco) +1% Penicilin-Streptomyces (Gibco) +1% HEPES (Solarbio). Cell culture conditions: 37 ℃,5% of CO ₂ 、5％O ₂ The constant temperature incubator.

The porcine primary fibroblasts used in the examples were all prepared from porcine ear tissue of Jiangxiang pigs, which was freshly harvested. The method for preparing the primary pig fibroblast comprises the following steps: (1) taking 0.5g of pig ear tissue, removing hair and bone tissue, soaking in 75% alcohol for 30-40s, washing with PBS buffer containing 5% (volume ratio) Penicillin-Streptomycin (Gibco) for 5 times, and washing with PBS buffer for one time; (2) shearing the tissue with scissors, digesting with 5mL of 0.1% collagenase solution (Sigma) at 37 ℃ for 1h, centrifuging 500g for 5min, and removing the supernatant; (3) resuspending the precipitate with 1mL of complete culture solution, spreading into a 10cm diameter cell culture dish containing 10mL of complete culture solution and sealed with 0.2% gelatin (VWR), and culturing until the bottom of the dish is 60% full of cells; (4) after completion of step (3), the cells were digested with trypsin and collected, and then resuspended in complete medium. Used for carrying out subsequent electrotransfer experiments.

Plasmid pKG-GE3, a circular plasmid, as described in patent application 202010084343.6, SEQ ID NO:2, respectively. SEQ ID NO:2, the nucleotide at positions 395 to 680 constitutes CMV enhancer, the nucleotide at positions 682 to 890 constitutes EF1a promoter, the nucleotide at positions 986 to 1006 encodes a Nuclear Localization Signal (NLS), the nucleotide at positions 1016 to 1036 encodes a Nuclear Localization Signal (NLS), the nucleotide at positions 1037 to 5161 encodes Cas9 protein, the nucleotide at positions 5162 to 5209 encodes a Nuclear Localization Signal (NLS), the nucleotide at positions 5219 to 5266 encodes a Nuclear Localization Signal (NLS), the nucleotide at positions 5276 to 5332 encodes self-cleaving polypeptide P2A (the amino acid sequence of self-cleaving polypeptide P2A is "ATNFSLSLLKKQAKGDAKGDVEENPGP", the position of self-cleaving is between the first and second amino acid residues from the C-terminus of the sequence), the nucleotide at positions 5333 to 6046 encodes EGFP protein, the nucleotide at positions 6056 to 539 encodes self-cleaving polypeptide T2A (the amino acid sequence of self-cleaving polypeptide T2A is "EGSLRGSLRGPLGVEGDVEGFENP", the nucleotide at positions 73610739 and the nucleotide at positions 7373769 to 677647), the nucleotide at positions WPBYb 6747 encodes the nucleotide sequence (the nucleotide sequence of the nucleotide at positions WPSbSLRGBW 679) and the nucleotide sequence of the nucleotide at positions 677610 to 677647), and the nucleotide sequence of WPSbRGSLRG 677610 to 677647, the sequence (WPSbRG 679). SEQ ID NO:2, the 911-6706 th nucleotides form fusion gene to express fusion protein. Due to the presence of the self-cleaving polypeptide P2A and the self-cleaving polypeptide T2A, the fusion protein spontaneously forms the following three proteins: proteins with Cas9 protein, proteins with EGFP protein and proteins with Puro protein.

The pKG-U6gRNA vector, plasmid pKG-U6gRNA, is a circular plasmid, as described in patent application 202010084343.6, SEQ ID NO:3, respectively. SEQ ID NO:3, the 2280 th to 2539 th nucleotides form the hU6 promoter, and the 2558 th to 2637 th nucleotides are used for transcription to form a gRNA framework. When the recombinant plasmid is used, a DNA molecule (a target sequence binding region for forming gRNA through transcription) of about 20bp is inserted into the plasmid pKG-U6gRNA to form a recombinant plasmid, and the recombinant plasmid is transcribed in a cell to obtain the gRNA.

Example 1 construction of prokaryotic Cas9 high-efficiency expression vector

The structure of plasmid pET-32a is schematically shown in FIG. 1.

The plasmid pKG-GE4 is obtained by modifying a plasmid pET-32a serving as a starting plasmid. Plasmid pET32a-T7lac-phoA SP-TrxA-His-EK-NLS-spCas9-NLS-T7ter (plasmid pKG-GE4 for short), as shown in SEQ ID NO:1, which is circular plasmid, and the structural schematic diagram is shown in figure 2.

The amino acid sequence of SEQ ID NO:1, the 5121-5139 th nucleotides form a T7 promoter, the 5140-5164 th nucleotides form a Lac operator, the 5178-5201 th nucleotides form a Ribosome Binding Site (RBS), the 5209-5271 th nucleotides form an alkaline phosphatase signal peptide (phoA signal peptide), the 5272-5598 th nucleotides form a TrxA protein, the 5620-567 th nucleotides form a His-Tag, the 565656568-5652 th nucleotides form an enterokinase cleavage site (EK cleavage site), the 5656-5670 th nucleotides form a nuclear localization signal, the 5701-9801 th nucleotides form a SPCas9 protein, the 9802-9849 th nucleotides form a nuclear localization signal, and the 9902-9949 th nucleotides form a T7 terminator. The nucleotides encoding the spCas9 protein have been codon optimized for the e.coli BL21 (DE 3) strain.

The main modifications of plasmid pKG-GE4 are as follows: (1) the encoding region of the TrxA protein is reserved, and the TrxA protein can help the expressed target protein to form a disulfide bond and increase the solubility and the activity of the target protein; adding a coding sequence of an alkaline phosphatase signal peptide in front of a coding region of the TrxA protein, wherein the alkaline phosphatase signal peptide can guide the expressed target protein to be secreted into the periplasmic cavity of the membrane of the bacterium and can be cut by a prokaryotic periplasmic signal peptidase; (2) adding a coding sequence of His-Tag after the coding sequence of the TrxA protein, wherein the His-Tag can be used for enriching the expressed target protein; (3) adding the coding sequence of an enterokinase enzyme cutting site DDDDK (Asp-Asp-Asp-Asp-Lys) at the downstream of the coding sequence of the His-Tag, and removing the His-Tag and the upstream fused TrxA protein by the purified protein under the action of enterokinase; (4) the Cas9 gene which is suitable for being expressed by an escherichia coli BL21 (DE 3) strain after codon optimization is inserted, and meanwhile, the nuclear localization signal coding sequence is added at the upstream and the downstream of the gene, so that the nuclear localization capability of the Cas9 protein purified at the later stage is improved.

The fusion gene in plasmid pKG-GE4 is shown in SEQ ID NO:1, nucleotides 5209 to 9852 of SEQ ID NO:2 (fusion protein TrxA-His-EK-NLS-spCas9-NLS, abbreviated as PRONCN protein). Due to the presence of the alkaline phosphatase signal peptide and the enterokinase cleavage site, the fusion protein is cleaved by enterokinase to form SEQ ID NO:3, the protein shown in SEQ ID NO: the protein shown in 3 is named NCN protein.

Example 2 preparation and purification of NCN protein

1. Inducible expression

1. The plasmid pKG-GE4 was introduced into E.coli BL21 (DE 3) to obtain a recombinant strain.

2. The recombinant strain obtained in step 1 was inoculated into a liquid LB medium containing 100. Mu.g/ml ampicillin and cultured overnight at 37 ℃ with shaking at 200 rpm.

3. Inoculating the bacterial liquid obtained in the step 2 to a liquid LB culture medium, and performing shaking culture at 30 ℃ and 230rpm until the bacterial liquid is OD _600nm The value =1.0, isopropyl thiogalactoside (IPTG) was added to make the concentration in the system 0.5mM, and the mixture was subjected to shaking culture at 230rpm at 25 ℃ for 12 hours, then centrifuged at 10000 ℃ for 15 minutes at 4 ℃ to collect the cells.

4. The cells obtained in step 3 were washed with PBS buffer.

2. Purification of fusion protein TrxA-His-EK-NLS-spCas9-NLS

1. And (3) adding the crude extraction buffer solution into the thallus obtained in the step one, suspending the thallus, then crushing the thallus by using a homogenizer (1000 par circulation is carried out for three times), then centrifuging for 30min at 4 ℃ at 15000g, collecting supernatant, filtering the supernatant by using a filter membrane with the aperture of 0.22 mu m, and collecting filtrate. In this step, 10ml of crude extraction buffer solution is prepared for each g of wet-weight thallus.

Crude extraction buffer: containing 20mM Tris-HCl (pH 8.0), 0.5M NaCl, 5mM Imidazole, 1mM PMSF, and the balance ddH ₂ O。

2. The fusion protein was purified by affinity chromatography.

Firstly, balancing a Ni-NTA agarose column by using a balance solution with 5 column volumes (the flow rate is 1 ml/min); then 50ml of the filtrate obtained in step 1 was loaded (flow rate 0.5-1 ml/min); the column was then washed with 5 column volumes of equilibration solution (flow rate 1 ml/min); the column was then washed with 5 column volumes of buffer (flow rate 1 ml/min) to remove contaminating proteins; then eluting with 10 column volumes of eluent at a flow rate of 0.5-1ml/min, and collecting the solution (90-100 ml) after passing through the column.

Ni-NTA agarose column: ausrey, L00250/L00250-C, 10ml of filler.

Balance liquid: containing 20mM Tris-HCl (pH 8.0), 0.5M NaCl, 5mM Imidazole, and the balance ddH ₂ O。

Buffer solution: containing 20mM Tris-HCl (pH 8.0), 0.5M NaCl, 50mM Imidazole, and the balance ddH ₂ O。

Eluent: containing 20mM Tris-HCl (pH 8.0), 0.5M NaCl, 500mM Imidazole, and the balance ddH ₂ O。

3. Enzyme digestion of fusion protein TrxA-His-EK-NLS-spCas9-NLS and purification of NCN protein

1. 15ml of the post-column solution collected in step two was concentrated to 200. Mu.l using Amicon ultrafiltration tube (Sigma, UFC9100, capacity 15 ml) and then diluted to 1ml with 25mM Tris-HCl (pH 8.0). 6 ultrafiltration tubes were used to give a total of 6ml.

2. Commercial His-tagged Recombinant Bovine Enterokinase (Bio, C620031, recombinant Bovine Enterokinase Light Chain, his-tagged, recombinant Bovine Enterokinase Light Chain, his) was added to the solution (about 6 ml) obtained in step 1, and cleaved with enzyme at 25 ℃ for 16 hours. 2 units of enterokinase are added in the amount of each 50 mug protein.

3. The solution (about 6 ml) that completed step 2 was taken and mixed with 480. Mu.l of Ni-NTA resin (Kinseri, L00250/L00250-C), mixed by rotation at room temperature for 15min, and then 7000g was centrifuged for 3min, and the supernatant (4-5.5 ml) was collected.

4. And (3) taking the supernatant obtained in the step (3), concentrating the supernatant to 200 mu l by using an Amicon ultrafiltration tube (Sigma, UFC9100, the volume of which is 15 ml), adding the concentrated solution into an enzyme stock solution, and adjusting the protein concentration to be 5mg/ml to obtain the NCN protein solution.

And (3) sequencing the protein in the NCN protein solution, wherein the 15N-terminal amino acid residues are shown as SEQ ID NO:3, positions 1 to 15, i.e., the NCN protein.

The NCN proteins used in the subsequent examples were all provided by NCN protein solutions.

Enzyme stock solution (ph 7.4): containing 10mM Tris,300mM NaCl,0.1mM EDTA,1mM DTT,50% (by volume) glycerol, and the balance ddH ₂ O。

Example 3 Performance of NCN protein

The 2 gRNA targets targeting the TTN gene were selected as follows:

TTN-gRNA1：AGAGCACAGTCAGCCTGGCG；

TTN-gRNA2：CTTCCAGAATTGGATCTCCG。

primers used to identify target fragments comprising grnas in the TTN gene were as follows:

TTN-F55：TACGGAATTGGGGAGCCAGCGGA；

TTN-R560：CAAAGTTAACTCTCTGTGTCT。

1. preparation of gRNA

1. Preparation of TTN-T7-gRNA1 transcription template and TTN-T7-gRNA2 transcription template

The TTN-T7-gRNA1 transcription template is a double-stranded DNA molecule, and is shown as SEQ ID NO:4, respectively.

The TTN-T7-gRNA2 transcription template is a double-stranded DNA molecule, and is shown as SEQ ID NO:5, respectively.

2. In vitro transcription to obtain gRNA

Taking TTN-T7-gRNA1 Transcription template, adopting a Transcription Aid T7 High Yield Transcription Kit (Fermentas, K0441) to carry out in vitro Transcription, and then using MEGA clear ^TM The Transcription Clean-Up Kit (Thermo, AM 1908) was recovered and purified to obtain TTN-gRNA1.TTN-gRNA1 is a single-stranded RNA, as shown in SEQ ID NO: and 6.

Taking TTN-T7-gRNA2 Transcription template, adopting Transcript Aid T7 High Yield Transcription Kit (Fermentas, K0441) to carry out in vitro Transcription, and then using MEGA clear ^TM The Transcription Clean-Up Kit (Thermo, AM 1908) was recovered and purified to obtain TTN-gRNA2.TTN-gRNA2 is a single-stranded RNA, as shown in SEQ ID NO: shown at 7.

2. gRNA and NCN protein dosage proportion optimization

1. Co-transfected primary porcine fibroblasts

A first group: co-transfecting the porcine primary fibroblasts with TTN-gRNA1, TTN-gRNA2 and NCN proteins. Proportioning: about 10 million porcine primary fibroblasts: 0.5 μ g TTN-gRNA1:0.5 μ g TTN-gRNA2: mu.g NCN protein.

Second group: co-transfecting the porcine primary fibroblasts with TTN-gRNA1, TTN-gRNA2 and NCN proteins. Proportioning: about 10 million porcine primary fibroblasts: 0.75 μ g TTN-gRNA1:0.75 μ g TTN-gRNA2: mu.g NCN protein.

Third group: co-transfecting the porcine primary fibroblasts with TTN-gRNA1, TTN-gRNA2 and NCN proteins. Proportioning: about 10 million porcine primary fibroblasts: 1 μ g TTN-gRNA1:1 μ g TTN-gRNA2: mu.g NCN protein.

And a fourth group: co-transfecting primary pig fibroblasts with TTN-gRNA1, TTN-gRNA2 and NCN protein. Proportioning: about 10 million porcine primary fibroblasts: 1.25 μ g TTN-gRNA1:1.25 μ g TTN-gRNA2: mu.g NCN protein.

A fifth group: co-transfecting the porcine primary fibroblasts with TTN-gRNA1 and TTN-gRNA2. Proportioning: about 10 ten thousand porcine primary fibroblasts: 1 μ g TTN-gRNA1:1 μ g TTN-gRNA2.

Co-transfection was performed by electroporation using a mammalian nuclear transfection kit (Neon kit, thermofeisher) and a Neon TM transfection system electrotransfer instrument (parameters set at 1450V, 10ms, 3 pulses).

2. After step 1, the culture is carried out for 12 to 18 hours by using the complete culture solution, and then the culture is carried out by replacing the complete culture solution with a new one. The total time of incubation after electroporation was 48 hours.

3. After completion of step 2, cells were digested with trypsin and collected, genomic DNA was extracted, PCR amplified using a primer pair consisting of TTN-F55 and TTN-R560, and then subjected to 1% agarose gel electrophoresis.

The electrophoretogram is shown in FIG. 3. The 505bp band is a wild-type band (WT), and the about 254bp band (the wild-type band is 505bp theoretically deleted by 251 bp) is a deletion mutant band (MT).

Gene deletion mutation efficiency = (MT grayscale/MT band bp number)/(WT grayscale/WT band bp number + MT grayscale/MT band bp number) × 100%. The deletion mutation efficiency of the first group of genes is 19.9 percent, the deletion mutation efficiency of the second group of genes is 39.9 percent, the deletion mutation efficiency of the third group of genes is 79.9 percent, and the deletion mutation efficiency of the fourth group of genes is 44.3 percent. The fifth group was not mutated.

The result shows that when the mass ratio of the two gRNAs to the NLS-spCas9-NLS protein is 1:1:4, the actual dosage is 1 mu g:1 μ g: the gene editing efficiency is highest at 4 mug. Thus, the optimal amount of two grnas and NCN protein was determined to be 1 μ g:1 μ g:4 μ g.

3. Comparison of Gene editing efficiency of NCN protein with that of the commercial Cas9 protein

1. Co-transfected porcine primary fibroblasts

Cas9-a group: co-transfecting the TTN-gRNA1, the TTN-gRNA2 and a commercial Cas9-A protein into a pig primary fibroblast. Proportioning: about 10 million porcine primary fibroblasts: 1 μ g TTN-gRNA1:1 μ g TTN-gRNA2:4 μ g Cas9-A protein.

pKG-GE4 group: co-transfecting the porcine primary fibroblasts with TTN-gRNA1, TTN-gRNA2 and NCN proteins. Proportioning: about 10 million porcine primary fibroblasts: 1 μ g TTN-gRNA1:1 μ g TTN-gRNA2: mu.g NCN protein.

Cas9-B set: co-transfecting the TTN-gRNA1, the TTN-gRNA2 and a commercial Cas9-B protein into a pig primary fibroblast. Proportioning: about 10 million porcine primary fibroblasts: 1 μ g TTN-gRNA1:1 μ g TTN-gRNA2:4 μ g Cas9-B protein.

Control group: co-transfecting the TTN-gRNA1 and the TTN-gRNA2 to the pig primary fibroblasts. Proportioning: about 10 million porcine primary fibroblasts: 1 μ g TTN-gRNA1:1 μ g TTN-gRNA2.

Co-transfection was performed by electroporation using a mammalian Nuclear transfection kit (Neon kit, thermofisiher) with a Neon TM transfection system electrotransfer instrument (parameters set at 1450V, 10ms, 3 pulses).

The electrophoretogram is shown in FIG. 4. The gene deletion mutation efficiency with the commercial Cas9-a protein was 28.5%, the gene deletion mutation efficiency with the NCN protein was 85.6%, and the gene deletion mutation efficiency with the commercial Cas9-B protein was 16.6%.

The result shows that compared with the Cas9 protein which adopts a commodity, the NCN protein prepared by the invention can obviously improve the gene editing efficiency.

Example 4 screening of efficient gRNA target of HNF1A Gene

Pig HNF1A gene information: encodes hepatocyte nuclear factor 1-alpha; is located on chromosome 14; geneID was 574067, sus scrofa. The amino acid sequence of the protein coded by the pig HNF1A gene is shown as SEQ ID NO. 8. In the genomic DNA, the porcine HNF1A gene has 10 exons. The partial sequence of the pig HNF1A gene (containing the 4 th exon and 400bp of the upper and lower exons) is shown as SEQ ID NO: shown at 9.

1. Conservation analysis of preset point mutation sites and adjacent genome sequences of HNF1A gene

18 newborn Jiangxiang pigs, 10 females (named 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10 respectively) and 8 males (named A, B, C, D, E, F, G and H respectively).

The porcine ear tissue designated 1 was used to extract the genome as a template, PCR amplified with different primer pairs, and then subjected to 1% agarose gel electrophoresis. The electrophoretogram is shown in FIG. 5. In FIG. 5: group 1: adopting a primer pair consisting of HNF1A-E4-F174 and HNF 1A-E4-R724; group 2: a primer pair consisting of HNF1A-E4-F228 and HNF1A-E4-R716 is adopted. As a result, it is preferable to amplify the target fragment using a primer pair consisting of HNF1A-E4-F174 and HNF 1A-E4-R724.

The genomic DNA of 18 pigs was used as templates, PCR amplification was performed using a primer pair consisting of HNF1A-E4-F174 and HNF1A-E4-R724, and then 1% agarose gel electrophoresis was performed. The electrophoretogram is shown in FIG. 6. And recovering PCR amplification products, sequencing, and comparing and analyzing the sequencing result with the HNF1A gene sequence in the public database. A common conserved region in 18 pigs is selected for designing a gRNA target.

HNF1A-E4-F174：AGAGAGGCTAAGTCACTTGCTCA；

HNF1A-E4-R724：AGAGCTGATGATCAATGGAGTGG；

HNF1A-E4-F228：GTCTGCCAACCTCAAACACTCAG；

HNF1A-E4-R716：TGATCAATGGAGTGGAGAAAGCC。

2. Screening target spots

And primarily screening a plurality of targets by screening NGG (avoiding possible mutation sites), and further screening 4 targets from the NGG through a preliminary experiment.

The 4 targets were as follows:

HNF1A-E4-gU1：AGAAGCATTTCGGCACAAGT；

HNF1A-E4-gU2：ATTTCGGCACAAGTTGGCCA；

HNF1A-E4-gD1：GGGCAGACCAGGAGAGCTGT；

HNF1A-E4-gD2：GGGGCAGACCAGGAGAGCTG。

3. preparation of recombinant plasmid

The plasmid pKG-U6gRNA was digested with the restriction enzyme BbsI, and the vector backbone (approximately 3kb linear large fragment) was recovered.

HNF1A-E4-gU1-S and HNF1A-E4-gU1-A are respectively synthesized, and then mixed and annealed to obtain double-stranded DNA molecules with sticky ends. The double-stranded DNA molecule having a cohesive end was ligated to a vector backbone to obtain a plasmid pKG-U6gRNA (HNF 1A-E4-gU 1). Plasmid pKG-U6gRNA (HNF 1A-E4-gU 1) expresses the nucleic acid sequence of SEQ ID NO:10 sgRNA _HNF1A-E4-gU1 。sgRNA _HNF1A-E4-gU1 (SEQ ID NO：10)：

AGAAGCAUUUCGGCACAAGUguuuuagagcuagaaauagcaaguuaaaauaaggcuaguccguuaucaacuugaaaaaguggcaccgagucggugcuuuu。

HNF1A-E4-gU2-S and HNF1A-E4-gU2-A are respectively synthesized, and then mixed and annealed to obtain double-stranded DNA molecules with sticky ends. The double-stranded DNA molecule having a cohesive end was ligated to a vector backbone to obtain a plasmid pKG-U6gRNA (HNF 1A-E4-gU 2). Plasmid pKG-U6gRNA (HNF 1A-E4-gU 2) expresses the nucleic acid sequence of SEQ ID NO:11 sgRNA _HNF1A-E4-gU2 。sgRNA _HNF1A-E4-gU2 (SEQ ID NO：11)：

AUUUCGGCACAAGUUGGCCAguuuuagagcuagaaauagcaaguuaaaauaaggcuaguccguuaucaacuugaaaaaguggcaccgagucggugcuuuu。

HNF1A-E4-gD1-S and HNF1A-E4-gD1-A are respectively synthesized, and then mixed and annealed to obtain the double-stranded DNA molecule with the sticky end. The double-stranded DNA molecule having the cohesive ends was ligated to a vector backbone to obtain a plasmid pKG-U6gRNA (HNF 1A-E4-gD 1). Plasmid pKG-U6gRNA (HNF 1A-E4-gD 1) expresses the nucleic acid sequence of SEQ ID NO:12 sgRNA _HNF1A-E4-gD1 。sgRNA _HNF1A-E4-gD1 (SEQ ID NO：12)：

GGGCAGACCAGGAGAGCUGUguuuuagagcuagaaauagcaaguuaaaauaaggcuaguccguuaucaacuugaaaaaguggcaccgagucggugcuuuu。

HNF1A-E4-gD2-S and HNF1A-E4-gD2-A are respectively synthesized, and then mixed and annealed to obtain the double-stranded DNA molecule with the sticky end. The double-stranded DNA molecule having a cohesive end was ligated to a vector backbone to obtain a plasmid pKG-U6gRNA (HNF 1A-E4-gD 2). Plasmid pKG-U6gRNA (HNF 1A-E4-gD 2) expresses the nucleic acid sequence of SEQ ID NO:13 sgRNA _HNF1A-E4-gD2 。sgRNA _HNF1A-E4-gD2 (SEQ ID NO：13)：

GGGGCAGACCAGGAGAGCUGguuuuagagcuagaaauagcaaguuaaaauaaggcuaguccguuaucaacuugaaaaaguggcaccgagucggugcuuuu。

HNF1A-E4-gU1-S：caccgAGAAGCATTTCGGCACAAGT；

HNF1A-E4-gU1-A：aaacACTTGTGCCGAAATGCTTCTc；

HNF1A-E4-gU2-S：caccgATTTCGGCACAAGTTGGCCA；

HNF1A-E4-gU2-A：aaacTGGCCAACTTGTGCCGAAATc；

HNF1A-E4-gD1-S：caccGGGCAGACCAGGAGAGCTGT；

HNF1A-E4-gD1-A：aaacACAGCTCTCCTGGTCTGCCC；

HNF1A-E4-gD2-S：caccGGGGCAGACCAGGAGAGCTG；

HNF1A-E4-gD2-A：aaacCAGCTCTCCTGGTCTGCCCC。

HNF1A-E4-gU1-S, HNF1A-E4-gU1-A, HNF1A-E4-gU2-S, HNF1A-E4-gU2-A, HNF1A-E4-gD1-S, HNF1A-E4-gD1-A, HNF1A-E4-gD2-S, HNF1A-E4-gD2-A are single-stranded DNA molecules.

4. Comparison of editing efficiency of different targets

1. Cotransfection

A first group: the plasmid pKG-U6gRNA (HNF 1A-E4-gU 1) and the plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.92. Mu.g of plasmid pKG-U6gRNA (HNF 1A-E4-gU 1): 1.08. Mu.g of plasmid pKG-GE3.

Second group: the plasmid pKG-U6gRNA (HNF 1A-E4-gU 2) and the plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.92. Mu.g of plasmid pKG-U6gRNA (HNF 1A-E4-gU 2): 1.08. Mu.g of plasmid pKG-GE3.

Third group: the plasmid pKG-U6gRNA (HNF 1A-E4-gD 1) and the plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.92. Mu.g of plasmid pKG-U6gRNA (HNF 1A-E4-gD 1): 1.08. Mu.g of plasmid pKG-GE3.

And a fourth group: the plasmid pKG-U6gRNA (HNF 1A-E4-gD 2) and the plasmid pKG-GE3 were co-transfected into porcine primary fibroblasts. Proportioning: about 20 million porcine primary fibroblasts: 0.92. Mu.g of plasmid pKG-U6gRNA (HNF 1A-E4-gD 2): 1.08. Mu.g of plasmid pKG-GE3.

A fifth group: carrying out electrotransformation operation on primary pig fibroblasts with the same electrotransformation parameters and without plasmids.

2. After the completion of step 1, the culture is carried out for 12 to 18 hours by using the complete culture solution, and then the culture is carried out by replacing with a new complete culture solution. The total time of incubation after electroporation was 48 hours.

3. After step 2 was completed, cells were digested with trypsin and collected, lysed, genomic DNA was extracted, PCR amplified using a primer pair consisting of HNF1A-E4-F174 and HNF1A-E4-R724, and then subjected to 1% agarose gel electrophoresis. The mutation of the target gene of the cell is detected, and the electrophoretogram is shown in FIG. 7.

And cutting and recovering the target product, sending the target product to a sequencing company for sequencing, and analyzing a sequencing peak map by using a webpage version Synthego ICE tool to obtain the gene editing efficiency of different targets. The gene editing efficiencies of the first group to the fourth group were 21%, 67%, 73%, and 40% in this order. No gene editing occurred in the fifth group. The result shows that the editing efficiency of HNF1A-E4-gU2 and HNF1A-E4-gD1 is higher.

Example 5 preparation of HNF1A Gene-pinpointed modified monoclonal cells by somatic cloning

Two high-efficiency gRNA targets (HNF 1A-E4-gU2 and HNF1A-E4-gD 1) screened in the example 4 are selected.

1. Preparation of gRNA

1. Preparing HNF1A-T7-gU2 transcription template and HNF1A-T7-gD1 transcription template

The HNF1A-T7-gU2 transcription template is a double-stranded DNA molecule, and is shown as SEQ ID NO: as shown at 14.

The HNF1A-T7-gD1 transcription template is a double-stranded DNA molecule, and is shown as SEQ ID NO:15, respectively.

2. In vitro transcription to obtain gRNA

Taking HNF1A-T7-gU2 Transcription template, adopting a Transcription Aid T7 High Yield Transcription Kit (Fermentas, K0441) to carry out in vitro Transcription, and then adopting MEGA clear ^TM The transformation Clean-Up Kit (Thermo, AM 1908) was recovered and purified to obtain HNF1A-gU2.HNF1A-gU2 is single-stranded RNA, and is shown as SEQ ID NO: shown at 16.

Taking HNF1A-T7-gD1 Transcription template, adopting Transcript Aid T7 High Yield Transcription Kit (Fermentas, K0441) to make in vitro Transcription, then using MEGA clear ^TM The transformation Clean-Up Kit (Thermo, AM 1908) was recovered and purified to obtain HNF1A-gD1.HNF1A-gD1 is single-stranded RNA, and is shown as SEQ ID NO: shown at 17.

HNF1A-gU2(SEQ ID NO.16)：

GGAUUUCGGCACAAGUUGGCCAGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU。

HNF1A-gD1(SEQ ID NO.17)：

GGGGGCAGACCAGGAGAGCUGUGUUUUAGAGCUAGAAAUAGCAAGUUAAAAUAAGGCUAGUCCGUUAUCAACUUGAAAAAGUGGCACCGAGUCGGUGCUUUU。

2. Synthesis of Single-stranded Donor DNA having Single base inserted at target site of HNF1A Gene

Synthesizing single-stranded DNA with single base inserted in the target site of HNF1A gene as Donor DNA, wherein the single-stranded DNA contains PAM sequence synonymous mutation of HNF1A-E4-gU2 and HNF1A-E4-gD1 target besides target site-specific modification. The single-stranded Donor DNA was named HNF1A-mutant-ss163.

HNF1A-mutant-ss163 is shown as SEQ ID NO:18, respectively.

3. Transfection of porcine primary fibroblasts

1. The HNF1A-gU2, the HNF1A-gD1, the HNF1A-mutant-ss163 and the NCN protein are co-transfected into the porcine primary fibroblasts. Proportioning: about 10 million porcine primary fibroblasts: 1 μ g HNF1A-gU2:1 μ g HNF1A-gD1:2 μ g HNF1A-mutant-ss163: mu.g NCN protein. Co-transfection was performed by electroporation using a mammalian nuclear transfection kit (Neon kit, thermofeisher) and a Neon TM transfection system electrotransfer instrument (parameters set at 1450V, 10ms, 3 pulses).

2. After step 1, the culture is carried out for 16 to 18 hours by using the complete culture solution, and then the culture is carried out by replacing the complete culture solution with a new one. The total time of incubation after electroporation was 48 hours.

3. After completion of step 2, cells were trypsinized and collected, then washed with complete medium, then resuspended with complete medium, and then individual monoclonals were picked up and transferred to 96-well plates (1 cell per well with 100. Mu.l of complete medium per well) for 2 weeks (replacement of new complete medium every 2-3 days).

4. After completion of step 3, cells were trypsinized and collected (cells obtained per well, approximately 2/3 of which were seeded into 6-well plates containing complete medium, the remaining 1/3 of which were collected in 1.5mL centrifuge tubes).

5. The 6-well plate of step 4 was taken, cultured until the cells grew to 80% confluency, trypsinized and collected, and the cells were cryopreserved using a cell cryopreservation solution (90% complete medium +10% DMSO, vol.).

6. And (5) taking the centrifugal tube in the step (4), taking the cell, performing cell lysis and extracting genomic DNA, performing PCR amplification by using a primer pair consisting of HNF1A-E4-F174 and HNF1A-E4-R724, and performing electrophoresis. Porcine primary fibroblasts were used as wild type controls (WT). The electrophoretogram is shown in FIG. 8. Lane numbers in fig. 8 are consistent with cell numbers in table 1.

7. After completion of step 6, the PCR amplification product was recovered and sequenced.

The sequencing result of the primary fibroblast of the pig is only one, and the genotype of the primary fibroblast is homozygous wild type. If the sequencing result of a certain monoclonal cell has two types, one type is consistent with the sequencing result of the pig primary fibroblast, and the other type has mutation (mutation comprises deletion, insertion or substitution of one or more nucleotides) compared with the sequencing result of the pig primary fibroblast, the genotype of the monoclonal cell is heterozygote; if the sequencing result of a certain monoclonal cell is two types, the two types of the sequencing results are both mutated (the mutation comprises deletion, insertion or substitution of one or more nucleotides) compared with the sequencing result of the pig primary fibroblast, and the genotype of the monoclonal cell is a biallelic different mutant type; if the sequencing result of a certain monoclonal cell is one and mutation (mutation comprises deletion, insertion or substitution of one or more nucleotides) is generated compared with the sequencing result of the pig primary fibroblast, the genotype of the monoclonal cell is a biallelic identical mutant; if the sequencing result of a certain monoclonal cell is one and is consistent with the sequencing result of the pig primary fibroblast, the genotype of the monoclonal cell is a homozygous wild type.

The results are shown in Table 1. The genotypes of the single cell clones numbered 6, 14, 18, 22, 36, 37, 44, 51 were homozygous wild-type. The genotypes of the single cell clones numbered 1, 3, 12, 17, 19, 23, 25, 26, 28, 29, 33, 35, 39, 42, 45, 47, 49, 50, 52, 54, 55 were heterozygous. The genotypes of the single cell clones numbered 2, 8, 13, 15, 24, 27, 30, 38, 43, 53 are biallelically distinct mutants. The genotypes of the single cell clones numbered 4, 5, 7, 9, 10, 11, 16, 20, 21, 31, 32, 34, 40, 41, 46, 48 are biallelic identical mutants. Among them, single cell clones numbered 8, 13, 17, 24, 27, 43, 50 and 53 were heterozygous for the target site mutation (i.e., one of the two homologous chromosomes completed the replacement of single-stranded Donor DNA), and single cell clones numbered 4, 11 and 32 were biallelic mutant for the target site mutation (i.e., both homologous chromosomes completed the replacement of single-stranded Donor DNA). The percentage of single-cell clones with HNF1A gene editing was 85.5%, and the percentage of single-cell clones with targeted site-directed modification (i.e., single-cell clones numbered 4, 11, 32, 8, 13, 17, 24, 27, 43, 50, 53) was 20%.

Exemplary sequencing alignments are shown in figures 9 to 14. FIG. 9 is the result of alignment of forward and reverse sequencing of single cell clone numbered 6 with the target site-directed modified sequence, which is homozygous wild type. FIG. 10 shows the forward and reverse sequencing of single-cell clone number 1, aligned with the target site-directed modified sequence, and is a hybrid. FIG. 11 is a forward and reverse sequencing alignment of single cell clone number 15 simultaneously with target site-directed modification of the sequence, as a biallelic differential mutant. FIG. 12 is a forward and reverse sequencing alignment of single cell clone numbered 7 with the target site-directed modified sequence, shown as a biallelic mutant. FIG. 13 is a forward and reverse sequencing of single cell clone numbered 17 simultaneously aligned with the target site-directed modified sequence, which is a target site-directed modified hybrid. FIG. 14 is a forward and reverse sequencing of single cell clone numbered 4 simultaneously aligned with the target site-directed modified sequence, a target site-directed modified biallelic mutant.

TABLE 1 genotype determination of single-cell clones of the HNF1A Gene

Note: target site-directed modification means that the replacement of single-stranded Donor DNA is completed; substitution of single-stranded Donor DNA was with SEQ ID NO:18 replaces the DNA molecule shown in SEQ ID NO: 19.

Single cell clones numbered 8, 13, 17, 24, 27, 43, 50, 53 were heterozygous for the target site mutation (i.e., one of the two homologous chromosomes completed the single-stranded Donor DNA replacement), and single cell clones numbered 4, 11, 32 were biallelic mutant for the target site mutation (i.e., both homologous chromosomes completed the single-stranded Donor DNA replacement).

The recombinant cell with target site-directed modification, whether heterozygote type or homozygote type, can be used for subsequent cloned pig production. The cells are used as nuclear transplantation donor cells to carry out somatic cell cloning, and cloned pigs, namely MODY3 type diabetes model pigs, can be obtained.

The present invention has been described in detail above. It will be apparent to those skilled in the art that the invention can be practiced in a wide range of equivalent parameters, concentrations, and conditions without departing from the spirit and scope of the invention and without undue experimentation. While the invention has been described with reference to specific embodiments, it will be appreciated that the invention can be further modified. In general, this application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. The use of some of the essential features is made possible within the scope of the claims attached below.

Sequence listing

<110> Nanjing King Gene engineering Co., ltd

<120> gene editing system for constructing HNF1A gene mutation diabetes model pig nuclear transplantation donor cell and application thereof

<130> GNCYX211868

<160> 19

<170> SIPOSequenceListing 1.0

<210> 1

<211> 9974

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 1

tggcgaatgg gacgcgccct gtagcggcgc attaagcgcg gcgggtgtgg tggttacgcg 60

cagcgtgacc gctacacttg ccagcgccct agcgcccgct cctttcgctt tcttcccttc 120

ctttctcgcc acgttcgccg gctttccccg tcaagctcta aatcgggggc tccctttagg 180

gttccgattt agtgctttac ggcacctcga ccccaaaaaa cttgattagg gtgatggttc 240

acgtagtggg ccatcgccct gatagacggt ttttcgccct ttgacgttgg agtccacgtt 300

ctttaatagt ggactcttgt tccaaactgg aacaacactc aaccctatct cggtctattc 360

ttttgattta taagggattt tgccgatttc ggcctattgg ttaaaaaatg agctgattta 420

acaaaaattt aacgcgaatt ttaacaaaat attaacgttt acaatttcag gtggcacttt 480

tcggggaaat gtgcgcggaa cccctatttg tttatttttc taaatacatt caaatatgta 540

tccgctcatg agacaataac cctgataaat gcttcaataa tattgaaaaa ggaagagtat 600

gagtattcaa catttccgtg tcgcccttat tccctttttt gcggcatttt gccttcctgt 660

ttttgctcac ccagaaacgc tggtgaaagt aaaagatgct gaagatcagt tgggtgcacg 720

agtgggttac atcgaactgg atctcaacag cggtaagatc cttgagagtt ttcgccccga 780

agaacgtttt ccaatgatga gcacttttaa agttctgcta tgtggcgcgg tattatcccg 840

tattgacgcc gggcaagagc aactcggtcg ccgcatacac tattctcaga atgacttggt 900

tgagtactca ccagtcacag aaaagcatct tacggatggc atgacagtaa gagaattatg 960

cagtgctgcc ataaccatga gtgataacac tgcggccaac ttacttctga caacgatcgg 1020

aggaccgaag gagctaaccg cttttttgca caacatgggg gatcatgtaa ctcgccttga 1080

tcgttgggaa ccggagctga atgaagccat accaaacgac gagcgtgaca ccacgatgcc 1140

tgcagcaatg gcaacaacgt tgcgcaaact attaactggc gaactactta ctctagcttc 1200

ccggcaacaa ttaatagact ggatggaggc ggataaagtt gcaggaccac ttctgcgctc 1260

ggcccttccg gctggctggt ttattgctga taaatctgga gccggtgagc gtgggtctcg 1320

cggtatcatt gcagcactgg ggccagatgg taagccctcc cgtatcgtag ttatctacac 1380

gacggggagt caggcaacta tggatgaacg aaatagacag atcgctgaga taggtgcctc 1440

actgattaag cattggtaac tgtcagacca agtttactca tatatacttt agattgattt 1500

aaaacttcat ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac 1560

caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa 1620

aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc 1680

accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt 1740

aactggcttc agcagagcgc agataccaaa tactgtcctt ctagtgtagc cgtagttagg 1800

ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc 1860

agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt 1920

accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga 1980

gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa gcgccacgct 2040

tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg 2100

cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca 2160

cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa 2220

cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt 2280

ctttcctgcg ttatcccctg attctgtgga taaccgtatt accgcctttg agtgagctga 2340

taccgctcgc cgcagccgaa cgaccgagcg cagcgagtca gtgagcgagg aagcggaaga 2400

gcgcctgatg cggtattttc tccttacgca tctgtgcggt atttcacacc gcatatatgg 2460

tgcactctca gtacaatctg ctctgatgcc gcatagttaa gccagtatac actccgctat 2520

cgctacgtga ctgggtcatg gctgcgcccc gacacccgcc aacacccgct gacgcgccct 2580

gacgggcttg tctgctcccg gcatccgctt acagacaagc tgtgaccgtc tccgggagct 2640

gcatgtgtca gaggttttca ccgtcatcac cgaaacgcgc gaggcagctg cggtaaagct 2700

catcagcgtg gtcgtgaagc gattcacaga tgtctgcctg ttcatccgcg tccagctcgt 2760

tgagtttctc cagaagcgtt aatgtctggc ttctgataaa gcgggccatg ttaagggcgg 2820

ttttttcctg tttggtcact gatgcctccg tgtaaggggg atttctgttc atgggggtaa 2880

tgataccgat gaaacgagag aggatgctca cgatacgggt tactgatgat gaacatgccc 2940

ggttactgga acgttgtgag ggtaaacaac tggcggtatg gatgcggcgg gaccagagaa 3000

aaatcactca gggtcaatgc cagcgcttcg ttaatacaga tgtaggtgtt ccacagggta 3060

gccagcagca tcctgcgatg cagatccgga acataatggt gcagggcgct gacttccgcg 3120

tttccagact ttacgaaaca cggaaaccga agaccattca tgttgttgct caggtcgcag 3180

acgttttgca gcagcagtcg cttcacgttc gctcgcgtat cggtgattca ttctgctaac 3240

cagtaaggca accccgccag cctagccggg tcctcaacga caggagcacg atcatgcgca 3300

cccgtggggc cgccatgccg gcgataatgg cctgcttctc gccgaaacgt ttggtggcgg 3360

gaccagtgac gaaggcttga gcgagggcgt gcaagattcc gaataccgca agcgacaggc 3420

cgatcatcgt cgcgctccag cgaaagcggt cctcgccgaa aatgacccag agcgctgccg 3480

gcacctgtcc tacgagttgc atgataaaga agacagtcat aagtgcggcg acgatagtca 3540

tgccccgcgc ccaccggaag gagctgactg ggttgaaggc tctcaagggc atcggtcgag 3600

atcccggtgc ctaatgagtg agctaactta cattaattgc gttgcgctca ctgcccgctt 3660

tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 3720

gcggtttgcg tattgggcgc cagggtggtt tttcttttca ccagtgagac gggcaacagc 3780

tgattgccct tcaccgcctg gccctgagag agttgcagca agcggtccac gctggtttgc 3840

cccagcaggc gaaaatcctg tttgatggtg gttaacggcg ggatataaca tgagctgtct 3900

tcggtatcgt cgtatcccac taccgagatg tccgcaccaa cgcgcagccc ggactcggta 3960

atggcgcgca ttgcgcccag cgccatctga tcgttggcaa ccagcatcgc agtgggaacg 4020

atgccctcat tcagcatttg catggtttgt tgaaaaccgg acatggcact ccagtcgcct 4080

tcccgttccg ctatcggctg aatttgattg cgagtgagat atttatgcca gccagccaga 4140

cgcagacgcg ccgagacaga acttaatggg cccgctaaca gcgcgatttg ctggtgaccc 4200

aatgcgacca gatgctccac gcccagtcgc gtaccgtctt catgggagaa aataatactg 4260

ttgatgggtg tctggtcaga gacatcaaga aataacgccg gaacattagt gcaggcagct 4320

tccacagcaa tggcatcctg gtcatccagc ggatagttaa tgatcagccc actgacgcgt 4380

tgcgcgagaa gattgtgcac cgccgcttta caggcttcga cgccgcttcg ttctaccatc 4440

gacaccacca cgctggcacc cagttgatcg gcgcgagatt taatcgccgc gacaatttgc 4500

gacggcgcgt gcagggccag actggaggtg gcaacgccaa tcagcaacga ctgtttgccc 4560

gccagttgtt gtgccacgcg gttgggaatg taattcagct ccgccatcgc cgcttccact 4620

ttttcccgcg ttttcgcaga aacgtggctg gcctggttca ccacgcggga aacggtctga 4680

taagagacac cggcatactc tgcgacatcg tataacgtta ctggtttcac attcaccacc 4740

ctgaattgac tctcttccgg gcgctatcat gccataccgc gaaaggtttt gcgccattcg 4800

atggtgtccg ggatctcgac gctctccctt atgcgactcc tgcattagga agcagcccag 4860

tagtaggttg aggccgttga gcaccgccgc cgcaaggaat ggtgcatgca aggagatggc 4920

gcccaacagt cccccggcca cggggcctgc caccataccc acgccgaaac aagcgctcat 4980

gagcccgaag tggcgagccc gatcttcccc atcggtgatg tcggcgatat aggcgccagc 5040

aaccgcacct gtggcgccgg tgatgccggc cacgatgcgt ccggcgtaga ggatcgagat 5100

cgatctcgat cccgcgaaat taatacgact cactataggg gaattgtgag cggataacaa 5160

ttcccctcta gaaataattt tgtttaactt taagaaggag atatacatat gaaacaaagc 5220

actattgcac tggcactctt accgttactg tttacccctg tgacaaaagc catgagcgat 5280

aaaattattc acctgactga cgacagtttt gacacggatg tactcaaagc ggacggggcg 5340

atcctcgtcg atttctgggc agagtggtgc ggtccgtgca aaatgatcgc cccgattctg 5400

gatgaaatcg ctgacgaata tcagggcaaa ctgaccgttg caaaactgaa catcgatcaa 5460

aaccctggca ctgcgccgaa atatggcatc cgtggtatcc cgactctgct gctgttcaaa 5520

aacggtgaag tggcggcaac caaagtgggt gcactgtcta aaggtcagtt gaaagagttc 5580

ctcgacgcta acctggccgg ttctggttct ggccatatgc accatcatca tcatcatgac 5640

gatgacgata agatgcccaa aaagaaacga aaggtgggta tccacggagt cccagcagcc 5700

gacaaaaaat atagcatcgg cctggacatc ggtaccaaca gcgttggctg ggcagtgatc 5760

actgatgaat acaaagttcc atccaaaaaa tttaaagtac tgggcaacac cgaccgtcac 5820

tctatcaaaa aaaacctgat tggtgctctg ctgtttgaca gcggcgaaac tgctgaggct 5880

acccgtctga aacgtacggc tcgccgtcgc tacactcgtc gtaaaaaccg catctgttat 5940

ctgcaggaaa ttttctctaa cgaaatggca aaagttgatg atagcttctt tcatcgtctg 6000

gaagagagct tcctggtgga agaagataaa aaacacgaac gtcacccgat tttcggtaac 6060

attgtggatg aggttgccta ccacgagaaa tatccgacca tctaccatct gcgtaaaaaa 6120

ctggttgata gcactgacaa agcggatctg cgtctgatct acctggctct ggcacacatg 6180

atcaaattcc gtggtcactt cctgatcgaa ggtgatctga accctgataa ctccgacgtg 6240

gacaaactgt tcattcagct ggttcagacc tataaccagc tgttcgaaga aaacccgatc 6300

aacgcgtccg gtgtagacgc taaggcaatt ctgtctgcgc gtctgtctaa gtctcgtcgt 6360

ctggaaaacc tgattgcgca actgccaggt gaaaagaaaa acggcctgtt cggcaatctg 6420

atcgccctgt ccctgggtct gactccgaac tttaaatcca actttgacct ggcggaagat 6480

gccaagctgc agctgagcaa agatacctat gacgatgacc tggataacct gctggcacag 6540

atcggtgatc agtatgccga tctgttcctg gccgcgaaaa acctgtctga tgcgattctg 6600

ctgtctgata tcctgcgcgt taacactgaa attactaaag cgccgctgag cgcatccatg 6660

attaaacgtt acgatgaaca ccaccaggat ctgaccctgc tgaaagcgct ggtgcgtcag 6720

cagctgccgg aaaaatacaa ggagatcttc ttcgaccaga gcaaaaacgg ttacgcgggc 6780

tacattgatg gtggtgcatc tcaggaggaa ttctacaaat tcattaaacc gatcctggaa 6840

aaaatggatg gtactgaaga gctgctggtt aaactgaatc gtgaagatct gctgcgcaaa 6900

cagcgtacct tcgataacgg ttccatcccg catcagattc atctgggcga actgcacgct 6960

atcctgcgcc gtcaggaaga cttttatccg ttcctgaaag acaaccgtga gaaaattgaa 7020

aaaatcctga ccttccgtat tccgtactat gtaggtccgc tggcgcgtgg taactcccgt 7080

ttcgcttgga tgacccgcaa aagcgaagaa accatcaccc cgtggaattt cgaagaagtc 7140

gttgacaaag gcgcgtccgc gcagtctttc atcgaacgca tgacgaactt cgacaaaaac 7200

ctgccgaacg agaaagtgct gccgaaacac tctctgctgt acgagtactt cactgtgtac 7260

aacgaactga ccaaagtgaa atacgtcacc gaaggtatgc gtaaaccggc attcctgtcc 7320

ggtgagcaaa aaaaagcaat cgtggatctg ctgttcaaaa ccaaccgtaa agtaaccgtg 7380

aaacagctga aggaagacta tttcaagaaa atcgaatgtt ttgattctgt tgaaatctcc 7440

ggcgtggaag atcgcttcaa tgcgtccctg ggtacgtatc acgacctgct gaaaattatc 7500

aaagacaaag attttctgga caacgaggaa aacgaagaca tcctggagga tattgtactg 7560

accctgaccc tgttcgaaga ccgtgagatg atcgaagaac gcctgaaaac ctacgcccac 7620

ctgttcgatg acaaggtaat gaagcagctg aaacgtcgtc gttataccgg ctggggtcgt 7680

ctgtcccgta aactgatcaa tggcatccgt gataaacagt ctggcaaaac catcctggac 7740

ttcctgaaat ccgacggttt cgcgaatcgt aacttcatgc aactgattca tgacgattct 7800

ctgactttca aagaagacat ccagaaagca caggtttccg gccagggtga ctctctgcac 7860

gagcacattg ccaatctggc tggttctccg gctattaaaa agggtattct gcagactgtg 7920

aaagtagttg atgagctggt caaagtaatg ggccgtcaca agccggaaaa cattgtgatc 7980

gaaatggcac gtgaaaacca gacgacccag aaaggtcaga aaaactctcg tgaacgcatg 8040

aaacgtatcg aagaaggcat caaagaactg ggctctcaga tcctgaagga acaccctgta 8100

gaaaataccc agctgcagaa cgaaaagctg tatctgtatt acctgcagaa cggccgcgat 8160

atgtatgtgg accaggaact ggatatcaac cgcctgtccg attacgatgt agatcacatc 8220

gtgccgcaaa gcttcctgaa agacgacagc attgacaaca aagtactgac ccgttctgat 8280

aagaaccgtg gcaaatccga taacgtcccg tctgaagaag ttgttaaaaa aatgaaaaac 8340

tattggcgtc agctgctgaa cgcgaaactg atcacccagc gtaagttcga caatctgact 8400

aaagctgagc gcggtggtct gtccgaactg gataaagcgg gttttatcaa acgccagctg 8460

gttgaaaccc gtcagatcac gaagcacgtt gcgcagattc tggactctcg tatgaacacc 8520

aaatacgacg aaaacgacaa actgatccgc gaggttaagg ttatcaccct gaaaagcaaa 8580

ctggtatccg attttcgtaa agactttcag ttctacaaag tgcgcgaaat taacaactat 8640

caccacgctc acgatgcata tctgaatgca gttgttggca cggcgctgat caaaaagtat 8700

ccgaaactgg aatctgaatt cgtatacggc gattacaaag tgtatgacgt tcgtaagatg 8760

atcgcaaaat ccgagcagga aattggtaag gcgacggcga aatacttctt ttattccaat 8820

attatgaact ttttcaaaac cgaaatcacc ctggcgaatg gtgaaattcg taaacgcccg 8880

ctgatcgaaa ccaacggtga aactggtgaa atcgtttggg acaaaggccg cgacttcgcg 8940

accgtgcgta aagttctgtc tatgccgcaa gtgaacatcg tcaagaagac cgaagtacaa 9000

accggcggtt ttagcaaaga gagcattctg ccaaaacgta actccgacaa actgatcgcg 9060

cgcaagaaag actgggatcc gaaaaaatac ggtggtttcg attctccaac cgttgcttat 9120

tccgttctgg tggtagccaa agttgagaaa ggtaaaagca aaaaactgaa atccgtaaag 9180

gaactgctgg gtattactat catggagcgt agctccttcg aaaaaaaccc gatcgatttt 9240

ctggaagcga aaggctataa agaagtcaaa aaggacctga tcatcaaact gccaaaatac 9300

agcctgttcg agctggaaaa cggccgtaaa cgtatgctgg catctgcggg cgaactgcag 9360

aaaggcaacg agctggctct gccgtccaaa tacgtgaact ttctgtacct ggcctctcac 9420

tacgaaaaac tgaaaggttc cccggaagac aacgaacaga aacagctgtt cgtagagcag 9480

cacaaacact acctggacga gatcatcgaa cagatttctg aattttctaa acgtgtgatt 9540

ctggctgatg cgaatctgga taaagttctg tctgcctata acaagcatcg tgacaaaccg 9600

atccgcgaac aggctgagaa catcatccac ctgttcactc tgactaacct gggcgcgcca 9660

gcggctttca agtactttga taccaccatt gaccgcaagc gttacacctc cactaaagaa 9720

gtgctggacg cgactctgat ccaccagtcc atcaccggtc tgtacgagac ccgtatcgat 9780

ctgagccagc tgggcggtga caaaaggccg gcggccacga aaaaggccgg ccaggcaaaa 9840

aagaaaaagt gacaaagccc gaaaggaagc tgagttggct gctgccaccg ctgagcaata 9900

actagcataa ccccttgggg cctctaaacg ggtcttgagg ggttttttgc tgaaaggagg 9960

aactatatcc ggat 9974

<210> 2

<211> 1547

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 2

Met Lys Gln Ser Thr Ile Ala Leu Ala Leu Leu Pro Leu Leu Phe Thr

1 5 10 15

Pro Val Thr Lys Ala Met Ser Asp Lys Ile Ile His Leu Thr Asp Asp

20 25 30

Ser Phe Asp Thr Asp Val Leu Lys Ala Asp Gly Ala Ile Leu Val Asp

35 40 45

Phe Trp Ala Glu Trp Cys Gly Pro Cys Lys Met Ile Ala Pro Ile Leu

50 55 60

Asp Glu Ile Ala Asp Glu Tyr Gln Gly Lys Leu Thr Val Ala Lys Leu

65 70 75 80

Asn Ile Asp Gln Asn Pro Gly Thr Ala Pro Lys Tyr Gly Ile Arg Gly

85 90 95

Ile Pro Thr Leu Leu Leu Phe Lys Asn Gly Glu Val Ala Ala Thr Lys

100 105 110

Val Gly Ala Leu Ser Lys Gly Gln Leu Lys Glu Phe Leu Asp Ala Asn

115 120 125

Leu Ala Gly Ser Gly Ser Gly His Met His His His His His His Asp

130 135 140

Asp Asp Asp Lys Met Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly

145 150 155 160

Val Pro Ala Ala Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr

165 170 175

Asn Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser

180 185 190

Lys Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys

195 200 205

Asn Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala

210 215 220

Thr Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn

225 230 235 240

Arg Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val

245 250 255

Asp Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu

260 265 270

Asp Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu

275 280 285

Val Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys

290 295 300

Leu Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala

305 310 315 320

Leu Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp

325 330 335

Leu Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val

340 345 350

Gln Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly

355 360 365

Val Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg

370 375 380

Leu Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu

385 390 395 400

Phe Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys

405 410 415

Ser Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp

420 425 430

Thr Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln

435 440 445

Tyr Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu

450 455 460

Leu Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu

465 470 475 480

Ser Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr

485 490 495

Leu Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu

500 505 510

Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly

515 520 525

Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu

530 535 540

Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp

545 550 555 560

Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln

565 570 575

Ile His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe

580 585 590

Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr

595 600 605

Phe Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg

610 615 620

Phe Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn

625 630 635 640

Phe Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu

645 650 655

Arg Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro

660 665 670

Lys His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr

675 680 685

Lys Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser

690 695 700

Gly Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg

705 710 715 720

Lys Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu

725 730 735

Cys Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala

740 745 750

Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp

755 760 765

Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu

770 775 780

Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys

785 790 795 800

Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg

805 810 815

Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly

820 825 830

Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser

835 840 845

Asp Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser

850 855 860

Leu Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly

865 870 875 880

Asp Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile

885 890 895

Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys

900 905 910

Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg

915 920 925

Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met

930 935 940

Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys

945 950 955 960

Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu

965 970 975

Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp

980 985 990

Ile Asn Arg Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser

995 1000 1005

Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp

1010 1015 1020

Lys Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys

1025 1030 1035 1040

Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr

1045 1050 1055

Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser

1060 1065 1070

Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg

1075 1080 1085

Gln Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr

1090 1095 1100

Lys Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr

1105 1110 1115 1120

Leu Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr

1125 1130 1135

Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu

1140 1145 1150

Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu

1155 1160 1165

Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met

1170 1175 1180

Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe

1185 1190 1195 1200

Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala

1205 1210 1215

Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr

1220 1225 1230

Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys

1235 1240 1245

Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln

1250 1255 1260

Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp

1265 1270 1275 1280

Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly

1285 1290 1295

Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val

1300 1305 1310

Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly

1315 1320 1325

Ile Thr Ile Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe

1330 1335 1340

Leu Glu Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys

1345 1350 1355 1360

Leu Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met

1365 1370 1375

Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro

1380 1385 1390

Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu

1395 1400 1405

Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln

1410 1415 1420

His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser

1425 1430 1435 1440

Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala

1445 1450 1455

Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile

1460 1465 1470

Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys

1475 1480 1485

Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu

1490 1495 1500

Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu

1505 1510 1515 1520

Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Lys Arg Pro Ala Ala

1525 1530 1535

Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys

1540 1545

<210> 3

<211> 1399

<212> PRT

<213> Artificial Sequence (Artificial Sequence)

<400> 3

Met Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala Ala

1 5 10 15

Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val Gly

20 25 30

Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys

35 40 45

Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly

50 55 60

Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys

65 70 75 80

Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr

85 90 95

Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe

100 105 110

Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His

115 120 125

Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His

130 135 140

Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser

145 150 155 160

Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met

165 170 175

Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp

180 185 190

Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn

195 200 205

Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys

210 215 220

Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu

225 230 235 240

Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu

245 250 255

Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp

260 265 270

Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp

275 280 285

Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu

290 295 300

Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile

305 310 315 320

Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met

325 330 335

Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala

340 345 350

Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp

355 360 365

Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln

370 375 380

Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly

385 390 395 400

Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys

405 410 415

Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly

420 425 430

Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu

435 440 445

Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro

450 455 460

Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met

465 470 475 480

Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val

485 490 495

Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn

500 505 510

Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu

515 520 525

Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr

530 535 540

Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys

545 550 555 560

Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val

565 570 575

Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser

580 585 590

Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr

595 600 605

Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn

610 615 620

Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu

625 630 635 640

Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His

645 650 655

Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr

660 665 670

Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys

675 680 685

Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala

690 695 700

Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys

705 710 715 720

Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His

725 730 735

Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile

740 745 750

Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg

755 760 765

His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr

770 775 780

Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu

785 790 795 800

Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val

805 810 815

Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln

820 825 830

Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu

835 840 845

Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp

850 855 860

Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly

865 870 875 880

Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn

885 890 895

Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe

900 905 910

Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys

915 920 925

Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys

930 935 940

His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu

945 950 955 960

Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys

965 970 975

Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu

980 985 990

Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val

995 1000 1005

Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val

1010 1015 1020

Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys Ser

1025 1030 1035 1040

Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr Ser Asn

1045 1050 1055

Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn Gly Glu Ile

1060 1065 1070

Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr Gly Glu Ile Val

1075 1080 1085

Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met

1090 1095 1100

Pro Gln Val Asn Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe

1105 1110 1115 1120

Ser Lys Glu Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala

1125 1130 1135

Arg Lys Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro

1140 1145 1150

Thr Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys

1155 1160 1165

Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met

1170 1175 1180

Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys

1185 1190 1195 1200

Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr

1205 1210 1215

Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala

1220 1225 1230

Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val

1235 1240 1245

Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro

1250 1255 1260

Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His Tyr

1265 1270 1275 1280

Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg Val Ile

1285 1290 1295

Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr Asn Lys His

1300 1305 1310

Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile Ile His Leu Phe

1315 1320 1325

Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr

1330 1335 1340

Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala

1345 1350 1355 1360

Thr Leu Ile His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp

1365 1370 1375

Leu Ser Gln Leu Gly Gly Asp Lys Arg Pro Ala Ala Thr Lys Lys Ala

1380 1385 1390

Gly Gln Ala Lys Lys Lys Lys

1395

<210> 4

<211> 225

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 4

ggcttgtcgg actcttcgct attacgccag ctggcgaagg gggatgtgct gcaaggcgat 60

taagttgggt aacgccaggg ttttcccagt cacgacgtta ggaaattaat acgactcact 120

ataggagagc acagtcagcc tggcggtttt agagctagaa atagcaagtt aaaataaggc 180

tagtccgtta tcaacttgaa aaagtggcac cgagtcggtg ctttt 225

<210> 5

<211> 225

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 5

ggcttgtcgg actcttcgct attacgccag ctggcgaagg gggatgtgct gcaaggcgat 60

taagttgggt aacgccaggg ttttcccagt cacgacgtta ggaaattaat acgactcact 120

ataggcttcc agaattggat ctccggtttt agagctagaa atagcaagtt aaaataaggc 180

tagtccgtta tcaacttgaa aaagtggcac cgagtcggtg ctttt 225

<210> 6

<211> 102

<212> RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 6

ggagagcaca gucagccugg cgguuuuaga gcuagaaaua gcaaguuaaa auaaggcuag 60

uccguuauca acuugaaaaa guggcaccga gucggugcuu uu 102

<210> 7

<211> 102

<212> RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 7

ggcuuccaga auuggaucuc cgguuuuaga gcuagaaaua gcaaguuaaa auaaggcuag 60

uccguuauca acuugaaaaa guggcaccga gucggugcuu uu 102

<210> 8

<211> 631

<212> PRT

<213> Sus scrofa

<400> 8

Met Val Ser Lys Leu Ser Gln Leu Gln Thr Glu Leu Leu Ala Ala Leu

1 5 10 15

Leu Glu Ser Gly Leu Ser Lys Glu Ala Leu Ile Gln Ala Leu Gly Glu

20 25 30

Pro Gly Pro Tyr Leu Leu Ala Gly Asp Gly Ala Leu Asp Lys Gly Glu

35 40 45

Ser Cys Gly Gly Ala Arg Gly Glu Leu Ala Glu Leu Pro Asn Gly Leu

50 55 60

Gly Glu Thr Arg Gly Ser Glu Asp Glu Thr Asp Asp Asp Gly Glu Asp

65 70 75 80

Phe Thr Pro Pro Ile Leu Lys Glu Leu Glu Asn Leu Ser Pro Glu Glu

85 90 95

Ala Ala His Gln Lys Ala Val Val Glu Thr Leu Leu Gln Glu Asp Pro

100 105 110

Trp Arg Val Ala Lys Met Val Lys Ser Tyr Leu Gln Gln His Asn Ile

115 120 125

Pro Gln Arg Glu Val Val Asp Thr Thr Gly Leu Asn Gln Ser His Leu

130 135 140

Ser Gln His Leu Asn Lys Gly Thr Pro Met Lys Thr Gln Lys Arg Ala

145 150 155 160

Ala Leu Tyr Thr Trp Tyr Val Arg Lys Gln Arg Glu Val Ala Gln Gln

165 170 175

Phe Thr His Ala Gly Gln Gly Gly Leu Ile Glu Glu Pro Thr Gly Asp

180 185 190

Glu Leu Pro Thr Lys Lys Gly Arg Arg Asn Arg Phe Lys Trp Gly Pro

195 200 205

Ala Ser Gln Gln Ile Leu Phe Gln Ala Tyr Glu Arg Gln Lys Asn Pro

210 215 220

Ser Lys Glu Glu Arg Glu Ala Leu Val Glu Glu Cys Asn Arg Ala Glu

225 230 235 240

Cys Ile Gln Arg Gly Val Ser Pro Ser Gln Ala Gln Gly Leu Gly Ser

245 250 255

Asn Leu Val Thr Glu Val Arg Val Tyr Asn Trp Phe Ala Asn Arg Arg

260 265 270

Lys Glu Glu Ala Phe Arg His Lys Leu Ala Met Asp Thr Tyr Ser Gly

275 280 285

Pro Pro Pro Gly Pro Gly Pro Gly Pro Ala Leu Pro Ala His Ser Ser

290 295 300

Pro Gly Leu Pro Pro Thr Ala Leu Ser Pro Ser Lys Val His Gly Val

305 310 315 320

Arg Tyr Gly Gln Ser Ala Thr Ser Glu Gly Ala Glu Val Pro Ser Ser

325 330 335

Ser Gly Gly Pro Leu Val Thr Val Ser Ala Pro Leu His Gln Val Ser

340 345 350

Pro Thr Gly Leu Glu Pro Ser His Ser Leu Leu Ser Thr Glu Ala Lys

355 360 365

Leu Val Ser Ala Thr Gly Gly Pro Leu Pro Pro Val Ser Thr Leu Thr

370 375 380

Ala Leu His Ser Leu Glu Gln Thr Ser Pro Gly Leu Asn Gln Gln Pro

385 390 395 400

Gln Asn Leu Ile Met Ala Ser Leu Pro Gly Val Met Ala Ile Gly Pro

405 410 415

Ser Glu Pro Ala Ser Leu Gly Pro Thr Phe Thr Asn Thr Gly Ala Ser

420 425 430

Thr Leu Val Ile Gly Leu Ala Ser Thr Gln Ala Gln Ser Val Pro Val

435 440 445

Ile Asn Ser Met Gly Ser Ser Leu Thr Thr Leu Gln Pro Val Gln Phe

450 455 460

Ser Gln Pro Leu His Pro Ser Tyr Gln Gln Pro Leu Met Pro Ser Val

465 470 475 480

Gln Ser His Val Ala Gln Ser Pro Phe Met Ala Thr Met Ala Gln Leu

485 490 495

Gln Ser Pro His Ala Leu Tyr Ser His Lys Pro Glu Val Ala Gln Tyr

500 505 510

Thr His Thr Gly Leu Leu Pro Gln Thr Met Leu Ile Thr Asp Thr Thr

515 520 525

Asn Leu Ser Ala Leu Ala Ser Leu Thr Pro Thr Lys Gln Val Phe Thr

530 535 540

Ser Asp Thr Glu Ala Ser Ser Glu Ser Gly Leu His Thr Pro Ala Ser

545 550 555 560

Gln Ala Thr Thr Ile His Ile Pro Ser Gln Asp Pro Ala Gly Ile Gln

565 570 575

His Leu Gln Pro Ala His Arg Leu Ser Ala Ser Pro Thr Val Ser Ser

580 585 590

Ser Ser Leu Val Leu Tyr Gln Ser Ser Asp Ser Thr Asn Gly His Ser

595 600 605

His Leu Leu Pro Ser Asn His Ser Val Ile Glu Thr Phe Ile Ser Thr

610 615 620

Gln Met Ala Ser Ser Ser Gln

625 630

<210> 9

<211> 1042

<212> DNA

<213> Sus scrofa

<400> 9

aaggctgggg aaggggagag gggctttggg tgctgaggga ggctccccag gttttgaaag 60

ctcctgctgt tggcccagga gttctcagct cctgggctga gtgtctgaaa cccagctcca 120

tttctggtgc ccccccaccc cactgaccca aacaaccttt gagtggctgc tcgactccct 180

catcctcact acaaccctat gtttattgtg cccactccct gaagagacta agagaggcta 240

agtcacttgc tcaaggtcac acagcagact gagattgaaa ctgagtctgc caacctcaaa 300

cactcaggta gatctctcat tctcagaacc ctccccccac ctccaaggag agggttcttc 360

tgtgcctggc ctggaggctc acaagtggcc attcctgcag ggcggagtgc atccagaggg 420

gggtgtcacc atcacaggca caggggctgg gctccaacct cgtcacggag gtgcgcgtct 480

acaactggtt tgccaatcgg cgcaaggaag aagcatttcg gcacaagttg gccatggaca 540

cgtacagtgg gccaccaccg gggccaggtc cgggccctgc actgcctgcc cacagctctc 600

ctggtctgcc cccaaccgcc ctctccccca gtaaggtcca cggtgagtgc catgtgggca 660

gggggactgg acagtggtta gagggactct gagggtaggt gggagagttg gggagcacca 720

cctcattggc agcagccacc cacgcctcct ggctttctcc actccattga tcatcagctc 780

tacccattcc atattcactc caactctttt tttttttttt ttttggtctt tttaggacca 840

cacatgcagc atgtggaagt tcccaggcta ggggtctaat gggagctgta gccgccagcc 900

tatgccacag ccacaacaac accagatcag agcctcatct gtgacctaca tcacagccca 960

cagcaatgct ggattcttaa cccactgaga gaggccaggg atcaaacctg cgtcctcatg 1020

gatactaata agatttgtta tc 1042

<210> 10

<211> 100

<212> RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 10

agaagcauuu cggcacaagu guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60

cguuaucaac uugaaaaagu ggcaccgagu cggugcuuuu 100

<210> 11

<211> 100

<212> RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 11

auuucggcac aaguuggcca guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60

cguuaucaac uugaaaaagu ggcaccgagu cggugcuuuu 100

<210> 12

<211> 100

<212> RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 12

gggcagacca ggagagcugu guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60

cguuaucaac uugaaaaagu ggcaccgagu cggugcuuuu 100

<210> 13

<211> 100

<212> RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 13

ggggcagacc aggagagcug guuuuagagc uagaaauagc aaguuaaaau aaggcuaguc 60

cguuaucaac uugaaaaagu ggcaccgagu cggugcuuuu 100

<210> 14

<211> 225

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 14

ggcttgtcgg actcttcgct attacgccag ctggcgaagg gggatgtgct gcaaggcgat 60

taagttgggt aacgccaggg ttttcccagt cacgacgtta ggaaattaat acgactcact 120

ataggatttc ggcacaagtt ggccagtttt agagctagaa atagcaagtt aaaataaggc 180

tagtccgtta tcaacttgaa aaagtggcac cgagtcggtg ctttt 225

<210> 15

<211> 225

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 15

ggcttgtcgg actcttcgct attacgccag ctggcgaagg gggatgtgct gcaaggcgat 60

taagttgggt aacgccaggg ttttcccagt cacgacgtta ggaaattaat acgactcact 120

atagggggca gaccaggaga gctgtgtttt agagctagaa atagcaagtt aaaataaggc 180

tagtccgtta tcaacttgaa aaagtggcac cgagtcggtg ctttt 225

<210> 16

<211> 102

<212> RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 16

ggauuucggc acaaguuggc caguuuuaga gcuagaaaua gcaaguuaaa auaaggcuag 60

uccguuauca acuugaaaaa guggcaccga gucggugcuu uu 102

<210> 17

<211> 102

<212> RNA

<213> Artificial Sequence (Artificial Sequence)

<400> 17

gggggcagac caggagagcu guguuuuaga gcuagaaaua gcaaguuaaa auaaggcuag 60

uccguuauca acuugaaaaa guggcaccga gucggugcuu uu 102

<210> 18

<211> 163

<212> DNA

<213> Artificial Sequence (Artificial Sequence)

<400> 18

tctacaactg gtttgccaat cggcgcaagg aagaagcatt tcggcacaag ctagcaatgg 60

acacgtacag tgggccacca cccggggcca ggtccgggcc ctgcactgcc tgtccacagc 120

tctcctggtc tgcccccaac cgccctctcc cccagtaagg tcc 163

<210> 19

<211> 162

<212> DNA

<213> Sus scrofa

<400> 19

tctacaactg gtttgccaat cggcgcaagg aagaagcatt tcggcacaag ttggccatgg 60

acacgtacag tgggccacca ccggggccag gtccgggccc tgcactgcct gcccacagct 120

ctcctggtct gcccccaacc gccctctccc ccagtaaggt cc 162

Claims

1. A method of making a recombinant cell comprising the steps of: using the nucleotide sequence of SEQ ID NO:18 to substitute the DNA molecule shown as SEQ ID NO:19 to obtain the recombinant cell.

2. The method of claim 1, wherein: using the nucleotide sequence of SEQ ID NO:18 to substitute the DNA molecule shown as SEQ ID NO:19 is as follows: co-transfecting the HNF1A-gU2, the HNF1A-gD1, the HNF1A-mutant-ss163 and the NCN protein into a pig cell; the HNF1A-gU2 is sgRNA, and the target sequence binding region is shown as SEQ ID NO:16, nucleotides 3 to 22; the HNF1A-gD1 is sgRNA, and the target sequence binding region is shown as SEQ ID NO:17 at nucleotides 3-22; the HNF1A-mutant-ss163 is SEQ ID NO:18, a single-stranded DNA molecule; the NCN protein is a Cas9 protein or a fusion protein with a Cas9 protein.

3. The method of claim 2, wherein: the NCN protein is shown as SEQ ID NO:3, respectively.

4. A method according to claim 2 or 3, characterized by: the proportions of the pig cells, the HNF1A-gU2, the HNF1A-gD1, the HNF1A-mutant-ss163 and the NCN protein are as follows in sequence: 10 ten thousand porcine cells: 0.8-1.2 μ g HNF1A-gU2:0.8-1.2 μ g HNF1A-gD1: 1.8-2.2. Mu.g HNF1A-mutant-ss163: 3-5. Mu.g NCN protein.

5. The method of claim 3, wherein:

the preparation method of the NCN protein comprises the following steps:

(1) Introducing the plasmid pKG-GE4 into escherichia coli BL21 (DE 3) to obtain recombinant bacteria;

(2) Culturing the recombinant strain by adopting a liquid culture medium at 30 ℃, then adding IPTG (isopropyl-beta-D-thiogalactoside) and carrying out induced culture at 25 ℃, and then collecting the strain;

(3) Crushing the collected thalli, and collecting a crude protein solution;

(4) Purification of the crude protein solution with His by affinity chromatography ₆ A tagged fusion protein;

(5) By using a compound having His ₆ Tagged enterokinase cleavage with His ₆ The tagged fusion protein was then removed with His using Ni-NTA resin ₆ A tagged protein, resulting in a purified NCN protein;

plasmid pKG-GE4 has the sequence shown in SEQ ID NO:1, 5209 to 9852 th nucleotide.

6. A kit comprising HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and NCN protein;

HNF1A-gU2 is HNF1A-gU2 as described in claim 2; HNF1A-gD1 is HNF1A-gD1 as described in claim 2; HNF1A-mutant-ss163 is the HNF1A-mutant-ss163 as set forth in claim 2; the NCN protein is the NCN protein described in claim 2 or 3 or 5;

the application of the kit is as follows (a), (b) or (c): (a) preparing a recombinant cell; (b) preparing a diabetes model pig; (c) Preparing a diabetes cell model or a diabetes tissue model or a diabetes organ model.

7. A kit comprises HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and PRONCN proteins;

HNF1A-gU2 is HNF1A-gU2 described in claim 2; HNF1A-gD1 is HNF1A-gD1 as described in claim 2; HNF1A-mutant-ss163 is the HNF1A-mutant-ss163 as set forth in claim 2;

the PRONCN protein sequentially comprises the following elements from upstream to downstream: signal peptide, molecular chaperone protein, protein tag, protease enzyme cutting site, nuclear localization signal, cas9 protein and nuclear localization signal;

the application of the kit is as follows (a), (b) or (c): (a) preparing a recombinant cell; (b) preparing diabetes model pigs; (c) Preparing a diabetes cell model or a diabetes tissue model or a diabetes organ model.

8. A kit comprises HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and a specific plasmid;

HNF1A-gU2 is HNF1A-gU2 as described in claim 2; HNF1A-gD1 is HNF1A-gD1 as described in claim 2; HNF1A-mutant-ss163 is the HNF1A-mutant-ss163 as set forth in claim 2;

the specific plasmid comprises the following elements from upstream to downstream in sequence: a promoter, an operator, a ribosome binding site, a PRONCN protein coding gene and a terminator; the PRONCN protein sequentially comprises the following elements from upstream to downstream: signal peptide, molecular chaperone protein, protein tag, protease enzyme cutting site, nuclear localization signal, cas9 protein and nuclear localization signal;

Application of HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and NCN protein in preparation of a kit;

HNF1A-gU2 is HNF1A-gU2 described in claim 2; HNF1A-gD1 is HNF1A-gD1 as described in claim 2; HNF1A-mutant-ss163 is the HNF1A-mutant-ss163 as set forth in claim 2; the NCN protein is the NCN protein described in claim 2 or 3 or 5;

The application of HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and PRONCN protein in the preparation of a kit;

The application of HNF1A-gU2, HNF1A-gD1, HNF1A-mutant-ss163 and the idiosyncratic particles in the preparation of the kit;

the specific plasmid sequentially comprises the following elements from upstream to downstream: a promoter, an operator, a ribosome binding site, a PRONCN protein encoding gene, and a terminator; the PRONCN protein sequentially comprises the following elements from upstream to downstream: signal peptide, molecular chaperone protein, protein tag, protease enzyme cutting site, nuclear localization signal, cas9 protein and nuclear localization signal;

12. A recombinant cell produced by the method of any one of claims 1 to 5.

13. Use of the recombinant cell of claim 12 for the preparation of diabetes model pigs.

14. The pig tissue, the pig organ or the pig cell of the diabetes model pig prepared by the recombinant cell of claim 12.

15. The recombinant cell of claim 12, the porcine tissue of claim 14, the porcine organ of claim 14, the porcine cell of claim 14, or the diabetes model pig prepared by using the recombinant cell of claim 12, wherein the recombinant cell is (d 1) or (d 2) or (d 3) or (d 4):

(d1) Screening a medicament for treating diabetes;

(d2) Evaluating the drug effect of the diabetes drug;

(d4) The pathogenesis of diabetes is studied.