CN1193040C - New human protein with the function of inhibiting tumor cell growth and its encoding sequence - Google Patents

New human protein with the function of inhibiting tumor cell growth and its encoding sequence Download PDF

Info

Publication number
CN1193040C
CN1193040C CNB001157442A CN00115744A CN1193040C CN 1193040 C CN1193040 C CN 1193040C CN B001157442 A CNB001157442 A CN B001157442A CN 00115744 A CN00115744 A CN 00115744A CN 1193040 C CN1193040 C CN 1193040C
Authority
CN
China
Prior art keywords
seq
leu
polypeptide
ala
aaa
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CNB001157442A
Other languages
Chinese (zh)
Other versions
CN1324819A (en
Inventor
顾健人
杨胜利
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cancer Institute
Original Assignee
Shanghai Cancer Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Cancer Institute filed Critical Shanghai Cancer Institute
Priority to CNB001157442A priority Critical patent/CN1193040C/en
Publication of CN1324819A publication Critical patent/CN1324819A/en
Application granted granted Critical
Publication of CN1193040C publication Critical patent/CN1193040C/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Landscapes

  • Peptides Or Proteins (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)

Abstract

The present invention discloses a novel human protein with the function of inhibiting cancer, polynucleotide for encoding the polypeptide and a method for preparing the polypeptide by a recombinant technology. The present invention also discloses a method of using the polypeptide to treat various diseases, such as cancers. The present invention also discloses an antagonist of the polypeptide and a therapeutic effect thereof. The present invention also discloses the application of the polynucleotide for encoding the human protein with the function of inhibiting cancer.

Description

People's albumen and encoding sequence thereof with anticancer growth function
Technical field
The invention belongs to biological technical field, specifically, the present invention relates to the proteic polynucleotide of people that new coding has cancer suppressing function, and the polypeptide of this polynucleotide encoding.The invention still further relates to the purposes and the preparation of these polynucleotide and polypeptide.
Background technology
The research of people's gene group is international focus at present, removes human chromosome DNA large scale sequencing, outside the method for expressed sequence order-checking (EST), also lacks the screening that begins from function and has the high-throughout method of functional gene.
Cancer is one of principal disease of harm humans health.In order to treat effectively and prophylaxis of tumours, people more and more pay close attention to genetic treatment of tumor at present.Therefore, this area presses for people's albumen and the agonist/inhibitor thereof that development research has cancer suppressing function.
Summary of the invention
The purpose of this invention is to provide the new people's protein polypeptide of a class with cancer suppressing function with and fragment, analogue and derivative.
Another object of the present invention provides the polynucleotide of these polypeptide of coding.
Another object of the present invention provides the method for these polypeptide of production and the purposes of this polypeptide and encoding sequence.
In a first aspect of the present invention, novel isolated protein polypeptide with cancer suppressing function is provided, and it comprises the polypeptide of the aminoacid sequence with the group of being selected from down: SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:8, SEQ IDNO:11, SEQ ID NO:14, SEQ ID NO:17, SEQ ID NO:20; Or its conservative property variation polypeptide or its active fragments or its reactive derivative.
Preferably, this polypeptide is the polypeptide with aminoacid sequence of the group of being selected from down: SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:8, SEQ ID NO:11, SEQ ID NO:14, SEQ ID NO:17, SEQ ID NO:20.
In a second aspect of the present invention, a kind of isolating polynucleotide are provided, it comprises a nucleotide sequence, and this nucleotide sequence is shown at least 85% homogeny with a kind of nucleotides sequence that is selected from down group: the polynucleotide of the above-mentioned protein polypeptide with cancer suppressing function of (a) encoding; (b) with polynucleotide (a) complementary polynucleotide.Preferably, the polypeptide of this polynucleotide encoding has the aminoacid sequence of the group of being selected from down: SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:8, SEQ IDNO:11, SEQ ID NO:14, SEQ ID NO:17, SEQ ID NO:20.More preferably, the sequence of these polynucleotide is selected from down group; Coding region sequence or the full length sequence of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:9, SEQ ID NO:12, SEQ ID NO:15, SEQ ID NO:18, SEQ ID NO:21.
In a third aspect of the present invention, the carrier that contains above-mentioned polynucleotide is provided, and has been transformed or host cell of transduceing or the host cell that is directly transformed or transduce by above-mentioned polynucleotide by this carrier.
In a fourth aspect of the present invention, the preparation method who prepares the polypeptide of the protein-active with cancer suppressing function is provided, this method comprises: (a) have under the proteic condition of cancer suppressing function suitable the expression, cultivate the above-mentioned host cell that is transformed or transduce: the polypeptide of (b) isolating the protein-active with cancer suppressing function from culture.
In a fifth aspect of the present invention, provide and above-mentioned protein polypeptide specificity bonded antibody with cancer suppressing function.The nucleic acid molecule that can be used for detecting also is provided, and it contains a successive 10-800 Nucleotide in the above-mentioned polynucleotide.
In a sixth aspect of the present invention, a kind of pharmaceutical composition is provided, it contains the protein polypeptide and the pharmaceutically acceptable carrier with cancer suppressing function of the present invention of safe and effective amount.These pharmaceutical compositions can be treated illnesss such as cancer and cellular abnormality propagation.
Others of the present invention are because disclosing of the technology of this paper is conspicuous to those skilled in the art.
Embodiment
The present invention adopts large-scale cDNA clone transfection cancer cells, has on the basis of cancer suppressing action in acquisition, proves new gene through order-checking, further obtains full length cDNA clone.DNA transfection evidence, the albumen with cancer suppressing function of the present invention has the effect that suppresses clone's formation to cancer cells (liver cancer cell), and its inhibiting rate is more than 50% or 50%.
As used herein, " isolating " is meant that material separates (if natural substance, primal environment promptly is a natural surroundings) from its primal environment.Do not have separation and purification as polynucleotide under the native state in the active somatic cell and polypeptide, but same polynucleotide or polypeptide as from native state with in other materials that exist separately, then for separation and purification.
As used herein, " isolating albumen or polypeptide with cancer suppressing function " is meant that the protein polypeptide with cancer suppressing function is substantially free of natural relative other albumen, lipid, carbohydrate or other material.Those skilled in the art can have the albumen of cancer suppressing function with the purified technology of protein purifying of standard.Basically pure polypeptide can produce single master tape on non-reduced polyacrylamide gel.Purity with protein polypeptide of cancer suppressing function can be used amino acid sequence analysis.
Polypeptide of the present invention can be recombinant polypeptide, natural polypeptides, synthetic polypeptide, preferred recombinant polypeptide.Polypeptide of the present invention can be the product of natural purifying, or the product of chemosynthesis, or uses recombinant technology to produce from protokaryon or eucaryon host (for example, bacterium, yeast, higher plant, insect and mammalian cell).The host used according to the recombinant production scheme, polypeptide of the present invention can be glycosylated, maybe can be nonglycosylated.Polypeptide of the present invention also can comprise or not comprise initial methionine residues.
The present invention also comprises the proteic fragment of the people with cancer suppressing function, derivative and analogue.As used herein, term " fragment ", " derivative " are meant with " analogue " and keep natural identical biological function or the active polypeptide of people's albumen with cancer suppressing function of the present invention basically.Polypeptide fragment of the present invention, derivative or analogue can be that (i) has one or more conservative or substituted polypeptide of non-conservation amino-acid residue (preferred conservative amino acid residue), and the amino-acid residue of such replacement can be also can not encoded by genetic code, or (ii) in one or more amino-acid residues, has a polypeptide of substituted radical, or (iii) mature polypeptide and another compound (such as the compound that prolongs the polypeptide transformation period, polyoxyethylene glycol for example) merge formed polypeptide, or (iv) additional aminoacid sequence is fused to this peptide sequence and the polypeptide that forms (as leader sequence or secretion sequence or be used for the sequence or the proteinogen sequence of this polypeptide of purifying).According to the instruction of this paper, these fragments, derivative and analogue belong to the known scope of those skilled in the art.
Polynucleotide of the present invention can be dna form or rna form.Dna form comprises the DNA of cDNA, genomic dna or synthetic.DNA can be strand or double-stranded.DNA can be coding strand or noncoding strand.With PP3895 albumen (in this application, its clone's numbering is adopted in proteinic name) (in this application, its clone numbering is adopted in proteinic name) be example, the coding region sequence of encoding mature polypeptide can be identical with the coding region sequence shown in the SEQ ID NO:3 or the varient of degeneracy.As used herein, " varient of degeneracy " is meant that in the present invention coding has the protein of SEQ ID NO:2, but with the differentiated nucleotide sequence of coding region sequence shown in the SEQ ID NO:3.With PP3993 albumen (in this application, its clone's numbering is adopted in proteinic name) (in this application, its clone numbering is adopted in proteinic name) be example, the coding region sequence of encoding mature polypeptide can be identical with the coding region sequence shown in the SEQ ID NO:6 or the varient of degeneracy.As used herein, " varient of degeneracy " is meant that in the present invention coding has the protein of SEQ ID NO:5, but with the differentiated nucleotide sequence of coding region sequence shown in the SEQ ID NO:6.Have the albumen of cancer suppressing function for other, can the rest may be inferred.Have the albumen of cancer suppressing function for other, can the rest may be inferred.
The polynucleotide of encoding mature polypeptide comprise: the encoding sequence of an encoding mature polypeptide; The encoding sequence of mature polypeptide and various additional code sequence; Encoding sequence of mature polypeptide (with optional additional code sequence) and non-coding sequence.
Term " polynucleotide of coded polypeptide " can be the polynucleotide that comprise this polypeptide of encoding, and also can be the polynucleotide that also comprise additional code and/or non-coding sequence.
The invention still further relates to the varient of above-mentioned polynucleotide, its coding has the polypeptide of identical aminoacid sequence or fragment, analogue and the derivative of polypeptide with the present invention.The varient of these polynucleotide can be the allelic variant of natural generation or the varient that non-natural takes place.These nucleotide diversity bodies comprise and replace varient, deletion mutation body and insert varient.As known in the art, allelic variant is the replacement form of polynucleotide, and it may be replacement, disappearance or the insertion of one or more Nucleotide, but can be from not changing the function of its encoded polypeptides in fact.
The invention still further relates to and above-mentioned sequence hybridization and two sequences between have at least 50%, preferably at least 70%, the polynucleotide of at least 80% homogeny more preferably.The present invention be more particularly directed under stringent condition and the interfertile polynucleotide of polynucleotide of the present invention.In the present invention, " stringent condition " is meant: (1) than hybridization under low ionic strength and the comparatively high temps and wash-out, as 0.2 * SSC, and 0.1%SDS, 60 ℃; Or (2) hybridization the time is added with denaturing agent, as 50% (v/v) methane amide, 0.1% calf serum/0.1%Ficoll, 42 ℃ etc.; Or (3) only at the homogeny between the two sequences at least more than 95%, be more preferably 97% and just hybridize when above.And the polypeptide of interfertile polynucleotide encoding has identical biological function and activity with the mature polypeptide shown in the SEQ IDNO:2.
The invention still further relates to nucleic acid fragment with above-mentioned sequence hybridization.As used herein, the length of " nucleic acid fragment " contains 15 Nucleotide at least, better is at least 30 Nucleotide, is more preferably at least 50 Nucleotide, preferably more than at least 100 Nucleotide.The amplification technique (as PCR) that nucleic acid fragment can be used for nucleic acid has the proteic polynucleotide of cancer suppressing function to determine and/or to separate to encode.
Polypeptide among the present invention and polynucleotide preferably provide with isolating form, more preferably are purified to homogeneous.
Dna sequence dna of the present invention can obtain with several method.For example, with hybridization technique DNA isolation well known in the art.These technology including, but not limited to: 1) with probe and genome or the hybridization of cDNA library to detect homology nucleotide sequence and 2) antibody screening of expression library to be to detect the dna fragmentation of the clone with common structure feature.
The proteic specific DNA fragment sequence that coding has cancer suppressing function produces also and can obtain with following method: 1) separate double chain DNA sequence from genomic dna; 2) the chemical synthesising DNA sequence is to obtain the double-stranded DNA of required polypeptide.
In the above-mentioned method of mentioning, isolation of genomic DNA is least commonly used.When the whole aminoacid sequence of the polypeptide product of needs was known, the direct chemical of dna sequence dna is synthetic to be the method for often selecting for use.When if required amino acid whose whole sequence is not known, the direct chemical of dna sequence dna is synthetic to be impossible, and the method for selecting for use is the separation of cDNA sequence.The standard method that separates interested cDNA is from the donorcells separating mRNA of this gene of high expression level and carries out reverse transcription, forms plasmid or phage cDNA library.Extract the existing multiple proven technique of method of mRNA, test kit also can obtain (Qiagene) from commercial channels.And the construction cDNA library also is usual method (Sambrook, et al., Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory.New York, 1989).Also can obtain the cDNA library of commercial offers, as the different cDNA library of Clontech company.When being used in combination the polymeric enzyme reaction technology, even few expression product also can be cloned.
Available ordinary method is screened gene of the present invention from these cDNA libraries.These methods include, but is not limited to: (1) DNA-DNA or DNA-RNA hybridization; (2) function of marker gene occurs or forfeiture: (3) measure the level of the proteic transcript with cancer suppressing function; (4), detect the protein product of genetic expression by immunological technique or mensuration biologic activity.Aforesaid method can singly be used, but also several different methods combined utilization.
In (1) kind method, hybridizing used probe is and any a part of homology of polynucleotide of the present invention that at least 15 Nucleotide of its length better are at least 30 Nucleotide, are more preferably at least 50 Nucleotide, preferably at least 100 Nucleotide.In addition, the length of probe within 2kb, preferably is within the 1kb usually.Probe used herein is the dna sequence dna of chemosynthesis on the basis of gene DNA sequence information of the present invention normally.Gene of the present invention itself or fragment are certainly as probe.The mark of dna probe can be used radio isotope, fluorescein or enzyme (as alkaline phosphatase) etc.
In (4) kind method, detect the protein product of protein gene expression and can use immunological technique such as Western blotting, radioimmunoprecipitation, enzyme-linked immunosorbent assay (ELISA) etc. with cancer suppressing function.
Use method (Saiki, the et al.Science 1985 of round pcr DNA amplification/RNA; 230:1350-1354) be optimized for acquisition gene of the present invention.When particularly being difficult to obtain the cDNA of total length from the library, can preferably use RACE method (the terminal rapid amplifying method of RACE-cDNA), the primer that is used for PCR can suitably be selected according to sequence information of the present invention disclosed herein, and available ordinary method is synthetic.Available ordinary method is as the DNA/RNA fragment by gel electrophoresis separation and purifying amplification.
The gene of the present invention that obtains as mentioned above, perhaps the available ordinary method of mensuration of the nucleotide sequence of various dna fragmentations etc. such as dideoxy chain termination (Sanger et al.PNAS, 1977,74:5463-5467).This class nucleotide sequencing is available commercial sequencing kit etc. also.In order to obtain the cDNA sequence of total length, order-checking need be carried out repeatedly.Sometimes need to measure a plurality of clones' cDNA sequence, just can be spliced into the cDNA sequence of total length.
The present invention also relates to comprise the carrier of polynucleotide of the present invention, and the host cell that produces through genetically engineered with carrier of the present invention or albumen coded sequence with cancer suppressing function, and the method that produces polypeptide of the present invention through recombinant technology.
By the recombinant DNA technology of routine, can utilize polymerized nucleoside acid sequence of the present invention to can be used to express or produce the protein polypeptide with cancer suppressing function (Science, 1984 of reorganization; 224:1431).In general following steps are arranged:
(1). have the proteic polynucleotide of people (or varient) of cancer suppressing function with coding of the present invention, or transform or the transduction proper host cell with the recombinant expression vector that contains these polynucleotide;
(2). the host cell of in suitable medium, cultivating;
(3). separation, protein purification from substratum or cell.
Among the present invention, the people's albumen polynucleotide sequence with cancer suppressing function can be inserted in the recombinant expression vector.Term " recombinant expression vector " refers to that bacterial plasmid well known in the art, phage, yeast plasmid, vegetable cell virus, mammalian cell virus are as adenovirus, retrovirus or other carriers.The carrier of Shi Yonging includes but not limited in the present invention: and the expression vector based on T7 of in bacterium, expressing (Rosenberg, et al.Gene, 1987,56:125); The pMSXND expression vector of in mammalian cell, expressing (Lee and Nathans, J Bio Chem.263:3521,1988) and at the carrier that derives from baculovirus of expressed in insect cells.In a word, as long as can duplicate in host and stablize, any plasmid and carrier can be used.A key character of expression vector is to contain replication orgin, promotor, marker gene and translation controlling elements usually.
Method well-known to those having ordinary skill in the art can be used to make up and contains people's encoding histone dna sequence dna with cancer suppressing function and suitable transcribing/the translate expression vector of control signal.These methods comprise (Sambroook, et al.Molecular Cloning, a Laboratory Manual, coldSpring Harbor Laboratory.New York, 1989) such as extracorporeal recombinant DNA technology, DNA synthetic technology, the interior recombinant technologys of body.Described dna sequence dna can effectively be connected on the suitable promotor in the expression vector, and is synthetic to instruct mRNA.The representative example of these promotors has: colibacillary lac or trp promotor; Lambda particles phage P LPromotor; Eukaryotic promoter comprises LTRs and some other known may command gene expression promoter in protokaryon or eukaryotic cell or its virus of CMV immediate early promoter, HSV thymidine kinase promoter, early stage and late period SV40 promotor, retrovirus.Expression vector also comprises ribosome bind site and the transcription terminator that translation initiation is used.
In addition, expression vector preferably comprises one or more selected markers, to be provided for selecting the phenotypic character of transformed host cells, cultivate Tetrahydrofolate dehydrogenase, neomycin resistance and the green fluorescent protein (GFP) of usefulness as eukaryotic cell, or be used for colibacillary tsiklomitsin or amicillin resistance.
Comprise the carrier of above-mentioned suitable dna sequence dna and suitable promotor or control sequence, can be used to transform appropriate host cell, so that it can marking protein.
Host cell can be a prokaryotic cell prokaryocyte, as bacterial cell; Or eukaryotic cell such as low, as yeast cell; Or higher eucaryotic cells, as mammalian cell.Representative example has: intestinal bacteria, streptomyces; The bacterial cell of Salmonella typhimurium; Fungal cell such as yeast; Vegetable cell; The insect cell of fruit bat S2 or Sf9; The zooblast of CHO, COS or Bowes melanoma cells etc.
When polynucleotide of the present invention are expressed in higher eucaryotic cells, be enhanced if will make to transcribe when in carrier, inserting enhancer sequence.Enhanser is the cis acting factor of DNA, and nearly 10 to 300 base pairs act on promotor transcribing with enhancing gene usually.Can for example be included in the SV40 enhanser of 100 to 270 base pairs of replication origin side in late period one, at the polyoma enhanser of replication origin side in late period one and adenovirus enhanser etc.
Persons skilled in the art all know how to select appropriate carriers, promotor, enhanser and host cell.
Can carry out with routine techniques well known to those skilled in the art with the recombinant DNA transformed host cell.When the host was prokaryotic organism such as intestinal bacteria, the competent cell that can absorb DNA can be used CaCl in exponential growth after date results 2Method is handled, and used step is well-known in this area.Alternative is to use MgCl 2If desired, transforming also the method for available electroporation carries out.When the host is an eukaryote, can select following DNA transfection method for use: coprecipitation of calcium phosphate method, conventional mechanical method such as microinjection, electroporation, liposome packing etc.
The transformant that obtains can be cultivated with ordinary method, expresses the polypeptide of coded by said gene of the present invention.According to used host cell, used substratum can be selected from various conventional substratum in the cultivation.Under the condition that is suitable for the host cell growth, cultivate.After host cell grows into suitable cell density, induce the promotor of selection with suitable method (as temperature transition or chemical induction), cell is cultivated for some time again.
Recombinant polypeptide in the above methods can wrap by in cell, extracellular or on cytolemma, express or be secreted into the extracellular.If desired, can utilize its physics, the separating by various separation methods with other characteristic and the albumen of purification of Recombinant of chemistry.These methods are well-known to those skilled in the art.The example of these methods includes, but are not limited to: conventional renaturation handles, with protein precipitant handle (salt analysis method), centrifugal, the broken bacterium of infiltration, superly handle, the combination of super centrifugal, sieve chromatography (gel-filtration), adsorption chromatography, ion exchange chromatography, high performance liquid chromatography (HPLC) and other various liquid chromatography (LC) technology and these methods.
The people's albumen or the polypeptide with cancer suppressing function of reorganization are of use in many ways.These purposes include, but is not limited to: directly have the disease due to the low or forfeiture of the protein function of cancer suppressing function as pharmacological agent and be used to screen and promote or antagonism has antibody, polypeptide or other part of the protein function of cancer suppressing function.For example, antibody can be used for activating or suppressing to have the proteic function of people of cancer suppressing function.The people's protein screening peptide library that has a cancer suppressing function with the reorganization of expressing can be used for seeking the peptide molecule that can suppress or stimulate the people's protein function with cancer suppressing function of therapeutic value.
The present invention also provides screening of medicaments to improve (agonist) or check the method that (antagonist) has the proteic medicament of people of cancer suppressing function to identify.Agonist improves the biological function such as stimulate cellular proliferation of the people's albumen with cancer suppressing function, and antagonist prevention disorder such as the various cancer relevant with cell hyperproliferation with treatment.For example, can be in the presence of medicine, the proteic film preparation of people that mammalian cell or expression is had cancer suppressing function is cultivated with the people's albumen with cancer suppressing function of mark.Measure the medicine raising then or check this interactional ability.
The proteic antagonist of people with cancer suppressing function comprises antibody, compound, acceptor disappearance thing and the analogue etc. that filter out.The proteic antagonist of people with cancer suppressing function can and be eliminated its function with the people's protein binding with cancer suppressing function, or suppresses to have the proteic generation of people of cancer suppressing function, or combines with the avtive spot of polypeptide and to make polypeptide can not bring into play biological function.The proteic antagonist of people with cancer suppressing function can be used for therepic use.
In screening during as the compound of antagonist, the albumen that can have a cancer suppressing function adds during bioanalysis measures, and determines by measuring albumen and the interaction between its acceptor that compounds affect has cancer suppressing function whether compound is antagonist.With the same quadrat method of above-mentioned SCREENED COMPOUND, can filter out the acceptor disappearance thing and the analogue of antagonist action.
Polypeptide of the present invention can be directly used in disease treatment, for example, and various malignant tumours and cellular abnormality propagation etc.
Polypeptide of the present invention, and fragment, derivative, analogue or their cell can be used as antigen to produce antibody.These antibody can be polyclone or monoclonal antibody.Polyclonal antibody can obtain by the method with this polypeptide direct injection animal.The technology of preparation monoclonal antibody comprises hybridoma technology, three knurl technology, people B-quadroma technology, EBV-hybridoma technology etc.
Can be with polypeptide of the present invention and antagonist and suitable pharmaceutical carrier combination back use.These carriers can be water, glucose, ethanol, salt, damping fluid, glycerine and their combination.Composition comprises the polypeptide or the antagonist of safe and effective amount and carrier and the vehicle that does not influence effect of drugs.These compositions can be used as medicine and are used for disease treatment.
The present invention also provides medicine box or the test kit that contains one or more containers, and one or more medicinal compositions compositions of the present invention are housed in the container.With these containers, can have by the given indicative prompting of government authorities of making, using or selling medicine or biological products, the government authorities that this prompting reflects production, uses or sells permits it to use on human body.In addition, polypeptide of the present invention can be used in combination with other treatment compound.
Pharmaceutical composition can be with mode administration easily, as by in part, intravenously, intraperitoneal, intramuscular, subcutaneous, the nose or the route of administration of intracutaneous.Albumen with cancer suppressing function comes administration with the amount that treats and/or prevents concrete indication effectively.The proteic amount with cancer suppressing function and the dosage range that are applied to the patient will depend on many factors, as administering mode, person's to be treated healthiness condition and diagnostician's judgement.
The proteic polynucleotide of people with cancer suppressing function also can be used for multiple therapeutic purpose.Gene therapy technology can be used for treating since have that the proteic nothing of cancer suppressing function is expressed or the proteic expression with cancer suppressing function of unusual/non-activity due to cell proliferation, growth or metabolic disturbance.The albumen with cancer suppressing function that the gene therapy vector (as virus vector) of reorganization can be designed to express variation is to suppress endogenic protein-active with cancer suppressing function.For example, a kind of albumen with cancer suppressing function of variation can be the albumen with cancer suppressing function that shortens, lacked signal conduction function territory, though can combine with the substrate in downstream, lacks signaling activity.Therefore the gene therapy vector of reorganization can be used for treating the protein expression with cancer suppressing function or the disease of active caused by abnormal.Deriving from the expression vector of virus such as protein gene that retrovirus, adenovirus, adeno-associated virus (AAV), hsv, parvovirus etc. can be used for having cancer suppressing function is transferred in the cell.The method that structure carries the recombinant viral vector of the protein gene with cancer suppressing function is found in existing document (Sambrook, et al.).The people protein gene of reorganization with cancer suppressing function can be packaged in the liposome and be transferred in the cell in addition.
Suppress to have cancer suppressing function people's protein mRNA oligonucleotide (comprising sense-rna and DNA) and ribozyme also within the scope of the invention.Ribozyme is the enzyme sample RNA molecule that a kind of energy specificity is decomposed specific RNA, and its mechanism of action is to carry out the endonuclease effect after ribozyme molecule and the hybridization of complementary target RNA-specific.The RNA of antisense and DNA and ribozyme can obtain with existing any RNA or DNA synthetic technology, as the technology widespread use of solid phase phosphoamide chemical synthesis synthetic oligonucleotide.Antisense rna molecule can be transcribed acquisition by the dna sequence dna of this RNA that encodes in external or body.This dna sequence dna has been incorporated into the downstream of rna polymerase promoter of carrier.In order to increase the stability of nucleic acid molecule, available several different methods is modified it, and as increasing the sequence length of both sides, the connection between the ribonucleoside is used phosphoric acid thioester bond or peptide bond but not phosphodiester bond.
Polynucleotide imports tissue or intracellular method comprises: directly be injected into polynucleotide in the in-vivo tissue; Or external by carrier (as virus, phage or plasmid etc.) earlier with the polynucleotide transfered cell in, again cell is transplanted in the body etc.
Polypeptide of the present invention also can be used as the peptide spectrum analysis, for example, the polypeptide available physical, chemistry or enzyme carry out the specificity cutting, and carry out the two-dimentional or three-dimensional gel electrophoresis analysis of one dimension.
The present invention also provides the antibody at the people's proteantigen determinant with cancer suppressing function.These antibody include, but is not limited to: the fragment that polyclonal antibody, monoclonal antibody, chimeric antibody, single-chain antibody, Fab fragment and Fab expression library produce.
The anti-proteic antibody of people with cancer suppressing function can be used in the immunohistochemistry technology, detects the people's albumen with cancer suppressing function in the biopsy specimen.
With the also available labelled with radioisotope of the protein bound monoclonal antibody of the people with cancer suppressing function, inject in the body and can follow the tracks of its position and distribution.This radiolabeled antibody can be used as a kind of atraumatic diagnostic method and is used for the location of tumour cell and has judged whether transfer.
Antibody among the present invention can be used for treating or prevents and the relevant disease of people's albumen with cancer suppressing function.The antibody that gives suitable dosage can stimulate or block proteic generation of the people with cancer suppressing function or activity.
Antibody also can be used for designing the immunotoxin at a certain privileged sites in the body.As have cancer suppressing function people's albumen high-affinity monoclonal antibody can with bacterium or plant poison (as diphtheria toxin, ricin, abrine etc.) covalent attachment.A kind of usual method is with sulfydryl linking agent such as SPDP, attacks the amino of antibody, by the exchange of disulfide linkage, toxin is incorporated on the antibody, and this hybrid antibody can be used for killing the cell of the people's protein positive with cancer suppressing function.
Available people's albumen or the polypeptide immune animal of the production of polyclonal antibody with cancer suppressing function, as rabbit, mouse, rat etc.Multiple adjuvant can be used for the enhancing immunity reaction, includes but not limited to freund's adjuvant etc.
Have cancer suppressing function people's protein monoclonal antibody can with hybridoma technology production (Kohler and Milstein.Nature, 1975,256:495-497).With the variable region bonded chimeric antibody in human constant region and inhuman source can with existing technology production (Morrison et al, PNAS, 1985,81:6851).And the technology of existing manufacture order chain antibody (U.S.PatNo.4946778) also can be used for producing the anti-proteic single-chain antibody of people with cancer suppressing function.
Can be incorporated into the rondom polypeptide storehouse that solid formation forms by the various amino acid that may make up by screening with the protein bound peptide molecule of the people with cancer suppressing function obtains.During screening, must carry out mark to people's protein molecular with cancer suppressing function.
The invention still further relates to quantitatively and detection and localization has the diagnostic testing process of people's protein level of cancer suppressing function.These tests are known in the art, and comprise that FISH measures and radioimmunoassay.The people's protein level that is detected in the test with cancer suppressing function, the disease that can have the importance of people's albumen in various diseases of cancer suppressing function with laying down a definition and be used to diagnose albumen to work with cancer suppressing function.
Proteic polynucleotide with cancer suppressing function can be used for having the diagnosis and the treatment of the protein related diseases of cancer suppressing function.Aspect diagnosis, the proteic polynucleotide with cancer suppressing function can be used for detecting have cancer suppressing function proteic expression whether or under morbid state, have an abnormal exprssion of cancer suppressing function.As the protein D NA sequence with cancer suppressing function can be used for the hybridization of biopsy specimen is had with judgement the proteic abnormal expression of cancer suppressing function.Hybridization technique comprises the Southern blotting, Northern blotting, in situ hybridization etc.These technological methods all are disclosed mature technologies, and relevant test kit all can obtain from commercial channels.Part or all of polynucleotide of the present invention can be used as probe stationary on microarray (Microarray) or DNA chip (being called " gene chip " again), is used for analyzing the differential expression analysis and the gene diagnosis of tissue gene.Carry out RNA-polymerase chain reaction (RT-PCR) amplification in vitro with the special primer of the albumen with cancer suppressing function and also can detect proteic transcription product with cancer suppressing function.
The sudden change that detection has the protein gene of cancer suppressing function also can be used for diagnosing the relevant disease of albumen with cancer suppressing function.Form with protein mutation of cancer suppressing function comprises that to have point mutation that the protein D NA sequence of cancer suppressing function compares, transposition, disappearance, reorganization and other any unusual etc. with normal wild type.Available existing technology such as Southern blotting, dna sequence analysis, PCR and in situ hybridization detect sudden change.In addition, sudden change might influence proteic expression, therefore can judge indirectly that with Northern blotting, Western blotting gene has or not sudden change.
Sequence of the present invention identifies it also is valuable to karyomit(e).This sequence can be specifically at certain bar human chromosome particular location and and can with its hybridization.At present, need to identify the concrete site of each gene on the karyomit(e).Now, have only chromosomal marker thing seldom to can be used for the marker chromosomes position based on actual sequence data (repetition polymorphism).According to the present invention, for these sequences are associated with disease related gene, its important the first step is positioned these dna sequence dnas on the karyomit(e) exactly.
In brief, prepare PCR primer (preferred 15-35bp), sequence can be positioned on the karyomit(e) according to cDNA.Then, these primers are used for the somatocyte hybrid cell that the PCR screening contains each bar human chromosome.Have only those hybrid cells that contain corresponding to the people's gene of primer can produce the fragment of amplification.
The PCR localization method of somatocyte hybrid cell is that DNA is navigated to concrete chromosomal quick method.Use Oligonucleolide primers of the present invention,, can utilize one group to realize inferior location from specific chromosomal fragment or a large amount of genomic clone by similar approach.Other the similar strategy that can be used for chromosomal localization comprises in situ hybridization, uses the karyomit(e) prescreen and the hybridization preliminary election of the airflow classification of mark, thereby makes up the special cDNA storehouse of karyomit(e).
The cDNA clone is carried out fluorescence in situ hybridization (FISH) with Metaphase Chromosome, can in a step, accurately carry out chromosomal localization.The summary of this technology is referring to Verma etc., Human Chromosomes:a Manual of BasicTechniques, Pergamon Press, New York (1988).
In case sequence is positioned to chromosome position accurately, the physical location of this sequence on karyomit(e) just can be associated with the gene map data.These data for example are found in, V.Mckusick, Mendelian Inheritance in Man (can by with the online acquisition of Johns Hopkins University Welch Medical Library).Can pass through linkage analysis then, determine gene and navigated to relation between the disease on the chromosomal region already.
Then, need to measure ill and not cDNA between diseased individuals or genome sequence difference.If observe certain sudden change in some or all of diseased individuals, and this sudden change is not observed in any normal individual, then this sudden change may be the cause of disease of disease.More ill and diseased individuals not is usually directed at first seek the variation of structure in the karyomit(e), as from the horizontal visible of karyomit(e) or use based on detectable disappearance of the PCR of cDNA sequence or transposition.Resolving power according to present physical mapping and assignment of genes gene mapping technology, being accurately positioned to the cDNA of the chromosomal region relevant with disease, can be a kind of (the supposing that 1 megabasse mapping resolving power and every 20kb are corresponding to a gene) between 50 to 500 potential Disease-causing genes.
Pyrenoids thuja acid full length sequence or its fragment with cancer suppressing function of the present invention can obtain with the method for pcr amplification method, recombination method or synthetic usually.For the pcr amplification method, can be disclosed according to the present invention about nucleotide sequence, especially open reading frame sequence designs primer, and with commercially available cDNA storehouse or by the prepared cDNA storehouse of ordinary method well known by persons skilled in the art as template, amplification and must relevant sequence.When sequence is longer, usually needs to carry out twice or pcr amplification repeatedly, and then the fragment that each time amplifies is stitched together by proper order.
In case obtained relevant sequence, just can obtain relevant sequence in large quantity with recombination method.This normally is cloned into carrier with it, changes cell again over to, separates obtaining relevant sequence then from the host cell after the propagation by ordinary method.
In addition, also the method for available synthetic is synthesized relevant sequence, especially fragment length more in short-term.Usually, by first synthetic a plurality of small segments, and then connect and to obtain the very long fragment of sequence.
At present, can be fully come the dna sequence dna of code book invention albumen (or its fragment, or derivatives thereof) by chemosynthesis.This dna sequence dna can be introduced then in the various dna moleculars (as carrier) and cell in this area.In addition, also can will suddenly change and introduce in the protein sequence of the present invention by chemosynthesis.
In addition, because the albumen with cancer suppressing function of the present invention has the natural acid sequence that is derived from the people, therefore, compare with the albumen of the same clan that derives from other species, estimate to have higher active and/or lower side effect (for example in the intravital immunogenicity of people lower or do not have) being applied to man-hour.
Below in conjunction with specific embodiment, further set forth the present invention.Should be understood that these embodiment only to be used to the present invention is described and be not used in and limit the scope of the invention.The experimental technique of unreceipted actual conditions in the following example, usually according to people such as normal condition such as Sambrook, molecular cloning: laboratory manual (New York:Cold Spring Harbor LaboratoryPress, 1989) condition described in, or the condition of advising according to manufacturer.
The acquisition of embodiment 1:cDNA gene and the restraining effect that the cancer cells clone is formed
PP3895, PP3993, PP4052, PP4068, PP4135, PP4189, PP2500 obtains by making up the human placenta cDNA library with ordinary method.Get the placenta tissue at 3,6,10 monthly ages, (GIBCO BRL company) extracts total RNA by manufacturer's specification sheets with Trizol reagent, extracts mRNA with the mRNA test kit (Pharmacia company) of purifying.Make up the cDNA library of above-mentioned mRNA with pCMV-script TMXR cDNA library construction test kit (Stratagene company).Wherein ThermoScript II is used MMLV-RT-Superscript II (GIBCO BRL) instead, and reverse transcription reaction carries out at 42 ℃.Transform XL 10-Gold recipient cell, obtained 1 * 10 6The cDNA library of cfu/ μ g cDNA titre.The first round is picking cDNA clone at random, is probe with high abundance cDNA clone with the cDNA clone who has proved cancer inhibitor cell growth function thereafter, screening by hybridization cDNA library, weak positive and negative clone of picking.With Qiagen96 orifice plate plasmid extraction test kit, carry out the extraction of plasmid DNA by shop instruction.Plasmid DNA and empty carrier transfection simultaneously hepatoma cell line 7721.After the 100ng DNA alcohol precipitation drying, add 6 μ l H 2Transfection is treated in the O dissolving.Add 0.74 μ l liposome and 9.3 μ l serum-free mediums in every part of DNA sample, behind the mixing, room temperature was placed 10 minutes.Add 150 μ l serum-free mediums in every pipe, divide equally and add 3 holes and grow in 7721 cells of 96 orifice plates, placed 2 hours for 37 ℃, every hole adds 50 μ l serum-free mediums again, 37 ℃ 24 hours.Every hole is changed 100 μ l and is trained liquid entirely, 37 ℃ 24 hours, change the full training liquid 100 μ l that contain G418,37 ℃ 24~48 hours, the limit is observed, the training liquid that G418 concentration does not wait is changed on the limit.After about 2~3 times, there is the clone to form up to the microscopy cell, counting.Find that above clone has the cell clone of inhibition formation effect, the result is as shown in the table.
CDNA clone's transfectional cell (7721) clone formation situation
CDNA clones title C DNA cloning number (three repetitions) Empty carrier clone number (three repetitions)
PP3895 3 9 0 33 34 38
PP3993 17 8 12 35 24 32
PP4052 0 1 0 16 20 18
PP4068 0 1 1 27 30 23
PP4135 14 12 10 27 30 23
PP4189 6 5 9 27 30 23
PP2500 0 2 0 28 30 27
The cDNA clone is adopted two deoxidation cessation method, on the ABI377 automatic dna sequencer, measure the nucleotide sequence of the nearly 500bp of one end.After the analysis, be defined as novel gene cloning, carry out the other end order-checking.As obtaining full length cDNA sequence not yet, then design primer, check order once more, up to obtaining full length sequence (SEQ ID NO:1,4,7,10,13,16,19).
Embodiment 2: obtain gene clone by PCR from placenta cDNA:
Get the human placenta at 3,6,10 monthly ages, (GIBCO BRL company) extracts total RNA by manufacturer's specification sheets with Trizol reagent, extracts mRNA with the mRNA test kit (Pharmacia company) of purifying.Carry out reverse transcription reaction with MMLV-RT-SuperscriptH (GIBCO BRL) ThermoScript II at 42 ℃, obtain placenta cDNA.Utilize the different primer of commentaries on classics (as shown in the table) of each gene, by 90 ℃ of 3 ' 1 circulations.94 ℃ 30 seconds, 60 ℃ 30 seconds 72 ℃ 1 minute, totally 35 circulations, pcr amplification is carried out in 72 ℃ of 10 ' 1 circulations, obtains to contain the amplified production of each protein gene of complete open reading frame sequence.Amplified production is through sequence verification, and the sequence that records with embodiment 1 conforms to, and changes amplified production over to host cell with routine techniques subsequently, obtains recombinant protein.
The gene specific primer sequence
Clone's title Special primer 1 (5 ' → 3 ') Special primer 2 (5 ' → 3 ')
PP3895 GGTTTACTGACACCCCCACCCCA GCGCCCGGCCTCTTTTTATCCTT
PP3993 ACTTGCATTTGCCCTGACACCCA GTTCTGCTTGGCCGAGCTGTTGA
PP4052 GTGTATGCTGCCCCCTTTCTGGG ACAGGATGGTAGTGGCGATGGCA
PP4068 CTGGGCCCAAGGACAAAGCTCAC ATCATGGGGCATGCACAGCATCT
PP4135 TGCCCCTAACCACTGAGACAGCA CAACTGCACATTTTGCTCATGTA
PP4189 TCAAGGTTGCTCTCCAGCTCAAGG GTTATTAGGCCCACCACTAAGAG
PP2500 CTTGCTGCTCTTCTCGTTCCCGA GCAGGGTCCTGGAACTTCTTGGC
Embodiment 3:cDNA cloned sequence is analyzed
1.PP3895 albumen
A: nucleotide sequence (SEQ ID NO:1) length: 1972bp
1 GGTTTAGTGA CACCCCCACC CCACCCCATC TGCATATTTT TTCACCACCC
51 CTCCCTTCTG TATATGATGC TTCTGTAGCT CTGTAACGCC CCCTACATTT
101 ACCTTCCTTA TATCTCCCCC GTCTTCCTCT CCATAGATCT CCTCCCATTT
151 CCCCTTCCAT GGTCCCCATC TTCCTTCTGA AATGTCTACT CCTTCATGTT
201 CCTTTATGTA TGTCTTCCAA TCTTTCCTTC CATAGCTCTC ATCACCTTCA
251 TATATTTCTT CCATCTTTCT CCTCCCACCT GCCTCGCCCT CTGTATATAC
301 CCCCACTCTC CCCCTTTTAT ATCTTCTCCA TCTCCCCCCA TATCTTTCCT
351 CTATGTCCAC ATCTGTGTAT TCCCCCCAAC TTCCCCTCCA TATATCTTTT
401 TTTACTCCCC TTTTCCTCCC TGTATCCTCT GTGTTCCCCC CATCTTGCTC
451 TACATCATTC TTCCCAAGAT CTTTACGTCT CCCATCTTGA TCTCTCCATC
501 TCCACTTTCT CCTAACATTT TCATTTCCGT TCCTTAGTGT CTCTAGAGAG
551 ATCATTCTTG ATAGCCTCAG CTCTTTCTCT GTGTTTTTCA GGTTTGTATT
601 CTGCTCTGCT CTACCTCTCC TCCTTGCCCC TTTTCTCTCC CAGGATGTCT
651 CTCCTTTCCA AATCCTTTTT GTACCTGAAT ACCTTTTGCC CCACCCTGGG
701 CTCTCATTTC CATCTCAGAC CTTAGCCTGG GATCTAAAGG GCTGACAGTG
751 TCCCTTTCTT CATGCAGATG ACAGTCGTCT AGAGGAGCTC AAAGCCACTC
801 TGCCCAGCCC AGACAAGCTC CCTGGATTCA AGATGTACCC CATTGACTTT
851 GAGAAGGTAT GGGGTGGGGC TCAGGACAGG GAAGGAGGAT GGGCAAAGCA
901 TAGACAGGCT GGAGAAAACA GAAGTATCTG GAGCCAGCCC CGGGCCTTTG
951 TGGGGATCAG ATTGTGGGCC TGCCATATGG CTCTGAATGA GTAGGTGTTC
1001 CCAGCCATCC CTTTGTGATC TGGGAGAGTC CAGCAGGCAA TTGCAGTGGA
1051 GGATACACAT CTTCTTTATC TGATCCTCTC CCCACTGCCT TCACACCCTC
1101 CCCACTCATA ACAGGATGAT GACAGCAACT TTCATATGGA TTTCATCGTG
1151 GCTGCATCCA ACCTCCGGGC AGAAAACTAT GACATTCCTT CTGCAGACCG
1201 GCACAAGAGC AAGCTGATTG CAGGGAAGAT CATCCCAGCC ATTGCCACGA
1251 CCACAGCAGC CGTGGTTGGC CTTGTGTGTC TGGAGCTGTA CAAGGTTGTG
1301 CAGGGGCACC GACAGCTTGA CTCCTACAAG AATGGTTTCC TCAACTTGGC
1351 CCTGCCTTTC TTTGGTTTCT CTGAACCCCT TGCCGCACCA CGTCACCAGG
1401 TGGGGGCCTG CATCCGAAGC AGGGTTTGGG TGGGGTGTAT CTGTGTAGAT
1451 CTGGTTCTGA TTCACGTCAT ACCCTGTCAC CAGGGGAGGG TTTCTGTCTG
1501 TGTACCTACC CTTTTTGTGT ATCCTTTTTC ACTTATTCAT TAATCACATT
1551 ATTTGAGTAC GTGCGAAAAG ATGGGATATT TGAATTGTGC CCTGGGAGAT
1601 TATTAGTAAC TACACAATAA TGGCAGCCAA AATTTATTGG ACGCTTCCTA
1651 CACTTAAGTG CTTTGCTTGC TTCATTAATG AATTCACTCA AATATTTATT
1701 GAGCACCTTT TGTGGGCAGG GACTCTTCTA AGTTATGTTC CTCAAGTAGA
1751 TTATATAAAT AACCTATTAA ATGATTTTGG AATCAAAAAA GGATAAAAAG
1801 AGGCCGGGCG CGGTGGCTTA CGCCTGTAAT CCCAGCACTT TGGGAGGCCG
1851 AGGCACGTGG TTCACCTGAA GTCAGGAGTT TGAGACCAGC CTGGCCAACA
1901 TGATGAAACC CTGTCTCTAC TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
1951 AAAAAAAAAA AAAAAAAAAA AA
B: aminoacid sequence (SEQ ID NO:2) length: 135 amino acid
1 MDFIVAASNL RAENYDIPSA DRHKSKLIAG KIIPAIATTT AAVVGLVCLE LYKVVQGHRQ
61 LDSYKNGFLN LALPFFGFSE PLAAPRHQVG ACIRSRVWVG CICVDLVLIH VIPCHQGRVS
121 VCVPTLFVYP FSLIH
C. Nucleotide and amino acid composite sequence (SEQ ID NO:3)
Clone number: PP3895
Start code: 1136 ATG stop coding: 1543 TAA
Protein molecular weight: 14799
1 G GTT TAC TGA CAC CCC CAC CCC ACC CCA TCT GCA TAT TTT TTC ACC 46
47 ACC CCT CCC TTC TGT ATA TGA TGC TTC TGT AGC TCT GTA ACG CCC CCT 94
95 ACA TTT ACC TTC CTT ATA TCT CCC CCG TCT TCC TCT CCA TAG ATC TCC 142
143 TCC CAT TTC CCC TTC CAT GGT CCC CAT CTT CCT TCT GAA ATG TCT ACT 190
191 CCT TCA TGT TCC TTT ATG TAT GTC TTC CAA TCT TTC CTT CCA TAG CTC 238
239 TCA TCA CCT TCA TAT ATT TCT TCC ATC TTT CTC CTC CCA CCT GCC TCG 286
287 CCC TCT GTA TAT ACC CCC ACT CTC CCC CTT TTA TAT CTT CTC CAT CTC 334
335 CCC CCA TAT CTT TCC TCT ATG TCC ACA TCT GTG TAT TCC CCC CAA CTT 382
383 CCC CTC CAT ATA TCT TTT TTT ACT CCC CTT TTC CTC CCT GTA TCC TCT 430
431 GTG TTC CCC CCA TCT TGC TCT ACA TCA TTC TTC CCA AGA TCT TTA CGT 478
479 CTC CCA TCT TGA TCT CTC CAT CTC CAC TTT CTC CTA ACA TTT TCA TTT 526
527 CCG TTC CTT AGT GTC TCT AGA GAG ATC ATT CTT GAT AGC CTC AGC TCT 574
575 TTC TCT GTG TTT TTC AGG TTT GTA TTC TGC TCT GCT CTA CCT CTC CTC 622
623 CTT GCC CCT TTT CTC TCC CAG GAT GTC TCT CCT TTC CAA ATC CTT TTT 670
671 GTA CCT GAA TAC CTT TTG CCC CAC CCT GGG CTC TCA TTT CCA TCT CAG 718
719 ACC TTA GCC TGG GAT CTA AAG GGC TGA CAG TGT CCC TTT CTT CAT GCA 766
767 GAT GAC AGT CGT CTA GAG GAG CTC AAA GCC ACT CTG CCC AGC CCA GAC 814
815 AAG CTC CCT GGA TTC AAG ATG TAC CCC ATT GAC TTT GAG AAG GTA TGG 862
863 GGT GGG GCT CAG GAC AGG GAA GGA GGA TGG GCA AAG CAT AGA CAG GCT 910
911 GGA GAA AAC AGA AGT ATC TGG AGC CAG CCC CGG GCC TTT GTG GGG ATC 958
959 AGA TTG TGG GCC TGC CAT ATG GCT CTG AAT GAG TAG GTG TTC CCA GCC 1006
1007 ATC CCT TTG TGA TCT GGG AGA GTC CAG CAG GCA ATT GCA GTG GAG GAT 1054
1055 ACA CAT CTT CTT TAT CTG ATC CTC TCC CCA CTG CCT TCA CAC CCT CCC 1102
1103 CAC TCA TAA CAG GAT GAT GAC AGC AAC TTT CAT ATG GAT TTC ATC GTG 1150
1 Met Asp Phe Ile Val 5
1151 GCT GCA TCC AAC CTC CGG GCA GAA AAC TAT GAC ATT CCT TCT GCA GAC 1198
6 Ala Ala Ser Asn Leu Arg Ala Glu Asn Tyr Asp Ile Pro Ser Ala Asp 21
1199 CGG CAC AAG AGC AAG CTG ATT GCA GGG AAG ATC ATC CCA GCC ATT GCC 1246
22 Arg His Lys Ser Lys Leu Ile Ala Gly Lys Ile Ile Pro Ala Ile Ala 37
1247 ACG ACC ACA GCA GCC GTG GTT GGC CTT GTG TGT CTG GAG CTG TAC AAG 1294
38 Thr Thr Thr Ala Ala Val Val Gly Leu Val Cys Leu Glu Leu Tyr Lys 53
1295 GTT GTG CAG GGG CAC CGA CAG CTT GAC TCC TAC AAG AAT GGT TTC CTC 1342
54 Val Val Gln Gly His Arg Gln Leu Asp Ser Tyr Lys Asn Gly Phe Leu 69
1343 AAC TTG GCC CTG CCT TTC TTT GGT TTC TCT GAA CCC CTT GCC GCA CCA 1390
70 Asn Leu Ala Leu Pro Phe Phe Gly Phe Ser Glu Pro Leu Ala Ala Pro 85
1391 CGT CAC CAG GTG GGG GCC TGC ATC CGA AGC AGG GTT TGG GTG GGG TGT 1438
86 Arg His Gln Val Gly Ala Cys Ile Arg Ser Arg Val Trp Val Gly Cys 101
1439 ATC TGT GTA GAT CTG GTT CTG ATT CAC GTC ATA CCC TGT CAC CAG GGG 1486
102 Ile Cys Val Asp Leu Val Leu Ile His Val Ile Pro Cys His Gln Gly 117
1487 AGG GTT TCT GTC TGT GTA CCT ACC CTT TTT GTG TAT CCT TTT TCA CTT 1534
118 Arg Val Ser Val Cys Val Pro Thr Leu Phe Val Tyr Pro Phe Ser Leu 133
1535 ATT CAT TAA TCA CAT TAT TTG AGT ACG TGC GAA AAG ATG GGA TAT TTG 1582
134 Ile His *** 136
1583 AAT TGT GCC CTG GGA GAT TAT TAG TAA CTA CAC AAT AAT GGC AGC CAA 1630
1631 AAT TTA TTG GAC GCT TCC TAC ACT TAA GTG CTT TGC TTG CTT CAT TAA 1678
1679 TGA ATT CAC TCA AAT ATT TAT TGA GCA CCT TTT GTG GGC AGG GAC TCT 1726
1727 TCT AAG TTA TGT TCC TCA AGT AGA TTA TAT AAA TAA CCT ATT AAA TGA 1774
1775 TTT TGG AAT CAA AAA AGG ATA AAA AGA GGC CGG GCG CGG TGG CTT ACG 1822
1823 CCT GTA ATC CCA GCA CTT TGG GAG GCC GAG GCA CGT GGT TCA CCT GAA 1870
1871 GTC AGG AGT TTG AGA CCA GCC TGG CCA ACA TGA TGA AAC CCT GTC TCT 1918
1919 ACT AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA 1966
1967 AAA AAA 1972
2.PP3993 albumen
A: nucleotide sequence (SEQ ID NO:4) length: 2064bp
1 TTCAGACAAA CCTCAGGTAA GATGGGACTG GGCTTCCCCG CCACTGCGCA
51 GCAGCTCCCG CCCTCCTGGG TGCTGTCCCA GGGGTGAAAG GAAGAGGTGG
101 GGAGGTACAG CTGGGAGTGT GGGGGATGGG GAAGGATGGG GAGGGAACGG
151 GCCCGTGGAC AACTTGCATT TGCCCTGACA CCCACCCTCC CTGCAGCTCT
201 ACACCCTCAC CGTGATCGGC CCAGGACCGC CAGATTGCCA GCCAGCCCAG
251 ATCTCTCGCC GTTACTCGGA CTTTGAGCGG CTGCACCGAA ACCTGCAGCG
301 GCAATTCCGG GGCCCAATGG CTGCCATCTC CTTCCCCCGT AAGCGGCTGC
351 GCCGGAATTT TACTGCAGAG ACCATTGCCC GCCGTAGCCG GGCCTTTGAG
401 CAGTTTTTGG GTCACCTGCA GGCAGTGCCT GAGCTGCGCC ATGCCCCGGA
451 CCTGCAGGAC TTCTTCGTGC TGCCGGAGCT GCGGCGGGCA CAGAGCCTCA
501 CCTGTACTGG CCTCTATCGT GAGGCTCTGG CACTCTGGGC CAATGCCTGG
551 CAGCTGCAAG CCCAGCTGGG CACCCCCTCT GGCCCAGACC GCCCCCTGCT
601 GACCCTGGCT GGGCTGGCCG TGTGCCACCA GGAGCTGGAA GACCCTGGAG
651 AGGCCCGGGC ATGCTGTGAG AAGGCCCTGC AGCTGCTTGG GGACAAGAGC
701 CTCCACCCTT TGCTGGCACC CTTTCTGGAG GCCCATGTCC GGCTCTCCTG
751 GCGCCTGGGC CTGGACAAAC GTCAATCAGA GGCTCGGCTC CAAGCCCTGC
801 AGGAGGCAGG CCTTACCCCC ACACCACCCC CCAGTCTCAA AGAATTGCTC
851 ATCAAGGAGG TGCTGGACTA ACCCTTGCCT AGATTTAAGG CCACTGTGAG
901 GAGAGGGGTT GCCCCAGAAG GCAGGGGAAG GACCTGATGA GAACAGAATA
951 GCTGGGAGGC TGCAGAGGGT GCTGGGAGCC CCTAGAAGTT CCAAAAGAGA
1001 ATGTGAAGCA GATCAAGGAA ACTTCTGTTG AGCTAGGCTC AGGGTGAGCT
1051 TTGGCTGGGG TTGCCCTTGT GTAGTACAGG GAAGTCTGAC ACAGCCTCTC
1101 CAGCCTATAA ACAGCCGGGG GGCTGTGGCA CAGGTTGGGG CAATGTTCCC
1151 TTGTTGGTGG GCCCCCAAGC TGGCAAGGCC TCTTGGCTGA AGGCCAGGGA
1201 CTCTGCCCCT GGAGTCCTGG AGTTAAGGGA TGAAGGCAAG GCTGCAGGTC
1251 TGGCCCAGGG GAATTAAAAG CCAGCCACTC CAGTGGTATC AGTCTCTTTA
1301 TTGGATGTGA GGGCCAAAAG GGACTGTAAC TCCTGTCTCA GGAATGGGGA
1351 TAGATGGGAG GTTCTTGAAG CCCCAGGCGA AGCTGGTACC TCTGGCTACA
1401 GCTTGCTCTC TGAGACCTGG GGCTTCACTC GGATCACGCC CTCCTGGGCA
1451 CAGGTCACAG CTAGGACTCC ATCCTGACGC CACAGCCGCC CATGGACCAG
1501 CCCCCGAGAG CCACCTGTGG GTGAGGTGAA GGGTGATGAT GGCCTGCTTC
1551 AGAACAGCCA AATACACTTT TTTTTTTTTT CCTGAAACAG AGTCCCACTA
1601 AGTTGCCAGG CTGGTCTCAA GCCGCCTGGG TTCAAGGGAT CCTCCCGCCT
1651 CAGCCTCCTG AGCAGCTGGG ATTACAGGCG CACATCACCA TGCCCAACCT
1701 CCAAGTGGAC TTCTTGCAAA GGGTCTGGCC CAGGGCAGGG CTGCCCCACA
1751 CAAGGGTGCA CTGAGTGTCG TGGCTGCTCC AAATGCCCCT TCATGAGCTT
1801 ATTATGGACC GTCATTGAGG GGTAACTCCT CCCACAGGAA CCCCAGTTGA
1851 CAGTTTAAAA GCACTTTTAC ACCTCTCCTC GCTTCCTCAA AAAGATCACA
1901 GAGGGAGGAG CTCTGAGAAC AGTCTCCTTC AACAGCTCGG CCAAGCAGAA
1951 CTGCTGTACC TCTGACCACT TGTGTTAGGA AAACTATCGG CTCCCTGTAT
2001 AATAAATCAA GCCAGGTCCT CCCCAAAAAA AAAAAAAAAA AAAAAAAAAA
2051 AAAAAAAAAA AAAA
B: aminoacid sequence (SEQ ID NO:5) length: 184 amino acid
1 MAAISFPRKR LRRNFTAETI ARRSRAFEQF LGHLQAVPEL RHAPDLQDFF VLPELRRAQS
61 LTCTGLYREA LALWANAWQL QAQLGTPSGP DRPLLTLAGL AVCHQELEDP GEARACCEKA
121 LQLLGDKSLH PLLAPFLEAH VRLSWRLGLD KRQSEARLQA LQEAGLTPTP PPSLKELLIK
181 EVLD
C. Nucleotide and amino acid composite sequence (SEQ ID NO:6)
Clone number: PP3993
Start code: 317 ATG stop coding: 871 TAA
Protein molecular weight: 20611
1 T TCA GAC AAA CCT CAG GTA AGA TGG GAC TGG GCT TCC CCG CCA CTG 46
47 CGC AGC AGC TCC CGC CCT CCT GGG TGC TGT CCC AGG GGT GAA AGG AAG 94
95 AGG TGG GGA GGT ACA GCT GGG AGT GTG GGG GAT GGG GAA GGA TGG GGA 142
143 GGG AAC GGG CCC GTG GAC AAC TTG CAT TTG CCC TGA CAC CCA CCC TCC 190
191 CTG CAG CTC TAC ACC CTC ACC GTG ATC GGC CCA GGA CCG CCA GAT TGC 238
239 CAG CCA GCC CAG ATC TCT CGC CGT TAC TCG GAC TTT GAG CGG CTG CAC 286
287 CGA AAC CTG CAG CGG CAA TTC CGG GGC CCA ATG GCT GCC ATC TCC TTC 334
1 Met Ala Ala Ile Ser Phe 6
335 CCC CGT AAG CGG CTG CGC CGG AAT TTT ACT GCA GAG ACC ATT GCC CGC 382
7 Pro Arg Lys Arg Leu Arg Arg Asn Phe Thr Ala Glu Thr Ile Ala Arg 22
383 CGT AGC CGG GCC TTT GAG CAG TTT TTG GGT CAC CTG CAG GCA GTG CCT 430
23 Arg Ser Arg Ala Phe Glu Gln Phe Leu Gly His Leu Gln Ala Val Pro 38
431 GAG CTG CGC CAT GCC CCG GAC CTG CAG GAC TTC TTC GTG CTG CCG GAG 478
39 Glu Leu Arg His Ala Pro Asp Leu Gln Asp Phe Phe Val Leu Pro Glu 54
479 CTG CGG CGG GCA CAG AGC CTC ACC TGT ACT GGC CTC TAT CGT GAG GCT 526
55 Leu Arg Arg Ala Gln Ser Leu Thr Cys Thr Gly Leu Tyr Arg Glu Ala 70
527 CTG GCA CTC TGG GCC AAT GCC TGG CAG CTG CAA GCC CAG CTG GGC ACC 574
71 Leu Ala Leu Trp Ala Asn Ala Trp Gln Leu Gln Ala Gln Leu Gly Thr 86
575 CCC TCT GGC CCA GAC CGC CCC CTG CTG ACC CTG GCT GGG CTG GCC GTG 622
87 Pro Ser Gly Pro Asp Arg Pro Leu Leu Thr Leu Ala Gly Leu Ala Val 102
623 TGC CAC CAG GAG CTG GAA GAC CCT GGA GAG GCC CGG GCA TGC TGT GAG 670
103 Cys His Gln Glu Leu Glu Asp Pro Gly Glu Ala Arg Ala Cys Cys Glu 118
671 AAG GCC CTG CAG CTG CTT GGG GAC AAG AGC CTC CAC CCT TTG CTG GCA 718
119 Lys Ala Leu Gln Leu Leu Gly Asp Lys Ser Leu His Pro Leu Leu Ala 134
719 CCC TTT CTG GAG GCC CAT GTC CGG CTC TCC TGG CGC CTG GGC CTG GAC 766
135 Pro Phe Leu Glu Ala His Val Arg Leu Ser Trp Arg Leu Gly Leu Asp 150
767 AAA CGT CAA TCA GAG GCT CGG CTC CAA GCC CTG CAG GAG GCA GGC CTT 814
151 Lys Arg Gln Ser Glu Ala Arg Leu Gln Ala Leu Gln Glu Ala Gly Leu 166
815 ACC CCC ACA CCA CCC CCC AGT CTC AAA GAA TTG CTC ATC AAG GAG GTG 862
167 Thr Pro Thr Pro Pro Pro Ser Leu Lys Glu Leu Leu Ile Lys Glu Val 182
863 CTG GAC TAA CCC TTG CCT AGA TTT AAG GCC ACT GTG AGG AGA GGG GTT 910
183 Leu Asp *** 185
911 GCC CCA GAA GGC AGG GGA AGG ACC TGA TGA GAA CAG AAT AGC TGG GAG 958
959 GCT GCA GAG GGT GCT GGG AGC CCC TAG AAG TTC CAA AAG AGA ATG TGA 1006
1007 AGC AGA TCA AGG AAA CTT CTG TTG AGC TAG GCT CAG GGT GAG CTT TGG 1054
1055 CTG GGG TTG CCC TTG TGT AGT ACA GGG AAG TCT GAC ACA GCC TCT CCA 1102
1103 GCC TAT AAA CAG CCG GGG GGC TGT GGC ACA GGT TGG GGC AAT GTT CCC 1150
1151 TTG TTG GTG GGC CCC CAA GCT GGC AAG GCC TCT TGG CTG AAG GCC AGG 1198
1199 GAC TCT GCC CCT GGA GTC CTG GAG TTA AGG GAT GAA GGC AAG GCT GCA 1246
1247 GGT CTG GCC CAG GGG AAT TAA AAG CCA GCC ACT CCA GTG GTA TCA GTC 1294
1295 TCT TTA TTG GAT GTG AGG GCC AAA AGG GAC TGT AAC TCC TGT CTC AGG 1342
1343 AAT GGG GAT AGA TGG GAG GTT CTT GAA GCC CCA GGC GAA GCT GGT ACC 1390
1391 TCT GGC TAC AGC TTG CTC TCT GAG ACC TGG GGC TTC ACT CGG ATC ACG 1438
1439 CCC TCC TGG GCA CAG GTC ACA GCT AGG ACT CCA TCC TGA CGC CAC AGC 1486
1487 CGC CCA TGG ACC AGC CCC CGA GAG CCA CCT GTG GGT GAG GTG AAG GGT 1534
1535 GAT GAT GGC CTG CTT CAG AAC AGC CAA ATA CAC TTT TTT TTT TTT TCC 1582
1583 TGA AAC AGA GTC CCA CTA AGT TGC CAG GCT GGT CTC AAG CCG CCT GGG 1630
1631 TTC AAG GGA TCC TCC CGC CTC AGC CTC CTG AGC AGC TGG GAT TAC AGG 1678
1679 CGC ACA TCA CCA TGC CCA ACC TCC AAG TGG ACT TCT TGC AAA GGG TCT 1726
1727 GGC CCA GGG CAG GGC TGC CCC ACA CAA GGG TGC ACT GAG TGT CGT GGC 1774
1775 TGC TCC AAA TGC CCC TTC ATG AGC TTA TTA TGG ACC GTC ATT GAG GGG 1822
1823 TAA CTC CTC CCA CAG GAA CCC CAG TTG ACA GTT TAA AAG CAC TTT TAC 1870
1871 ACC TCT CCT CGC TTC CTC AAA AAG ATC ACA GAG GGA GGA GCT CTG AGA 1918
1919 ACA GTC TCC TTC AAC AGC TCG GCC AAG CAG AAC TGC TGT ACC TCT GAC 1966
1967 CAC TTG TGT TAG GAA AAC TAT CGG CTC CCT GTA TAA TAA ATC AAG CCA 2014
2015 GGT CCT CCC CAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA 2062
2063 AA 2064
3.PP4052 albumen
A: nucleotide sequence (SEQ ID NO:7) length: 1794bp
1 CTAAGAGAGC TTGGAAAGGG ATAGAGAAGT CTGACCCAAA TTTGCGGAGC
51 GACTGAGTGT ATGCTGCCCC CTTTCTGGGC CTTGGCTTCT TCCTCAATCA
101 TCTAGGCACA GTCCTATGAC TGCCTGTTTT TGAGGATGTG GGAAGGGTCT
151 GCAAATACAG TGCTTTCCCA TTGACACACG CTGGTGAGGA TGCAGGCTCC
201 CTGGCACCAG CAGTGAGGGC TCAGATTGCA AGAGTAAAAA CTTCCATCAC
251 TGGGAAGAGA AGTCTGCAGG GGACTGGAGG TGATCTGAAG ATTCTGAAAT
301 AACTCTTCCT CTCTCTGCAG AGAAGGATGG TGCTTCATCC TGTTTAGGTA
351 AGGGTGATAG CCAAGTGTGT TCAGGGTGGC TGCACCAACC CACCCTGAGT
401 CCATGTGGCT CAAGACTGCT GCTCAGGTGG GGTCAGCTGA GTGGGTAGGA
451 AGTCGGGAGG CACTGCCTAG CAGGTTTCAA CTTTAGTCTG GAGGCTGCAT
501 CTGTCTCCTC TAAACACAGT GGTTCTCATT GGTTATACCC CAAAATCACA
551 CCAGGAAACT TTAAAACTAG TGACGACTGA GTCCCGTCTC CAGAAACTCT
601 GACTTAATTG TGGGGGTGTG GCCTGGACAT CAGGATTTGT AATAACTCCT
651 CAGGTGATTC TCATGTAGTC AAGGTTGAGA ACCACTGCTT TAACTTTCTC
701 CAAATCCTAA TTACTTCTAG TGTGGCCTGA GGACTGCACT TTGAATGGCA
751 AGGGCCAAAT AACCAAGTTT TTGTTTTTTT CCAGGAAGTG GAGAAGAATT
801 TGGGGTTGGG ACTAGACTGG GGGTGAGCTG GGGAGATGGA AGATGGAAGT
851 GGGGTCAGTG GGAGGCAATG ATGGTAGGTA TCTTGGAAGA AGGATGCTTA
901 ATTTTAACAC GGAAAGACTG GAAGAAGGGA AGATACAAAG AGGTGGTCTC
951 CAGAAGACAG TCAGACAAAT AACAGAGCCT TAGAAATAAA ACCTTTTGGG
1001 CTGGGTGGCC CAGGTTCTCA ATTGCTGCCT CTACAGGAAA GTCTTTGGGT
1051 TCGGAGGATT CCAGAAACAT GAAGGAGAAG TTGGAGGACA TGGAGAGTGT
1101 CCTCAAGGAC CTGACAGAGG AGAAGAGAAA AGATGTGCTA AACTCCCTCG
1151 CTAAGTGCCT CGGCAAGGAG GATATTCGGC AGGATCTAGA GCAAAGAGTA
1201 TCTGAGGTCC TGATTTCCGG GGAGCTACAC ATGGAGGACC CAGACAAGCC
1251 TCTCCTAAGC AGCCTTTTTA ATGCTGCTGG GGTCTTGGTA GAAGCGCGTG
1301 CAAAAGCCAT TCTGGACTTC CTGGATGCCC TGCTAGAGCT GTCTGAAGAG
1351 CAGCAGTTTG TGGCTGAGGC CCTGGAGAAG GGGACCCTTC CTCTGTTGAA
1401 GGACCAGGTG AAATCTGTCA TGGAGCAGAA CTGGGATGAG CTGGCCAGCA
1451 GTCCTCCTGA CATGGACTAT GACCCTGAGG CACGAATTCT CTGTGCGCTG
1501 TATGTTGTTG TCTCTATCCT GCTGGAGCTG GCTGAGGGGC CTACCTCTGT
1551 CTCTTCCTAA CTACAAAAGC CCTTTCTCCC CACAAGCCTC TGGGTTTTCC
1601 CTTTACCAGT CTGTCCTCAC TGCCATCGCC ACTACCATCC TGTCACCAGT
1651 GGGACCTCTT TAAAACAAGC AGCCAACCAT TCTTTGATGT ATCCCATTCG
1701 CTCCATGTTA ACATCCAAAA CCAGCCTGGA TTTCATACAT GGACTTCTGA
1751 TTAAAAGTGG CAGGTTGTGC ATGTTAAAAA AAAAAAAAAA AAAA
B: aminoacid sequence (SEQ ID NO:8) length: 163 amino acid
1 MKEKLEDMES VLKDLTEEKR KDVLNSLAKC LGKEDIRQDL EQRVSEVLIS GELHMEDPDK
61 PLLSSLFNAA GVLVEARAKA ILDFLDALLE LSEEQQFVAE ALEKGTLPLL KDQVKSVMEQ
121 NWDELASSPP DMDYDPEARI LCALYVVVSI LLELAEGPTS VSS
C. Nucleotide and amino acid composite sequence (SEQ ID NO:9)
Clone number: PP4052
Start code: 1069 ATG stop coding: 1560 TAA
Protein molecular weight: 18158
1 CTA AGA GAG CTT GGA AAG GGA TAG AGA AGT CTG ACC CAA ATT TGC GGA 48
49 GCG ACT GAG TGT ATG CTG CCC CCT TTC TGG GCC TTG GCT TCT TCC TCA 96
97 ATC ATC TAG GCA CAG TCC TAT GAC TGC CTG TTT TTG AGG ATG TGG GAA 144
145 GGG TCT GCA AAT ACA GTG CTT TCC CAT TGA CAC ACG CTG GTG AGG ATG 192
193 CAG GCT CCC TGG CAC CAG CAG TGA GGG CTC AGA TTG CAA GAG TAA AAA 240
241 CTT CCA TCA CTG GGA AGA GAA GTC TGC AGG GGA CTG GAG GTG ATC TGA 288
289 AGA TTC TGA AAT AAC TCT TCC TCT CTC TGC AGA GAA GGA TGG TGC TTC 336
337 ATC CTG TTT AGG TAA GGG TGA TAG CCA AGT GTG TTC AGG GTG GCT GCA 384
385 CCA ACC CAC CCT GAG TCC ATG TGG CTC AAG ACT GCT GCT CAG GTG GGG 432
433 TCA GCT GAG TGG GTA GGA AGT CGG GAG GCA CTG CCT AGC AGG TTT CAA 480
481 CTT TAG TCT GGA GGC TGC ATC TGT CTC CTC TAA ACA CAG TGG TTC TCA 528
529 TTG GTT ATA CCC CAA AAT CAC ACC AGG AAA CTT TAA AAC TAG TGA CGA 576
577 CTG AGT CCC GTC TCC AGA AAC TCT GAC TTA ATT GTG GGG GTG TGG CCT 624
625 GGA CAT CAG GAT TTG TAA TAA CTC CTC AGG TGA TTC TCA TGT AGT CAA 672
673 GGT TGA GAA CCA CTG CTT TAA CTT TCT CCA AAT CCT AAT TAC TTC TAG 720
721 TGT GGC CTG AGG ACT GCA CTT TGA ATG GCA AGG GCC AAA TAA CCA AGT 768
769 TTT TGT TTT TTT CCA GGA AGT GGA GAA GAA TTT GGG GTT GGG ACT AGA 816
817 CTG GGG GTG AGC TGG GGA GAT GGA AGA TGG AAG TGG GGT CAG TGG GAG 864
865 GCA ATG ATG GTA GGT ATC TTG GAA GAA GGA TGC TTA ATT TTA ACA CGG 912
913 AAA GAC TGG AAG AAG GGA AGA TAC AAA GAG GTG GTC TCC AGA AGA CAG 960
961 TCA GAC AAA TAA CAG AGC CTT AGA AAT AAA ACC TTT TGG GCT GGG TGG 1008
1009 CCC AGG TTC TCA ATT GCT GCC TCT ACA GGA AAG TCT TTG GGT TCG GAG 1056
1057 GAT TCC AGA AAC ATG AAG GAG AAG TTG GAG GAC ATG GAG AGT GTC CTC 1104
1 Met Lys Glu Lys Leu Glu Asp Met Glu Ser Val Leu 12
1105 AAG GAC CTG ACA GAG GAG AAG AGA AAA GAT GTG CTA AAC TCC CTC GCT 1152
13 Lys Asp Leu Thr Glu Glu Lys Arg Lys Asp Val Leu Asn Ser Leu Ala 28
1153 AAG TGC CTC GGC AAG GAG GAT ATT CGG CAG GAT CTA GAG CAA AGA GTA 1200
29 Lys Cys Leu Gly Lys Glu Asp Ile Arg Gln Asp Leu Glu Gln Arg Val 44
1201 TCT GAG GTC CTG ATT TCC GGG GAG CTA CAC ATG GAG GAC CCA GAC AAG 1248
45 Ser Glu Val Leu Ile Ser Gly Glu Leu His Met Glu Asp Pro Asp Lys 60
1249 CCT CTC CTA AGC AGC CTT TTT AAT GCT GCT GGG GTC TTG GTA GAA GCG 1296
61 Pro Leu Leu Ser Ser Leu Phe Asn Ala Ala Gly Val Leu Val Glu Ala 76
1297 CGT GCA AAA GCC ATT CTG GAC TTC CTG GAT GCC CTG CTA GAG CTG TCT 1344
77 Arg Ala Lys Ala Ile Leu Asp Phe Leu Asp Ala Leu Leu Glu Leu Ser 92
1345 GAA GAG CAG CAG TTT GTG GCT GAG GCC CTG GAG AAG GGG ACC CTT CCT 1392
93 Glu Glu Gln Gln Phe Val Ala Glu Ala Leu Glu Lys Gly Thr Leu Pro 108
1393 CTG TTG AAG GAC CAG GTG AAA TCT GTC ATG GAG CAG AAC TGG GAT GAG 1440
109 Leu Leu Lys Asp Gln Val Lys Ser Val Met Glu Gln Asn Trp Asp Glu 124
1441 CTG GCC AGC AGT CCT CCT GAC ATG GAC TAT GAC CCT GAG GCA CGA ATT 1488
125 Leu Ala Ser Ser Pro Pro Asp Met Asp Tyr Asp Pro Glu Ala Arg Ile 140
1489 CTC TGT GCG CTG TAT GTT GTT GTC TCT ATC CTG CTG GAG CTG GCT GAG 1536
141 Leu Cys Ala Leu Tyr Val Val Val Ser Ile Leu Leu Glu Leu Ala Glu 156
1537 GGG CCT ACC TCT GTC TCT TCC TAA CTA CAA AAG CCC TTT CTC CCC ACA 1584
157 Gly Pro Thr Ser Val Ser Ser *** 164
1585 AGC CTC TGG GTT TTC CCT TTA CCA GTC TGT CCT CAC TGC CAT CGC CAC 1632
1633 TAC CAT CCT GTC ACC AGT GGG ACC TCT TTA AAA CAA GCA GCC AAC CAT 1680
1681 TCT TTG ATG TAT CCC ATT CGC TCC ATG TTA ACA TCC AAA ACC AGC CTG 1728
1729 GAT TTC ATA CAT GGA CTT CTG ATT AAA AGT GGC AGG TTG TGC ATG TTA 1776
1777 AAA AAA AAA AAA AAA AAA 1794
4.PP4068 albumen
A: nucleotide sequence (SEQ ID NO:10) length: 1949bp
1 GGAAGGCAAA GGTAGAGCAA CTGGATCTCT GGCTCTCCAC ATAGCTTCTG
51 ATCTCAGACC TTACTAAAAT GCTTTCTGGG CCCAAGGACA AAGCTCACAT
101 GAACAAATGA TTTTGAGTCA TGAATGAAAA ATCTTGCTCT TTCCATAGTA
151 AAGAAGAATT AAGAGATGGA CAGGGTGAAA GATTGTCTGC TGGATATTCT
201 CCATCATATG ACAAGGACAA GAGTGTTCTG GCTTTCAGAG GAATCCCTAT
251 CTCAGAGTTG AAGAACCATG GCATTCTCCA GGCTCTGACC ACAGAAGCTT
301 ATGAATGGGA GCCACGTGTT GTGAGTACAG AGGTGGTCAG AGCCCAAGAA
351 GAATGGGAAG CTGTGGACAC CATCCAGCCA GAGACAGGGA GCCAAGCTAG
401 CTCAGAGCAG CCTGGGCAGC TAATCTCCTT CGGTGAGGCC CTGCAGCACT
451 TCCAGACTGT GGACCTTTCC CCCTTCAAGA AAAGAATCCA GCCAACTATT
501 CGAAGGACTG GGCTCGCCGC CCTCCGACAC TACCTCTTCG GGCCTCCAAA
551 GCTCCACCAG CGCCTTCGGG AAGAAAGGGA CTTGGTCCTG ACCATTGCTC
601 AGTGAGCGAA TCCAGCCACA GACCTGAGAG GCGCAGGCTT CCTTGCCCTC
651 CTGCATCTGC TCTACCTGGT GATGGACTCA AAGACCTTGC CGATGGCGCA
701 GGAGATTTTC CGCCTGTCTC GTCACCACAT CCAGCAATTC CCTTTCTGTT
751 TGATGTCCGT GAACATCACC CACATTGCCA TCCAGGCCTT GAGAGAGGAG
801 TGTCTCTCCA GAGAGTGTAA TCGGCAGCAG AAGGTCATCC CCGTGGTGAA
851 CAGCTTCTAT GCCGCCACAT TCCTCCACCT CGCACATGTC TGGAGGACAC
901 AGCGGAAGAC CATCTCAGAC TCGGGCTTTG TCCTCAAAGA GTTGGAAGTA
951 TTGGCCAAGA AGAGCCCACG GCGGCTGCTC AAGACCCTGG AGCTGTACTT
1001 GGCCAGGGTG TCAAAGGGAC AGGCCTCCTT GTTGGGAGCA CAGAAGTGCT
1051 ATGGGCCAGA AGCCCCTCCC TTCAAGGATC TCACCTTCAC AGGTGAGAGT
1101 GACCTGCAGT CTCACTCATC CGAAGGCGTA TGGCTGATCT GACCTCCGAG
1151 ATGAATGGAG GCTTAAAGGC TGAGCTGCAG GGGCTTTCAG GGGGTCAGTG
1201 GAGCCATGTC AGGAGCCTGG CCAGGCCGCA CCCCTTGCTG TCTCAGCAGA
1251 TGGGATATAG GAAGCTCCTG GGCTTAGCTG TGGGAAGCCA AGTACCCTCA
1301 CCGGCATGGG ACATGAGGGG CAGCTAGACT TCACCCCCTT CCCGCAGACC
1351 TGCCTCCAGA GCAAGGAGAA TTCTGCCTTA ATCTGTTGGG CTCCAGTCTC
1401 CGGGTTGAAT TCCAGTGTAT CCCACTGGGA GTGAATGGAT CATGAGGTGG
1451 GATGGCCCCA TCTGGATGTT CTGCAGATCC CCACATGGGA GGAGATTCCC
1501 AAGTAGAGGC AGCCAGAATT CTGACTCCCT GGCAGCAGCA GGGCTTTCAG
1551 GCACCAGTGT CTGTGTTGTT AGAACTCAGA GGAAGAGGCA GGGGCAGCAG
1601 CCGACAGGGG GCAGCATGAC CCAAGCAAGG GGCCTGAGGC CTCAGTGGGG
1651 GTAGGGCAAG GAGGTACCTC ACAGGTGGGT GTGAGGCCCC CTCTGGAGTT
1701 TCTGGCCATT CACTTACCCA CTCTCTTCTC CCCCTGACCC CCGCTCCATT
1751 GTTTATGATG GAAAAACGGA CATTTGGCTA GGTGTCTCTC ACGGCTGCTC
1801 CATCCTAGCC CCCACAAGCT GGGGCTTCCT CCTGTAGATG CTGTGCATGC
1851 CCCATGATGA GTTTCTGGCC TAATTGAGGG AAGGAGGAAA TTCATACCAG
1901 CAGTTTTCAA ATAAAAGAAT TGTTCTAATT AAAAAAAAAA AAAAAAAAA
B: aminoacid sequence (SEQ ID NO:11) length: 161 amino acid
1 MNEKSCSFHS KEELRDGQGE RLSAGYSPSY DKDKSVLAFR GIPISELKNH GILQALTTEA
61 YEWEPRVVST EVVRAQEEWE AVDTIQPETG SQASSEQPGQ LISFGEALQH FQTVDLSPFK
121 KRIQPTIRRT GLAALRHYLF GPPKLHQRLR EERDLVLTIA Q
C. Nucleotide and amino acid composite sequence (SEQ ID NO:12)
Clone number: PP4068
Start code: 120 ATG stop coding: 605 TGA
Protein molecular weight: 18261
1 GG AAG GCA AAG GTA GAG CAA CTG GAT CTC TGG CTC TCC ACA TAG CTT 47
48 CTG ATC TCA GAC CTT ACT AAA ATG CTT TCT GGG CCC AAG GAC AAA GCT 95
96 CAC ATG AAC AAA TGA TTT TGA GTC ATG AAT GAA AAA TCT TGC TCT TTC 143
1 Met Asn Glu Lys Ser Cys Ser Phe 8
144 CAT AGT AAA GAA GAA TTA AGA GAT GGA CAG GGT GAA AGA TTG TCT GCT 191
9 His Ser Lys Glu Glu Leu Arg Asp Gly Gln Gly Glu Arg Leu Ser Ala 24
192 GGA TAT TCT CCA TCA TAT GAC AAG GAC AAG AGT GTT CTG GCT TTC AGA 239
25 Gly Tyr Ser Pro Ser Tyr Asp Lys Asp Lys Ser Val Leu Ala Phe Arg 40
240 GGA ATC CCT ATC TCA GAG TTG AAG AAC CAT GGC ATT CTC CAG GCT CTG 287
41 Gly Ile Pro Ile Ser Glu Leu Lys Asn His Gly Ile Leu Gln Ala Leu 56
288 ACC ACA GAA GCT TAT GAA TGG GAG CCA CGT GTT GTG AGT ACA GAG GTG 335
57 Thr Thr Glu Ala Tyr Glu Trp Glu Pro Arg Val Val Ser Thr Glu Val 72
336 GTC AGA GCC CAA GAA GAA TGG GAA GCT GTG GAC ACC ATC CAG CCA GAG 383
73 Val Arg Ala Gln Glu Glu Trp Glu Ala Val Asp Thr Ile Gln Pro Glu 88
384 ACA GGG AGC CAA GCT AGC TCA GAG CAG CCT GGG CAG CTA ATC TCC TTC 431
89 Thr Gly Ser Gln Ala Ser Ser Glu Gln Pro Gly Gln Leu Ile Ser Phe 104
432 GGT GAG GCC CTG CAG CAC TTC CAG ACT GTG GAC CTT TCC CCC TTC AAG 479
105 Gly Glu Ala Leu Gln His Phe Gln Thr Val Asp Leu Ser Pro Phe Lys 120
480 AAA AGA ATC CAG CCA ACT ATT CGA AGG ACT GGG CTC GCC GCC CTC CGA 527
121 Lys Arg Ile Gln Pro Thr Ile Arg Arg Thr Gly Leu Ala Ala Leu Arg 136
528 CAC TAC CTC TTC GGG CCT CCA AAG CTC CAC CAG CGC CTT CGG GAA GAA 575
137 His Tyr Leu Phe Gly Pro Pro Lys Leu His Gln Arg Leu Arg Glu Glu 152
576 AGG GAC TTG GTC CTG ACC ATT GCT CAG TGA GCG AAT CCA GCC ACA GAC 623
153 Arg Asp Leu Val Leu Thr Ile Ala Gln *** 162
624 CTG AGA GGC GCA GGC TTC CTT GCC CTC CTG CAT CTG CTC TAC CTG GTG 671
672 ATG GAC TCA AAG ACC TTG CCG ATG GCG CAG GAG ATT TTC CGC CTG TCT 719
720 CGT CAC CAC ATC CAG CAA TTC CCT TTC TGT TTG ATG TCC GTG AAC ATC 767
768 ACC CAC ATT GCC ATC CAG GCC TTG AGA GAG GAG TGT CTC TCC AGA GAG 815
816 TGT AAT CGG CAG CAG AAG GTC ATC CCC GTG GTG AAC AGC TTC TAT GCC 863
864 GCC ACA TTC CTC CAC CTC GCA CAT GTC TGG AGG ACA CAG CGG AAG ACC 911
912 ATC TCA GAC TCG GGC TTT GTC CTC AAA GAG TTG GAA GTA TTG GCC AAG 959
960 AAG AGC CCA CGG CGG CTG CTC AAG ACC CTG GAG CTG TAC TTG GCC AGG 1007
1008 GTG TCA AAG GGA CAG GCC TCC TTG TTG GGA GCA CAG AAG TGC TAT GGG 1055
1056 CCA GAA GCC CCT CCC TTC AAG GAT CTC ACC TTC ACA GGT GAG AGT GAC 1103
1104 CTG CAG TCT CAC TCA TCC GAA GGC GTA TGG CTG ATC TGA CCT CCG AGA 1151
1152 TGA ATG GAG GCT TAA AGG CTG AGC TGC AGG GGC TTT CAG GGG GTC AGT 1199
1200 GGA GCC ATG TCA GGA GCC TGG CCA GGC CGC ACC CCT TGC TGT CTC AGC 1247
1248 AGA TGG GAT ATA GGA AGC TCC TGG GCT TAG CTG TGG GAA GCC AAG TAC 1295
1296 CCT CAC CGG CAT GGG ACA TGA GGG GCA GCT AGA CTT CAC CCC CTT CCC 1343
1344 GCA GAC CTG CCT CCA GAG CAA GGA GAA TTC TGC CTT AAT CTG TTG GGC 1391
1392 TCC AGT CTC CGG GTT GAA TTC CAG TGT ATC CCA CTG GGA GTG AAT GGA 1439
1440 TCA TGA GGT GGG ATG GCC CCA TCT GGA TGT TCT GCA GAT CCC CAC ATG 1487
1488 GGA GGA GAT TCC CAA GTA GAG GCA GCC AGA ATT CTG ACT CCC TGG CAG 1535
1536 CAG CAG GGC TTT CAG GCA CCA GTG TCT GTG TTG TTA GAA CTC AGA GGA 1583
1584 AGA GGC AGG GGC AGC AGC CGA CAG GGG GCA GCA TGA CCC AAG CAA GGG 1631
1632 GCC TGA GGC CTC AGT GGG GGT AGG GCA AGG AGG TAC CTC ACA GGT GGG 1679
1680 TGT GAG GCC CCC TCT GGA GTT TCT GGC CAT TCA CTT ACC CAC TCT CTT 1727
1728 CTC CCC CTG ACC CCC GCT CCA TTG TTT ATG ATG GAA AAA CGG ACA TTT 1775
1776 GGC TAG GTG TCT CTC ACG GCT GCT CCA TCC TAG CCC CCA CAA GCT GGG 1823
1824 GCT TCC TCC TGT AGA TGC TGT GCA TGC CCC ATG ATG AGT TTC TGG CCT 1871
1872 AAT TGA GGG AAG GAG GAA ATT CAT ACC AGC AGT TTT CAA ATA AAA GAA 1919
1920 TTG TTC TAA TTA AAA AAA AAA AAA AAA AAA 1949
5.PP4135 albumen
A: nucleotide sequence (SEQ ID NO:13) length: 1585bp
1 CATACTTGGT CTATCTTCTA CTTTGTCTTC TCTTAGGACC CAAGGTCTCT
51 TAGCACAAAC ACTGCCTCTT GAGTTCCCAG TGCATCTGTA TCAATCTGGA
101 GATTACGTCC TCATCAAAAG CTGGAAAGAA GAAAAACTCG AACCAACCTG
151 GGAGACCTTA TCTAGTGCCC CTAACCACTG AGACAGCAGT CTGGACCGTT
201 AAGAAAGGGT AGACCCATCA CACTCAGGTG AAAAAGGCAT GACCCCCTTT
251 GGAGGCATAG GTTGTCACTC CCGGGCCAAC ACCTTCCAAA CTAATATTCA
301 AAAAAACTTA ACCTGTCTAA TTTGCTTCCT CTTTCTTTCG TTAGCTACCC
351 AGGAACATTT TATTTTATCA ATGTAACCTG ATCATCGTTT CCTCAAACAA
401 TTACATTTGA TGCTTGTCTT GTTATGCCCT GTGGGGACCT ACAAACCCAA
451 AGGCAACTAG CCTCTTCAGG CAATAACATA CAGGACATAG GAATGGGCAA
501 AGACTTCATG ACTAAAACAC CAAAAGCAAC AGCAACAAAA GCCAAAATTG
551 ACAAATGGGA TCTAATTAAA CTAAAGAGAT TCTGCGCAGC AAAGGAAACT
601 ATCATCAGAG TGAACAGGCA ACCTACAGAA TGGGAGAAAA TTTTTGCAAT
651 CTATCCATTT GACAAAGGGC TAATATCCAG AATCTATAAA GAACTTAAAT
701 TTACAAGAAA AAAACAACCC CATCAAAAAG CGGGCGAAGG ATATGAACAG
751 ACACTTCTCC AAAGAAGACA TTTATGCAGC CAACAAACAT GAAAAATAGC
801 TGATCATCAC TGGTCATTAT AGAAATGCAA ATCAAAACCA CAGTAAGATA
851 CTAACTCATG CCAGTTAGAA TGGCGATCAT TAAAAAGTCA GGAAACAACA
901 GATGCTGGAG AGGATGTGCA GAAATAGGAA TGCTTTTTAT ACTGTTGGTG
951 AAAGTGTAAA TTAGTTCAAC CATTGTGGAA GACAGTGTGG CGATTCCTCA
1001 AGGATCTATA GAACCAGAAC TACCACATGA CCCAGCAATC CCATTACTGG
1051 GTATATACCC AAAGGATTAT ACATCATTCT GCTATAAAGA CACATGCACA
1101 CGTACGTTTA TTGCAGCACT ATTTACAATA GCAAAGACTT GGAATCAACC
1151 CAAATCCCCG TCAATGATAG ACTGGATAAA GACAATGTGG CACTTATACA
1201 CCATGGAATA CTATGCAGCC AAAAAAAGGA TGAATTCATG TCCTTTGCAG
1251 CGACATGGAT GAAGCTGAAA ACCATCATTC TCAGCAAACT AACACGAGAA
1301 CAGAAAACCA AACACTACAT GTTCTCACTC ATAAGTGGGA GTTGAACAAT
1351 GAGAACACAT GGACACACGG AGGGGAACAC CACACACCAG GGCCTGTCGG
1401 CGGGTGGGGA GGCTAGGGGA GGGATAGCAT TAGGAGAAAT ACCTAATGTA
1451 GATGACAGGT TGATGGGTCT GACAAACCAC CATGACACGT GTATACCTAT
1501 GTAATGCAAC TGCACATTTT GCTCATGTAC CCCAGAACTT AAACTATAAT
1551 TAAAAACACA TAATTTCAAA AAAAAAAAAA AAAAA
B: aminoacid sequence (SEQ ID NO:14) length: 122 amino acid
1 MPCGDLQTQR QLASSGNNIQ DIGMGKDFMT KTPKATATKA KIDKWDLIKL KRFCAAKETI
61 IRVNRQPTEW EKIFAIYPFD KGLISRIYKE LKFTRKKQPH QKAGEGYEQT LLQRRHLCSQ
121 QT
C. Nucleotide and amino acid composite sequence (SEQ ID NO:15)
Clone number: PP424
Start code: 424 ATG stop coding: 792 TGA
Protein molecular weight: 14172
1 CAT ACT TGG TCT ATC TTC TAC TTT GTC TTC TCT TAG GAC CCA AGG TCT 48
49 CTT AGC ACA AAC ACT GCC TCT TGA GTT CCC AGT GCA TCT GTA TCA ATC 96
97 TGG AGA TTA CGT CCT CAT CAA AAG CTG GAA AGA AGA AAA ACT CGA ACC 144
145 AAC CTG GGA GAC CTT ATC TAG TGC CCC TAA CCA CTG AGA CAG CAG TCT 192
193 GGA CCG TTA AGA AAG GGT AGA CCC ATC ACA CTC AGG TGA AAA AGG CAT 240
241 GAC CCC CTT TGG AGG CAT AGG TTG TCA CTC CCG GGC CAA CAC CTT CCA 288
289 AAC TAA TAT TCA AAA AAA CTT AAC CTG TCT AAT TTG CTT CCT CTT TCT 336
337 TTC GTT AGC TAC CCA GGA ACA TTT TAT TTT ATC AAT GTA ACC TGA TCA 384
385 TCG TTT CCT CAA ACA ATT ACA TTT GAT GCT TGT CTT GTT ATG CCC TGT 432
1 Met Pro Cys 3
433 GGG GAC CTA CAA ACC CAA AGG CAA CTA GCC TCT TCA GGC AAT AAC ATA 480
4 Gly Asp Leu Gln Thr Gln Arg Gln Leu Ala Ser Ser Gly Asn Asn Ile 19
481 CAG GAC ATA GGA ATG GGC AAA GAC TTC ATG ACT AAA ACA CCA AAA GCA 528
20 Gln Asp Ile Gly Met Gly Lys Asp Phe Met Thr Lys Thr Pro Lys Ala 35
529 ACA GCA ACA AAA GCC AAA ATT GAC AAA TGG GAT CTA ATT AAA CTA AAG 576
36 Thr Ala Thr Lys Ala Lys Ile Asp Lys Trp Asp Leu Ile Lys Leu Lys 51
577 AGA TTC TGC GCA GCA AAG GAA ACT ATC ATC AGA GTG AAC AGG CAA CCT 624
52 Arg Phe Cys Ala Ala Lys Glu Thr Ile Ile Arg Val Asn Arg Gln Pro 67
625 ACA GAA TGG GAG AAA ATT TTT GCA ATC TAT CCA TTT GAC AAA GGG CTA 672
68 Thr Glu Trp Glu Lys Ile Phe Ala Ile Tyr Pro Phe Asp Lys Gly Leu 83
673 ATA TCC AGA ATC TAT AAA GAA CTT AAA TTT ACA AGA AAA AAA CAA CCC 720
84 Ile Ser Arg Ile Tyr Lys Glu Leu Lys Phe Thr Arg Lys Lys Gln Pro 99
721 CAT CAA AAA GCG GGC GAA GGA TAT GAA CAG ACA CTT CTC CAA AGA AGA 768
100 His Gln Lys Ala Gly Glu Gly Tyr Glu Gln Thr Leu Leu Gln Arg Arg 115
769 CAT TTA TGC AGC CAA CAA ACA TGA AAA ATA GCT GAT CAT CAC TGG TCA 816
116 His Leu Cys Ser Gln Gln Thr *** 123
817 TTA TAG AAA TGC AAA TCA AAA CCA CAG TAA GAT ACT AAC TCA TGC CAG 864
865 TTA GAA TGG CGA TCA TTA AAA AGT CAG GAA ACA ACA GAT GCT GGA GAG 912
913 GAT GTG CAG AAA TAG GAA TGC TTT TTA TAC TGT TGG TGA AAG TGT AAA 960
961 TTA GTT CAA CCA TTG TGG AAG ACA GTG TGG CGA TTC CTC AAG GAT CTA 1008
1009 TAG AAC CAG AAC TAC CAC ATG ACC CAG CAA TCC CAT TAC TGG GTA TAT 1056
1057 ACC CAA AGG ATT ATA CAT CAT TCT GCT ATA AAG ACA CAT GCA CAC GTA 1104
1105 CGT TTA TTG CAG CAC TAT TTA CAA TAG CAA AGA CTT GGA ATC AAC CCA 1152
1153 AAT CCC CGT CAA TGA TAG ACT GGA TAA AGA CAA TGT GGC ACT TAT ACA 1200
1201 CCA TGG AAT ACT ATG CAG CCA AAA AAA GGA TGA ATT CAT GTC CTT TGC 1248
1249 AGC GAC ATG GAT GAA GCT GAA AAC CAT CAT TCT CAG CAA ACT AAC ACG 1296
1297 AGA ACA GAA AAC CAA ACA CTA CAT GTT CTC ACT CAT AAG TGG GAG TTG 1344
1345 AAC AAT GAG AAC ACA TGG ACA CAC GGA GGG GAA CAC CAC ACA CCA GGG 1392
1393 CCT GTC GGC GGG TGG GGA GGC TAG GGG AGG GAT AGC ATT AGG AGA AAT 1440
1441 ACC TAA TGT AGA TGA CAG GTT GAT GGG TCT GAC AAA CCA CCA TGA CAC 1488
1489 GTG TAT ACC TAT GTA ATG CAA CTG CAC ATT TTG CTC ATG TAC CCC AGA 1536
1537 ACT TAA ACT ATA ATT AAA AAC ACA TTT TTT CAA AAA AAA AAA AAA AAA 1584
1585 A 1585
6.PP4189 albumen
A: nucleotide sequence (SEQ ID NO:16) length: 1762bp
1 GAAAAAGGTA GAATTCTTAT GAATGTTTCA TGATCATGTA TGTGTGGTTA
51 ACGCAGATAC ATTTGGAAGT CGGTTATCAA GGTTGCTCTC CAGCTCAAGG
101 CCTTTGCCCT TTTTGTCAAA ACCAAAGAAG TTCCAACAAA AAGGAGTTTT
151 GAATGTAAAG AAAAATTGTG GAAATGCTGT CAGCAGCTAT TCACAGACCA
201 AACCAGCATC CATAGACATG TGGCAACACA ACATGCTGAT GAAATTTATC
251 ACCAGACAGC TTCTATTTTA AAGCAACTGG CTGTGACATT GAGCACCTCA
301 AAGAGTCTTT CGTCTGCAGA TGAAAAGAAC CCTTTAAAAG AGTGCCTTCC
351 ACATAGCCAT GACGTGTCTG CTTGGCTCCC TGATATAAGC TGCTTTAACC
401 CTGATGAGCT GATAAGTGGC CAGGGCAGTG AAGAAGGGGA GGTGCTCCTT
451 TATTACTGCT ACCATGACCT GGAGGATCCC CAATGGATCT GTGCCTGGCA
501 GACAGCTCTG TGTCAGCACC TGCACCTCAC AGGCAAGGAA TCCATTTATC
551 CCCAGGTGAA TTTCATAAAG AAGTAGAAAA GTTTTTATCT CAGGCAAATC
601 AAGAACAAAG TGATACTATC CTTCTTGATT GCAGAAACTT CTATGAAAGC
651 AAAATAGGAC GATTCCAAGG CTGCTTAGCC CCAGACATCA GGAAATTCAG
701 TTACTTCCCT AGCTACGTTG ACAAAAATCTAGAACTTTTC AGAGAGAAGA
751 GAGTGCTGAT GTACTGTACC GGGGGCATCC GCTGTGAGCG GGGTTCAGCC
801 TACCTCAAAG CCAAGGGAGT GTGCAAGGAG GTGTTCCAGC TCAAGGGTGG
851 CATCCACAAG TACCTGGAAG AGTTTCCTGA TGGCTTTTAC AAAGGGAAGT
901 TGTTTGTTTT TGATGAACGC TATGCTCTGT CCTACAACAG TGATGTGGTG
951 TCAGGTAGGT CAGCACAGGC TCAGAGCCCA AACTGAAATG AAGCACATTG
1001 TCAGTTCACC ATTCTAGAAA AATGACACAG GGAAGACAGG CCAGTGCTCA
1051 TTACTGAGCA CTGAATAAGC AGGGAAAATA AGTACATTGT GCCACCATTT
1101 TCCCAGCTGT GGAGCTGAGA GAACCCTAGC CCAGGAGTCA GGAGGCCTGG
1151 GTTGGGATCC TGGCTTCACC ATTGCTAGCT GGACAAGCCC ATTAACATGG
1201 GGATCATCTC ACCTGCCCTG CCTGCCTGTC TACCTGCCAA GAGCTGTACT
1251 ACTGGGCTAA TTCAGGGCTC TTAACCTGGA ATTGGTACAT AGATTTCAGG
1301 GATTCTGTGA ATTTGGATGG AAAAATAATT GTATCTTTGT TTTCAATAAC
1351 ACCTCACTAA AATGAAGCAT TTCCTTTAGT TATGAATGTA GGCAACAAAG
1401 TACCAGTTGT ATTAATGTAC CTGTGACTTT GTCTTCAGTA GGATTCACAA
1451 TACTTTCATA TCATGTTCTA GTTGCCTCAG ATATCTCAAA ATAGTATTTA
1501 TACTCATCAC TGCTTCAAAA TGAAAATAGT TATTAGGCCC ACCACTAAGA
1551 GTTGATATAT AATGTGTTAA TAAATGGCAC GTCTTATTAT ATATTACAGA
1601 TTTTGAAAAA GAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAACAAAAAA
1651 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
1701 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
1751 AAAAAAAAAA AA
B: aminoacid sequence (SEQ ID NO:17) length: 167 amino acid
1 MDLCLADSSV SAPAPHRQGI HLSPGEFHKE VEKFLSQANQ EQSDTILLDC RNFYESKIGR
61 FQGCLAPDIR KFSYFPSYVD KNLELFREKR VLMYCTGGIR CERGSAYLKA KGVCKEVFQL
121 KGGIHKYLEE FPDGFYKGKL FVFDERYALS YNSDVVSGRS AQAQSPN
C. Nucleotide and amino acid composite sequence (SEQ ID NO:18)
Clone number: PP4198
Start code: 483 ATG stop coding: 986 TGA
Protein molecular weight: 19000
1 GA AAA AGG TAG AAT TCT TAT GAA TGT TTC ATG ATC ATG TAT GTG TGG 47
48 TTA ACG CAG ATA CAT TTG GAA GTC GGT TAT CAA GGT TGC TCT CCA GCT 95
96 CAA GGC CTT TGC CCT TTT TGT CAA AAC CAA AGA AGT TCC AAC AAA AAG 143
144 GAG TTT TGA ATG TAA AGA AAA ATT GTG GAA ATG CTG TCA GCA GCT ATT 191
192 CAC AGA CCA AAC CAG CAT CCA TAG ACA TGT GGC AAC ACA ACA TGC TGA 239
240 TGA AAT TTA TCA CCA GAC AGC TTC TAT TTT AAA GCA ACT GGC TGT GAC 287
288 ATT GAG CAC CTC AAA GAG TCT TTC GTC TGC AGA TGA AAA GAA CCC TTT 335
336 AAA AGA GTG CCT TCC ACA TAG CCA TGA CGT GTC TGC TTG GCT CCC TGA 383
384 TAT AAG CTG CTT TAA CCC TGA TGA GCT GAT AAG TGG CCA GGG CAG TGA 431
432 AGA AGG GGA GGT GCT CCT TTA TTA CTG CTA CCA TGA CCT GGA GGA TCC 479
480 CCA ATG GAT CTG TGC CTG GCA GAC AGC TCT GTG TCA GCA CCT GCA CCT 527
1 Met Asp Leu Cys Leu Ala Asp Ser Ser Val Ser Ala Pro Ala Pro 15
528 CAC AGG CAA GGA ATC CAT TTA TCC CCA GGT GAA TTT CAT AAA GAA GTA 575
16 His Arg Gln Gly Ile His Leu Ser Pro Gly Glu Phe His Lys Glu Val 31
576 GAA AAG TTT TTA TCT CAG GCA AAT CAA GAA CAA AGT GAT ACT ATC CTT 623
32 Glu Lys Phe Leu Ser Gln Ala Ash Gln Glu Gln Ser Asp Thr Ile Leu 47
624 CTT GAT TGC AGA AAC TTC TAT GAA AGC AAA ATA GGA CGA TTC CAA GGC 671
48 Leu Asp Cys Arg Asn Phe Tyr Glu Ser Lys Ile Gly Arg Phe Gln Gly 63
672 TGC TTA GCC CCA GAC ATC AGG AAA TTC AGT TAG TTC CCT AGC TAC GTT 719
64 Cys Leu Ala Pro Asp Ile Arg Lys Phe Ser Tyr Phe Pro Ser Tyr Val 79
720 GAC AAA AAT CTA GAA CTT TTC AGA GAG AAG AGA GTG CTG ATG TAC TGT 767
80 Asp Lys Asn Leu Glu Leu Phe Arg Glu Lys Arg Val Leu Met Tyr Cys 95
768 ACC GGG GGC ATC CGC TGT GAG CGG GGT TCA GCC TAC CTC AAA GCC AAG 815
96 Thr Gly Gly Ile Arg Cys Glu Arg Gly Ser Ala Tyr Leu Lys Ala Lys 111
816 GGA GTG TGC AAG GAG GTG TTC CAG CTC AAG GGT GGC ATC CAC AAG TAC 863
112 Gly Val Cys Lys Glu Val Phe Gln Leu Lys Gly Gly Ile His Lys Tyr 127
864 CTG GAA GAG TTT CCT GAT GGC TTT TAC AAA GGG AAG TTG TTT GTT TTT 911
128 Leu Glu Glu Phe Pro Asp Gly Phe Tyr Lys Gly Lys Leu Phe Val Phe 143
912 GAT GAA CGC TAT GCT CTG TCC TAC AAC AGT GAT GTG GTG TCA GGT AGG 959
144 Asp Glu Arg Tyr Ala Leu Ser Tyr Asn Ser Asp Val Val Ser Gly Arg 159
960 TCA GCA CAG GCT CAG AGC CCA AAC TGA AAT GAA GCA CAT TGT CAG TTC 1007
160 Ser Ala Gln Ala Gln Ser Pro Asn *** 168
1008 ACC ATT CTA GAA AAA TGA CAC AGG GAA GAC AGG CCA GTG CTC ATT ACT 1055
1056 GAG CAC TGA ATA AGC AGG GAA AAT AAG TAC ATT GTG CCA CCA TTT TCC 1103
1104 CAG CTG TGG AGC TGA GAG AAC CCT AGC CCA GGA GTC AGG AGG CCT GGG 1151
1152 TTG GGA TCC TGG CTT CAC CAT TGC TAG CTG GAC AAG CCC ATT AAC ATG 1199
1200 GGG ATC ATC TCA CCT GCC CTG CCT GCC TGT CTA CCT GCC AAG AGC TGT 1247
1248 ACT ACT GGG CTA ATT CAG GGC TCT TAA CCT GGA ATT GGT ACA TAG ATT 1295
1296 TCA GGG ATT CTG TGA ATT TGG ATG GAA AAA TAA TTG TAT CTT TGT TTT 1343
1344 CAA TAA CAC CTC ACT AAA ATG AAG CAT TTC CTT TAG TTA TGA ATG TAG 1391
1392 GCA ACA AAG TAC CAG TTG TAT TAA TGT ACC TGT GAC TTT GTC TTC AGT 1439
1440 AGG ATT CAC AAT ACT TTC ATA TCA TGT TCT AGT TGC CTC AGA TAT CTC 1487
1488 AAA ATA GTA TTT ATA CTC ATC ACT GCT TCA AAA TGA AAA TAG TTA TTA 1535
1536 GGC CCA CCA CTA AGA GTT GAT ATA TAA TGT GTT AAT AAA TGG CAC GTC 1583
1584 TTA TTA TAT ATT ACA GAT TTT GAA AAA GAA AAA AAA AAA AAA AAA AAA 1631
1632 AAA AAA AAA AAA CAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA 1679
1680 AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA 1727
1728 AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AAA AA 1762
7.PP2500 albumen
A: nucleotide sequence (SEQ ID NO:19) length: 2153bp
1 CTGGGACGGG GGAAAGGAGA CGCTTCTTCC TCTTGCTGCT CTTCTCGTTC
51 CCGAGATCAG CGGCGGCGGT GACCGCGAGT GGGTCGGCAC CGTCTCCGGC
101 TCCGGGTGCG AACAATGCTG ACTGATAGCG GAGGCGGCGG CACCTCCTTT
151 GAGGAGGACC TGGACTCTGT GGCTCCGCGA TCCGCCCCAG CTGGGGCCTC
201 GGAGCCGCCT CCGCCGGGAG GGGTCGGTCT GGGGATCCGC ACCGTGAGGC
251 TCTTTGGGGA GGCCGGGCCA GCGTCGGGAG TCGGCAGCAG CGGCGGCGGC
301 GGCAGCGGCA GCGGTACGGG CGGAGGGGAC GCGGCGCTGG ATTTCAAGTT
351 GGCGGCTGCC GTGCTGAGGA CCGGGGGTGG AGGTGGTGCC TCTGGCAGTG
401 ACGAGGACGA AGTGTCCGAG GTTGAATCAT TTATTTTGGA CCAAGAAGAT
451 CTGGATAACC CAGTGCTTAA AACAACATCA GAGATATTCT TATCAAGTAC
501 TGCAGAAGGA GCAGACTTAC GCACTGTGGA TCCAGAGACA CAGGCACGAC
551 TAGAAGCATT GCTAGAAGCA GCAGGAATTG GCAAATTGTC AACTGCTGAT
601 GGTAAAGCTT TTGCAGATCC TGAGGTACTC CGGAGACTGA CATCCTCAGT
651 TAGTTGTGCA CTGGATGAAG CTGCTGCTGC ACTGACACGG ATGAAAGCAG
701 AAAACAGCCA CAATGCAGGA CAAGTGGACA CTCGCAGTCT AGCAGAAGCT
751 TGTTCAGATG GGGATGTTAA TGCTGTTCGT AAATTGCTAG ATGAAGGCAG
801 AAGTGTAAAT GAACATACAG AAGAAGGAGA AAGCCTGCTG TGTTTGGCTT
851 GTTCAGCAGG GTATTATGAA TTAGCACAAG TATTGCTTGC TATGCATGCT
901 AATGTTGAAG ATCGAGGGAA TAAAGGAGAC ATAACTCCCC TGATGGCAGC
951 TTCCAGTGGA GGTTACTTAG ATATTGTGAA ATTATTACTT CTTCATGATG
1001 CTGATGTCAA CTCCCAGTCT GCAACAGGAA ACACTGCGCT AACTTATGCA
1051 TGTGCTGGAG GATTTGTTGA CATTGTTAAA GTGCTCCTTA ATGAAGGTGC
1101 AAATATAGAA GATCATAATG AAAATGGACA TACTCCCTTA ATGGAAGCAG
1151 CCAGTGCAGG TCATGTGGAA GTTGCAAGAG TTCTTTTAGA TCATGGTGCA
1201 GGCATCAACA CTCATTCTAA TGAATTCAAA GAAAGTGCTC TAACACTTGC
1251 TTGCTACAAA GGCCATTTGG ATATGGTTCG CTTTCTACTT GAAGCTGGTG
1301 GAGATCAAGA GCACAAAACA GATGAGATGC ACACTGCCTT AATGGAGGCC
1351 TGCATGGATG GACATGTAGA GGTGGCACGT TTGCTTTTGG ATAGTGGTGC
1401 TCAAGTGAAC ATGCCTGCAG ATTCATTTGA ATCTCCATTG ACGCTAGCTG
1451 CCTGTGGAGG ACATGTTGAA TTGGCAGCTC TACTTATTGA AAGGGGAGCA
1501 AATCTTGAAG AAGTTAATGA TGAAGGATAC AGTCCCTTGA TGGAAGCTGC
1551 CCGGGAAGGA CATGAAGAAA TGGTGGCACT ACTCTTAGCA CAAGGAGCAA
1601 ATATAAATGC CCAGACAGAA GAAACTCAAG AAACTGCTCT TACCTTGGCT
1651 TGCTGTGGAG GATTTTCTGA AGTTGCAGAC TTTCTTATTA AGGCAGGGGC
1701 TGATATAGAA CTTGGCTGCT CCACACCTCT GATGGAGGCA TCTCAGGAGG
1751 GACACCTGGA ATTGGTTAAA TATTTGCTGG CTTCTGGCGC TAATGTGCAT
1801 GCTACAACAG CAACAGGAGA CACAGCCTTA ACCTATGCTT GTGAAAATGG
1851 ACATACGGAT GTTGCAGATG TTTTACTTCA AGCAGGGGCT GATTTAGACA
1901 AGCAGGAGGA CATGAAGACT ATTTTGGAGG GCATAGATCC GGCCAAGCAT
1951 CAGGTGAGGG TGGCCTTTGA TGCTTGTAAG CTACTACGTA AAGAATAGAT
2001 GTTGTAGGTA ACCAGAACTC TGGATATCTG AATTCCAGCC AAGAAGTTCC
2051 AGGACCCTGC TGGGTGACAA AGGAAATCCT CTTCAATTGA AAAAGATTAT
2101 GAAGTCCCAA TAAAAAGAGA TTTGTATTGC TAAAAAAAAA AAAAAAAAAA
2151 AAA
B: aminoacid sequence (SEQ ID NO:20) length: 627 amino acid
1 MLTDSGGGGT SFEEDLDSVA PRSAPAGASE PPPPGGVGLG IRTVRLFGEA GPASGVGSSG
61 GGGSGSGTGG GDAALDFKLA AAVLRTGGGG GASGSDEDEV SEVESFILDQ EDLDNPVLKT
121 TSEIFLSSTA EGADLRTVDP ETQARLEALL EAAGIGKLST ADGKAFADPE VLRRLTSSVS
181 CALDEAAAAL TRMKAENSHN AGQVDTRSLA EACSDGDVNA VRKLLDEGRS VNEHTEEGES
241 LLCLACSAGY YELAQVLLAM HANVEDRGNK GDITPLMAAS SGGYLDIVKL LLLHDADVNS
301 QSATGNTALT YACAGGFVDI VKVLLNEGAN IEDHNENGHT PLMEAASAGH VEVARVLLDH
361 GAGINTHSNE FKESALTLAC YKGHLDMVRF LLEAGADQEH KTDEMHTALM EACMDGHVEV
421 ARLLLDSGAQ VNMPADSFES PLTLAACGGH VELAALLIER GANLEEVNDE GYTPLMEAAR
481 EGHEEMVALL LAQGANINAQ TEETQETALT LACCGGFSEV ADFLIKAGAD IELGCSTPLM
541 EASQEGHLEL VKYLLASGAN VHATTATGDT ALTYACENGH TDVADVLLQA GADLDKQEDM
601 KTILEGIDPA KHQVRVAFDA CKLLRKE
C. Nucleotide and amino acid composite sequence (SEQ ID NO:21)
Clone number: PP2500
Start code: 115 ATG stop coding: 1998 TAG
Protein molecular weight: 64908
1 CTG GGA CGG GGG AAA GGA GAC GCT TCT TCC TCT TGC TGC TCT TCT CGT 48
49 TCC CGA GAT CAG CGG CGG CGG TGA CCG CGA GTG GGT CGG CAC CGT CTC 96
97 CGG CTC CGG GTG CGA ACA ATG CTG ACT GAT AGC GGA GGC GGC GGC ACC 144
1 Met Leu Thr Asp Ser Gly Gly Gly Gly Thr 10
145 TCC TTT GAG GAG GAC CTG GAC TCT GTG GCT CCG CGA TCC GCC CCA GCT 192
11 Ser Phe Glu Glu Asp Leu Asp Ser Val Ala Pro Arg Ser Ala Pro Ala 26
193 GGG GCC TCG GAG CCG CCT CCG CCG GGA GGG GTC GGT CTG GGG ATC CGC 240
27 Gly Ala Ser Glu Pro Pro Pro Pro Gly Gly Val Gly Leu Gly Ile Arg 42
241 ACC GTG AGG CTC TTT GGG GAG GCC GGG CCA GCG TCG GGA GTC GGC AGC 288
43 Thr Val Arg Leu Phe Gly Glu Ala Gly Pro Ala Ser Gly Val Gly Ser 58
289 AGC GGC GGC GGC GGC AGC GGC AGC GGT ACG GGC GGA GGG GAC GCG GCG 336
59 Ser Gly Gly Gly Gly Ser Gly Ser Gly Thr Gly Gly Gly Asp Ala Ala 74
337 CTG GAT TTC AAG TTG GCG GCT GCC GTG CTG AGG ACC GGG GGT GGA GGT 384
75 Leu Asp Phe Lys Leu Ala Ala Ala Val Leu Arg Thr Gly Gly Gly Gly 90
385 GGT GCC TCT GGC AGT GAC GAG GAC GAA GTG TCC GAG GTT GAA TCA TTT 432
91 Gly Ala Ser Gly Ser Asp Glu Asp Glu Val Ser Glu Val Glu Ser Phe 106
433 ATT TTG GAC CAA GAA GAT CTG GAT AAC CCA GTG CTT AAA ACA ACA TCA 480
107 Ile Leu Asp Gln Glu Asp Leu Asp Asn Pro Val Leu Lys Thr Thr Ser 122
481 GAG ATA TTC TTA TCA AGT ACT GCA GAA GGA GCA GAC TTA CGC ACT GTG 528
123 Glu Ile Phe Leu Ser Ser Thr Ala Glu Gly Ala Asp Leu Arg Thr Val 138
529 GAT CCA GAG ACA CAG GCA CGA CTA GAA GCA TTG CTA GAA GCA GCA GGA 576
139 Asp Pro Glu Thr Gln Ala Arg Leu Glu Ala Leu Leu Glu Ala Ala Gly 154
577 ATT GGC AAA TTG TCA ACT GCT GAT GGT AAA GCT TTT GCA GAT CCT GAG 624
155 Ile Gly Lys Leu Ser Thr Ala Asp Gly Lys Ala Phe ALa Asp Pro Glu 170
625 GTA CTC CGG AGA CTG ACA TCC TCA GTT AGT TGT GCA CTG GAT GAA GCT 672
171 Val Leu Arg Arg Leu Thr Ser Ser Val Ser Cys Ala Leu Asp Glu Ala 186
573 GCT GCT GCA CTG ACA CGG ATG AAA GCA GAA AAC AGC CAC AAT GCA GGA 720
187 Ala Ala Ala Leu Thr Arg Met Lys Ala Glu Asn Ser His Asn Ala Gly 202
721 CAA GTG GAC ACT CGC AGT CTA GCA GAA GCT TGT TCA GAT GGG GAT GTT 768
203 Gln Val Asp Thr Arg Ser Leu Ala Glu Ala Cys Ser Asp Gly Asp Val 218
769 AAT GCT GTT CGT AAA TTG CTA GAT GAA GGC AGA AGT GTA AAT GAA CAT 816
219 Asn Ala Val Arg Lys Leu Leu Asp Glu Gly Arg Ser Val Asn Glu His 234
817 ACA GAA GAA GGA GAA AGC CTG CTG TGT TTG GCT TGT TCA GCA GGG TAT 864
235 Thr Glu Glu Gly Glu Ser Leu Leu Cys Leu Ala Cys Ser Ala Gly Tyr 250
865 TAT GAA TTA GCA CAA GTA TTG CTT GCT ATG CAT GCT AAT GTT GAA GAT 912
251 Tyr Glu Leu Ala Gln Val Leu Leu Ala Met His Ala Asn Val Glu Asp 266
913 CGA GGG AAT AAA GGA GAC ATA ACT CCC CTG ATG GCA GCT TCC AGT GGA 960
267 Arg Gly Asn Lys Gly Asp Ile Thr Pro Leu Met Ala Ala Ser Ser Gly 282
961 GGT TAC TTA GAT ATT GTG AAA TTA TTA CTT CTT CAT GAT GCT GAT GTC 1008
283 Gly Tyr Leu Asp Ile Val Lys Leu Leu Leu Leu His Asp Ala Asp Val 298
1009 AAC TCC CAG TCT GCA ACA GGA AAC ACT GCG CTA ACT TAT GCA TGT GCT 1056
299 Asn Ser Gln Ser Ala Thr Gly Asn Thr Ala Leu Thr Tyr Ala Cys Ala 314
1057 GGA GGA TTT GTT GAC ATT GTT AAA GTG CTC CTT AAT GAA GGT GCA AAT 1104
315 Gly Gly Phe Val Asp Ile Val Lys Val Leu Leu Asn Glu Gly Ala Asn 330
1105 ATA GAA GAT CAT AAT GAA AAT GGA CAT ACT CCC TTA ATG GAA GCA GCC 1152
331 Ile Glu Asp His Asn Glu Asn Gly His Thr Pro Leu Met Glu Ala Ala 346
1153 AGT GCA GGT CAT GTG GAA GTT GCA AGA GTT CTT TTA GAT CAT GGT GCA 1200
347 Ser Ala Gly His Val Glu Val Ala Arg Val Leu Leu Asp His Gly Ala 362
1201 GGC ATC AAC ACT CAT TCT AAT GAA TTC AAA GAA AGT GCT CTA ACA CTT 1248
363 Gly Ile Asn Thr His Ser Asn Glu Phe Lys Glu Ser Ala Leu Thr Leu 378
1249 GCT TGC TAC AAA GGC CAT TTG GAT ATG GTT CGC TTT CTA CTT GAA GCT 1296
379 Ala Cys Tyr Lys Gly His Leu Asp Met Val Arg Phe Leu Leu Glu Ala 394
1297 GGT GCA GAT CAA GAG CAC AAA ACA GAT GAG ATG CAC ACT GCC TTA ATG 1344
395 Gly Ala Asp Gln Glu His Lys Thr Asp Glu Met His Thr Ala Leu Met 410
1345 GAG GCC TGC ATG GAT GGA CAT GTA GAG GTG GCA CGT TTG CTT TTG GAT 1392
411 Glu Ala Cys Met Asp Gly His Val Glu Val Ala Arg Leu Leu Leu Asp 426
1393 AGT GGT GCT CAA GTG AAC ATG CCT GCA GAT TCA TTT GAA TCT CCA TTG 1440
427 Ser Gly Ala Gln Val Asn Met Pro Ala Asp Ser Phe Glu Ser Pro Leu 442
1441 ACG CTA GCT GCC TGT GGA GGA CAT GTT GAA TTG GCA GCT CTA CTT ATT 1488
443 Thr Leu Ala Ala Cys Gly Gly His Val Glu Leu Ala Ala Leu Leu Ile 458
1489 GAA AGG GGA GCA AAT CTT GAA GAA GTT AAT GAT GAA GGA TAC ACT CCC 1536
459 Glu Arg Gly Ala Asn Leu Glu Glu Val Asn Asp Glu Gly Tyr Thr Pro 474
1537 TTG ATG GAA GCT GCC CGG GAA GGA CAT GAA GAA ATG GTG GCA CTA CTC 1584
475 Leu Met Glu Ala Ala Arg Glu Gly His Glu Glu Met Val Ala Leu Leu 490
1585 TTA GCA CAA GGA GCA AAT ATA AAT GCC CAG ACA GAA GAA ACT CAA GAA 1632
491 Leu Ala Gln Gly Ala Asn Ile Asn Ala Gln Thr Glu Glu Thr Gln Glu 506
1633 ACT GCT CTT ACT TTG GCT TGC TGT GGA GGA TTT TCT GAA GTT GCA GAC 1680
507 Thr Ala Leu Thr Leu Ala Cys Cys Gly Gly Phe Ser Glu Val Ala Asp 522
1681 TTT CTT ATT AAG GCA GGG GCT GAT ATA GAA CTT GGC TGC TCC ACA CCT 1728
523 Phe Leu Ile Lys Ala Gly Ala Asp Ile Glu Leu Gly Cys Ser Thr Pro 538
1729 CTG ATG GAG GCA TCT CAG GAG GGA CAC CTG GAA TTG GTT AAA TAT TTG 1776
539 Leu Met Glu Ala Ser Gln Glu Gly His Leu Glu Leu Val Lys Tyr Leu 554
1777 CTG GCT TCT GGC GCT AAT GTG CAT GCT ACA ACA GCA ACA GGA GAC ACA 1824
555 Leu Ala Ser Gly Ala Asn Val His Ala Thr Thr Ala Thr Gly Asp Thr 570
1825 GCC TTA ACC TAT GCT TGT GAA AAT GGA CAT ACG GAT GTT GCA GAT GTT 1872
571 Ala Leu Thr Tyr Ala Cys Glu Asn Gly His Thr Asp Val Ala Asp Val 586
1873 TTA CTT CAA GCA GGG GCT GAT TTA GAC AAG CAG GAG GAC ATG AAG ACT 1920
587 Leu Leu Gln AAa Gly Ala Asp Leu Asp Lys Gln Glu Asp Met Lys Thr 602
1921 ATT TTG GAG GGC ATA GAT CCG GCC AAG CAT CAG GTG AGG GTG GCC TTT 1968
603 Ile Leu Glu Gly Ile Asp Pro Ala Lys His Gln Val Arg Val Ala Phe 618
1969 GAT GCT TGT AAG CTA CTA CGT AAA GAA TAG ATG TTG TAG GTA ACC AGA 2016
619 Asp Ala Cys Lys Leu Leu Arg Lys Glu *** 628
2017 ACT CTG GAT ATC TGA ATT CCA GCC AAG AAG TTC CAG GAC CCT GCT GGG 2064
2065 TGA CAA AGG AAA TCC TCT TCA ATT GAA AAA GAT TAT GAA GTC CCA ATA 2112
2113 AAA AGA GAT TTG TAT TGC TAA AAA AAA AAA AAA AAA AAA AA 2153
D.Blastp result
Query=PP2500 (627 amino acid)
>SP_IN:Q21920 Q21920 caenorhabditis elegans.r11a8.7 protein.11/1999
Length=2606 amino acid
Score value=184bits (462), predicated value=1e-45
Homogeny=115/306 (37%), similarity=168/306 (54%), breach=9/306 (2%)
Query:304 TGNTALTYACAGGFVDIVKVLLNEGANIEDHNENGHTPLMEAASAGHVEVARVLLDHGAG 363
T T LT ACA G DIV++LL EGANIE ++ G +PL+ AA+AGH V VLL + A
Sbjct:1220 TLETPLTIACANGHKDIVELLLKEGANIEHRDKKGFSPLIIAATAGHSSVVEVLLKNHAA 1279
Query:364 INTHSNEFKESALTLACYKGHLDMVRFLLEAGADQEHKTDEMHTALMEACMDGHVEVARL 423
I S+ K++AL+LAC G D+V LL GA++EH+ +T L A G++E+ +
Sbjct:1280 IEAQSDRTKDTALSLACSGGRKDVVELLLAHGANKEHRNVSDYTPLSLASSGGYIEIVNM 1339
Query:424 LLDSGAQVNMPADS--FESPLTLAACGGHVELAALLIERGANLE-EVNDEGYTPLMEAAR 480
LL +G+++N S SPL LA+ GH E +L+E+G+++ ++ T L A+
Sbjct:1340 LLTAGSEINSRTGSKLGISPLMLASMNGHREATRVLLEKGSDINAQIETNRNTALTLASF 1399
Query:481 EGHEEMVALLLAQGANINXXXXXXXXXXXXXXCCGGFSEVADFLIKAGAD-----IELGC 535
+G E+V LLLA AN+ GG+ +V +LI AGAD ++
Sbjct:1400 QGRTEVVKLLLAYNANVE-HRAKTGLTPLMECASGGYVDVGNLLIAAGADTNASPVQQTK 1458
Query:536 STPLMEASQEGHLELVKYLLASGANVHATTATGDTALTYACENGHTDVADVLLQAGADLD 595
T L ++++GH + V+ LL A V G TAL AC G+ A LL+GAD D
Sbjct:1459 DTALTISAEKGHEKFVRMLLNGDAAVDVRNKKGCTALWLACNGGYLSTAQALLEKGADPD 1518
Query:596 KQEDMK 601
++ K
Sbjct:1519 MFDNRK 1524
Score value=183bits (460), predicated value=2e-45
Homogeny=129/397 (32%), similarity=214/397 (53%), breach=18/397 (4%)
Query:170 EVLRRLTSSVSCALDEAAAALTRMKAENSHNAGQVDTRSLAEACSDGDVNAVRKLLDEGR 229
E+++ +S + ++ +L ++ GQ L S+GD +
Sbjct:1159 EIQKKGKTSSGTLISTSSKSLMAKSVQSQQQQGQ-----LRRTHSEGD--GAERAKSRSN 1211
Query:230 SVNEHTEEG-ESLLCLACSAGYYELAQVLLAMHANVEDRGNKGDITPLMAASSGGYLDIV 288
++++ TE E+ L +AC+ G+ ++ ++LL AN+E R KG +PL+ A++ G+ +V
Sbjct:1212 AIDKATETTLETPLTIACANGHKDIVELLLKEGANIEHRDKKG-FSPLIIAATAGHSSVV 1270
Query:289 KLLLLHDADVNSQS-ATGNTALTYACAGGFVDIVKVLLNEGANIEDHNENGHTPLMEAAS 347
++LL +A + +QS T +TAL+ AC+GG D+V++LL GAN E N+ +TPL A+S
Sbjct:1271 EVLLKNHAAIEAQSDRTKDTALSLACSGGRKDVVELLLAHGANKEHRNVSDYTPLSLASS 1330
Query:348 AGHVEVARVLLDHGAGINTHS-NEFKESALTLACYKGHLDMVRFLLEAGADQEHKTD-EM 405
G++E+ +LL G+ IN+ + ++ S L LA GH + R LLE G+D + +
Sbjct:1331 GGYIEIVNMLLTAGSEINSRTGSKLGISPLMLASMNGHREATRVLLEKGSDINAQIETNR 1390
Query:406 HTALMEACMDGHVEVARLLLDSGAQVNMPADSFESPLTLAACGGHVELAALLIERGA--N 463
+TAL A G EV+LLL A V A + +PL A GG+V++ LLI GA N
Sbjct:1391 NTALTLASFQGRTEVVKLLLAYNANVEHRAKTGLTPLMECASGGYVDVGNLLIAAGADTN 1450
Query:464 LEEVNDEGYTPLMEAAREGHEEMVALLLAQGANINXXXXXXXXXXXXXXCCGGFSEVADF 523
V T L +A +GHE+ V +LL A ++ C GG+ A
Sbjct:1451 ASPVQQTKDTALTISAEKGHEKFVRMLLNGDAAVD-VRNKKGCTALWLACNGGYLSTAQA 1509
Query:524 LIKAGADIELGCS---TPLMEASQEGHLELVKYLLAS 557
L++ GAD ++ + +P+M A ++GH+E+VKY++ S
Sbjct:1510 LLEKGADPDMFDNRKISPMMAAFRKGHVEIVKYMVNS 1546
Score value=160bits (400), predicated value=3e-38
Homogeny=120/400 (30%), similarity=206/400 (51%), breach=32/400 (8%)
Query:230 SVNEHTEEGESLLCLACSAGYYELAQVLLAMHANVEDRGNK-GDITPLMAASSGGYLDIV 288
S++ + ++L LA G + + + ++ RG+K ITPLM A++ IV
Sbjct:199 SIDSQIVQQNAMLLLAARVGIEQFVEYSHEIGV-MQFRGDKLSKITPLMEAAASSSETIV 257
Query:289 KLLLLHDADVNSQSATG-NTALTYACAGGFVDIVK-VLLNEGANIEDH---NENGHTPLM 343
+LL AD N S NTAL YA+ D+V+ +L +EG D N + H +M
Sbjct:258 RRLLELGADPNVASIPNCNTALIYAASTDGRDVVREILMTEGPKKPDVYLINNHYHDAMM 317
Query:344 EAASAGHVEVARVLLDHGAG---INTHSNEFKESALTLACYKGHLDMVRFLLEAGADQEH 400
E A G + + L+ G +N E ++SALTL+ KGH+ +V +++
Sbjct:318 EVALVGGTDTLKEFLEMGYRPRFLNLRQQE-RDSALTLSAQKGHIKIVTAIMDYYEKNPP 376
Query:401 KTDE--------MHTALMEACMDGHVEVARLLLDSGAQVNMPADSF--ESPLTLAACGGH 450
+T+E ++ALMEA M+GH++V +L+L G ++ + SPL +A+ GG+
Sbjct:377 QTEEEKQELCLERYSALMEAAMEGHIDVCKLMLSRGTPADLCTEVTIEPSPLIVASAGGY 436
Query:451 VELAALLIERGANLEEVNDEGYTPLMEAAREGHEE---MVALLLAQGANINXXXXXXXXX 507
E+ +L+ GA +EE++++ TPLMEA + +V LLL++ A ++
Sbjct:437 PEVVEVLLAAGAKIEELSNKKNTPLMEACAGDQGDQAGVVKLLLSKHAEVDVSNPDTGDT 496
Query:508 XXXXXCCGGFSEVADFLIKAGADIELGCSTPLMEASQEGHLELVKYLLASGANVHATTAT 567
G+ + LI+ G D+ G++P++EA++ GHLE ++++LA H T
Sbjct:497 PLSLAARNGYIAIMKMLIEKGGDLTAGKTSPIVEAARNGHLECIQFILA-----HCKTIP 551
Query:568 GD---TALTYACENGHTDVADVLLQAGADLDKQEDMKTIL 604
D AL A+G + + +++AGADL+ ++D +T L
Sbjct:552 QDQLSRALVSAADFGSLLIVEEVIRAGADLNFEQDERTAL 591

Claims (10)

1. isolating human polypeptides with cancer suppressing function, it is characterized in that, it contains the aminoacid sequence that is selected from down group: SEQ ID NO:2, SEQ ID NO:5, SEQ ID NO:8, SEQ ID NO:11, SEQ ID NO:14, SEQ IDNO:17, SEQ ID NO:20, described polypeptide forms inhibited to the clone of the hepatoma cell line 7721 of cultivation.
2. polypeptide as claimed in claim 1 is characterized in that, this amino acid sequence of polypeptide is selected from down group: SEQ ID NO:2, SEQ ID NO:8, SEQ ID NO:11, SEQ ID NO:17, SEQ ID NO:20.
3. isolating polynucleotide is characterized in that, it is selected from down group:
(a) polynucleotide of polypeptide according to claim 1 of encoding;
(b) with polynucleotide (a) complementary polynucleotide.
4. polynucleotide as claimed in claim 3 is characterized in that, the polypeptide of this polynucleotide encoding has the aminoacid sequence of the group of being selected from down: SEQ ID NO:2, SEQ ID NO:8, SEQ ID NO:11, SEQ ID NO:17, SEQ IDNO:20.
5. polynucleotide as claimed in claim 3 is characterized in that, the sequence of these polynucleotide is selected from down group:
Coding region sequence or the full length sequence of SEQ ID NO:3, SEQ ID NO:6, SEQ ID NO:9, SEQ ID NO:12, SEQ ID NO:15, SEQID NO:18, SEQ ID NO:21.
6. a carrier is characterized in that, it contains the described polynucleotide of claim 3.
7. a genetically engineered host cell is characterized in that, it is a kind of host cell that is selected from down group:
(a) host cell that transforms or transduce with the described carrier of claim 6;
(b) host cell that transforms or transduce with the described polynucleotide of claim 3.
8. the preparation method of the polypeptide of the people's protein-active with cancer suppressing function is characterized in that this method comprises:
(a) have under the proteic condition of people of cancer suppressing function suitable the expression, cultivate the described host cell of claim 7;
(b) isolate the polypeptide of the people's protein-active with cancer suppressing function from culture, described polypeptide forms inhibited to the clone of the hepatoma cell line 7721 of cultivating.
9. energy and the described human polypeptides specificity bonded antibody with cancer suppressing function of claim 1, wherein said polypeptide forms inhibited to the clone of the hepatoma cell line 7721 of cultivating.
10. a pharmaceutical composition is characterized in that, it contains the described polypeptide of claim 1 and the pharmaceutically acceptable carrier of safe and effective amount.
CNB001157442A 2000-05-18 2000-05-18 New human protein with the function of inhibiting tumor cell growth and its encoding sequence Expired - Fee Related CN1193040C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB001157442A CN1193040C (en) 2000-05-18 2000-05-18 New human protein with the function of inhibiting tumor cell growth and its encoding sequence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB001157442A CN1193040C (en) 2000-05-18 2000-05-18 New human protein with the function of inhibiting tumor cell growth and its encoding sequence

Publications (2)

Publication Number Publication Date
CN1324819A CN1324819A (en) 2001-12-05
CN1193040C true CN1193040C (en) 2005-03-16

Family

ID=4585188

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB001157442A Expired - Fee Related CN1193040C (en) 2000-05-18 2000-05-18 New human protein with the function of inhibiting tumor cell growth and its encoding sequence

Country Status (1)

Country Link
CN (1) CN1193040C (en)

Also Published As

Publication number Publication date
CN1324819A (en) 2001-12-05

Similar Documents

Publication Publication Date Title
CN1170850C (en) Human angiogenin-like protein and coding sequence and application thereof
CN1169954C (en) Human protein able to suppress growth of cancer cells and its coding sequence
CN1193040C (en) New human protein with the function of inhibiting tumor cell growth and its encoding sequence
CN1160370C (en) A novel human cell cysle control related protein and a sequence encoding the same
CN1177864C (en) Novel human protein with expression difference in liver cancer tissue and its code sequence
CN1169958C (en) Human protein able to suppress growth of cancer cells and its coding sequence
CN1194010C (en) New human protein with the function of inhibiting cancer cell growth and its coding sequence
CN1303102C (en) Method for diagnosing and curing alopecia utilizing the Rhor gene of human and rat and the encoding products
CN1177048C (en) Human protein with function of suppressing cancer cell growth and its coding sequence
CN1194989C (en) Novel human protein able to suppress cancer cell growth and its coding sequence
CN1170848C (en) Novel human hepatoma associated protein and coding sequence thereof
CN1166686C (en) New human protein with the function of inhibiting cancer cell growth and its coding sequence
CN1209373C (en) Human protein with suppression to cancer cell growth and its coding sequence
CN1169955C (en) Human protein able to suppress growth of cancer cells and its coding sequence
CN1155615C (en) Human protein with cancer cell growth suppressing function and its coding sequence
CN1199998C (en) Human protein with suppression to cancer cell growth and its coding sequence
CN1155614C (en) Human protein with cancer cell growth suppressing function and its coding sequence
CN1190446C (en) New human protein with function of improving mouse NIH/3T3 cell transformation and its encoding sequence
CN1209374C (en) Human Protein for promoting transform of 3T3 cell and its coding sequence
CN1169831C (en) Human protein with cancer call growth suppressing function and its coding sequence
CN1193041C (en) New human protein with the function of inhibiting cancer cell growth and its encoding sequence
CN1169957C (en) Human protein able to suppress growth of cancer cells and its coding squence
CN1199997C (en) New human protein having mouse NIH/3T3 cell conversion promoting function and its code sequence
CN1177049C (en) Human protein with function of suppressing cancer cell growth and its coding sequence
CN1230445C (en) Novel human protein with function for promoting mouse NIH/313 cell transformation and coding sequence thereof

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C19 Lapse of patent right due to non-payment of the annual fee
CF01 Termination of patent right due to non-payment of annual fee