NZ564717A - Marker assisted selection of bovine for desired milk fat colour - Google Patents
Marker assisted selection of bovine for desired milk fat colourInfo
- Publication number
- NZ564717A NZ564717A NZ56471707A NZ56471707A NZ564717A NZ 564717 A NZ564717 A NZ 564717A NZ 56471707 A NZ56471707 A NZ 56471707A NZ 56471707 A NZ56471707 A NZ 56471707A NZ 564717 A NZ564717 A NZ 564717A
- Authority
- NZ
- New Zealand
- Prior art keywords
- bovine
- polymorphism
- bcmol
- milk
- gene
- Prior art date
Links
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Provided is a method of determining genetic merit of a bovine, wherein the genetic merit is with respect to a milk content phenotype or a milk colour phenotype, the method comprising determining the BCMO1 allelic profile of the bovine and determining the genetic merit of the bovine on the basis of the BCMO1 allelic profile. Further provided are specific DNA probes and primers for the alleles, polypeptides corresponding to the alleles, bovines selected by the method and milk and milk products obtained from the selected bovines.
Description
564717
NEW ZEALAND PATENTS ACT, 1953
No: 564717/566978
Date: 21 December 2007/26 Match 2008
COMPLETE SPECIFICATION
MARICER ASSISTED SELECTION OF BOVINE FOR DESIRED MILK CONTENT
We, VIALACTIA BIOSCIENCES (NZ) LIMITED, a New Zealand company of 9 Princes Street, Auckland, New Zealand, do hereby declare the invention for which we pray that a patent may be granted to us, and the method by which it is to be performed, to be particularly described in and by the following statement:
564717
2
MARKER ASSISTED SELECTION OF BOVINE FOR DESIRED MILK CONTENT FIELD OF THE INVENTION
[0001] This invention relates to an application of marker assisted selection of bovine for a quantitative trait loci (QTL) associated with milk colour and p-carotene content, particularly by assaying for the presence of polymorphisms in a gene which is associated with the QTL. BACKGROUND
[0002] The genetic basis of bovine milk production is of immense significance to the dairy industry. An ability to modulate milk volumes and composition has the potential to alter farming practices and to produce products which are tailored to meet a range of requirements. In particular, a method of genetically evaluating bovine to select those which express desirable traits, such as desirable milk fat colour or composition, would be useful.
[0003] Genetic bases for variations in the composition of milk, for example, the relative amounts of major milk proteins, and the effect of these variations on milk production characteristics and milk processing properties, has been the subject of considerable research, debate, and review. For example, PCT International application PCT/NZ01/00245 (published as W002/36824) reports that polymorphisms in the bovine Diacylglycerol-o-acyltransferase 1 (DGAT1) gene are associated with increased milk yield and altered milk composition, and in particular that the presence of a K232A mutation in the DGAT1 gene results in a decrease in milk fat percentage, milk fat yield, solid fat content and milk protein percentage, while increasing milk volume and milk protein yield. In another example, PCT International application PCT/NZ02/00157 (published as W003/104492) reports that polymorphisms in the bovine growth hormone receptor (GHR) gene are associated with an increased milk volume and altered milk composition, and in particular that the presence of the F279Y amino acid variant results in increased milk yield and decreased milk fat and milk protein percentage, as Well as a decrease in live weight. For other characteristics of milk composition, the basis for variation is less clear.
[0004] The yellow colour of milk and milk fat, caused primarily by the presence of (3-carotene, is considered a negative characteristic in some consumer markets. Conversely, other markets prize the yellow colour, while foods enriched in (3-carotene have been associated with health benefits. Consequently, strategies to modulate milk colour could be economically valuable. Although environmental factors, such as diet, lactation stage and milk
564717
3
volume, influence milk colour, previous research suggests that some of the variation in milk colour may be attributable to genetics (Winkelman et at, 1999).
[0005] Strategies to modulate milk colour or content (for example P-carotene content) could provide health benefits and are expected to be economically valuable, p-carotene and vitamin A deficiencies are still major health problems (particularly in developing countries) leading to blindness and childhood mortality. Milk with increased p-carotene content would be of benefit, for example in markets where other dietary sources of p-carotene are scarce or not commonly consumed.
[0006] Marker assisted selection, which provides the ability to follow a specific favourable genetic allele, involves the identification of a DNA molecular marker or markers that segregate(s) with a gene or group of genes associated with or which in part defines a trait. DNA markers have several advantages. They are relatively easy to measure and are unambiguous, and as DNA markers are co-dominant, heterozygous and homozygous animals can be distinctively identified. Once a marker system is established, selection decisions are able to be made very easily as DNA markers can be assayed at any time after a DNA containing sample has been collected from an individual animal, whether embryonic, infant or adult.
[0007] It is an object of the present invention to provide a method for marker assisted selection of bovine with desired milk colour or milk content, particularly milk p-carotene content; and/or to provide animals selected using the method of the invention as well as milk produced by the selected animals; and/or to provide the public with a useful choice. SUMMARY OF THE INVENTION
[0008] This invention relates to the elucidation of the role of the gene encoding p-carotene 15', 15'-monooxygenase 1 (BCMOl) [EC: 1.14.99.36] in milk colour or p-carotene content, particularly milk fat colour and p-carotene content. In particular, the invention relates to the identification of the C-1054T polymorphism in the promoter of the BCMOl gene, of the G15929A (G278R) polymorphism in exon 6 of the BCMOl gene, and of the A18068G (N341D) polymorphism in exon 7 of the BCMOl gene, and their association with variations in milk colour or content, for the first time. This includes the association of each of the T allele at the C-1054T polymorphism, the G allele at the G15929A (G278R) polymorphism, and the G allele at the A18068G (N341D) polymorphism with production of milk fat with increased p-carotene content, for the first time. Furthermore, this includes the association of each of the C allele at the C-1054T polymorphism, the A allele at the G15929A
564717
4
(G278R) polymorphism, and the A allele at the A18068G (N341D) polymorphism with production of milk fat with decreased p-carotene content, for the first time.
[0009] This gives rise to numerous, and separate, aspects of the invention.
[0010] In one aspect the invention provides a method of determining the genetic merit of a bovine with respect to milk colour or p-carotene content or with respect to capability of producing progeny that will have increased or decreased milk colour or P-carotene content, which comprises determining the BCMOl allelic profile of the bovine, and determining the genetic merit of the bovine on the basis of the BCMOl allelic profile.
[0011] In one embodiment, milk content is milk fat content, more preferably milk fat P-carotene content.
[0012] In one embodiment, milk colour is milk fat colour.
[0013] In one embodiment, the genetic merit with respect to milk colour or p-carotene content is production of milk with increased colour or P-carotene content, preferably production of milk with increased yellow colour,
[0014] In one embodiment, the genetic merit with respect to milk colour or p-carotene content is capability of producing progeny that will have increased milk colour or milk p-carotene content.
[0015] In various embodiments, the BCMOl allelic profile is determined by determining the expression or activity of a BCMOl gene product. It will be appreciated that methods comprising determining the expression or activity of a BCMOl gene product encompass determining expression from or of a BCMOl gene.
[0016] Accordingly, in various embodiments the invention provides a method for identifying or selecting a bovine that produces milk with increased p-carotene content, or that is capable of producing progeny that produce milk with increased P-carotene content, comprising determining the expression or activity of a BCMOl gene product, and identifying or selecting the bovine on the basis of the determination.
[0017J In another embodiment, the genetic merit with respect to milk colour or p-carotene content is production of milk with decreased colour or p-carotene content, preferably production of milk with decreased yellow colour.
[0018] In a further embodiment, the genetic merit with respect to milk colour or p-carotene content is capability of producing progeny that will have decreased milk colour or milk p-carotene content.
564717
[0019] Accordingly, in various embodiments the invention provides a method for identifying or selecting a bovine that produces milk with decreased P-carotene content, or capable of producing progeny that produce milk with decreased P-carotene content, comprising determining the expression or activity of a BCMOl gene product, and identifying or selecting the bovine on the basis of the determination.
[0020] In one embodiment, expression or activity of the BCMOl gene product is determined using BCMOl mRNA, for example by determining the presence or amount of BCMOl mRNA. In other embodiments, expression or activity of the BCMOl gene product is determined using BCMOl protein, preferably by determining the presence or amount of BCMOl protein, for example the amount of BCMOl protein, or by determining the activity of BCMOl protein, for example the enzymatic activity of BCMOl protein present in a sample obtained from the bovine. In still other embodiments, the expression or activity of a BCMOl gene product is determined using BCMOl DNA, preferably by determining the presence or absence of one or more polymorphisms associated with decreased or increased BCMOl expression or activity, for example one or more promoter polymorphisms associated with increased or decreased expression, or one or more coding sequence polymorphisms associated with increased or decreased expression or activity.
[0021] In another embodiment, the BCMOl allelic profile of the bovine is determined together with the allelic profile of the bovine at one or more genetic loci associated with milk colour or p-carotene content.
[0022] In one embodiment, the one or more genetic loci is one or more polymorphisms in one or more genes associated with milk colour or p-carotene content.
[0023] The one or more polymorphisms can be detected directly or by detection of one or more polymorphisms which are in linkage disequilibrium with said one or more polymorphisms.
[0024] Linkage disequilibrium (LD) is a phenomenon in genetics whereby two or more mutations or polymorphisms are in such close genetic proximity that they are co-inherited. This means that in genotyping, detection of one polymorphism as present infers the presence of the other. (Reich DE et al; Linkage disequilibrium in the human genome, Nature 2001, 411:199-204.)
[0025] It will be apparent that as used herein, the phrase "BCMOl allelic profile" contemplates data indicative of the presence or absence of one or more alleles at one or more polymorphisms in the BCMOl gene or which affect expression from the BCMOl gene or the
564717
6
expression or activity of a BCMOl gene product or which are associated with variation in the expression from the BCMOl gene or in the expression or activity of a BCMOl gene product. In preferred embodiments, the BCMOl allelic profile comprises data indicative of the presence or absence of one or more alleles at one or more polymorphisms associated with increased or decreased milk colour or P-carotene content. For example, in preferred embodiments the BCMOl allelic profile comprises data indicative of the presence or absence of the C allele or of the presence or absence of the T allele at the C-1054T polymorphism, or data indicative of the presence or absence of the A allele or of the presence or absence of the G allele at the G15929A (G278R) polymorphism, or data indicative of the presence or absence of the A allele or of the presence or absence of the G allele at the A18068G (N341D) polymorphism in the BCMOl gene. In other embodiments, the BCMOl allelic profile comprises data indicative of the presence or absence of one or more alleles at one or more polymorphisms in the promoter of the BCMOl gene, or in a regulatory region of the BCMOl gene, or in an intron of the BCMOl gene, or in a coding region of the BCMOl gene, and preferably comprises data indicative of the presence or absence of one or more alleles which affect expression from the BCMOl gene or the expression or activity of a BCMOl gene product or which are associated with variation in the expression from the BCMOl gene or in the expression or activity of a BCMOl gene product.
[0026] In one embodiment, the BCMOl allelic profile consists of data indicative of the presence or absence of the C allele or of the presence or absence of the T allele at the C-1054T polymorphism, or data indicative of the presence or absence of the A allele or of the presence or absence of the G allele at the G15929A (G278R) polymorphism, or data indicative of the presence or absence of the A allele or of the presence or absence of the G allele at the A18068G (N341D) polymorphism in the BCMOl gene, or of any combination of such data.
[0027] It will further be appreciated that the BCMOl allelic profile may comprise information correlating the presence or absence of one or more polymorphisms as described above with milk colour or p-carotene content.
[0028] In one embodiment, the allelic profile is determined using nucleic acid obtained from said bovine, preferably DNA obtained from said bovine, or alternatively, said allelic profile is determined using RNA obtained from said bovine.
[0029] In yet a further embodiment, the allelic profile is determined with reference to the amino acid sequence of BCMOl protein obtained from said bovine.
564717
7
[0030] In another embodiment, the allelic profile is determined with reference to the amount or activity of BCMOl protein obtained from said bovine.
[0031] Conveniently, in said method the presence or absence of DNA encoding a wild type BCMOl gene product, or of nucleotide sequence comprising a wild type BCMOl gene, in said bovine is determined, directly or indirectly, for example using an expressed BCMOl gene product.
[0032] Alternatively, in said method the presence or absence of at least one nucleotide difference from the nucleotide sequence of a wild type BCMOl gene, for example, at least one nucleotide difference from the nucleotide sequence encoding wild type BCMOl, in said bovine is determined, directly or indirectly.
[0033] More specifically, in said method the presence or absence of one or more of the group comprising the C allele at the C-1054T promoter polymorphism in the bovine BCMOl gene,
the T allele at the C-l 054T promoter polymorphism in the bovine BCMOl gene,
the G allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene,
the A allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene,
the A allele at the A18068G (N341D) polymorphism in the bovine BCMOl gene, and the G allele at the A18068G (N341D) polymorphism in the bovine BCMOl gene, is determined, directly or indirectly.
[0034] For example, the presence of the C allele or the T allele at the C-1054T promoter polymorphism may be determined using a polymorphism in linkage disequilibrium with the C allele or with the T allele at the C-1054T promoter polymorphism. Similarly, the presence of the A allele or the G allele at the G15929A (G278R) polymorphism may be determined using a polymorphism in linkage disequilibrium with the A allele or with the G allele at the G15929A (G278R) polymorphism. Likewise, the presence of the A allele or the G allele at the A18068G (N341D) polymorphism may be determined using a polymorphism in linkage disequilibrium with the A allele or with the G allele at the A18068G (N341D) polymorphism.
[0035] In one embodiment, the method includes ascertaining, from a sample of material containing DNA obtained from the bovine, whether a sequence of the DNA encoding a protein "(A)" having biological activity of wild type BCMOl is present, or whether a sequence of the DNA encoding an allelic protein "(B)" at least partially lacking the activity of (A) is present, or whether a sequence of the DNA encoding (A) and a sequence of the DNA encoding (B) are both present. The absence of the DNA encoding (A) and the presence of the
564717
8
DNA encoding (B) indicates an association with high relative p-carotene levels, particularly with the production of milk with, inter alia, increased fat colour. The reverse association holds true, where the presence of the DNA encoding (A) and the absence of the DNA encoding (B) indicates an association with low relative P-carotene levels, particularly with the production of milk with, inter alia, decreased fat colour. The presence of both the DNA encoding (A) and the DNA encoding (B) indicates an association with intermediate relative Ji-carotene levels, particularly with the production of milk with, inter alia, intermediate fat colour.
[0036] As used herein, biological activity of wild type BCMOl protein refers to both expression levels and activity characteristic of BCMOl protein expressed from the wild type BCMOl gene.
[0037] In another embodiment, the method includes ascertaining, from a sample of material containing DNA obtained from the bovine, whether a wild type BCMOl gene sequence is present. In still another embodiment, the method includes ascertaining, from a sample of material containing DNA obtained from the bovine, the expression of the BCMOl gene product, preferably by determining the presence or absence of one or more polymorphisms associated with decreased or increased BCMOl expression, for example one or more promoter polymorphisms associated with increased or decreased expression.
[0038] In one embodiment, this method includes ascertaining, from a sample of material containing mRNA obtained from the bovine, whether mRNA encoding a protein "(A)" having biological activity of a wild type BCMOl is present, or whether mRNA encoding a variant protein "(B)" at least partially lacking the activity of (A) is present, or whether mRNA encoding (A) and mRNA encoding (B) are both present. The absence of the mRNA encoding (A) and the presence of the mRNA encoding (B) again indicates an association with high relative p-carotene levels, particularly with the production of milk with, inter alia, increased fat colour. The reverse association again holds true. Again, the presence of both the mRNA encoding (A) and the mRNA encoding (B) indicates an association with intermediate relative P-carotene levels, particularly with the production of milk with, inter alia, intermediate fat colour.
[0039] In another embodiment, the method includes ascertaining the amount of BCMOl mRNA present in a sample of material containing mRNA obtained from the bovine.
[0040] In another embodiment, the method includes ascertaining, from a sample of material containing protein obtained from the bovine, whether a protein "(A)" having
564717
9
biological activity of a wild type BCMOl is present, or whether a variant protein "(B)" at least partially lacking the activity of (A) is present, or whether (A) and (B) are both present. The absence of (A) and the presence of (B) again indicates an association with high relative p-carotene levels, particularly with the production of milk with, inter alia, increased fat colour and/or high relative P-carotene content. The reverse association again holds true. Further, the presence of both (A) and (B) indicates an association with intermediate relative P-carotene levels, particularly with the production of milk with, inter alia, intermediate fat colour.
[0041] In another embodiment, the method includes ascertaining the amount or activity of BCMOl protein present in a sample of material containing protein obtained from the bovine.
[0042] In a further embodiment, the invention provides a method of determining genetic merit of a bovine with respect to milk P-carotene content which comprises determining the BCMOl allelic profile of the bovine, together with determining the allelic profile of the bovine at one or more genetic loci associated with milk p-carotene content.
[0043] In one embodiment, the one or more genetic loci is one or more polymorphisms in one or more genes associated with milk p-carotene content, preferably one or more polymorphisms in one or more genes involved in P-carotene uptake or metabolism.
[0044] In one embodiment the gene involved in p-carotene uptake is the SCARB1 gene (the sequence of which is available atNCBI accession number NM_174597.2, GI:31341575).
[0045] In another embodiment the gene involved p-carotene metabolism is selected from BC02 (the sequence of which is available at NCBI accession number NM 001101987, GI:156120622) or BCMOl.
[0046] For example, the one or more polymorphisms in a gene associated with p-carotene uptake is the C-321G promoter polymorphism in the SCARB1 gene. The association of the C allele at this polymorphism with production of milk and particularly milk fat with increased p~ carotene content, and the association of the G allele at this polymorphism with production of milk and particularly milk fat with decreased p-carotene content, is discussed in the applicant's co-pending New Zealand Patent application NZ 561999, United Kingdom Patent application GB 0817719.8, Australian Patent application AU 2008227070, and Irish Patent application IE 2008/0786.
[0047] For example, the one or more polymorphisms in a gene associated with P-carotene metabolism the W80Stop G/A polymorphism in the BC02 gene. The association of the A allele with production of milk and particularly milk fat with increased P-carotene content, and
564717
the association of the G allele at this polymorphism with production of milk and particularly milk fat with decreased p-carotene content, is discussed in the applicant's co-pending New Zealand Patent application NZ 561998, United Kingdom Patent application GB 0815964.2, Australian Patent application AU 2008207705, and Irish Patent application IE 2008/0710.
[0048] Accordingly, in one embodiment the presence or absence of one or more of the group comprising the C allele at the C-1054T promoter polymorphism in the bovine BCMOl gene,
the T allele at the C-1054T promoter polymorphism in the bovine BCMOl gene,
the G allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene,
the A allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene,
the A allele at the A18068G (N341D) polymorphism in the bovine BCMOl gene, and the G allele at the A18068G (N341D) polymorphism in the bovine BCMOl gene, is determined, directly or indirectly, together with the presence or absence of one or more of the group comprising the C allele at the C-321G promoter polymorphism in the SCARB1 gene,
the G allele at the C-321G promoter polymorphism in the SCARB1 gene,
the G allele at the WSOStop G/A polymorphism in the BC02 gene,
the A allele at the W80Stop G/A polymorphism in the BC02 gene.
[0049] In a further aspect, the invention includes a probe comprising a nucleic acid molecule sufficiently complementary with a nucleic acid sequence comprising a bovine BCMOl gene or encoding a bovine BCMOl gene product, or its complement, so as to bind thereto under stringent conditions, as well as a diagnostic kit containing such a probe.
[0050] The invention also includes a primer composition useful for detection of the presence or absence of a wild type BCMOl gene and/or the presence or absence of nucleic acid encoding a wild type BCMOl gene product, such as wild type BCMOl protein. In one form, the composition can include a nucleic acid primer substantially complementary to a nucleic acid sequence comprising a wild type BCMOl gene or encoding a wild type BCMOl gene product, or its complement. The nucleic acid sequence can in whole or in part be identified in SEQ ID No.l or SEQ ID No. 2. The invention also includes a primer composition useful for detection of the presence or absence of a variant BCMOl gene and/or the presence of the DNA encoding a variant BCMOl gene product, such as a variant BCMOl protein at least partially lacking wild type BCMOl activity. In one form, the composition can include a nucleic acid primer substantially complementary to a nucleic acid sequence
564717
11
comprising a variant BCMOl gene or encoding a variant BCMOl gene product, or its complement. Again, the nucleic acid sequence can in whole or in part be identified in SEQ ID NO:l or SEQ ID NO:2. Diagnostic kits including such a composition are also included.
[0051] Particularly contemplated are primers comprising or substantially complementary to a nucleic acid sequence present in SEQ ID NO:l and within approximately 1 to about 2000 bp of the C-1054T polymorphism, more preferably within approximately 1 to about 1000 bp, or within approximately 1 to about 500 bp, approximately 1 to about 400 bp, approximately 1 to about 300 bp, approximately 1 to about 200 bp, approximately 1 to about 100 bp, approximately 1 to about 50 bp, or approximately 1 to about 20 bp of the C-1054T polymorphism.
[0052] Also contemplated are primers comprising or substantially complementary to a nucleic acid sequence present in SEQ ID NO:l and within approximately 1 to about 2000 bp of the G15929A (G278R) polymorphism, more preferably within approximately 1 to about 1000 bp, or within approximately 1 to about 500 hp, approximately 1 to about 400 bp, approximately 1 to about 300 bp, approximately 1 to about 200 bp, approximately 1 to about 100 bp, approximately 1 to about 50 bp, or approximately 1 to about 20 bp of the G15929A (G278R) polymorphism.
[0053] Further contemplated are primers comprising or substantially complementary to a nucleic acid sequence present in SEQ ID NO:l and within approximately 1 to about 2000 bp of the A18068G (N341D) polymorphism, more preferably within approximately 1 to about 1000 bp, or within approximately 1 to about 500 bp, approximately 1 to about 400 bp, approximately 1 to about 300 bp, approximately 1 to about 200 bp, approximately 1 to about 100 bp, approximately 1 to about 50 bp, or approximately 1 to about 20 bp of the A18068G (N341D) polymorphism.
[0054] Examples of such primers are presented herein as SEQ ID NOs: 4-17.
[0055] It will be appreciated by those skilled in the art that a pair of such primers can be used to determine the identity of the nucleotide at a given polymorphism, by, for example the selective generation of an amplicon with one or more sequence-specific primers. Primer compositions comprising a pair of such primers are accordingly contemplated.
[0056] The invention also provides a diagnostic kit including a primer composition useful for determining the presence or absence of a wild type BCMOl gene and/or the presence or absence of nucleic acid encoding wild type BCMOl, the diagnostic kit comprising one or more primers or primer compositions or probes or probe compositions as described herein.
564717
12
[0057] The invention further includes an antibody composition useful for determining the presence or absence of wild type BCMOl protein, or for determining the presence or absence of a variant BCMOl protein such as a variant BCMOl protein at least partially lacking wild type BCMOl activity, or for determining the expression of BCMOl protein, as well as a diagnostic kit containing such an antibody together with instructions for use, for example in a method of the invention,
[0058] The invention further provides a diagnostic kit useful in detecting DNA comprising a variant BCMOl gene, or DNA or mRNA encoding a variant BCMOl gene product at least partially lacking wild type activity, in a bovine which includes first and second primers for amplifying the DNA or mRNA, the primers being complementary to nucleotide sequences of the DNA or RNA upstream and downstream, respectively, of a polymorphism in the BCMOl gene which results in or is associated with increased or decreased P-carotene levels (particularly increased or decreased milk colour or P-carotene content.
[0059] In one embodiment at least one of the nucleotide sequences is selected to be from a non-coding region of the wild type BCMOl gene. The kit can also include a primer complementary to a naturally occurring mutation of a coding or non-coding portion of the wild type BCMOl gene, for example a mutation in the promoter of the BCMOl gene. Preferably the kit includes instructions for use, for example in accordance with a method of the invention.
[0060] In another embodiment the invention provides a method of assessing the genetic merit of a bovine with respect to milk content, the method comprising determining the presence or absence of the C-1054T promoter polymorphism in the bovine BCMOl gene.
[0061] Thus, in another embodiment the invention provides a method of assessing the genetic merit of a bovine with respect to milk content which comprises the step of determining the presence or absence of one or more polymorphisms selected from the group comprising:
the C-l 054T promoter polymorphism in the bovine BCMOl gene,
the G15929A (G278R) polymorphism in the bovine BCMOl gene, or the A18068G (N341D) polymorphism in the bovine BCMOl gene.
[0062] In another embodiment the invention provides a method of assessing the genetic merit of a bovine with respect to milk content which comprises the step of determining the presence or absence of one or more polymorphisms selected from the group comprising:
564717
13
the C allele at the C-1054T promoter polymorphism in the bovine BCMOl gene, the T allele at the C-1054T promoter polymorphism in the bovine BCMOl gene, the G allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene, the A allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene, the A allele at the A18068G (N341D) polymorphism of the bovine BCMOl gene, or the G allele at the A18068G (N341D) polymorphism of the bovine BCMOl gene.
[0063] Again, the one or more polymorphisms can be detected directly or by detection of one or more polymorphisms which are in linkage disequilibrium with the one or more polymorphisms.
[0064] In another aspect, the present invention provides a method for identifying or selecting a bovine with a genotype indicative of desired milk colour or of desired milk p-carotene content. The method comprises determining the bovine BCMOl allelic profile of said bovine, and selecting the bovine on the basis of the determination.
[0065] In one embodiment, the invention provides a method for identifying or selecting a bovine with increased milk colour or increased milk p-carotene content, preferably increased milk fat colour or increased milk fat P-carotene content.
[0066] In one embodiment, the invention provides a method for identifying or selecting a bovine with a BCMOl allelic profile indicative of increased milk colour or increased milk P-carotene content, preferably of increased milk fat colour or increased milk fat P-carotene content.
[0067] In one example, the method comprises determining the absence of one or more of the C allele at the C-1054T promoter polymorphism in the BCMOl gene, the A allele at the G15929A (G278R) polymorphism in the BCMOl gene, or the A allele at the A18068G (N341D) polymorphism in the BCMOl gene, and identifying or selecting the bovine on the basis of the determination. Alternatively or additionally, the method comprises determining the presence of one or more of the T allele at the C-1054T promoter polymorphism in the BCMOl gene, the G allele at the G15929A (G278R) polymorphism in the BCMOl gene, or the G allele at the A18068G (N341D) polymorphism in the BCMOl gene, and identifying or selecting the bovine on the basis of the determination.
[0068] In one example, the method comprises determining the presence of one or more of the TT genotype at the C-10541 promoter polymorphism in the BCMOl gene, the GG genotype at the G15929A (G278R) polymorphism in the BCMOl gene, or the GG genotype at the A18068G (N341D) polymorphism in the BCMOl gene, and selecting the bovine on the
564717
14
basis of the determination. In a further example, the method comprises determining the presence of at least two of said genotypes, for example the method comprises determining the genotype at each polymorphism of the group comprising the C-1054T promoter polymorphism in the BCMOl gene, the G15929A (G278R) polymorphism in the BCMOl gene, and the A18068G (N341D) polymorphism in the BCMOl gene.
[0069] In one embodiment, the invention provides a method for identifying or selecting a bovine with a BCMOl allelic profile indicative of intermediate milk fat colour or intermediate milk fat P-carotene content.
[0070] In one embodiment, the method comprises determining the presence of the C allele and of the T allele at the C-1054T promoter polymorphism in the BCMOl gene, and selecting the bovine on the basis of the determination.
[0071] In a further embodiment, the method comprises determining the presence of the CT genotype at the C-1054T promoter polymorphism in the BCMOl gene, and selecting the bovine on the basis of the determination.
[0072] In a further embodiment the invention provides a method for selecting a bovine with a BCMOl allelic profile indicative of decreased milk fat colour or decreased milk fat p-carotene content.
[0073] In one example, the method comprises determining the presence of one or more of the C allele at the C-1054T promoter polymorphism in the BCMOl gene, the A allele at the G15929A (G278R) polymorphism in the BCMOl gene, or the A allele at the A18068G (N341D) polymorphism in the BCMOl gene, and selecting the bovine on the basis of the determination. Alternatively or additionally, the method comprises determining the absence of one or more of the T allele at the C-1054T promoter polymorphism in the BCMOl gene, the G allele at the G15929A (G278R) polymorphism in the BCMOl gene, or the G allele at the A18068G (N341D) polymorphism in the BCMOl gene, and selecting the bovine on the basis of the determination.
[0074] In one example, the method comprises determining the presence of one or more of the CC genotype at the C-1054T promoter polymorphism in the BCMOl gene, the GG genotype at the G15929A (G278R) polymorphism in the BCMOl gene, or the GG genotype at the A18068G (N341D) polymorphism in the BCMOl gene, and selecting the bovine on the basis of the determination. In a further example, the method comprises determining the presence of at least two of said genotypes, for example the method comprises determining the genotype at each polymorphism of the group comprising the C-1054T promoter
564717
polymorphism in the BCMOl gene, the G15929A (G278R) polymorphism in the BCMOl gene, and the A18068G (N341D) polymorphism in the BCMOl gene.
[0075] In one embodiment, the presence or absence of any one or more of the above alleles is determined with respect to a BCMOl polynucleotide (such as genomic DNA, mRNA or cDNA produced from mRNA) obtained from the bovine.
[0076] For example, the presence or absence of any one or more of the above alleles is determined by sequencing a BCMOl polynucleotide obtained from the bovine.
[0077] In a further embodiment the determination comprises the step of amplifying a BCMOl polynucleotide sequence from genomic DNA, mRNA or cDNA produced from mRNA derived from said bovine, for example by PCR.
[0078] Preferably the determination is by use of primers which comprise a nucleotide sequence having at least about 12 contiguous bases of or complementary to the sequence present in SEQ ID NO:l or SEQ ID NO:2 or a naturally occurring flanking sequence.
[0079] In one embodiment at least one of the primers comprises sequence corresponding to at least one of the allele-specific nucleotides described herein.
[0080] In an alternative embodiment, the method comprises restriction enzyme digestion of a nucleotide derived from the bovine. Such digestion may also be performed on a product of the PCR amplification described above.
[0081] In a further embodiment, the presence or absence of any of the above alleles is determined by mass spectrometric analysis of a BCMOl polynucleotide, such as that obtained from the bovine or from a method as described herein.
[0082] In an alternative embodiment, the presence or absence of any of the above alleles is determined by hybridisation of a probe or probes comprising a nucleotide sequence of or complementary to the sequence of SEQ ID NO:l or SEQ ID NO:2.
[0083] Preferably the probe or probes comprises 12 or more contiguous nucleotides of or complementary to the sequence in SEQ ID NO:l or SEQ ID NO:2.
[0084] Preferably the probe or probes comprise sequence corresponding to one of the allele-specific nucleotides described herein or complements thereof.
[0085] In an alternative embodiment, the presence or absence of any of the C allele or of the G allele above alleles is determined by analysis of a BCMO 1 polypeptide obtained from the bovine.
[0086] In one embodiment, the presence of one or more of the C allele at the C-1054T promoter polymorphism, the A allele at the G15929A (G278R) polymorphism, or the A allele
564717
16
at the A18068G (N341D) polymorphism is determined by detecting a reduction in the amount of or an absence of BCMOl gene product, or a reduction in the activity of BCMOl polypeptide.
[0087] In one embodiment, the presence of one or more of the T allele at the C-1054T promoter polymorphism, the G allele at the G15929A (G278R) polymorphism, or the G allele at the A18068G (N341D) polymorphism is determined by detecting an increase in the amount of BCMOl gene product, or an increase in the activity of BCMOl polypeptide.
[0088] In a further aspect the invention provides a bovine selected by a process of the invention; milk produced by the selected bovine or the progeny thereof as well as dairy products produced from such milk; and ova or semen produced by the selected bovine.
[0089] In still a further aspect the invention provides a method of selecting a herd of bovine, comprising selecting individuals by a method of the present invention, and segregating and collecting the selected individuals to form the herd. The invention further provides a herd of bovine so selected, as well as a herd comprising bovine produced by bovine selected by the methods described herein.
[0090] In a still further aspect, the invention provides a method of determining genetic merit of a bovine with respect to one or more milk colour or P-carotene content phenotypes, or with respect to capability of producing progeny predisposed to or with one or more milk colour or p-carotene content phenotypes, the method comprising providing data about the BCMOl allelic profile of said bovine, and determining the genetic merit of the bovine on the basis of the data,
[0091] In one embodiment, the data about the BCMOl allelic profile comprises data representative of the presence or absence of one or more of the group comprising the C allele at the C-1054T promoter polymorphism in the bovine BCMOl gene, the T allele at the C-1054T promoter polymorphism in the bovine BCMOl gene, the G allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene, the A allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene, the A allele at the A18068G (N341D) polymorphism of the bovine BCMOl gene, or the G allele at the A18068G (N341D) polymorphism of the bovine BCMOl gene.
[0092] In one example, the method additionally comprises providing data comprising the result of at least one analysis of one or more genetic loci associated with one or more milk colour or p-carotene content phenotypes, wherein the data is representative of the genetic merit of the bovine.
564717
17
[0093] In one example, the one or more genetic loci are one or more polymorphisms associated with an increase or decrease in expression or activity of a BCMOl gene product.
[0094] For example, the genetic loci is the BCMOl gene (including all regulatory elements such as the promoter, introns and 3'UTR).
[0095] In one embodiment, the one or more milk colour or P-carotene content phenotypes is selected from the group comprising production of or capability of producing milk with increased milk colour or production of or capability of producing milk with increased milk p-carotene content.
[0096] In another embodiment, the one or more milk colour or p-carotene content phenotypes is selected from the group comprising production of or capability of producing milk with decreased milk colour or production of or capability of producing milk with decreased milk p-carotene content,
[0097] Accordingly, in one embodiment the invention provides a method of determining genetic merit of a bovine with respect to milk colour or P-carotene content, or with respect to capability of producing progeny that will have increased or decreased milk colour or p-carotene content, the method comprising providing data about the BCMOl allelic profile of the bovine, and determining the genetic merit of the bovine on the basis of the data.
[0098] In one embodiment, the method additionally comprises providing data comprising the result of at least one analysis of one or more genetic loci associated with one or more milk or tissue colour or p-carotene content phenotypes, wherein the data is representative of the genetic merit of the bovine.
[0099] In one example, the one or more genetic loci are one or more polymorphisms associated with an increase or decrease in expression or activity of a BCMOl gene product.
[00100] For example, the genetic loci is the BCMOl gene (including all regulatory elements such as the promoter, introns and 3'UTR).
[00101] In one embodiment, the data comprises the result of one or more genetic tests of a sample from the bovine, and the determination comprises analysing the result for the presence or absence of one or more polymorphisms associated with increased or decreased expression or activity of BCMOl gene product, or one or more polymorphisms in linkage disequilibrium with one or more polymorphisms associated with increased or decreased expression or activity of BCMOl gene product,
564717
18
wherein a result indicative of the presence or absence of one or more of said polymorphisms is indicative of a bovine with one or more desired milk or tissue colour or p-carotene content phenotypes; and identifying or selecting the bovine on the basis of the result.
[00102] In one example, the one or more polymorphisms associated with increased or decreased expression or activity of BCMO 1 gene product is one or more polymorphisms in the BCMOl gene.
[00103] In a further aspect the invention provides a method for selecting a bovine that produces milk fat with increased or decreased p-carotene content, or capable of producing progeny that produce milk fat with increased or decreased P-carotene content, the method comprising a) providing the result of one or more genetic tests of a sample from the bovine, and b) analysing the result for the presence or absence of one or more polymorphisms in the BCMOl gene associated with increased or decreased expression or activity of BCMOl gene product, or one or more polymorphisms in linkage disequilibrium with one or more polymorphisms in the BCMOl gene associated with increased or decreased expression or activity of BCMOl gene product,
wherein a result indicative of the presence or absence of one or more of said polymorphisms is indicative of a bovine that produces milk fat with increased or decreased f5-carotene content, or that is capable of producing progeny that produce milk fat with increased or decreased P-carotene content.
[00104] In one embodiment, the one or more polymorphisms is selected from the group comprising:
the C-1054T promoter polymorphism in the bovine BCMOl gene,
the G15929A (G278R) polymorphism in the bovine BCMOl gene,
the A18068G (N341D) polymorphism of the bovine BCMOl gene, or one or more polymorphisms in linkage disequilibrium with one or more of these polymorphisms,
[00105] In other aspects, the invention provides a system for performing one or more of the methods of the invention, said system comprising:
computer processor means for receiving, processing and communicating data;
storage means for storing data including a reference genetic database of the results of genetic analysis of a bovine with respect to one or more milk colour or p-carotene content
564717
19
phenotypes and optionally a reference milk colour or P-carotene content phenotype database of non-genetic factors for one or more bovine milk colour or p-carotene content phenotypes; and a computer program embedded within the computer processor which, once data consisting of or including the result of a genetic analysis for which data is included in the reference genetic database is received, processes said data in the context of said reference databases to determine, as an outcome, the genetic merit of the bovine, said outcome being communicable once known, preferably to a user having input said data.
[00106] In one example, said system is accessible via the internet or by personal computer.
[00107] In one embodiment, said reference genetic database comprises or includes the results of one or more analyses of one or more genetic loci associated with one or more milk colour or P-carotene content phenotypes, more preferably the one or more genetic loci are one or more polymorphisms in one or more genes associated with one or more milk colour or P-carotene content phenotypes.
[00108] In yet a further aspect, the invention provides a computer program suitable for use in a system as defined above comprising a computer usable medium having program code embodied in the medium for causing the computer program to process received data consisting of or including the result of at least one genetic analysis of one or more genetic loci associated with one or more milk colour or P-carotene content phenotypes in the context of both a reference genetic database of the results of said at least one genetic analysis and optionally a reference database of non-genetic factors associated with one or more bovine milk colour or p-carotene content phenotypes.
[00109] In one embodiment, the one or more genetic loci are one or more polymorphisms in one or more genes associated with one or more milk colour or p-carotene content phenotypes.
[00110] In one example, the one or more polymorphisms are one or more polymorphisms associated with an increase or decrease in expression or activity of a BCMOl gene product.
[00111] In still another aspect, the invention provides a method of determining the genetic merit of a bovine with respect to milk colour or p-carotene content, or with respect to capability of producing progeny that will have increased or decreased milk colour or p-carotene content, the method comprising determining milk or tissue colour or P-carotene content of the bovine,
determining the BCMOl allelic profile of the bovine,
564717
comparing the BCMOl allelic profile of the bovine or the milk or tissue colour or P-carotene content of the bovine with that of a bovine having a known BCMOl allelic profile; determining the genetic merit of the bovine on the basis of the comparison.
[00112] It will be appreciated that for the purposes of the comparison, the milk colour or P-carotene content associated with the known BCMOl allelic profile is known. It will further be appreciated that the association of milk colour or p-carotene content with a particular BCMO 1 allelic profile may be established by the methods described herein.
[00113] In another aspect, the invention relates to an isolated, purified or recombinant nucleic acid molecule comprising nucleotide sequence selected from the group comprising:
(a) at least 12 contiguous nucleotides of SEQ ID NO:l and comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism; or
(b) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism; or
(c) at least 12 contiguous nucleotides of a functional variant of SEQ ID NO:l; or
(d) at least 12 contiguous nucleotides of a functional variant of SEQ ID NO:2; or
(e) at least 12 contiguous nucleotides of any one of SEQ ID NOs:4 - 17; or
(f) a complement of any one of (a) to (e); or
(g) a sequence of at least 12 contiguous nucleotides and capable of hybridising to the nucleotide sequence of any one of (a) to (f) under stringent conditions.
[00114] In one embodiment, the nucleic acid molecule comprises nucleotide sequence selected from
(a) at least 12 contiguous nucleotides of SEQ ID NO:l and comprising a cytosine at the C-1054T promoter polymorphism; or
(b) at least 12 contiguous nucleotides of SEQ ID NO:2 and comprising a cytosine at the C-1054T polymorphism; or
(c) at least 12 contiguous nucleotides of SEQ ID NO:l and comprising a thymine at the C-l 054T polymorphism; or
(d) at least 12 contiguous nucleotides of SEQ ID NO:2 and comprising a thymine at the C-1054T polymorphism; or
(e) at least 12 contiguous nucleotides of SEQ ID NO:l and comprising a guanine at the the G15929A (G278R) polymorphism; or
564717
21
(f) at least 12 contiguous nucleotides of SEQ ID NO:2 and comprising a guanine at the G15929A (G278R)polymorphism; or
(g) at least 12 contiguous nucleotides of SEQ ID NO:l and comprising an adenine at the G15929A (G278R)polymorphism; or
(h) at least 12 contiguous nucleotides of SEQ ID NO:2 and comprising an adenine at the G15929A (G278R)polymorphism; or
(i) at least 12 contiguous nucleotides of SEQ ID NO:l and comprising a guanine at the the A18068G (N341D) polymorphism; or
(j) at least 12 contiguous nucleotides of SEQ ID NO:2 and comprising a guanine at the A18068G (N341D)polymorphism; or (k) at least 12 contiguous nucleotides of SEQ ID NO:l and comprising an adenine at the A18068G (N341D)polymorphism; or (1) at least 12 contiguous nucleotides of SEQ ID NO:2 and comprising an adenine at the A18068G (N341D)polymorphism; or (m) a complement of any one of (a) to (1); or
(n) a sequence of at least 12 contiguous nucleotides and capable of hybridising to the nucleotide sequence of any one of (a) to (m) under stringent conditions.
[00115] In one embodiment, the BCMO 1 nucleic acid molecule consists of a nucleic acid sequence as defined in any one of (a) to (n) above.
[00116] In one embodiment, the BCMOl nucleic acid molecule is a BCMOl fragment as defined herein, for example a BCMOl fragment comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, the A18068G (N341D) polymorphism, or a combination of any two or more thereof.
[00117] In one embodiment, the nucleic acid molecule comprises nucleotide sequence selected from the group comprising:
(a) from 12 to at least about 20000, or at least about 10000, contiguous nucleotides of SEQ ID NO:l and comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism; or
(b) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism; or
(c) from 12 to at least about 20000, or at least about 10000, contiguous nucleotides of a functional variant of SEQ ID NO: 1; or
564717
22
(d) from 12 to 1791 contiguous nucleotides of a functional variant of SEQ ID NO:2; or
(e) a complement of any one of (a) to (d); or
(f) a sequence of at least 12 contiguous nucleotides and capable of hybridising to the nucleotide sequence of any one of (a) to (e) under stringent conditions.
[00118] In one embodiment, the BCMO nucleic acid molecule consists of nucleotide sequence selected from the group comprising:
(a) from 12 to about 20000 contiguous nucleotides of SEQ ID NO:l and comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism; or
(b) from 12 to about 1791 contiguous nucleotides of SEQ ID NO:2 and comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism; or
(c) from 12 to about 20000 contiguous nucleotides of a functional variant of SEQ ID NO:l; or
(d) from 12 to about 1791 contiguous nucleotides of a functional variant of SEQ ID NO:2; or
(e) any one of SEQ ID NOs:4 - 17; or
(f) a complement of any one of (a) to (e); or
(g) a sequence of from 12 to about 20000 contiguous nucleotides and capable of hybridising to the nucleotide sequence of any one of (a) to (f) under stringent conditions.
[00119] In another aspect the invention relates to an isolated, purified or recombinant polypeptide comprising at least about 10 contiguous amino acids of SEQ ID NO:3.
[00120] In one embodiment, the polypeptide comprises one or more of the following:
(a) arginine at the position corresponding to amino acid 278 of SEQ ID NO:3; or
(b) an amino acid other than glycine at the position corresponding to amino acid 278 of SEQ ID NO:3; or
(c) aspartate at the position corresponding to amino acid 341 of SEQ ID NO:3; or
(d) an amino acid other than asparagine at the position corresponding to amino acid 341 of SEQ ID NO:3; or
(e) any combination of (a) or (b) and (c) or (d).
[00121] In one embodiment, the polypeptide consists of at least about 10 contiguous amino acids of SEQ ID NO:3, at least about 20, at least about 30, at least about 40, at least
564717
23
about 50, at least about 100, at least about 150, at least about 200, or at least about 250 contiguous amino acids of SEQ ID NO:3, or consists of an amino acid sequence as defined in any one of (a) to (e) above.
[00122] In one embodiment the polypeptide is a variant as defined herein.
[00123] In one example, the polypeptide has at least about 80%, at least about 85%, at least about 90%, at least about 95%, or at least about 99% amino acid sequence identity with the amino acid sequence of SEQ ID NO:3 or with an amino acid sequence present in SEQ ID NO:3.
[00124] In another embodiment the polypeptide is a functional variant of a BCMOl, or is a functional fragment thereof.
[00125] In one example, the polypeptide (for example, the functional variant or functional fragment) has at least about 10%, at least about 20%, at least about 30%, at least about 40%, at least about 50%, at least about 60%, at least about 70%, at least about 80%, at least about 90%, at least about 100%, at least about 110%, at least about 120%, at least about 130%, at least about 140%, at least about 150%, or at least about 200% of the ezymatic activity of a BCMOl polypeptide having the amino acid sequence of SEQ ID NO:3.
[00126] For example, in one embodiment the isolated, purified or recombinant polypeptide comprises ail amino acid sequence having at least 95% sequence identity with a sequence present in SEQ ID NO:3, wherein the polypeptide comprises at least about 100 amino acids.
[00127] In a further example, the isolated, purified or recombinant polypeptide comprises an amino acid sequence having at least 95% sequence identity with a sequence present in SEQ ID NO:3, wherein the polypeptide comprises one or more of the following:
(a) arginine at the position corresponding to amino acid 278 of SEQ ID NO:3; or
(b) an amino acid other than glycine at the position corresponding to amino acid 278 of SEQ ID NO:3; or
(c) aspartate at the position corresponding to amino acid 341 of SEQ ID NO:3; or
(d) an amino acid other than asparagine at the position corresponding to amino acid 341 of SEQ ID NO:3; or
(e) any combination of (a) or (b) and (c) or (d).
[00128] The invention also provides a genetic construct comprising a BCMOl nucleic acid molecule of the invention, a vector comprising the genetic construct or a nucleic acid sequence as described above, a host cell comprising the genetic construct or vector, a polypeptide encoded by a BCMOl nucleic acid molecule of the invention, an antibody which
564717
24
selectively binds a polypeptide of the invention, and a method for recombinantly producing a polypeptide of the invention.
[00129] The term "comprising" as used in this specification means "consisting at least in part of'. When interpreting each statement in this specification that includes the term "comprising", features other than that or those prefaced by the term may also be present. Related terms such as "comprise" and "comprises" are to be interpreted in the same manner.
[00130] In this specification where reference has been made to patent specifications, other external documents, or other sources of information, this is generally for the purpose of providing a context for discussing the features of the invention. Unless specifically stated otherwise, reference to such external documents is not to be construed as an admission that such documents, or such sources of information, in any jurisdiction, are prior art, or form part of the common general knowledge in the art.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1 shows a graph showing the range and seasonal effect of milk (3-carotene content determined as described in the Example. Data for peak (35 days post calving), mid (late November) and late (late February) lactation is shown. These data were obtained from a Friesian-Jersey crossbred trial, as described in the materials and methods.
Figure 2 shows a graph depicting a QTL for milk p-carotene contenton bovine chromosome 18. The maximum F-value of 7.8 occurred at 15 centimorgans (cM). The grey bar shows the 95% confidence interval for the QTL.
Figure 3 shows a schematic showing a graphical representation of polymorphism in the bovine BCMOl gene. A: Graphical representation of the predicted exonic structure of bovine BCMOl gene. B. Identification (indicated by arrows) of the C/T promoter polymorphism at position -1054 relative to the +1 translation start site (i.e., in the promoter), a G/A polymorphism at position 15929 (G278R, exon 6), and an A/G polymorphism at position 18068 (N341D, exon 7). The -1054 and 15929 polymorphisms were heterozygous in 2 of the six F1 sires (sires 405 and 740) and the 18068 polymorphism was heterozygous in one sire (405).
Figure 4 shows the statistical effect of each genotypic state for the following BCMO 1 polymorphisms: G15929A (G278R), A18068G (N341D) and C-1054T polymorphisms on milk p-carotene content at peak lactation. A)
564717
BCM01_G15929A, B) BCMOl A18068G, C) BCM01_C-1054T. Similar effects were seen at the other two time points during lactation (mid, late lactation).
Figure 5 shows the adjusted statistical effect of BCM01_G15929A (G278R) on bovine chromosome 18 milk p-carotene QTL (mid lactation data shown). Solid line: milk p-carotcne QTL (residual of log-transformed, modeled data). Dotted line: milk P-carotene QTL after BCM01G15929A accounted for (residual of log-transformed, modeled data including BCM01G15929A as fixed effect). The decrease in QTL significance (F-value) suggests a significant association of the BCMOl genotype with the milk p-carotene content QTL variation.
Figure 6 shows the adjusted statistical effect of BCM01_A18068G (N341D) on bovine chromosome 18 milk P-carotene QTL (mid lactation data shown). Solid line: milk p-carotene QTL (residual of log-transformed, modeled data). Dotted line: milk p-carotene QTL after BCM01_A18068G accounted for (residual of log-transformed, modeled data including BCMOl A18068G as fixed effect). The decrease in QTL significance (F-value) suggests a significant association of the BCMOl genotype with the milk p-carotene content QTL variation.
Figure 7 shows the adjusted statistical effect of BCM01C-1054T polymorphism on bovine chromosome 18 milk P-carotene QTL (mid lactation data shown). Solid line: milk P-carotene QTL (residual of log-transformed, modeled data). Dotted line: milk P-carotene QTL after BCMOl C-1054T accounted for (residual of log-transformed, modeled data including BCM01_C-1054T as fixed effect).
Figure 8 shows the enzyme activities for the following recombinant BCMOl proteins: 278G/341N (wildtype), 278G/341D, 278R/341N, 278R/341D expressed in CHO-K1 cells, with enzyme activity assayed in cell lysates, as described in the example. Data presented are means ± standard error for n - 2 independent experiments.
DETAILED DESCRIPTION OF THE INVENTION
[00131] The present invention recognises for the first time that polymorphisms in the
BCMOl gene in bovine is associated with a QTL for variations in milk colour and variations in milk P-carotene content.
564717
26
[00132] For the sake of clarity, the phrase "milk colour or p-carotene content" is to be read as referring to milk colour or milk p-carotene content. Grammatical equivalents or components thereof are to be read likewise.
[00133] It will be apparent to those skilled in the art that milk colour can readily be determined qualitatively or quantitatively. For example, a visual comparison may in many cases be sufficient to qualitatively determine a sample of milk having increased colour, or decreased colour, for example increased yellow colour, relative to another sample, for example relative to milk produced by bovine having wild type BCMOl. It will similarly be appreciated by those skilled in the art that reference herein to milk having increased colour may be considered as a reference to milk that is less white, and vice versa. Methods for quantitative determination of milk colour or p-carotene content are also known in the art, and examples are provided herein.
[00134] The invention provides methods of assessing the genetic merit of a bovine with respect to milk P-carotene content, more particularly milk fat p-carotene content. One such method comprises the step of determining the BCMOl allelic profile of said bovine. Another such method comprises the step of determining the level of a BCMOl gene product of said bovine.
[00135] The invention also provides a method for selecting a bovine with a genotype indicative of desired milk p-carotene content, particularly desired milk fat p-carotene content. One of the major applications of the present invention is in the selection of bovine having the T allele or the C allele at the C-1054T promoter polymorphism in the BCMOl gene, which are associated with increased milk p-carotene content and milk colour, and decreased milk P-carotene content and milk colour, respectively. Accordingly, one method comprises determining the presence or absence of the C allele or of the T allele at the C-1054T promoter polymorphism of the BCMOl gene, and selecting the bovine on the basis of the determination.
[00136] Another of the applications of the present invention is in the selection of bovine having the G allele or the A allele at the G15929A (G278R) polymorphism in the BCMOl gene, which are associated with increased milk fat P-carotene content and milk fat colour, and decreased milk fat p-carotene content and milk fat colour, respectively. Accordingly, one method comprises determining the presence or absence of the G allele or the A allele at the G15929A (G278R) polymorphism of the bovine BCMOl gene, and selecting the bovine on the basis of the determination.
564717
27
[00137] Another of the applications of the present invention is in the selection of bovine having the G allele or the A allele at the A18068G (N341D) polymorphism in the BCMOl gene, which are associated with increased milk fat P-carotene content and milk fat colour, and decreased milk fat P-carotene content and milk fat colour, respectively. Accordingly, one method comprises determining the presence or absence of the G allele or the A allele at the A18068G (N341D) polymorphism of the bovine BCMOl gene, and selecting the bovine on the basis of the determination.
[00138] Additionally, the invention is directed towards the selected bovine and semen from the selected bovine which may be useful in further breeding programs. Bovine so selected will be useful for milk production. The invention is also directed towards milk produced by the selected bovine or the progeny thereof, as well as dairy products produced from such milk.
[00139] The production of a wide variety of dairy products is well known m the art, and dairy products contemplated herein include ice creams, yoghurts and cheeses, dairy based drinks (such as milk drinks including milk shakes, and yogurt drinks), milk powders, dairy based sports supplements, food additives such as protein sprinkles and dietary supplement products including daily supplement tablets.
[00140] The present invention recognises that polymorphisms in the gene encoding BCMOl, as well as BCMOl levels or activity, may be used as a selection tool to breed animals with higher or lower milk concentrations of p-carotene (and thus milk fat colour). This in turn may allow the production of milk products more suitable to markets favouring white milk and milk products, or the production of milk products more suitable to markets favouring yellow milk and milk products, or the production of milk and milk products, such as foods, high in P-carotene.
1 BCMOl
[00141] BCMOl is a key regulatory enzyme for the metabolism of P-carotene to vitamin A. BCMOl catalyses the symmetrical cleavage of p-carotene, resulting in the formation of two molecules of retinal. Retinal (also known as retinaldehyde) is the aldehyde isomer of vitamin A, and can be reversibly reduced to produce retinol or irreversibly oxidized to produce retinoic acid. In animals, the major isoform of vitamin A is retinol. The conversion of p-carotene to vitamin A in animals is usefully reviewed in von Lintig, J. and Vogt, K., 2004.
564717
28
[00142] The genomic sequence comprising the bovine BCMOl gene is presented herein as SEQ ID No.l. The predicted amino acid coding sequence of bovine BCMOl is presented herein as SEQ ID No. 2, and a provisional reference sequence is available as NCBI accession number NM 001024559.1 (GI:66792909). This coding sequence is derived from a cDNA
clone, the sequence of which is available as NCBI accession number DQ008469.1 (GI:62999033). The amino acid sequence encoded by this coding sequence is presented herein as SEQ ID No. 3, and is itself available as NCBI accession number NP 001019730.1 (GI:66792910).
[00143] The present invention relates to the identification that one or more mutations in the BCMOl gene leads to variation in milk content and composition, particularly variation in milk colour and milk p-carotene content, and are associated with one or more milk content or milk colour phenotypes.
[00144] As described herein, the BCMOl mutations were closely associated with milk content phenotype. For example (and as described herein), animals homozygous for the C allele (CC genotype) at the C-1054T promoter polymorphism (wild type promoter) produced milk with less p-carotene than animals homozygous for the T allele (TT genotype). This effect was observed at three stages of lactation, and within two sire families that carried the mutation. Animals homozygous for the C allele (wild type promoter) also produced milk with less p-carotene than heterozygous animals. Likewise, animals homozygous for the A allele of the G15929A polymorphism and animals homozygous for the A allele of the AI8068G polymorphism produced milk with less p-carotene than animals homozygous for either of the respective G alleles.
[00145] The C-1054T CC genotype for BCMOl was present in 74.43% of F2 animals in the Holstein-Friesian x Jersey crossbred (FJXB) trial, while the CT genotype was present in 24.49% of F2 animals, and the TT genotype in 1.08% (see Table 2 herein). The G15929A GG genotype for BCMOl was present in 74.64% of F2 animals in the Holstein-Friesian x Jersey crossbred (FJXB) trial, while the GA genotype was present in 24.05% of F2 animals, and the AA genotype in 1.31% (see Table 2 herein). The A18068G AA genotype for BCMOl was present in 84.64% of F2 animals in the Holstein-Friesian x Jersey crossbred (FJXB) trial, while the AG genotype was present in 14.76% of F2 animals, and the GG genotype in 0.60% (see Table 2 herein). The details of the FJXB trial are detailed in the materials and methods.
564717
29
[00146] A reference bovine BCMOl nucleotide sequence referred to as "wild type BCMOl" determined by the applicants is presented as SEQ ID NO'. I, and the compiled coding sequence is presented as SEQ ID NO:2, with the corresponding amino acid sequence is presented as SEQ ID NO:3. It will be appreciated that the polymorphisms described herein are noted as variants in SEQ ID NO:l, where the wild-type allele is depicted in SEQ ID NO:!, and the non-wild-type allele is noted as the variant nucleotide. Accordingly, as used herein with respect to BCMOl, such as use with respect to a BCMOl gene or a BCMOl gene product, the term "wild type" recognizes the characteristics of the BCMOl nucleotide sequences presented as SEQ ID NOs;l and 2, and of the protein product encoded thereby. For example, when used with respect to enzymatic activity, the term "wild type" denotes activity associated with the wild type BCMOl enzyme. Similarly, when used with respect to expression level, the term "wild type" denotes a level of expression associated with the wild type BCMOl gene or with the wild type BCMOl promoter.
[00147] It will be apparent that the term "activity" may refer both to the inherent enzymatic activity of a single molecule of BCMOl, which may be wild type activity or may be less or greater than wild type activity as may depend, for example on the amino acid sequence, the presence of any amino acid substitutions, the availability of co-factors, and the like, as well as to the total enzymatic activity of the population of BCMOl molecules present (for example, in a bovine or in a sample taken from a bovine), as may depend on both the enzymatic activity of each molecule present and the level of expression (for example, how many such molecules are present).
[00148] As used herein, such as when used in reference to an allelic protein lacking the activity of wild type BCMOl, the phrase "lacking the activity of (A)" contemplates activity both greater than that of (A) and less than that of (A). For example, an allelic protein lacking the activity of wild type BCMOl may be a variant BCMOl protein of greater or lesser enzymatic activity than that of wild type BCMOl.
[00149] Methods to assay the activity of BCMOl are well known in the art. For example, one such method quantifies BCMOl mRNA, for example using a method well known in the art including quantitative RT-PCR, TaqManâ„¢ assays, or the like. Another exemplary method utilises the assay of the BCMOl substrate p-carotene described herein, where the disappearance of p-carotene (and the associated reduction in absorbance at 450nm) correlates with BCMOl enzymatic activity.
564717
[00150] The C-1054T promoter polymorphism, the G15929A (G278R) polymorphism in exon 6 of the BCMOl gene, and the A18068G (N341D) polymorphism in exon 7 of the BCMOl gene are each identified in the bovine BCMOl gene sequence presented as SEQ ID NO:l in the sequence ID listing as variants.
2 Identification and analysis of polymorphisms
[00151] It will be apparent to those skilled in the field that the convention of identifying polymorphisms by their position in the genomic sequence relative to the +1 translation start site of the gene in which they occur is followed herein. Accordingly, the C-1054T polymorphism in the BCMOl gene described herein lies 1054 nucleotides upstream of the +1 translation start site of the BCMOl gene. Those skilled in the art will also recognise that these positions can readily be expressed relative to the coding sequence (for example for non-intronic polymorphisms).
[00152] It will similarly be apparent to those skilled in the field that the convention of identifying polymorphisms effecting an amino acid substitution by their codon position in the gene in which they occur and the amino acid substitution effected thereby is also contemplated herein. Accordingly, the G15929A polymorphism and the A18068G polymorphism described herein may be referred to by reference to the codon of the BCMOl gene within which they are located and the amino acid substitution effected, namely the G278R polymorphism, and the N341D polymorphism, respectively.
[00153] The polymorphisms described herein can be detected directly or by detection of one or more polymorphisms which are in linkage disequilibrium with these polymorphisms. Linkage disequilibrium is a phenomenon in genetics whereby two or more mutations or polymorphisms are in such close genetic proximity that they are co-inherited. This means that in genotyping, detection of one polymorphism as present implies the presence of the other. (Reich DE et al; Linkage disequilibrium in the human genome, Nature 2001, 411:199-204.)
[00154] Various degrees of linkage disequilibrium are possible. Preferably, the one or more polymorphisms in linkage disequilibrium with one or more of the polymorphisms specified herein are in greater than about 60% linkage disequilibrium, are in about 70% linkage disequilibrium, about 75%, about 80%, about 85%, about 90%, about 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%, or about 100% linkage disequilibrium with one or more of the polymorphisms selected from the group comprising the C-1054T promoter polymorphism of the BCMOl gene, the G15929A (G278R) polymorphism of the BCMOl gene, or the Al 8068G (N341D) polymorphism of the BCMOl gene. (Devlin and Risch 1995;
564717
31
A comparison of linkage disequilibrium measures for fine-scale mapping, Genomics 29: 311-322).
[00155] There are numerous standard methods known in the art for determining whether a particular DNA sequence is present in a sample, many of which include the step of sequencing a DNA sample. Thus in one embodiment of the invention, the step determining whether or not the specified nucleotides are present in a nucleic acid derived from a bovine, includes the step of sequencing the nucleic acid. Methods for nucleotide sequencing are well known to those skilled in the art.
[00156] In one aspect, the present invention provides a method for determining the genetic merit of a bovine with respect to milk content, and particularly with milk colour or p-carotene content. In one embodiment the method includes ascertaining, from a sample of material containing DNA obtained from the bovine, whether a sequence of the DNA encoding "(A)" a protein having biological activity of wild type BCMOl is present, and whether a sequence of the DNA encoding "(B)" an allelic protein lacking the activity of (A) is present. In another embodiment, the method includes ascertaining, from a sample of material containing DNA obtained from the bovine, whether the wild type BCMOl gene sequence is present. In still another embodiment, the method includes ascertaining, from a sample of material containing DNA obtained from the bovine, the expression of the BCMOl gene product, preferably by determining the presence or absence of one or more polymorphisms associated with decreased or increased BCMOl expression, for example one or more promoter polymorphisms associated with increased or decreased expression.
[00157] An example of another art standard method known for determining whether a particular DNA sequence is present in a sample is the Polymerase Chain Reaction (PCR). One embodiment of the invention thus includes a step in which ascertaining whether a sequence of the DNA encoding (A) is present, or whether a sequence of the DNA encoding (B) is present, includes amplifying the DNA in the presence of primers based on a nucleotide sequence encoding a protein having biological activity of wild type BCMOl, and/or in the presence of a primer containing at least a portion of a polymorphism known to naturally occur and which when present results in high relative P-carotene levels, and particularly in milk having inter alia a higher p-carotene content, and/or in the presence of a primer containing at least a portion of a polymorphism known to naturally occur and which when present results in low relative p-carotene levels, and particularly in milk having inter alia a lower p-carotene content.
564717
32
[00158] A primer of the present invention, used in PCR for example, is a nucleic acid molecule sufficiently complementary to the sequence on which it is based and of sufficient length to selectively hybridise to the corresponding protein of a nucleic acid molecule intended to be amplified and to prime synthesis thereof under in vitro conditions commonly used in PCR Likewise, a probe of the present invention, is a molecule, for example a nucleic acid molecule of sufficient length and sufficiently complementary to the nucleic acid molecule of interest, which selectively binds under high or low stringency conditions with the nucleic acid sequence of interest for detection in the presence of nucleic acid molecules having differing sequences.
[00159] Accordingly, a preferred embodiment of the invention thus includes the step of amplifying a BCMOl polynucleotide in the presence of at least one primer comprising a nucleotide sequence of, or complementary to, the BCMOl gene (SEQ ID NO:l and SEQ ID NO:2) or flanking sequence thereof, and/or in the presence of a such a primer comprising sequence corresponding to or flanking the C-1054T polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism, or comprising sequence including one or other of the allele-specific polymorphic nucleotides at the C-1054T polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism as described herein. PCR methods are well known by those skilled in the art (Mullis et al., 1994.) The template for amplification may be selected from genomic DNA, mRNA or first strand cDNA derived from a sample obtained from the bovine under test (Sambrook et al,, 1987).
[00160] Primers suitable for use in PCR based methods of the invention should be sufficiently complementary to the BCMOl gene sequence, such as SEQ ID NO:l or SEQ ID NO:2 or flanking sequence thereof, and of sufficient length to selectively hybridise to the corresponding portion of a nucleic acid molecule intended to be amplified and to prime synthesis thereof under in vitro conditions commonly used in PCR. Such primers should comprise at least about 12 contiguous bases of or complementary to SEQ ID NO:l or SEQ ID NO:2, or naturally occurring flanking sequences thereof. Examples of such PCR primers are presented herein as SEQ ID NOs: 4-17.
[00161] Suitable PCR primers may include sequence corresponding to the C-1054T C allele-specific or C-1054T T allele-specific BCMOl nucleotides described herein. Similarly, PCR primers may include sequence corresponding to the G15929A (G278R) G allele-specific or G15929A (G278R) A allele-specific BCMOl nucleotides, or sequence corresponding to the A18068G (N341D) A allele-specific or AI8068G (N341D) G allele-specific BCMOl
564717
33
nucleotides described herein. Generation of a corresponding PCR product, or the lack of product, may constitute a test for the presence or absence of the specified nucleotides in the BCMOl gene of the test bovine.
[00162] Other methods for determining whether a particular nucleotide sequence is present in a sample may include the step of restriction enzyme digestion of nucleotide sample. Separation and visualisation of the digested restriction fragments by methods well known in the art, may form a diagnostic test for the presence of a particular nucleotide sequence. The nucleotide sequence digested may be a PCR product amplified as described above.
[00163] Still other methods for determining whether a particular nucleotide sequence is present in a sample include a step of hybridisation of a probe to a sample nucleotide sequence. Thus, methods for detecting for example the C-1054T C allele-specific or C-1054T T allele-specific nucleotides may comprise the additional steps of hybridisation of a probe derived from the BCMOl sequence of SEQ ID NO: 1 or SEQ ID NO:2.
[00164] Such probes should comprise a nucleic acid molecule of sufficient length and sufficiently complementary to the BCMOl gene sequence, to selectively bind under high or low stringency conditions with the nucleic acid sequence of a sample to facilitate detection of the presence or absence of the allele-specific nucleotides described herein.
[00165] With respect to polynucleotide molecules greater than about 100 bases in length, typical stringent hybridization conditions are no more than 25 to 30° C (for example, 10° C) below the melting temperature (Tm) of the native duplex (see generally, Sambrook et al., 1987; Ausubel et al., 1987). Tm for polynucleotide molecules greater than about 100 bases can be calculated by the formula Tm = 81. 5 + 0. 41 % (G + C-log (Na+).
[00166] With respect to polynucleotide molecules having a length less than 100 bases, exemplary stringent hybridization conditions are 5 to 10° C below Tm. On average, the Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/oligonucleotlde length)0 C.
[00167] Such a probe may be hybridised with genomic DNA, mRNA, or cDNA produced form mRNA, derived from a sample taken from a bovine under test.
[00168] Such probes would typically comprise at least 12 contiguous nucleotides of or complementary to the sequences presented SEQ ID NO:l or SEQ ID NO:2, and may comprise sequence corresponding to the allele-specific nucleotides described herein.
564717
34
[00169] Such probes may additionally comprise means for detecting the presence of the probe when bound to sample nucleotide sequence. Methods for labelling probes such as radiolabelling are well known in the art (see for example, Sambrook et al., 1987).
[00170] As will be apparent to a person skilled in the art, promoter function may be determined by various well-known methods, for example use in reporter systems. For example, one approach of determining the allelic state of a promoter would be via a reporter construct, where the promoter of interest is fused upstream of a reporter gene (e.g. lucifcrase), and the activity of the reporter is determined and correlates with promoter function.
[00171] Accordingly, in one embodiment the method for determining the genetic merit of a bovine with respect to milk colour or P-carotene content includes ascertaining, from a sample of material containing DNA obtained from the subject, whether a sequence of the DNA encoding the promoter of the BCMOl gene associated with higher or lower relative levels of expression is present.
[00172] In another aspect, the invention provides a method for determining the genetic merit of bovine with respect to milk colour or p-carotene content with reference to a sample of material containing mRNA obtained from the bovine. In one embodiment this method includes ascertaining whether a sequence of the mRNA encoding (A) a protein having biological activity of a wild type BCMOl is present, and whether a sequence of the mRNA encoding (B) a protein at least partially lacking the activity of (A) is present, and may include determining the amount of mRNA.
[00173] In another aspect, the invention provides a method for determining the genetic merit of bovine with respect to milk content with reference to a sample of material containing mRNA obtained from the bovine. In one embodiment this method includes ascertaining whether a sequence of the mRNA encoding (A) a protein having biological activity of a wild type BCMOl is present, and whether a sequence of the mRNA encoding (B) a protein at least partially lacking the activity of (A) is present, and may include determining the amount of mRNA. The absence of the mRNA encoding (A) and the presence of the mRNA encoding (B), or a decrease in the amount of the mRNA encoding (A) compared to wild type levels, again indicates an association with high relative p-carotene levels, particularly with the production of milk with, inter alia, increased fat colour. The reverse association again holds true.
[00174] Again, if an amplification method such as PCR is used in ascertaining whether a sequence of the mRNA encoding (A) is present, or whether a sequence of the mRNA
564717
encoding (B) is present, the method includes amplifying the mRNA, for example in the presence of a pair of primers complementary to a nucleotide sequence encoding a protein having biological activity of a wild type BCMOl, or in the presence of a pair of primers complementary to a nucleotide sequence encoding a variant BCMOl protein. It will be appreciated that in embodiments of the invention reliant on assessing the amount of BCMOl mRNA present in a sample, quantitative amplification methods well known in the art may be employed, for example quantitative RT-PCR, microarray analysis, and other methods described herein.
[00175] Other methods to quantitate or otherwise assess the amount of nucleic acid, particularly the amount of mRNA are well known in the art. These include Northern analysis using probes able to hybridise to the target BCMOl mRNA. Such probes should comprise a nucleic acid molecule of sufficient length and sufficiently complementary to the BCMOl coding sequence to selectively bind under high or low stringency conditions with the nucleic acid sequence of a sample to facilitate detection and assessment of the amount of BCMOl mRNA present. As is evident to the person skilled in the art, such quantitative methods generally utilise an internal control, for example in the case of Northern analysis quantitation may be done with reference to, for example, rRNA present in the sample.
[00176] In a further aspect, the invention provides a method of determining genetic merit of a bovine with respect to milk content which comprises determining the BCMOl allelic profile of said bovine, together with determining the allelic profile of said bovine at one or more genetic loci associated with milk content, including milk |3-carotene content.
[00177] In one embodiment, said genetic loci is a polymorphism in a gene associated with milk j3-carotene content, preferably a polymorphism in a gene involved in p-carotene uptake or metabolism. Preferably the gene involved in p-carotene uptake is the SCARB1 gene (the sequence of which is available at NCBI accession number NM 174597.2, GI:31341575). Preferably the gene involved P-carotene metabolism is selected from BC02 (the sequence of which is available atNCBI accession number NM_001101987, GI:156120622) or BCMOl.
[00178] The methods of the invention are reliant on genetic information such as that derived from methods suitable to the detection and identification of polymorphisms, particularly single nucleotide polymorphisms (SNPs) associated with the qualitative trait for which an assessment is desired. For the sake of convenience the following discussion refers particularly to SNPs, yet the art-skilled worker will appreciate that the methods discussed are
564717
36
amenable to the detection and identification of other genetic polymorphisms, such as triplet repeats or microsatellites.
[00179] A SNP is a single base change or point mutation resulting in genetic variation between individuals. SNPs are believed to occur in mammalian genomes approximately once every 100 to 300 bases, and can occur in coding or non-coding regions. Due to the redundancy of the genetic code, a SNP in the coding region may or may not change the amino acid sequence of a protein product. A SNP in a non-coding region can, for example, alter gene expression by, for example, modifying control regions such as promoters, transcription factor binding sites, processing sites, ribosomal binding sites, mRNA stability, and affect gene transcription, processing, and translation.
[00180] SNPs can facilitate large-scale association genetics studies, and there has recently been great interest in SNP discovery and detection. SNPs show great promise as markers for a number of phenotypic traits (including latent traits), such as for example, disease propensity and severity, wellness propensity, drug responsiveness including, for example, susceptibility to adverse drug reactions, and as described herein association with desirable phenotypic traits. Knowledge of the association of a particular SNP with a phenotypic trait, coupled with the knowledge of whether a subject has said particular SNP, can enable the targeting of diagnostic, preventative and therapeutic applications to allow better disease management, to enhance understanding of disease states, to develop selective breeding regimes, and to identify subjects of desirable genetic merit.
[00181] Indeed, a number of databases have been constructed of known SNPs, and for some such SNPs, the biological effect associated with a SNP. Understandably, there has been a focus on human genetics. For example, the NCBI SNP database "dbSNP" is incorporated into NCBI's Entrez system and can be queried using the same approach as the other Entrez databases such as PubMed and GenBank. This database has records for over 1.5 million SNPs mapped onto the human genome sequence. Each dbSNP entry includes the sequence context of the polymorphism (i.e., the surrounding sequence), the occurrence frequency of the polymorphism (by population or individual), and the experimental method(s), protocols, and conditions used to assay the variation, and can include information associating a SNP with a particular phenotypic trait. Similar databases are available for a number of species of commercial and scientific interest.
[00182] There has been and continues to be a great deal of effort to develop methods that reliably and rapidly identify new SNPs associated with a phenotypic trait. This is no trivial
564717
37
task, at least in part because of the complexity of mammalian genomic DNA (e.g., the haploid human genome of 3 x 109 base pairs, while current estimates of the size of the haploid bovine genome are in the range of 2.6 - 2.7 x 109 base pairs), and the associated sensitivity and discriminatory requirements.
[00183] Genotyping approaches to detect SNPs well-known in the art include DNA sequencing, methods that require allele specific hybridization of primers or probes, allele specific incorporation of nucleotides to primers bound close to or adjacent to the polymorphisms (often referred to as "single base extension", or "minisequencing"), allele-specific ligation (joining) of oligonucleotides (ligation chain reaction or ligation padlock probes), allele-specific cleavage of oligonucleotides or PCR products by restriction enzymes (restriction fragment length polymorphisms analysis or RFLP) or chemical or other agents, resolution of allele-dependent differences in electrophoretic or chromatographic mobilities, by structure specific enzymes including invasive structure specific enzymes, or mass spectrometry. Analysis of amino acid variation is also possible where the SNP lies in a coding region and results in an amino acid change.
[00184] DNA sequencing allows the direct determination and identification of SNPs. The benefits in specificity and accuracy are generally outweighed for screening purposes by the difficulties inherent in whole genome, or even targeted subgenome, sequencing.
[00185] Mini-sequencing involves allowing a primer to hybridize to the DNA sequence adjacent to the SNP site on the test sample under investigation. The primer is extended by one nucleotide using all four differentially tagged fluorescent dideoxynucleotides (A,C,G, or T), and a DNA polymerase. Only one of the four nucleotides (homozygous case) or two of the four nucleotides (heterozygous case) is incorporated. The base that is incorporated is complementary to the nucleotide at the SNP position.
[00186] A number of sequencing methods and platforms are particularly suited to large-scale implementation, and are amenable to use in the methods of the invention. These include pyrosequencing methods, such as that utilised in the GS FLX pyrosequencing platform available from 454 Life Sciences (Branford, CT) which can generate 100 million nucleotide data in a 7.5 hour run with a single machine, and solid-state sequencing methods, such as that utilised in the SOLiD sequencing platform (Applied Biosystems, Foster City, CA).
[00187] A number of methods currently used for SNP detection involve site-specific and/or allele-specific hybridisation. These methods are largely reliant on the discriminatory binding of oligonucleotides to target sequences containing the SNP of interest. The
564717
38
techniques of Illumina (San Diego, CA). Affymetrix (Santa Clara, CA.) and Nanogcri Inc. (San Diego, Calif.) are particularly well-known, and utilize the fact that DNA duplexes containing single base mismatches are much less stable than duplexes that are perfectly base-paired. The presence of a matched duplex is usually detected by fluorescence. A number of whole-genome genotyping products and solutions amenable or adaptable for use in the present invention are now available, including those available from the above companies.
[00188] The majority of methods to detect or identify SNPs by site-specific hybridisation require target amplification by methods such as PCR to increase sensitivity and specificity (see, for example U.S. Pat. No. 5,679,524, PCT publication WO 98/59066, PCT publication WO 95/12607). US Application 20050059030 (incorporated herein in its entirety) describes a method for detecting a single nucleotide polymorphism in total human DNA without prior amplification or complexity reduction to selectively enrich for the target sequence, and without the aid of any enzymatic reaction. The method utilises a single-step hybridization involving two hybridization events: hybridization of a first portion of the target sequence to a capture probe, and hybridization of a second portion of said target sequence to a detection probe. Both hybridization events happen in the same reaction, and the order in which hybridisation occurs is not critical.
[00189] US Application 20050042608 (incorporated herein in its entirety) describes a modification of the method of electrochemical detection of nucleic acid hybridization of Thorp et al. (U.S. Pat. No. 5,871,918). Briefly, capture probes are designed, each of which has a different SNP base and a sequence of probe bases on each side of the SNP base. The probe bases are complementary to the corresponding target sequence adjacent to the SNP site. Each capture probe is immobilized on a different electrode having a non-conductive outer layer on a conductive working surface of a substrate. The extent of hybridization between each capture probe and the nucleic acid target is detected by detecting the oxidation-reduction reaction at each electrode, utilizing a transition metal complex. These differences in the oxidation rates at the different electrodes are used to determine whether the selected nucleic acid target has a single nucleotide polymorphism at the selected SNP site.
[00190] The technique of Lynx Therapeutics (Hayward, Calif.) using MEGATYPEâ„¢ technology can genotype very large numbers of SNPs simultaneously from small or large pools of genomic material. This technology uses fluorescently labeled probes and compares the collected genomes of two populations, enabling detection and recovery of DNA fragments
564717
39
spanning SNPs that distinguish the two populations, without requiring prior SNP mapping or knowledge.
[00191] A number of other methods for detecting and identifying SNPs exist. These include the use of mass spectrometry, for example, to measure probes that hybridize to the SNP. This technique varies in how rapidly it can be performed, from a few samples per day to a high throughput of many thousands of SNPs per day, using mass code tags. A preferred example is the use of mass spectrometric determination of a nucleic acid sequence which comprises the polymorphisms of the invention, for example, which includes the C-1054T promoter polymorphism, the G15929A (G278R) and the A18068G (N341D) exonic polymorphisms, in the BCMOl gene (whether the coding sequence or a complementary sequence). Such mass spectrometric methods are known to those skilled in the art, and the genotyping methods of the invention are amenable to adaptation for the mass spectrometric detection of the polymorphisms of the invention.
[00192] SNPs can also be determined by ligation-bit analysis. This analysis requires two primers that hybridize to a target with a one nucleotide gap between the primers. Each of the four nucleotides is added to a separate reaction mixture containing DNA polymerase, ligase, target DNA and the primers. The polymerase adds a nucleotide to the 3'end of the first primer that is complementary to the SNP, and the ligase then ligates the two adjacent primers together. Upon heating of the sample, if ligation has occurred, the now larger primer will remain hybridized and a signal, for example, fluorescence, can be detected. A further discussion of these methods can be found in U.S. Pat. Nos. 5,919,626; 5,945,283; 5,242,794; and 5,952,174.
[00193] US Patent 6,821,733 (incorporated herein in its entirety) describes methods to detect differences in the sequence of two nucleic acid molecules that includes the steps of: contacting two nucleic acids under conditions that allow the formation of a four-way complex and branch migration; contacting the four-way complex with a tracer molecule and a detection molecule under conditions in which the detection molecule is capable of binding the tracer molecule or the four-way complex; and determining binding of the tracer molecule to the detection molecule before and after exposure to the four-way complex. Competition of the four-way complex with the tracer molecule for binding to the detection molecule indicates a difference between the two nucleic acids.
[00194] Protein- and proteomics-based approaches are also suitable for polymorphism detection and analysis. Polymorphisms which result in or are associated with variation in
564717
40
expressed proteins can be detected directly by analysing said proteins. This typically requires separation of the various proteins within a sample, by, for example, gel electrophoresis or HPLC, and identification of said proteins or peptides derived therefrom, for example by NMR or protein sequencing such as chemical sequencing or more prevalently mass spectrometry. Proteomic methodologies are well known in the art, and have great potential for automation. For example, integrated systems, such as the ProteomlQâ„¢ system from Proteome Systems, provide high throughput platforms for proteome analysis combining sample preparation, protein separation, image acquisition and analysis, protein processing, mass spectrometry and bioinformatics technologies.
[00195] The majority of proteomic methods of protein identification utilise mass spectrometry, including ion trap mass spectrometry, liquid chromatography (LC) and LC/MSn mass spectrometry, gas chromatography (GC) mass spectroscopy, Fourier transform-ion cyclotron resonance-mass spectrometer (FT-MS), MALDI.-TOF mass spectrometry, and ESI mass spectrometry, and their derivatives. Mass spectrometric methods are also useful in the determination of post-translational modification of proteins, such as phosphorylation or glycosylation, and thus have utility in determining polymorphisms that result in or are associated with variation in post-translational modifications of proteins.
[00196] Associated technologies are also well known, and include, for example, protein processing devices such as the "Chemical Inkjet Printer" comprising piezoelectric printing technology that allows in situ enzymatic or chemical digestion of protein samples electroblotted from 2-D PAGE gels to membranes by jetting the enzyme or chemical directly onto the selected protein spots. After in-situ digestion and incubation of the proteins, the membrane can be placed directly into the mass spectrometer for peptide analysis.
[00197] It will be apparent that the presence or absence of the C allele or of the T allele at the C-1054T promoter polymorphism, or the presence or absence of the G or of the A allele at the G15929A exonic polymorphism, or the presence or absence of the G or of the A allele at the Al 8068G exonic polymorphism, in the BCMOl gene may also be determined by analysis of a polypeptide sample, derived from a bovine.
[00198] Suitable polypeptide-based analyses include those able to discriminate between full-length and truncated protein products, and may include but are not limited to, the following: Native poly aery lamide gel electrophoresis (PAGE), isoelectric focussing, 2D PAGE, or Western blotting with specific antibodies. Mass spectroscopy, immunoprecipitation, and peptide fingerprinting are also suitable.
564717
41
[00199] A large number of methods reliant on the conformational variability of nucleic acids have been developed to detect SNPs.
[00200] For example, Single Strand Conformational Polymorphism (SSCP, Orita et al., PNAS 1989 86:2766-2770) is a method reliant on the ability of single-stranded nucleic acids to form secondary structure in solution under certain conditions. The secondary structure depends on the base composition and can be altered by a single nucleotide substitution, causing differences in electrophoretic mobility under nondenaturing conditions. The various polymorphs are typically detected by autoradiography when radioactively labelled, by silver staining of bands, by hybridisation with detectably labelled probe fragments or the use of fluorescent PCR primers which are subsequently detected, for example by an automated DNA sequencer.
[00201] Modifications of SSCP are well known in the art, and include the use of differing gel running conditions, such as for example differing temperature, or the addition of additives, and different gel matrices. Other variations on SSCP are well known to the skilled artisan, including,RNA-SSCP, restriction endonuclease fingerprinting-SSCP, dideoxy fingerprinting (a hybrid between dideoxy sequencing and SSCP), bi-directional dideoxy fingerprinting (in which the dideoxy termination reaction is performed simultaneously with two opposing primers), and Fluorescent PCR-SSCP (in which PCR products are internally labelled with multiple fluorescent dyes, may be digested with restriction enzymes, followed by SSCP, and analysed on an automated DNA sequencer able to detect the fluorescent dyes).
[00202] Other methods which utilise the varying mobility of different nucleic acid structures include Denaturing Gradient Gel Electrophoresis (DGGE), Temperature Gradient Gel Electrophoresis (TGGE), and Heteroduplex Analysis (HET). Here, variation in the dissociation of double stranded DNA (for example, due to base-pair mismatches) results in a change in electrophoretic mobility. These mobility shifts are used to detect nucleotide variations.
[00203] Denaturing High Pressure Liquid Chromatography (HPLC) is yet a further method utilised to detect SNPs, using HPLC methods well-known in the art as an alternative to the separation methods described above (such as gel electophoresis) to detect, for example, homoduplexes and heteroduplexes which elute from the HPLC column at different rates, thereby enabling detection of mismatch nucleotides and thus SNPs.
[00204] Yet further methods to detect SNPs rely on the differing susceptibility of single stranded and double stranded nucleic acids to cleavage by various agents, including chemical
564717
42
cleavage agents and nucleolytic enzymes. For example, cleavage of mismatches within RNA:DNA heteroduplexes by RNase A, of heteroduplexes by, for example bacteriophage T4 endonuclease YII or T7 endonuclease I, of the 5' end of the hairpin loops at the junction between single stranded and double stranded DNA by cleavase I, and the modification of mispaired nucleotides within heteroduplexes by chemical agents commonly used in Maxam-Gilbert sequencing chemistry, are all well known in the art.
[00205] Further examples include the Protein Translation Test (PTT), used to resolve stop codons generated by variations which lead to a premature termination of translation and to protein products of reduced size, and the use of mismatch binding proteins. Variations are detected by binding of, for example, the MutS protein, a component of Escherichia coli DNA mismatch repair system, or the human hMSH2 and GTBP proteins, to double stranded DNA heteroduplexes containing mismatched bases. DNA duplexes are then incubated with the mismatch binding protein, and variations are detected by mobility shift assay. For example, a simple assay is based on the fact that the binding of the mismatch binding protein to the heteroduplex protects the heteroduplex from exonuclease degradation.
[00206] Those skilled in the art will know that a particular SNP, particularly when it occurs in a regulatory region of a gene such as a promoter, can be associated with altered expression of a gene. Altered expression of a gene can also result when the SNP is located in the coding region of a protein-encoding gene, for example where the SNP is associated with codons of varying usage and thus with tRNAs of differing abundance. Such altered expression can be determined by methods well known in the art, and can thereby be employed to detect such SNPs. Similarly, where a SNP occurs in the coding region of a gene and results in a non-synonomous amino acid substitution, such substitution can result in a change in the function of the gene product. Similarly, in cases where the gene product is an RNA, such SNPs can result in a change of function in the RNA gene product. Any such change in function, for example as assessed in an activity or functionality assay, can be employed to detect such SNPs.
100207] The above methods of detecting and identifying SNPs are amenable to use in the methods of the invention.
3 Polynucleotides, polypeptides, and variants
[00208] The term "polynucleotide(s)," as used herein, means a single or double-stranded deoxyribonucleotide or ribonucleotide polymer of any length but preferably at least 15 nucleotides, and include as non-limiting examples, coding and non-coding sequences of a
564717
43
gene, sense and antisense sequences complements, exons, introns, genomic DNA, cDNA, pre-mRNA, mRNA, rRNA, siRNA, miRNA, tRNA, ribozymes, recombinant polypeptides, isolated and purified naturally occurring DNA or RNA sequences, synthetic RNA and DNA sequences, nucleic acid probes, primers and fragments. A number of nucleic acid analogues are well known in the art and are also contemplated.
[00209] A "fragment" of a polynucleotide sequence provided herein is a subsequence of contiguous nucleotides that is preferably at least 15 nucleotides in length. The fragments of the invention preferably comprises at least 20 nucleotides, more preferably at least 30 nucleotides, more preferably at least 40 nucleotides, more preferably at least 50 nucleotides and most preferably at least 60 contiguous nucleotides of a polynucleotide of the invention. A fragment of a polynucleotide sequence can be used in antisense, gene silencing, triple helix or ribozyme technology, or as a primer, a probe, included in a microarray, or used in polynucleotide-based selection methods.
[00210] The term "fragment" in relation to promoter polynucleotide sequences is intended to include sequences comprising cis-elements and regions of the promoter polynucleotide sequence capable of regulating expression of a polynucleotide sequence to which the fragment is operably linked.
[00211] Preferably fragments of polynucleotide sequences of the invention comprise at least 20, more preferably at least 30, more preferably at least 40, more preferably at least 50, more preferably at least 100, more preferably at least 200, more preferably at least 300, more preferably at least 400, more preferably at least 500, more preferably at least 600, more preferably at least 700, more preferably at least 800, more preferably at least 900 and most preferably at least 1000 contiguous nucleotides of a polynucleotide of the invention.
[00212] A "fragment" of a polypeptide or an amino acid sequence provided herein is a subsequence of contiguous amino acids that is preferably at least 10 amino acids in length. The fragments of the invention preferably comprise or consist of at least about 15 amino acids, at least 20 amino acids, more preferably at least 30 amino acids, more preferably at least 40 amino acids, more preferably at least 50 amino acids and most preferably at least 60 contiguous amino acids of a polypeptide of the invention. A fragment of a polypeptide sequence can be used in the methods described herein, for example in polypeptide-based selection methods as described herein, and polypeptide fragments retaining functional activity may be used in methods of the invention that are reliant on such activity.
564717
44
[00213] The term "primer" refers to a short polynucleotide, usually having a free 3'OH group, that is hybridized to a template and used for priming polymerization of a polynucleotide complementary to the template. Such a primer is preferably at least 5, more preferably at least 6, more preferably at least 7, more preferably at least 9, more preferably at least 10, more preferably at least 11, more preferably at least 12, more preferably at least 13, more preferably at least 14, more preferably at least 15, more preferably at least 16, more preferably at least 17, more preferably at least 18, more preferably at least 19, more preferably at least 20 nucleotides in length.
[00214] The term "probe" refers to a short polynucleotide that is used to detect a polynucleotide sequence that is complementary to the probe, in a hybridization-based assay, The probe may consist of a "fragment" of a polynucleotide as defined herein. Preferably such a probe is at least 5, more preferably at least 10, more preferably at least 20, more preferably at least 30, more preferably at least 40, more preferably at least 50, more preferably at least 100, more preferably at least 200, more preferably at least 300, more preferably at least 400 and most preferably at least 500 nucleotides in length.
[00215] The term "variant" as used herein refers to polynucleotide or polypeptide sequences different from the specifically identified sequences, wherein one or more nucleotides or amino acid residues is deleted, substituted, or added. Variants may be naturally occurring allelic variants, or non-naturally occurring variants. Variants may be from the same or from other species and may encompass homologues, paralogues and orthologues. In certain embodiments, variants of the polynucleotides and polypeptides possess biological activities that are the same or similar to those of the wild type polynucleotides or polypeptides. The term "variant" with reference to polynucleotides and polypeptides encompasses all forms of polynucleotides and polypeptides as defined herein.
3.1 Polynucleotide variants
[00216] Variant polynucleotide sequences preferably exhibit at least 50%, more preferably at least 51%, at least 52%, at least 53%, at least 54%, at least 55%, at least 56%, at least 57%, at least 58%, at least 59%, at least 60%, at least 61%, at least 62%, at least 63%, at least 64%, at least 65%, at least 66%, at least 67%, at least 68%. at least 69%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 76%, at least %, at least 77%, at least 78%a, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%, at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or
564717
45
at least 99% identity to a specified polynucleotide sequence. Identity is found over a comparison window of at least 20 nucleotide positions, preferably at least 50 nucleotide positions, at least 100 nucleotide positions, or over the entire length of the specified polynucleotide sequence.
[00217] Polynucleotide sequence identity can be determined in the following manner. The subject polynucleotide sequence is compared to a candidate polynucleotide sequence using BLASTN (from the BLAST suite of programs, version 2.2.10 [Oct 2004]) in bl2seq (Tatiana A. Tatusova, Thomas L. Madden (1999), "Blast 2 sequences - a new tool for comparing protein and nucleotide sequences", FEMS Microbiol Lett. 174:247-250), which is publicly available from NCBI (ftp://ftp.ncbi.nih.gov/blast/). The default parameters of bl2seq are utilized except that filtering of low complexity parts should be turned off.
[00218] The identity of polynucleotide sequences may be examined using the following unix command line parameters:
[00219] bl2seq -i nucleotideseql j nucleotideseq2 -F F -p blastn
[00220] The parameter -F F turns off filtering of low complexity sections. The parameter -p selects the appropriate algorithm for the pair of sequences. The bl2seq program reports sequence identity as both the number and percentage of identical nucleotides in a line "Identities = ".
[00221] Polynucleotide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs (e.g. Needleman, S. B. and Wunsch, C. D. (1970) J. Mol. Biol. 48, 443-453). A full implementation of the Needleman-Wunsch global alignment algorithm is found in the needle program in the EMBOSS package (Rice,P. Longden,I. and Bleasby,A. EMBOSS: The European Molecular Biology Open Software Suite, Trends in Genetics June 2000, vol 16, No 6. pp.276-277) which can be obtained from http://www.hgmp.mrc.ac.uk/Software/EMBOSS/. The European Bioinformatics Institute server also provides the facility to perform EMBOSS-needle global alignments between two sequences on line at http:/www.ebi.ac.uk/emboss/align/.
[00222] Alternatively the GAP program may be used which computes an optimal global alignment of two sequences without penalizing terminal gaps. GAP is described in the following paper: Huang, X. (1994) On Global Sequence Alignment. Computer Applications in Biosciences 10, 227-235.
564717
46
[00223] Polynucleotide variants of the present invention also encompass those which exhibit a similarity to one or more of the specifically identified sequences that is likely to preserve the functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance. Such sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.10 [Oct 2004]) from NCBI (ftp://ftp.ncbi.nih.gov/blast/).
[00224] The similarity of polynucleotide sequences may be examined using the following unix command line parameters:
[00225] bl2seq -i nucleotideseql -j nucleotideseq2 -F F -p tblastx
[00226] The parameter -F F turns off filtering of low complexity sections. The parameter -p selects the appropriate algorithm for the pair of sequences. This program finds regions of similarity between the sequences and for each such region reports an "E value" which is the expected number of times one could expect to see such a match by chance in a database of a fixed reference size containing random sequences. The size of this database is set by default in the bl2seq program. For small E values, much less than one, the E value is approximately the probability of such a random match.
[00227] Variant polynucleotide sequences preferably exhibit an E value of less than 1 x 10~10, more preferably less than 1 x 10"20, less than 1 x 10"30, less than 1 x 10"40, less than 1 x 10"50, less than 1 x 10~60, less than 1 x 10~70, less than 1 x 10~80, less than 1 x 10"90, less than 1 x lO"100, less than 1 x I0"no, less than 1 x 10"120 or less than 1 x 10"123 when compared with any one of the specifically identified sequences.
[00228] Alternatively, variant polynucleotides of the present invention hybridize to a specified polynucleotide sequence, or complements thereof under stringent conditions.
[00229] The term "hybridize under stringent conditions", and grammatical equivalents thereof, refers to the ability of a polynucleotide molecule to hybridize to a target polynucleotide molecule (such as a target polynucleotide molecule immobilized on a DNA or RNA blot, such as a Southern blot or Northern blot) under defined conditions of temperature and salt concentration. The ability to hybridize under stringent hybridization conditions can be determined by initially hybridizing under less stringent conditions then increasing the stringency to the desired stringency.
[00230] With respect to polynucleotide molecules greater than about 100 bases in length, typical stringent hybridization conditions are no more than 25 to 30°C (for example, 10°C)
564717
47
below the melting temperature (Tm) of the native duplex (see generally, Sambrook et al., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Ausubel et al., 1987, Current Protocols in Molecular Biology, Greene Publishing,). Tm for polynucleotide molecules greater than about 100 bases can be calculated by the formula Tm = 81. 5 + 0. 41% (G + C-log (Na+). (Sambrook et al., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press; Bolton and McCarthy, 1962, PNAS 84:1390). Typical stringent conditions for polynucleotide of greater than 100 bases in length would be hybridization conditions such as prewashing in a solution of 6X SSC, 0.2% SDS; hybridizing at 65°C, 6X SSC, 0.2% SDS overnight; followed by two washes of 30 minutes each in IX SSC, 0.1% SDS at 65QC and two washes of 30 minutes each in 0.2X SSC, 0.1% SDS at 65°C.
[00231] With respect to polynucleotide molecules having a length less than 100 bases, exemplary stringent hybridization conditions are 5 to 10°C below Tm. On average, the Tm of a polynucleotide molecule of length less than 100 bp is reduced by approximately (500/oligonucleotide length)°C.
[00232] With respect to the DNA mimics known as peptide nucleic acids (PNAs) (Nielsen et al., Science. 1991 Dec 6;254(5037):1497-500) Tm values are higher than those for DNA-DNA or DNA-RNA hybrids, and can be calculated using the formula described in Giesen et al., Nucleic Acids Res. 1998 Nov l;26(21):5004-6. Exemplary stringent hybridization conditions for a DNA-PNA hybrid having a length less than 100 bases are 5 to 10°C below the Tm.
[00233] Variant polynucleotides of the present invention also encompasses polynucleotides that differ from the sequences of the invention but that, as a consequence of the degeneracy of the genetic code, encode a polypeptide having similar activity to a polypeptide encoded by a polynucleotide of the present invention. A sequence alteration that does not change the amino acid sequence of the polypeptide is a "silent variation". Except for ATG (methionine) and TGG (tryptophan), other codons for the same amino acid may be changed by art recognized techniques, e.g., to optimize codon expression in a particular host organism.
[00234] Polynucleotide sequence alterations resulting in conservative substitutions of one or several amino acids in the encoded polypeptide sequence without significantly altering its biological activity are also included in the invention. A skilled artisan will be aware of
564717
48
methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al,, 1990, Science 247, 1306).
[00235] Variant polynucleotides due to silent variations and conservative substitutions in the encoded polypeptide sequence may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.10 [Oct 2004]) from NCBI (ftp://ftp.ncbi.nih.gov/T3last/) via the tblastx algorithm as previously described.
3.2 Polypeptide Variants
[00236] The term "variant" with reference to polypeptides encompasses naturally occurring, recombinantly and synthetically produced polypeptides. Variant polypeptide sequences preferably exhibit at least 50%, more preferably at least 51%, at least 52%, at least 53%, at least 54%, at least 55%, at least 56%, at least 57%, at least 58%, at least 59%, at least 60%, at least 61 %, at least 62%, at least 63%, at least 64%, at least 65%, at least 66%, at least 67%, at least 68%, at least 69%, at least 70%, at least 71%, at least 72%, at least 73%, at least 74%, at least 75%, at least 16%, at least %, at least 77%, at least 78%, at least 79%, at least 80%, at least 81%, at least 82%, at least 83%), at least 84%, at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identity to a sequences of the present invention. Identity is found over a comparison window of at least 20 amino acid positions, preferably at least 50 amino acid positions, at least 100 amino acid positions, or over the entire length of a polypeptide of the invention.
[00237] Polypeptide sequence identity can be determined in the following manner. The subject polypeptide sequence is compared to a candidate polypeptide sequence using BLASTP (from the BLAST suite of programs, version 2.2.10 [Oct 2004]) in bl2seq, which is publicly available from NCBI (ftp://ftp.ncbi.nih.gov/blast/). The default parameters of bl2seq are utilized except that filtering of low complexity regions should be turned off.
[00238] Polypeptide sequence identity may also be calculated over the entire length of the overlap between a candidate and subject polynucleotide sequences using global sequence alignment programs. EMBOSS-needle (available at http:/www.ebi.ac.uk/emboss/align/) and GAP (Huang, X. (1994) On Global Sequence Alignment. Computer Applications in the Biosciences 10, 227-235.) as discussed above are also suitable global sequence alignment programs for calculating polypeptide sequence identity.
[00239] Polypeptide variants of the present invention also encompass those which exhibit a similarity to one or more of the specifically identified sequences that is likely to preserve the
564717
49
functional equivalence of those sequences and which could not reasonably be expected to have occurred by random chance. Such sequence similarity with respect to polypeptides may be determined using the publicly available bl2seq program from the BLAST suite of programs (version 2.2.10 [Oct 2004]) from NCBI (ftp://ftp.ncbi.nih.gov/blasty). The similarity of polypeptide sequences may be examined using the following unix command line parameters:
bl2seq -i peptideseql -j peptideseq2 -F F -p blastp
[00240] Variant polypeptide sequences preferably exhibit an E value of less than 1 x 10"10, more preferably less than 1 x 10 20, less than 1 x 10"30, less than 1 x 10"40, less than 1 x 10"50, less than 1 x 10"60, less than 1 x 10"70, less than 1 x 10"80, less than 1 x 10"90, less than 1 xlO" I0°, less than 1 x 10~110, less than 1 x 10"120 or less than 1 x 10"123 when compared with any one of the specifically identified sequences.
[00241] The parameter -F F turns off filtering of low complexity sections. The parameter -p selects the appropriate algorithm for the pair of sequences. This program finds regions of similarity between the sequences and for each such region reports an "E value" which is the expected number of times one could expect to see such a match by chance in a database of a fixed reference size containing random sequences. For small E values, much less than one, this is approximately the probability of such a random match.
[00242] Conservative substitutions of one or several amino acids of a described polypeptide sequence without significantly altering its biological activity are also included in the invention. A skilled artisan will be aware of methods for making phenotypically silent amino acid substitutions (see, e.g., Bowie et al., 1990, Science 247, 1306).
[00243] A polypeptide variant of the present invention also encompasses that which is produced from the nucleic acid encoding a polypeptide, but differs from the wild type polypeptide in that it is processed differently such that it has an altered amino acid sequence. For example a variant may be produced by an alternative splicing pattern of the primary RNA transcript to that which produces a wild type polypeptide.
4 Diagnostic kits
[00244] The invention further provides diagnostic kits useful in determining the bovine BCMOl allelic profile of bovine, for example for use in the methods of the present invention.
[00245] Accordingly, in one embodiment the invention provides a diagnostic kit which can be used to determine the BCMOl genotype of bovine genetic material. One kit includes a set of primers used for amplifying the genetic material. A kit can contain a primer including a
564717
50
nucleotide sequence for amplifying a region of the genetic material containing a non-wild type allele at a polymorphism. Such a kit could also include a primer for amplifying the corresponding region of the reference gene, for example one that produces a wild type BCMOl or a functionally wild type BCMOl. Usually, such a kit would also include another primer upstream or downstream of the region of the gene. These primers are used to amplify the segment containing the polymorphism of interest. The actual genotyping is carried out using primers that target specific alleles such as those described herein, and that could function as allele-specific oligonucleotides in conventional hybridisation, Taqman assays, OLE assays, etc. Alternatively, primers can be designed to permit genotyping by micro sequencing.
[00246] One kit of primers can include first, second and third primers, (a), (b) and (c), respectively. Primer (a) is based on a region containing a BCMOl mutation. Primer (b) encodes a region upstream or downstream of the region to be amplified by a primer (a) so that genetic material containing the mutation is amplified, by PCR, for example, in the presence of the two primers. Primer (c) is based on the region corresponding to that on which primer (a) is based, but lacking the mutation. Thus, genetic material containing the non-mutated region will be amplified in the presence of primers (b) and (c). Genetic material homozygous for the wild type gene will thus provide amplified products in the presence of primers (b) and (c). Genetic material homozygous for the mutated gene will thus provide amplified products in the presence of primers (a) and (b). Heterozygous genetic material will provide amplified products in both cases.
[00247] For example, the kit may include a primer comprising a cytosine at the position corresponding to the C-1054T promoter polymorphism in the BCMOl gene or comprising a nucleotide capable of hybridising to a thymine at the position corresponding to the C-1054T promoter polymorphism in the BCMOl gene. Those skilled in the art will recognise that in such a primer, the cytosine, or the nucleotide capable of hybridising to a thymine, as applicable, may be substituted for a nucleotide analogue having the same discriminatory base-pairing as the substituted nucleotide.
[00248] In another example, the kit may include a primer comprising a thymine at the position corresponding to the C-1054T promoter polymorphism in the BCMOl gene, or comprising a nucleotide capable of hybridising to an ademine at the position corresponding to the C-1054T promoter polymorphism in the BCMOl gene. Those skilled in the art will recognise that in such a primer, the thymine, or the nucleotide capable of hybridising to an
564717
51
adenine, as applicable, may be substituted for a nucleotide analogue having the same discriminatory base-pairing as the substituted nucleotide.
[00249] Those skilled in the art will appreciate that the invention provides kits comprising primers similarly directed to the G15929A (G278R) polymorphism, or to the A18068G (N341D) polymorphism.
[00250] In one embodiment, the diagnostic kit is useful in detecting DNA comprising a variant BCMOl gene or DNA or mRNA encoding a variant BCMOl polypeptide at least partially lacking reference activity in a bovine which includes first and second primers for amplifying the DNA or mRNA, the primers being complementary to nucleotide sequences of the DNA or mRNA upstream and downstream, respectively, of a polymorphism in the portion of the DNA encoding BCMOl which results in increased P-carotene levels (particularly increased P-carotene content in milk fat), preferably wherein at least one of the nucleotide sequences is selected to be from a non-coding region of the wild type BCMOl gene. The kit can also include a third primer complementary to a naturally occurring mutation of a coding portion of the wild type BCMOl gene. Preferably the kit includes instructions for use, for example in accordance with a method of the invention.
[00251] In one embodiment, the diagnostic kit comprises a nucleotide probe complementary to the sequence, or an oligonucleotide fragment thereof, shown in SEQ ID NO:l or SEQ ID NO:2, for example, for hybridisation with mRNA from a sample of cells; means for detecting the nucleotide probe bound to mRNA in the sample with a standard. In a particular aspect, the kit of this aspect of the invention includes a probe having a nucleic acid molecule sufficiently complementary with a sequence presented in SEQ ID NO Tor SEQ ID NO:2 or complements thereof, so as to bind thereto under stringent conditions. "Stringent
' hybridisation conditions" takes on its common meaning to a person skilled in the art. Appropriate stringency conditions which promote nucleic acid hybridisation, for example, 6x sodium chloride/sodium citrate (SSC) at about 45°C are known to those skilled in the art, including in Current Protocols in Molecular Biology, John Wiley & Sons, NY (1989). Appropriate wash stringency depends on degree of homology and length of probe. If homology is 100%, a high temperature (65°C to 75°C) may be used. However, if the probe is very short (<100bp), lower temperatures must be used even with 100% homology. In general, one starts washing at low temperatures (37°C to 40°C), and raises the temperature by 3-5°C intervals until background is low enough to be a major factor in autoradiography. The diagnostic kit can also contain an instruction manual for use of the kit.
564717
52
[00252] The invention also includes kits for detecting the presence of BCMOl protein in a biological sample. For example, the kit can include a compound or agent capable of detecting BCMOl protein in a biological sample; and a standard. The compound or agent can be packaged in a suitable container. The kit can further comprise instructions for using the kit to detect BCMO 1 protein.
[00253] In one embodiment, the diagnostic kit comprises an antibody or an antibody composition useful for detection of the presence or absence of wild type BCMOl, or determining wild type BCMOl and/or the presence or absence of a variant protein at least partially lacking wild type activity, or determining the level of expression of the BCMOl protein, together with instructions for use, for example in a method of the invention.
[00254] For antibody-based kits, the kit can include: (1) a first antibody (e.g., attached to a solid support) which binds to a polypeptide corresponding to a marker; and, optionally, (2) a second, different antibody which binds to either the polypeptide or the first antibody and is conjugated to a detectable agent.
[00255] The kit can also include a buffering agent, a preservative, or a protein stabilizing agent. The kit can also include components necessary for detecting the detectable agent (e.g., an enzyme or a substrate). The kit can also contain a control sample or a series of control samples which can be assayed and compared to the test sample contained. Each component of the kit can be enclosed within an individual container and all of the various containers can be within a single package, along with instructions for interpreting the results of the assays performed using the kit.
Sample preparation
[00256] As will be apparent to persons skilled in the art, samples suitable for use in the methods of the present invention may be obtained from tissues or fluids as convenient, and so that the sample contains the moiety or moieties to be tested. For example, where nucleic acid is to be analysed, tissues or fluids containing nucleic acid will be used.
[00257] Conveniently, samples may be taken from milk, tissues including blood, serum, and plasma, cerebrospinal fluid, urine, semen or saliva. Tissue samples may be obtained using standard techniques such as cell scrapings or biopsy techniques. For example, the cell or tissue samples may be obtained by using an ear punch to collect ear tissue from bovine. Similarly, blood sampling is routinely performed, for example for pathogen testing, and methods for taking blood samples are well known in the art. Likewise, methods for storing and processing biological samples are well known in the art. For example, tissue samples
564717
53
may be frozen until tested if required. In addition, one of skill in the art would realize that some test samples would be more readily analyzed following a fractionation or purification procedure, for example, separation of whole blood into serum or plasma components. 6 Computer-Related Embodiments
[00258] It will also be appreciated that the methods of the invention are amenable to use with and the results analysed by computer systems, software and processes. Computer systems, software and processes to identify and analyse genetic polymorphisms are well known m the art. For example, the results of one or more genetic analyses as described herein may be analysed using a computer system and processed by such a system.
[00259] Both the SNPs and the results of an analysis of the SNPs utilised in the present invention may be "provided" in a variety of mediums to facilitate use thereof. As used in this section, "provided" refers to a manufacture, other than an isolated nucleic acid molecule, that contains SNP information of the present invention. Such a manufacture provides the SNP information in a form that allows a skilled artisan to examine the manufacture using means not directly applicable to examining the SNPs or a subset thereof as they exist in nature or in purified form. The SNP information that may be provided in such a form includes any of the SNP information provided by the present invention such as, for example, polymorphic nucleic acid and/or amino acid sequence information, information about observed SNP alleles, alternative codons, populations, allele frequencies, SNP types, and/or affected proteins, phenotypic effect or association, or any other information provided by the present invention in Tables 2 and 3 and/or the Sequence ID Listing.
[00260] In one application of this embodiment, the SNPs and the results of an analysis of the SNPs utilised in the present invention can be recorded on a computer readable medium. As used herein, "computer readable medium" refers to any medium that can be read and accessed directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these categories such as magnetic/optical storage media. A skilled artisan can readily appreciate how any of the presently known computer readable media can be used to create a manufacture comprising computer readable medium having recorded thereon SNP information of the present invention. One such medium is provided with the present application, namely, the present application contains computer readable medium (floppy disc) that has nucleic acid sequences used in analysing the SNPs utilised in the present invention,
564717
54
together with derived amino acid sequence, provided/recorded thereon in ASCII text format in a Sequence ID Listing.
[00261] As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled artisan can readily adopt any of the presently known methods for recording information on computer readable medium to generate manufactures comprising the SNP information of the present invention.
[00262] A variety of data storage structures are available to a skilled artisan for creating a computer readable medium having recorded thereon SNP information of the present invention. The choice of the data storage structure will generally be based on the means chosen to access the stored information. In addition, a variety of data processor programs and formats can be used to store the SNP information of the present invention on computer readable medium. For example, sequence information can be represented in a word processing text file, formatted in commercially-available software such as WordPerfect and Microsoft Word, represented in the form of an ASCII file, or stored in a database application, such as OB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of data processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having recorded thereon the SNP information of the present invention.
[00263] By providing the SNPs and/or the results of an analysis of the SNPs utilised in the present invention in computer readable form, a skilled artisan can routinely access the SNP information for a variety of purposes. Computer software is publicly available which allows a skilled artisan to access sequence information provided in a computer readable medium. Examples of publicly available computer software include BLAST (Altschul et at, J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et at, Comp. Chem. 17:203-207 (1993)) search algorithms.
[00264] The present invention further provides systems, particularly computer-based systems, which contain the SNP information described herein. Such systems may be designed to store and/or analyze information on, for example, a number of SNP positions, or information on SNP genotypes from a number of subjects. The SNP information of the present invention represents a valuable information source. The SNP information of the present invention stored/analyzed in a computer-based system may be used for such applications as identifying or selecting subjects, in addition to computer-intensive applications as determining or analyzing SNP allele frequencies in a population, mapping disease genes, genotype-phenotype association studies, grouping SNPs into haplotypes,
564717
55
correlating SNP haplotypes with response to particular drugs, or for various other bioinformatic, pharmacogenomic, drug development, or selection or identification applications.
[00265] As used herein, "a computer-based system" refers to the hardware, software, and data storage used to analyze the SNP information of the present invention. The minimum hardware of the computer-based systems of the present invention typically comprises a central processing unit (CPU), an input, an output, and data storage. A skilled artisan can readily appreciate that any one of the currently available computer-based systems are suitable for use in the present invention. Such a system can be changed into a system of the present invention by utilizing the SNP information, such as that provided herewith on the floppy disc, or a subset thereof, without any experimentation.
[00266] As stated above, the computer-based systems of the present invention comprise data storage having stored therein SNP information, such as SNPs and/or the results of an analysis of the SNPs utilised in the present invention, and the necessary hardware and software for supporting and implementing one or more programs or algorithms. As used herein, "data storage" refers to memory which can store SNP information of the present invention, or a memory access facility which can access manufactures having recorded thereon the SNP information of the present invention.
[00267] The one or more programs or algorithms are implemented on the computer-based system to identify or analyze the SNP information stored within the data storage. For example, such programs or algorithms can be used to determine which nucleotide is present at a particular SNP position in a target sequence, or to analyse the results of a genetic analysis of the SNPs described herein. As used herein, a "target sequence" can be any DNA sequence containing the SNP position(s) to be analysed, searched or queried.
[00268] A variety of structural formats for the input and output can be used to input and output the information in the computer-based systems of the present invention. An exemplary format for an output is a display that depicts the SNP information, such as the presence or absence of specified nucleotides (alleles) at particular SNP positions of interest. Such presentation can provide a rapid, binary scoring system for many SNPs or subjects simultaneously. It will be appreciated that such output may be accessed remotely, for example over a LAN or the internet. Typically, given the nature of SNP information, such remote accessing of such output or of the computer system itself is available only to verified users so that the security of the SNP information and/or the computer system is maintained. Methods
564717
56
to control access to computer systems and the data residing thereon are well-known in the art, and are amenable to the embodiments of the present invention.
[00269] One exemplary embodiment of a computer-based system comprising SNP information of the present invention that can be used to implement the present invention includes a processor connected to a bus. Also connected to the bus are a main memory (preferably implemented as random access memory, RAM) and a variety of secondary storage devices, such as a hard drive and a removable medium storage device. The removable medium storage device may represent, for example, a floppy disc drive, a CD-ROM drive, a magnetic tape drive, etc. A removable storage medium (such as a floppy disc, a compact disc, a magnetic tape, etc.) containing control logic and/or data recorded therein may be inserted into the removable medium storage device. The computer system includes appropriate software for reading the control logic and/or the data from the removable storage medium once inserted in the removable medium storage device. The SNP information of the present invention may be stored in a well-known manner in the main memory, any of the secondary storage devices, and/or a removable storage medium. Software for accessing and processing the SNP information (such as SNP scoring tools, search tools, comparing tools, etc.) preferably resides in main memory during execution.
[00270] Accordingly, the present invention provides a system for performing one or more of the methods of the invention, said system comprising:
computer processor means for receiving, processing and communicating data;
storage means for storing data including a reference genetic database of the results of genetic analysis of a bovine with respect to one or more milk colour or p-carotene content phenotypes and optionally a reference milk colour or p-carotene content phenotypes database of non-genetic factors for bovine milk colour or p-carotene content phenotypes; and a computer program embedded within the computer processor which, once data consisting of or including the result of a genetic analysis for which data is included in the reference genetic database is received, processes said data in the context of said reference databases to determine, as an outcome, the genetic merit of the bovine, said outcome being communicable once known, preferably to a user having input said data.
[00271] Preferably, said system is accessible via the internet or by personal computer.
[00272] Preferably, said reference genetic database comprises or includes the results of one or more analyses of one or more genetic loci associated with one or more milk colour or p-carotene content phenotypes, more preferably the one or more genetic loci are one or more
564717
57
polymorphisms in one or more genes associated with one or more milk colour or P-carotene content phenotypes, preferably one or more polymorphisms in one or more genes involved in P-carotene uptake or metabolism.
[00273] Preferably the one or more genes involved in p-carotene uptake is the SCARB1 gene. Preferably the one or more genes involved P-carotene metabolism is selected from BC02 or BCMOl,
[00274] In yet a further aspect, the invention provides a computer program suitable for use in a system as defined above comprising a computer usable medium having program code embodied in the medium for causing the computer program to process received data consisting of or including the result of at least one genetic analysis of one or more genetic loci associated with one or more milk colour or P-carotene content phenotypes in the context of both a reference genetic database of the results of said at least one genetic analysis and optionally a reference database of non-genetic factors associated with bovine milk colour or P-carotene content phenotypes.
[00275] Preferably, the one or more genetic loci are one or more polymorphisms in one or more genes associated with one or more milk colour or p-carotene content phenotypes, preferably one or more polymorphisms in one or more genes involved in P-carotene uptake or metabolism.
[00276] Preferably the one or more genes involved in P-carotene uptake is the SCARB1 gene. Preferably the one or more genes involved p-carotene metabolism is selected from BC02 or BCMOl.
[00277] It will be appreciated that it is not intended to limit the invention to the above example only, many variations, which may readily occur to a person skilled in the art, being possible without departing from the scope thereof as defined in the accompanying claims.
[00278] This invention may also be said broadly to consist in the parts, elements and features referred to or indicated in the specification of the application, individually or collectively, and any or all combinations of any two or more said parts, elements or features, and where specific integers are mentioned herein which have known equivalents in the art to which this invention relates, such known equivalents are deemed to be incorporated herein as if individually set forth.
[00279] The invention consists in the foregoing and also envisages constructions of which the following gives examples only.
564717
58
EXAMPLE -Analysis of the genetic basis for milk fat colour
[00280] This example describes the investigation of the genetic basis for observed variations in milk fat colour using the results of a Holstein-Friesian X Jersey cross-bred trial conducted to facilitate the discovery of QTLs, genes and mutations associated with economically important milk traits.
Materials and methods
1. Trial design
[00281] A Holstein-Friesian x Jersey crossbred trial was conducted using an F2 trial design with a half-sibling family structure. Reciprocal crosses of Holstein-Friesian and Jersey animals were carried out to produce six F1 bulls of high genetic merit. 850 F2 female progeny forming the basis of the trial herd were then produced through mating of high genetic merit F1 cows with these F1 bulls. The herd was formed over two seasons; animals in cohort one were born in spring 2000, and entered their first lactation in spring 2002, while animals in cohort two were born in spring 2001 and entered their first lactation in spring 2003. A total of 724 F2 cows entered their second lactations (during which milk fat colour was measured). The animals were farmed under standard New Zealand dairy farming practices using a pasture based management system. All animal work was conducted in accordance with the Ruakura Animal Ethics committee.
2. Milk fat colour measurement
[00282] Cows were milked twice daily; milk volume was recorded at each milking. Milk fat colour was measured at three time points during the second lactation: peak lactation (35 days post-calving), mid lactation (mid November) and late lactation (late February). On each collection day, samples were collected from the a.m. and p.m. milkings and combined to make a single composite sample for each animal. Milk fat colour was measured as previously described (Winkelman et al, 1999). Briefly, nonsaponifiable material (including carotenoids) was extracted from fresh milk samples and the absorbance at 450nm was measured. Fat colour (j_ig p-carotene/g milk fat) was calculated (Winkelman et al., 1999).
3. Genotyping
[00283] Genomic DNA was prepared from whole blood from a total of 1679 animals within the trial pedigree (846 F2 daughters, six F1 sires, 796 F1 dams, and 13 selected F0 sires). An initial whole genome scan was conducted by genotyping each animal for 285 microsatellite markers, obtained primarily from published marker maps. Subsequently, the
564717
59
pedigree was genotyped using the Affymetrix Bovine 10K SNP GeneChip. A total of 6634 informative SNP markers were used for QTL analysis.
4. Candidate gene sequencing
[00284] BCMOl was identified as a candidate gene for the milk fat colour QTL on chromosome 18, based on its mapped location to bovine chromosome 18, within the 95% confidence interval of the QTL. Intron/exon boundaries were determined by homology with the human gene sequence. The promoter was amplified using the primers presented as SEQ ID NOS:4-7 and sequenced in both directions. Exons were also sequenced in both directions. The determined wild-type gene sequence for the BCMOl gene is shown in SEQ ID NO:l.
. Statistical analysis
[00285] Data analysis was performed using SAS (version 9.1). Phenotype data for milk fat colour was recorded at three stages of lactation (peak, mid and late lactation). These data were matched with the following covariates: cohort (cohort 1 or cohort 2), sire (sires 1 - 6), milk fat%, milk protein%, lactose%, milk solids%, milk yield, condition score, live weight (average taken for ± seven days around each of the milk fat colour time points at peak, mid and late lactation), somatic cell count (threshold of 200,000 cells during ± seven days around sampling times), free fatty acids (as an indicator of milk fat quality, measured in the same sample as milk fat colour at peak, mid and late lactation), calving week, and estrus week. Animals with missing data points for any of the measurements were excluded and the final datasets included 597, 648 and 632 observations at peak, mid and late lactation, respectively.
[00286] Analyses were conducted using both raw and log-transformed data. ANOVA was conducted for the milk fat colour phenotype at peak, mid and late lactation. The final ANOVA models for each of the lactations were produced using backward elimination process; all the covariates were included in the model at the first stage of the modeling process and the least significant covariates removed at each subsequent stage until all the remaining covariates were found to be significant (significance level set at 0.1). Thus, the final models were as follows: peak lactation (sire, cohort, milk protein%, calving week), mid lactation (sire, cohort) and late lactation (sire, cohort, milk protein%, milk solid%, and somatic cell count).
6. QTL detection
[00287] The data used for QTL analysis were the residuals from each model for both non-transformed and log-transformed data. The raw phenotype data (no covariates or modeling) was also used to detect QTLs. Since the same results were detected with each kind of data,
564717
60
results presented below used non-transformed, modeled data. QTL analysis was conducted using a line of descent model and a half-sib model. Subsequently, the BCMOl mutation was included as a covariate into the models for peak, mid and late lactation} and QTL analysis was performed to test for an association between the BCMOl mutation and QTL significance. 7, Cloning and expression of recombinant BCMOl
[00288] Bovine cDNA was obtained by reverse-transcriptase PCR from mRNA extracted from small intestine tissue of a Friesian cow. The full length BCMO 1 coding sequence was obtained by PCR using the Advantage GC polymerase (Clontech, California, USA). The forward primer (TGTGCGGCCGCCATGGAAATAATATTTGG [SEQ ID NO: 12]) contained a Noil site (underlined) and the reverse primer (GATAGTCCTCACGGCCAAAA [SEQ ID NO: 13]) was placed in the 3' UTR. The resultant PCR product was T/A cloned, using the T/A PCR-produced 3'-adenosine overhang, into pGEMT-easy (Promega, Wisconsin, USA), and the inserted DNA from selected clones were sequenced to verify BCMOl sequence integrity. The BCMOl gene was subsequently digested out of pGEMT-easy with Noil, and ligated into the pFLAG-CMV-2 vector (Sigma-Aldrich), also linearised with Notl, to create a FLAG-BCMOl fusion protein.
[00289] Resulting clones were digested with Notl to determine the presence of full-legth BCMOl insert, and with BamHI, to determine the orientation of the insert. The insert from selected clones were sequenced to verify the full length sequence of the BCMOl coding sequence, and to verify the reading frame. Site directed mutagenesis was then performed to generate clones containing each of the following amino acids 278G/341D, 278R/341N and 278R/341D, (with the wildtype protein being 278G/341N) using the QuikChange Mutagenesis Kit, (Stratagene, La Jolla, California), following manufacturer's instructions. The primers used for the site-directed mutagenesis are presented in Table 1 below.
Table 1: primer sequences used for Site-directed mutagenesis
Primer name
Primer sequence
G15929A (G278R) - sense strand cctggccttccacagggaggacaagac fSEQ ID NO: 14]
G15929A (G278R) — anti-sense strand gtcttgtcctccctgtggaaggccagg [SEQ ID NO: 15]
A18068G (N341D) - sense strand tctacttggccaacctggacgaggactttaaggag [SEQ ID NO: 161
Al 8068G (N341D) - anti-sense strand ctccttaaagtcctcgtccaggttggccaagtaga [SEQ ID NO: 17]
[00290] Sequencing verified the site-directed mutagenesis and the plasmid DNA was purified by midi-prep (Qiagen, following manufacturer's instructions). For transfection experiments, CHO-K1 cells (ATCC, Manasass, VA) were seeded into 75 mm2 flasks (1.5
564717
61
million cells per flask, with 20mL DMEM). After 24 hours, at approximately 50% con fluency, the cells were transfected with 10 |ug plasmid DNA, with each of the clones separately transfected, using Fugene 6 as a transfection agent as per manufacturer's instructions (Roche Applied Science). One flask was transfected per plasmid, per experiment. Cells were harvested 48 hours following transfection for measurement of BCMOl activity. Expression of a -64 kDa FLAG-BCMOl protein in cells was verified by western blotting, for each of the protein variants, using the FLAG M2 monoclonal antibody (Sigma-Aldrich) at a concentration of 1: 1000, and detected using the ECL detection kit (GE Healthcare), according to manufacturer's instructions.
[00291] BCMOl activity was measured as followed. Cells were harvested 48 hours following transfection using the protocol of During et al., 1996. Briefly, cells were rinsed twice with Solution A (50 mM HEPES, 1.15% KC1, 1 mM EDTA, 0.1 mM DTT) and scraped into two mL Solution A. The cell suspension was centrifuged at 2000 x g, for 10 minutes at 4°C, and the cell pellet was resuspended into 350 jul Solution A. The cells were then disrupted using the FastPrep instrument (MP Biomedicals, Solon, Ohio) and using lysis matrix D (2mL capped tube containing 1.4mm ceramic spheres; QBiogene). The lysate was then centrifuged at 9000 xg, for 30 minutes at 4°C. The resulting supernatant was pippeted off the pellet, snap frozen in liquid nitrogen and stored at -80°C for subsequent enzyme assay. Enzyme activity was determined by measuring the conversion of p-carotene to retinal, as described in During et al,, 1996, but using 25 ^L cell lysate in a total reaction volume of 50 (xL. Relative enzyme activity was calculated as fmol retinal formed per min (fmol retinal/min). Protein assays were conducted using a modified Lowry assay (Bio-rad protein assay).
Results
1. p-Carotene concentration in milk is influenced by sire and by stage of lactation.
[00292] The variation in p-carotene concentration in milk throughout lactation is shown in Figure 1.
2. Detection of a major QTL for milk fat colour on bovine chromosome 15
[00293] Analysis of the p-carotene data within the half-sib model of QTL analysis showed the presence of a significant QTL on bovine chromosome 18 (Figure 2). The maximum F value for the QTL was 7.8, and the most likely position was estimated at 15 cM. Bootstrap analysis (n = 1000) showed that the 95% confidence interval for the QTL was 5-30 cM (shown as a grey bar in Figure 1). The information content for the markers used in mapping on chromosome 18 ranged between 0.625 and 0.847 and averaged 0.774. There were a total
564717
62
of 190 markers (6 microsatellite markers and 184 single nucleotide polymorphisms) that comprised the chromosome 18 genetic map.
3. Identification of BCMOl as a candidate gene and detection of a polymorphism
[00294] p~carotene 15', 15'-monooxygenase (BCMOl) catalyses the symmetrical cleavage of p-carotene to vitamin A (see von Lintig, J, and Vogt, K., 2004) and is located within the milk p-carotene QTL confidence interval bovine chromosome 18 (Figure 2). Therefore, this gene was identified as a strong candidate for the chromosome 18 milk fat colour QTL. To determine whether this gene explained the observed variation, the BCMOl region in the six F1 sires was sequenced to identify any genetic polymorphisms that could potentially alter the function, activity, or expression of this enzyme. Intron/exon boundaries were determined using the UniGene bovine gene prediction Bt. 15605 and by sequence comparison with the human BCMOl gene sequence (UniGene Hs,212172). The predicted structure of the bovine gene is shown in Figure 3. Primers were designed within introns so that complete sequence was obtained from each exon.
[00295] Three polymorphisms in the bovine BCMOl gene were identified. The first was a 5' C to T substitution at genomic nucleotide position -1054 (relative to the +1 translation start site), and was heterozygous in two of the six F1 sires. Two single nucleotide polymorphisms causing amino acid substitutions were discovered; one in exon 6: G15929A (G278R) and one in exon 7: A18068G (N341D), G15929A was heterozygous in two F1 sires, with the remaining four F1 sires homozygous for the G allele, while A18068G was heterozygous in one F1 sire, with the remaining five sires homozygous for the A allele. To determine whether these polymorphisms were associated with the QTL effect, the remainder of the FJXB trial pedigree was genotyped. The frequency for each BCMOl genotype is shown in Table 2 below.
Table 2: Genotype frequencies of F2 population
Polymorphism
Genotype
Total
%
C-1054T
CC
623
74.43
CT
205
24.49
TT
9
1.08
G15929A
GG
627
74.64
GA
202
24.05
AA
11
1.31
A18068G
AA
711
84.64
AG
124
14.76
GG
0.60
564717
63
4. BCMOl polymorphisms have an effect on 0-carotene concentration in milk [00296] The association of the BCMOl polymorphisms with variation in milk fat colour (p-carotene concentration) within offspring of the two heterozygous sires is shown in Table 3 below and in Figure 4.
Table 3: Genotype effect on milk fat colour.
Milk fat colour (pg p-carotene/g milk fat)
Polymorphism
Genotype
Peak Lactation
Mid Lactation
Late Lactation
C-1054T
CC
9.22 (0.27)
7.78 (0.27)
6.56(1.15)
CT
.37(0.20)
8.64 (0.21)
6.72 (0.76)
TT
.90 (0.18)
9.37 (0.13)
7.45 (0.39)
G15929A
GG
11.34 (0.17)
9.32 (0.13)
7.32 (0.09)
GA
9.67 (0.24)
8.15 (0.20)
6.71 (0.16)
AA
8.34 (0.60)
7.94 (0.75)
.73 (0.58)
A18068G
AA
.84 (0.15)
9.0(0.12)
7.12(0.09)
AG
11.15 (0.39)
9.11 (0.29)
7.35 (0.19)
GG
.95 (0.9)
11.45 (0.38)
8.70 (0.72)
Data shown are means with standard errors in brackets.
[00297] Inclusion of the BCMOl C-1054T, or G15929A, or A18068G genotypes as fixed effects in the statistical model reduced the QTL effect for milk p-carotene to below a statistically significant threshold. The QTLs resulting from the adjusted statistical effects are shown in Figures 5, 6 and 7.
[00298] For the BCMOl C-1054T polymorphism, animals homozygous for the T allele produce milk with greater concentrations of p-carotene than homozygous C animals. For the G15929A polymorphism, animals homozygous for the G allele produce milk with greater concentrations of p-carotene than homozygous A animals. For the A18068G polymorphism, animals homozygous for the G allele produce milk with greater concentrations of P-carotene than homozygous A animals. For each polymorphism, the effects were similar at each of the three lactation time points that were measured.
[00299] For the C-1054T polymorphism, animals homozygous for the C allele produced milk of approximately 15.4%, 17% and 12% less P-carotene than animals homozygous for the T allele, at peak, mid and late lactation, respectively. For the G15929A polymorphism, animals homozygous for the A allele produced milk of approximately 26%, 14.8%, and 21.7%) less p-carotene than animals homozygous for the G allele, at peak, mid and late lactation, respectively. For the A18068G polymorphism, animals homozygous for the A
564717
64
allele produced milk of approximately 32%, 21%, and 9.4% less p-carotene than animals homozygous for the G allele, at peak, mid and late lactation, respectively.
. Effect of BCMOl polypolymorphisms on enzymatic activity of recombinant BCMOl
[00300] The effect of the BCMOl polymorphisms on the enzymatic activity of recombinant BCMOl was determined as described above.
[00301] The enzymatic activity of recombinant BCMOl carrying the 341D and 278G residues, the 341N and 278R residue, or the 341D and the 278R residues (compared to wild type of 278G/341N), is shown in Figure 8. Data are representative of two independent experiments, and the protein content was equal for each lysate within an experiment.
[00302] The above data show that the presence of the 34ID residue (instead of wild type 34 IN) results in markedly decreased enzyme activity. Notably, the 278R version has similar enzymatic activity to wild-type- BCMOl, as defined by 341N and 278G. Indeed, this activity accords well with the milk p-carotene content data presented above and in Figure 4. Animals homozygous for the 341D amino acid produce milk that is higher in P-carotene (figure 4B). Further, the N341D polymorphism explains more variation than the G278R polymorphism when fitted as a fixed effect in the QTL analysis (Figures 5 and 6). The BCMOl enzyme cleaves p-carotene resulting in colourless retinol, and consequently reduced enzyme activity would be expected to result in higher p-carotene milk content (increased yellow colour).
Discussion
[00303] The present invention recognises that the BCMOl polymorphisms described above, together with polymorphisms in linkage disequilibrium with these polymorphisms, are useful as a selection tool to identify and breed animals with higher or lower milk concentrations of p-carotene, and thus milk fat colour. Such a strategy may allow the production of milk products more suitable to particular markets, depending on the preference for white or yellow milk and milk products, or the dietary or health requirements prevalent in a market.
Publications von Lintig, J, and Vogt, K; Vitamin A Formation in Animals: Molecular Identification and Functional Characterization of Carotene Cleaving Enzymes, J. Nutr. 2004, 134:251 S-256S.
Reich DE et al; Linkage disequilibrium in the human genome, Nature 2001, 411:199-204.
564717
65
Winkelman et al., Estimation of Heritabilities and Correlations Associated with Milk Color
Traits, J Dairy Sci. 1999, 82:215-224.
Mullis et al., Eds. 1994 The Polymerase Chain Reaction, Birkhauser.
Tatiana A. Tatusova, Thomas L. Madden, Blast 2 sequences - a new tool for comparing protein and nucleotide sequences, FEMS Microbiol Lett. 1999,174:247-250.
Needleman, S. B. and Wunsch, C. D., J. Mol. Biol. 1970, 48, 443-453.
Rice,P. Longden,I. and Bleasby,A., EMBOSS: The European Molecular Biology Open
Software Suite, Trends in Genetics, June 2000, vol 16, No 6. pp.276-277.
Huang, X., On Global Sequence Alignment. Computer Applications in the Biosciences 1994, 10, 227-235.
Sambrook et al., Eds, 1987, Molecular Cloning, A Laboratory Manual, 2nd Ed. Cold Spring Harbor Press.
Ausubel et al., 1987, Current Protocols in Molecular Biology, Greene Publishing.
Bolton and McCarthy, PNAS 1962, 84:1390.
Nielsen et al., Science 1991, Dec 6;254(5037):1497-500.
Giesen et al,, Nucleic Acids Res. 1998, Nov l;26(21):5004-6.
Bowie et al., Science 1990, 247, 1306.
Current Protocols in Molecular Biology, John Wiley 8c Sons, NY (1989).
Oritaetal.,PNAS 1989, 86:2766-2770.
Devlin andRisch 1995, Genomics 29: 311-322
During et al., 1996 Analytical Biochemistry 241, 199-205
INDUSTRIAL APPLICATION
[00304] The present invention is directed to methods of genotyping bovine to facilitate the selection of animals with altered milk production phenotypes. In particular, such phenotypes include desired milk colour and desired milk composition. It is anticipated that herds of bovine selected for such traits will produce milk fat of more desirable colour (whether more or less yellow), or more desirable composition, and therefore be of significant economical benefit to farmers.
564717
66
SEQUENCE ID LISTING
<110> Vialactia Biosciences (NZ) Limited
<120> Marker Assisted Selection of Bovine for Desired Milk Content
<130> 576277 JBM
<150> NZ 564717
<151> 2007-12-24
<150> NZ 566978
<151> 2008-03-26
<160> 17
<170> Patentln version 3.5
<210> 1
<211> 32605
<212> DNA
<213> Bos taurus
<220>
<221> variant
<222> (135) .. (135)
<223> -1054 C/T polymorphism
<220>
<221> misc_feature <222> (3158)..(3207)
<223> n is a, c, g, or t <220>
<221> variant
<222> (17117) .. (17117)
<223> G15929A (G278R) polymorphism
<220>
<221> variant
<222> (19256) . . (19256)
<223> A18068G (N341D) polymorphism
<400> 1
agtgcctctc acttttctca ctccacccag agaggcaaca gctggttctt tcttgaagca acgtaagtaa agtgaggtca gtaactcatc atccaacgta gcaacgtgag tgtggatttc tggtgataac atcacctcga gattcagaaa tgcaggggaa ccttcagtac tattccttac agttcgtgcg tagtttttgg ttcccagtat ttattattta aaagggtaca tgaaacgttt tatgtttact tcttcaaaaa gtgaaatata attaagttgc tctttacata ggattctttt tcttgctggg aggtaggggg tggtgaggag actcagagac ctttagtcaa aataaaacca aacaattttc cctaaatgat cttgccagtt tttttttttt tttttttttc atttttattg tagtatagtt gatttggggc ttccctagtg gctcagtgat aaagaaccca cctgctaatg caggagatgt gggtttgatc cctgggtcag gaggatgccc tggagaaaga aatggcaaca
60
120
180
240
300
360
420
480
540
564717
67
cactccagta tttttgcctg gacagaggag tctggcggcc tacagtccat ggggtcgcaa
600
aagaactgga cacaacacag cgactaaagc aatagcgtag ctgatttact aaagtaagag
660
cataacatag cgttatgtta atttcaggta tacagcaaaa cgaatcagtt atacctatac
720
agtatagaca ctcttttttt aggttctttc ccatataggc cattatagag tattgagtag
780
agtttcctgt gctatacaga agctccttgt tatctatttt atatattagt gtgtatgtat
840
caatcccaat cttccaattt atccttcccc ccatcccacc agtttaaacc catcaagaaa
900
gttgtttgtt agttttggtt cagccaaatg tggaagtgga aagggagagc tggggatgta
960
atcggaggat taaagattag ctgccacaca tggccgggga actgaccttt gaccaaatat
1020
aagtcacgcc agcacagctt ctccctgtgc aggaag.agat cccaggccct cgcagtgcca
1060
tctgaaggga gggagatgta aaggaagctg cagggaggga agacaaggag tggccaagag
1140
cagtccctga acacggacga gcatcgctct cgctcagagc cctgcacaat ggaaataata
1200
tttggcagaa ataagaagga gcaactggag cctgtgaggg ccagagtaac aggtgagcat
1260
tttgtataaa ccacgggtac ttacatgttt agatgtacac tttttttttt cctgcttata
1320
aaagtaatgc ttactgtaaa tcatttttgg aaaatctgga aaaccgtaaa tagaggtgac
1380
atgatccata ttctaggcat attccctgtg aaaatctcag tgtgctttta tgcattataa
1440
ctcagctttc aaggggagtc agtggttctc actacttcca ggagaaaagt ccgaggacta
1500
ctgctattgg ctcatagcca gtgggattat cctgcaaact gtgctctgtc cccagggtgc
1560
ggtatttgta aaagtgggtg catcctaaat taatttcaaa atgatacgag aagcgtgtgc
1620
gagggcccgt tcttgtgcat gcgcggtgcg ggagggcacc gcagatcccg atgaaatttc
1680
ctcaaaggtg cggtttcagg gtgcgagcac tgcagagcag atgagacagc tccagggtgg
1740
gagctgccac atgaaccctc cttgcatagc tttctcctcc tcttctgcat ggccactttc
1800
ctgggaccat cttgtttaga ctgtacaggg cagagagtgg tctgcacttc gggtcccagg
1860
agccgtgctg agaacaagga agggggtcag ctcagccaca ccactgtttt ggacagatgg
1920
gcagttttca caaggacgag acagcactgc cctctgagag gcaagtcaca ttggagcttt
1980
acagcagacg aattgccttg gaaggctttt ggaagaaaga aatagcatgt atgtgtattt
2040
tttctgatta cagaggtaat gtgcaatcta tatatgtatg tatattcagt tcagttcagt
2100
cactcagtcg tgtccgattc tttgtacttt atatccagag ggatataaaa atcagctgta
2160
tttcttctgt ctcttaagga ctgttaatat ttgagtctat tttctttaag tttttaaaag
2220
aaacatcaat atagatagta tatgcatgcg tcctccgtcg cttcagtcat gtctgaccct
2280
ttgcaaccat atggactgta gcccaccagg ctcctctgtc aatgtgattc cccaggcaag
2340
aatactggag tgggttgccc tgccctcctc caggggatct tcccaaccca gggattgaac
2400
ccgcgtctcc tgcattgcag gtggattctt taccactgag ccacccggga aacccataag
2460
attaatctga tgataggtag actttatata caaatacttt ttcaaaaaac aaatgtacac
2520
ttactattaa ctttttattt ttaaaaaaag tggatctgtg atacccatca ttgtatgctc
2580
tggggtacgg tgggcatttt gccattatac tcccttgcag ctcatgttta ttggctgcat
2640
gtcaatactt ggaaatttta gagaacaaga atcccgggac ctgagatggt gaatgaacct
2700
ctagatctaa gtccgttgtt ttcacaggat ggtccccagc cctgcagcat gagcactccc
2760
caggctcttg ttagaatctc aaattcctgg acccacttca gaccctctga atcagacacc
2820
caggggtgga gcccagaaat gtgtgttttc agagccctct ggctgcgtgc aggtggggga
2880
ccactgctgt aaactgggcc cccttccttg acaggtgtga agctgcgggc ctgggagaag
2940
564717
68
catcgacttg cccgaggcca ggcggccagc agaggacagg ctccagccca gtgccagcct
3000
ccctgcaggg tcacgtctaa tccaccagtc agggtctgct ggcaggcagc tcactacaca
3060
cggtgctgtg tggccgcatc tgtccaggga agccaggcca aggctctggc ctcacatccg
3120
gcagggcttt cgtgactctg attcttcctt tacaagannn nnnnnnnnnn nnnnnnnnnn
3180
nnrmnnrmnn nnnnnnnnnn nnnnnnngca attgccacat tattagaaac caaatgtgaa
3240
tgcatcttgc tgaatggaat gaagccaata tctcctatgt cctctaatct ctctctcctt
3300
cctaatgaat ttgaattcct tagggtgtga aaagggggga gaagagtaga gagagatgtg
3360
gtcatgtttt cactgtccct gagagggacc aaaagtacta ggaagtcacc catttgagta
3420
tccaggaaac ttcagaaaaa ggaaggactc aagcttttta ttttaaatga ccaaccaaat
3480
tcagtcttgt ggaggtgggg ggaagtcaga tttaacagag gaagttagtt aaataagttt
3540
cacaatgtgg aagggaagct aacatgacct tgtgggtggt gccctgtgag ccaactcaaa
3600
ttatagtttt agccacttct gccaggtcgc gctagtgata aagaacctgc ctgccaatgc
3660
aggagacaaa agagatgcag gtttgatccc tgagtgggga agatctcttg gtggagaggg
3720
tggtaaccca ctccagtatt cttgcctgca gaatcccatg ggcagaggag cctggtggcc
3780
tatagtccac agggtcgcaa gagttgaaca cgactgaagc ggcttagcac gcatgctgat
3840
tgcataatgc ttttgtttta cttgcttcct ctctgatgta ctaaattata gagaggattc
3900
tttttgatcc ggaagaattc atgtataaaa gtgctaatgg gataaaagga tagctccaat
3960
tttttttccc aattcagtct tgggcagtcc aatatacctt gtggctcagc tggtaaagaa
4020
tccgcccgca atgcgggaga cctgggtttg atccctgggt ggggaagatc ccctggggaa
4080
gtgcatggca acccactcca gtattcttgc ctggagaatc cccatggaca gaggagcctg
4140
gtgggccata gttcatgggg ttgcaaagag atgggcatga ctgagcgact aagcacagca
4200
caaatgttct ttttacatct agggctttac ttttaagctc agattcatct ttccctgctc
4260
tgcctcaacc tgctccttct cctgtgtccc aggtcactgt gaatagtcac acccagggca
4320
ggagactcgg ctggcacttt ttcagtgttc cccttccatt tcccctacaa ttagtcaaac
4380
tctaagtccc tttgttttat cagtttctct ctgtcccact gctactatct tcttcctctc
4440
tcaggtcttc ccacctctct ctagcttctc ctccaaccca gagtccacac catagccaga
4500
aataactttt ggtaccccaa atctgatcac attccattag tatgagaaga ggtaggtcct
4560
tagagacaga acgtggatta ggggttgcta ggagctgggt ggcatatgca gactaactgc
4620
taatgagcac agggcttctt ttgggggtga tggaaatgct cctgggttgg tggtatggtt
4680
gcacaacatt gtaaatatac taaacacaca cacacacaca ccctgaattg tatactttga
4740
aatgatggat ttatgttatg tgaattatat ctcaattttt aaaattgcaa elelC[d.3.2.cld.d.d
4800
ttaatcctat tattccctgt cacacccatc aaaagtgtct ctttcttcta agcaccaaat
4860
caaagctaga tttggtctta ctgatcttgc cctctactta gccttgcagc ctcaatactt
4920
gctataccct cacttgaaag ttacaggcca gctcttacta aacagcttgc atacttcttg
4980
gtctttgaac cagtctctgg tctctgggtt caggtttttc cagagataag aattcaggag
5040
taagtcattt atttgggatg tgcctcccag ataatcacga tgatgtgatc actgacctag
5100
agccagacat cctggaatgt gaagtcaagt gggccttaga aagcatcact acaaacaaag
5160
ctagtggagg tgatggaatt ccagttgagc tgttcccaat cctgaaagat gatgctgtga
5220
aagtgctgca ttcaatatgc cagcaaattt ggaaaactca gcagtggcca caggactgga
5280
aaaggtcagt tttcattcca atcccaaaga aaggcaatgc caaagaatgc tcaaactacc
5340
564717
69
gcacaattgc actcatctca cacactagta aagtaatgct caaaattctc caagccaggc
5400
ttcagcaata tgtgaactgt gaacttcctg atgttcaagc tggttttaga aaaggcagag
5460
gaaccagaga tcaaactgcc aacatccact ggatcatgga aaaagcaaga gagttccaga
5520
aaaacatcta tttctgcttt attgactatg ccaaagcctt tgactgtgtg gatcacaata
5580
aactgtggaa aattctgaaa gagatgggaa taccagatca cctgatttgc ctcttgagaa
5640
atttgtatgc aggtcaggaa gcaacagtta gaactggaca tggaacaaca gactggttcc
5700
aaataggaaa aggagttcgt caaggctgta tattgtcacc ctgcttattt aacttatatg
5760
cagagtacat catgagaaac gctggactgg aagaagcaca agctggaatc aagattgctg
5820
ggagaaatat caataacctc agatacgcag atgacaccac ccttagggca gaaagtgaag
5880
aggaactaaa aagcctcttg atgaaagtga aagaggagag tgaaaaagtt ggcttaaagc
5940
tcaacattca gaaaaccaag atcatggcat ccggtcccat cacttcatgg gaaatagatg
6000
ggaaacagtg gaaaccgtgt cagactttat ttttctgggc tccaaaatca ctacagatgg
6060
tgactgcagc catgaaatta aaagacgctt actccttgga aggaaagtta tgaccaacct
6120
agatagcata ttcaaaagca gagacattac tttgccaaca aaggtttgtc tagtcaaggc
6180
tatggtcttt cctgtggtca tgtatggatg tgagagttgg actgtgaaga aggctgaggg
6240
ccgaagaatt gatgcttttg aactatggtg ttggagaaga ctcttgcaag tcccttggac
6300
tgcaaggaga ttcaaccagt ccattctgaa ggagatcagc cctgggattt ctttggaagg
6360
aatgatgcta aagctgaaac tccagtactt tggccacctc acgcaaagag ttgactcata
6420
ggaaaagact ctgatgctgg gagggattgg gggcaagagg agaaggggac gacagaggat
6480
gagatggctg catggcatca ctgactcgat ggacgtgagt ctgagtgaac tccaggagtt
6540
ggtgatggac agggaggcct ggcgtgctgc gattcatggg gtcgcaaaag agtcagacac
6600
gactgagcaa ctgatctgat ctgatcccaa ggaacactgg gcgaaggagg cgttgtcaag
6660
ccattgacca ccaggagtta ttggagctgc atcctccggg caaagcataa accacgcctc
6720
acattgtgtc caccgagggc aaggaagtgg tggccttgga cccacagcac cccatctgtc
6780
gtcgggtgag agcagcttcc aggggcatta actctccaac gtggttctgg cttgtccatg
6840
tgtgtgcaga gctgctccac agtgagaaca aagccctcag gtgaagagct ttaggtattt
6900
gcagaaagca gccttcaggg agagatgaaa gccaagggga cgtgggcagg gcgccgacag
6960
cacctgcaac actgtcccaa cagccgtgga atgcctgcct cacgtataat ttaaactttt
7020
ctaatagccc cgtaccaaaa agaacaaaga aacagatggc attaatttta atagtgtttt
7080
tcctttaatg ccacatatcc aaaatgttat cagttcaaca tatagctaat gttcttaaat
7140
tgctaatgag atacatttct ttttcatcct aagtctctgg aatctggtgt gcatttcaca
7200
attactgtac atctcagttg ggactagcca agtttccagt acccagttcc cctacatggg
7260
acagcacagc tctagaatgt tctttcctca tgcctttgca aggcaaacac caactctccc
7320
tcaaaactca accaggtcac ctcctccagg aagtgttcct caccctcctg ctctcaaccc
7380
accttggagc ggagttctgc tccacccagc acccttgctg agtctgtgga actgccttta
7440
tcacagtgta ttttgggtat tttccttagt ctttgtctct taatggatga ggagttcctt
7500
ggaagcagtg tgcatgggtg ctcagttggg tccaactctt tgtgtctcca tggactgtaa
7560
cctgccaggc tcctctgtcc atgggattcc aggcaagaat accggagtgg gttgccatgc
7620
cctcccccag gggtcttccc aacccatgga tggaacccgt gtcttacgtg tcctgcactg
7680
acaggcaggt tctttcccac tagtgccacc tgggaagccc cacttgaaca taggcatgct
7740
7800
7860
7920
7980
8040
8100
8160
8220
8280
834 0
8400
8460
8520
8580
8 64 0
8700
8760
8820
8880
8940
9000
9060
9120
9180
9240
9300
9360
9420
9480
9540
9600
9660
9720
9780
9840
9900
9960
0020
0080
0140
564717
gcctgttcat gaacagattc ggtgcagagc aggggatcct actggttcga catggtttgc caattctatt gcacggtaac ggctgccatt ggctcctgca tctagtgagt tctgagatgg ctcctctaga taagtacttc gcaagtaaca ccccaggatg atagagcatc cgtggtcttc ccaaaccggg ccttagcaga aggccaagtg caaaattaaa gatagatttg tgaattccaa ccgtgttagt ctcttttttt catggacaga aactgagtga gttgagtaga agtagcacat gccatccttg gtgtcttggg tactacagca accccaccaa agtcacagcc gctgctctca gggctgtgct acctcgccct aagggctttg aggaaaacag ctcggtaccc agtgagtggt tggaaactaa gctccgcaat tggcctggcc cgtgggtgct tctcttctgg ccaccaggct tcctcttccg ttgcaggtgc ctcagactaa gcagtcaggt gctctcaatg tatacacaat taaagccagt ctcatttaag tgtgttccca aaaaccaggc aagacaagaa ggccaccaga gagatgggga attttttata attgttttcc gagtcaaagt ttcaggtgtg tttggagaag ggagcctgct ctgaacaact gttcccgttc gtaggtcaat tgacagtgga ccccactttt aggaatgtga actctagagg ctgaagacct agcacctaac ccccaacccc ctttcctagc aaaagttgtg actctccctg tcagcgccca agaggggctg acattctttc gggcctggga ttgctccaca gtgacaaatg tttagtagct cctctgtcca ggggtcttcc agcctttacc aactttactc tgccagtgtt cctcccctgc ttaagagggc acttctcaaa ctgtgaatcc gatgaccccg ggttctagtg ctggagtaag agataatgct agagtataca aagcagattt aataagaaac cacagagaga cagcaaagtg gaaatggtaa ggcatgctgt ctttttttag tatacaggaa ccctgggcga tatgtgatgg tggcatatta aaagtcaaca tttcaaatta ggtggggtcc tggtgcttca acgtcttgac cccacctccc gcaagggaat cgagctccaa
70
cacaatgcca tgtggacact tctcatcagg tgcacacggt gcttcaccat accacagatg aaatcatgtc ggggattttc tgacccaggg actgagccac tcagaggtaa ggtgcagcac gtctcacagt acacataagg agatggtctg tgagaccctg agcacactgg cacagttggc aacagcaaag ttcaaacact ggagcgcttt gaagatgagt caagccgtca ttttttttaa aatcagttat cccgctccag agtccatggg attcttttcc gtccttatta gaattccctg gcatttctct taattgcacc taatgtcatg cccaccaaaa ccacgggtcc gcctctgtgg ttacagactc cgattcgtgg cagagatcag gaggaacccg ggcacattct tctttgagct caagattcca gggcgagacc cagagatggt gagtatctta tgactctttt caggcaagaa atccaacctg ctggaaagca gcagctgttt agcctggcct ggaaatgtaa caatcagggc tgggtttctg cccagggctc tggtgtgggc atgaggaacg acagaaaaat aacaaagatt tttcaagtct caacccaggt tcccatgagg attatagttg acacatacat tactcttgcc gtcgaaaaag catataggct gttatctact agatgcagcc gatgggaacg aaccatggaa tgctaattaa tgaaggcccc cccagcctcc actcagccct tggggagact ttcagttcca agtgacacag tcctgctgac ttctggaagt ttttggttta gcctggctgc agatacaacc aagagcaccg aagcaacaca gtgatcccat tactggagtg catctcctgt cctaaaagtt cctgttcttt ggtctctcct acaaaaacat cctccgattt catcagaatc aaagggaagg ctccggcctc gaaatcagtg gactcataac tcaatcataa aaaattcttt cccccttagc atcgaacgtg atttacaatg gtatctactc aggagaatcc agtcggacac ataacagagt ttatacatat tctccttgaa agttgtcagt gcataggtta tggcaaaaaa aggcctaagt ccccattcct cacaggaagt gactgggaac cttatccttc ccacaaacca actttgattt
564717
71
ttgccctgaa aaacccactt cagacctgtc tttcaaaaca gataacaaat ttgtgtttcg
10200
tgaagccact acatttatga taatttgtta cagcagccac aagaaattaa tacaggtcta
10260
gaatcatcta gtatattttc acgttggtct tggcagactg gtcaagggtc catgcttttc
10320
tggagtgagt gagattctct ccagtgtcat caacacatgt actgaccctc tcctgtgtgc
10380
agaatgggta taagagtgct ccctggttgc agagtacggc agaagccaac acaatattgt
10440
aaaacaaata tcttccaatt aaaaattttt taaagatttg aaaaataatt ttagtgagag
10500
caaaaagtct tctgtaccat taataattaa aaatgaaacc atgtttattt caattaaaaa
10560
aatgttactc caggggctaa attgttaatg taaggctccc agaactgaac tgggtccttg
10620
ggaaacccag cctgagtatg gggccttctg gatcagtcct ctaaacctgg accttacctg
10680
gagggactcc accgtggtgg aaggcatggt cagtggcctg ggccagcagg tctctgggtg
10740
atgctggaca aggcgatgga acatttctga gcctcagttc cgccatacgt agatggagct
10600
gggactagga atacttacct tgccgtattg ttgtgagtca accttgccaa aatgtatgtg
10860
agagttcaca ccagcgctta ttgtatccag cacacacatt taagagatgg tctcagtaag
10920
ccacccccac ccccaccccg accccttcac atccctggaa tggctcaaat aagttcagat
10980
aagtttcaaa taagcctgtg atgattctgc tgcttctctt tcaggtgaag tctactacag
11040
gagcaaatac ctgagaagtg atacctacac tgccaacatc gaagcaaaca ggattgtggt
11100
gtcagaattc ggaacaatgg cctatccaga cccctgcaaa aacatatttt ccaagtaact
11160
gtccgttttg tatctagtcc cgctttccaa tatccttgac ggtcgagaca accagcgttt
11220
gctcctctga cggtgtgttt agggtggatg gaaaacttcc catgatggcc tttgaccatg
11280
caggctacgt acttctggat tacagatagc aggcaaggca aagagcatct ggcagattgt
11340
cagatgttct cccacatggc tggtgggaga gcactttgaa gtcacctatc aaaattacaa
11400
acacacattc ccttcagtcc agagaggtac ttgcctgttt ctgaaatgac aaatgtacaa
11460
ggttattcgt ggctgaatga tttgttatga tgaggaactg aaaataactc aggtgatcat
11520
taacagaaga gtgttaaata aatcacagta acactataca ggaaaacacc aggcagatat
11580
aaaaagcact gaggatgctc tctatgtact gaatggaaaa ttccccactg ctaagtgggg
11640
aaaaaggaaa aaaaagctaa gtgtagaaag agcatataat atgctgtaat ttgtgctata
11700
aggcagggga tatctaggtt tgtttttgaa tgtgcagaaa gcctgggaga aggcatgggc
11760
gaggtaggga accaacggga ggaaggacgg gggagatttt cagttgtaca catttaaaac
11820
catctgggtt gtgatgcata tgaatgcctt acgcgttctg aattgttggc cagggaatca
11880
ggccagtggt catcatctcc tttctgatag gctttagaca tagaaggggt acattagggg
11940
tagactggcc tggagaggga agctggtttc tgcttctcca gaactgacat catctctcca
12000
atcccagctt ccagcgctgg gggtcccgca aatagcaggc tgggaacagc cattgcagac
12060
gggggaggag ccagggggca gcgggaggga aggaattgaa accgcagctt gtgcggcatc
12120
cttgctgtta cagtgattgc cagtaggggg agcactcacc agtactcggg gccaagactt
12180
cctgctttga gtgtgccttc taataggatg atcagttctt aactcagttt actttccttt
12240
ttcaaagaaa cgaatttctt ggacaaagtt gaggtaacga gggtggattg tcattcctgc
12300
tcaaagcaga gatcaactgc ttttctagtt tcgggggagc cattatgggt atgtgtatga
12360
tttctataga gcttcttaaa atacttgaga actgtccacc agctcttgat ggctctggtg
12420
ggggagacct ttggatctgg ggagctgggg agggctcagt gggcagcctt gctgtcccag
12480
ctgactccct tcttagctga ttggttgtgg tggtgctttg ggttaactcg atctctggac
12540
564717
tgtcagattc agtgttttaa cacagtggtt ctatccccga acgcgaccac agaaggtatc cagggaggcc ggcatcttag acaaaggaac aggccggaag ggaaacttct tccaatctct ctatttccaa tccagttcga tttgaatgaa ggcattaggc agaagcgagt ctgttcaaag tcttagccat agcactttcc tttggctgaa cccattgctc gtgtttccct agaggcctgg cagggccggg cggtgaggca cagagaacac aggttacatc aggttgggag ggcgggactt ttttcgtaaa aaatgttctc taagatccct aaggtggctc ttgcccacat aggatgctac ttcgattcct cactgatgtt tctgacaaac aggctccccg ctcaaatgac gtggtaggcg gacctgctga tttcacagac agagaccagt agtgagtcta aggaggctgc cactctgccc acaaactagg tctgaaatca ttgcctcttc tgtaaagata aatatagaca cccctgacag tgacagacag agagagcatg ctacgttttc ccttgatgtc gaaatcatgt acaaagctgc ggttgtgccg tatgattgtg ttgaccaaat ggatttatgc attagcaccc tcgggtttcc acgtagctca aggtcgcaag acatggaaaa tgccctctcc tatgtggctg aatgttggca gccccagtcc tttgggaaga ccatgactca gtttcctttt gttcattatt gtcatccatc ctccatctgt ggttgttcag cagttgtgac aactaaagcc cttgctttcc aactgtctga tacatcagga caagcagcct aggcactcag tgccatctct tggcttaggt ggtgctggca ctgcttctgg caaatcaccc cgagcggtta tgtcgacagt tgctgggctg ggtgcaaaaa acaaagatca tttgcaatac ccagctcctc agagaggaat ttctcaggga gcctgtttgg gctttttgta ctctgtggga aggctgcctc acaccgaggc ccaagccccc aggaagcgtc atagccatca cctttttaaa taaatctggc cgtccatcgt caggtgggtc gttctggttc gaggaccatg ggaagaagtg tcattaattt catcatccat cagggctgtg ggagctcagg
72
ccagcacttt atctggagta tccccagagc tcaacatcat ggatcaaccc cagtgcctga agaggagccc gtgtctgaat caacagaaat ggccctgttc tggtcgcccc accctattcc ggacctattt cctcggcact tccaggaccc tcagcgccca gtagacccac atcttaacga tgaagctagg cactaggtga ggaagggcgg agtcgggatc acctctccgt tacaaaaggg tctgcccacg cacccggtgg tgcagcctgt tcctttaaag cctctctgag agcagctgac aacttcacat ggacaagggg atcctggggc tggttgatgg gactcgcttt aatccctttg atctgttaat ccttcctttc ctgctttttc cctgctggca ccctccctta gaaccttcca tttctcctac gaggtgtgga ccagaccctg cttaaggaat cctctgagcc tcgttcctag gtgttccccc tctctgaagc ttggtttaca agaatgacct caacttgcct gagtggcccc ccagcatccc cgtggatacc tttcagtttt tctgttcaag cagcataggt gaaggaggta acagctggtt gttactgagg tcccagaata gaggatttaa atgttacttg ggccagagag gtgctgtggg gcaatgtctg cctggagtca tgtcgtgttc cctcactacg aagacaaaat tgggatgagg cacaacactc tctcctttag ctctttcaac catgtattta attcattcat tcatccaagc tccactgcag gttgaagtag gctgagtgga ctgtcccaca gaagacttct gaaaccctgg ggaccctctg cctcgtctcg cactcccgtg acagttctaa ttctagggaa gatgcctcgt cattttaacc ttttggaagg tcagtggaaa tggagaggaa tttcctgggc ctccagctgg ggagtgcttt ccaactgagg gcgcccgact ccctggccac tcgttagtca accctaggag atgacttcac gagctctctg ggacaaccgt agcccccacg ccttattttc ggcagctctt tgcaggttga acgctgctgg acgtgatctt cgtgaacacc aacctgatac gagaaaaggg ctccttttct tcaatctatc tttttagttt cccaaaagcc attccgccct
12600 12660 12720 12780 12840 12900 12960 13020 13080 13140 13200 13260 13320 13380 13440 13500 13560 13620 136S0 13740 13800 13860 13920 13980 14040 14100 14160 14220 14280 14340 14400 14460 14520 14560 14640 14700 14760 14820 14880 14940
564717
73
gttccctgag ccttctagac attcttcact caggcccggg agcctggcat ggcgcccagg
15000
ctgattgtga ttgcgtgttc ctgcggggga gaagggtggg actcctagcc cttatttggg
15060
caggaccctc tcgcaaccac acccaggaga ccaggcctcc gagggtggct gtgcgggctc
15120
cggggatggt gcaggacttc gagggtgctc agctagacct tctggtttag gaagacaggt
15180
ttccatagga gattgcccct ggcccctccc ttaccattct gtagagcacc cccgtccagc
15240
aaattccagg ctccactggc agacgggaat ccctgggtcc catggtgcag tgagtgctct
15300
tgttaaatgg aacccccgtt aattaagcta ataaagggat gcagggtccg gctcaggatc
15360
ctagaatgtc ccacctgagg gtaggctagc catttgagtt tgccacctca tgttcctgag
15420
gaccgggagc cccaggggat cgggtgacgt ccccacggct cacagcaggt accctccgca
15480
cctgctccag gctggcgctt caccaggatt cacacctgcc tcccgcctcc caggccaggg
15540
tgcctttcct tcccgtttgc tttgtgttcc atttgtttag tgaagaaaag acctacctac
15600
cccacaaccc gacagcctgc ctcccaccag ccagctgttt acagggtgca caggtggcca
15660
caggaatggc agggacgctt gcatgtaact cattataacc cagccgcagg cctgggtggg
15720
gaggcctttt ggccagcagc tgactcatcc ttgcagagga ataaggaaac caagcagtcc
15780
caagcacctc tgtgctgggt gccgggtgag gcgcctggcc tcctgcagta tttccttcag
15840
ttcccccctt ccctccaagg gttgtccctc caaagaacag aggaggtggc caggctgagg
15900
ggactgtgtc ctttggctta accctgtgcc tctggctgcc cegccaagcc tcggtttccc
15960
atctgtaaaa tgggaatcat acagccgacc ctgactgcca cagtctgtca catcaacaca
16020
ggcacttggg ctcactccta ggtccttcag aacaggaaat gccaagagaa attcacagaa
16080
accctgtgaa atgcatgtca gatgatggac ctcagcagcg tgtcaaagct tcatcttggg
16140
aagctagccc acggcccggg cgtggctgtg ctcagtcgct cagtcgtgtc cgactgtttg
16200
cgaccccatg gactgtagcc caccaggctc ctctgtccat ggggattctc caggcaggaa
16260
cactggagtg ggttgccatg cctcactcca gggcatggcc tgggagactg acggaaatac
16320
agagccgtgg gccctgcccc acaggctgag tcagcctcca cactttgtca tgacccccag
16380
acaatctgcg tgcacgttca atgcttcagg agcattgctc cagagccggg gttgcaaatg
16440
ttttctgtaa agggccccag agtaactatt tgagacattg ccggctgtgt ggtctgcgcc
16500
acgtctgctg gactctgaat cctgaaggca aagcagctgt ggactctgca ggaggcggca
16560
gcgtggccct cgctccagat aaacctttat tttggaagca ggcggaggcc agcttagccc
16620
caggccagag ttactgacac ctggccctaa ggcccagaga cccaggggcc ttgagcctgg
16680
ggaggcagcg gtttcagatg ctaaaccagg cccctgggtc gggtgggggg aggtccagtc
16740
cctgcctcag ccaggcagca gcccttcacc cggtttgctg tcgactccca aagattcttc
16800
tcagcggggc gggctgaaag gctgggctgt aggatgcggt ggccctgtgg tgtccttggg
16860
aggcagtcag gtcccagctg ctcctgaggc tggcttctct ccagggggca ggaaggaggg
16920
ccggagcccc ctgaaggaca cggaggtctt ctgctccatc gccgcccact ccctcctctc
16980
cccgagctat taccacagct tcggagtcag cgagaactac atcattttcc tcgagcagcc
17040
tttcaagttg gacatcctca agatggccac ggcttacatt cggggtgtga gctgggcttc
17100
ctgcctggcc ttccacgggg aggacaaggt aaggcctagc caggcaaccc cccaccccgc
17160
agccccagga cagccagagc cagccggagg tcagcaggca ggggcgggcg gctcagcggg
17220
ggaggaagcg agctcccagg cagtgtcccc agaggcacca gctagccttc cagagacttc
17280
tgaggctggg tgggccttca caggggtccc gagttagggc aaggggctca gcatcatctg
17340
17400
17460
17520
17580
17640
17700
17760
17820
17880
17940
18000
18060
18120
18180
18240
18300
18360
18420
18480
18540
18600
18660
18720
18780
18840
18900
18960
19020
19080
19140
19200
19260
19320
19380
19440
19500
19560
19620
19680
19740
564717
ttcaaacagc cctgccagag cacgagcatc tgcattctca ccttctgctg ggaaggccct tgcacacacg tccccagctg caccaaggtc acctcatttt acccttggcc ctagttcccc cagggggccg ggcctccagg gcccccgagc gctgacccct tcccactgtc cttgattttg ctccctcctt tcaccctcgg cgcctatgac ctggtcactc gccggggctt ctgaggaagc tgccgcccac tccctgaaag tccccatggc aggggttgct tgctgcagcg tcacatccac ccccatggtg tgtcatcacc ggactttaag cctccacgtg gctttggttc gaaaatgctt tacagattaa cctcccaagt atattagtat gctgttaaaa acacgagtgt ggccccctcc cttgggcatc gggaggagag gttccccgtc catcatgccc ttcctgcctc gctccctccc cccctagctg cccagtcgtg tcacccccac tgaaaatatg gcggaagtgg ctgcagggat cctggcaggg ggattgggtt ccttctcctc gggaaagaga tcagaggagg attaagcaac acccagaggt agcactgcag tggtgggagg tgagagggct ctggatctca gtctggtcct ttctggtgag tcagggtccc tggggctggc atcatcgacc gtatttcacc tacgaggacg gagaactcca gacaaggtaa tgatttcatt tcattatttt gtgaaaagga ctctctctct ataatgatat agtatgtaca agaaagcagc gtcgtccatt cgtacagagg gaagcaaagc ttccccagaa tcagcccagc agggcctctg cttgtgaatc tcacagaccc ctcagccatc tggactataa tcagtgtttg gtttgccctg cccccaggga ctttgaatgc tatgggaaag tgcaacactc gaaagagacg atggttaggc tagggctgta gtgtacggag ccctccaggg tccaggacgg ggtcctcccc gttgaatgaa ccacgtgagg gtttgccctc gagcaggtga agagggagcg gaaggacgcg acgtcaatgc gcagcctcta ggctcacctc cggcttgagt gacattgaaa tccaaatatt aaaaaatcac ccgtagatag atatgcatat gtatgatttt
74
tgccaagccc cagcaccgct tgtctttgtg agctccccga tcggacccac ctctggctcc cactttctgc caggtctggc ctccccacta atctttcaca actgcatgaa tggcacgaat ggggaccacc tctctggccc atttgtagag aagtgcctgt agccctcatc atgagccaga cagagctgcc gatacggcta gcctactggg cagcctggtt ggtttcaggg aacttggcca ggtcggggag ccgagacccc tggagtggga accgttggct ctgagccccg gaagcccgtg ctacgaggag ccagctcttc catgcccacc tcttggggag ttcaaaatga gaacatattc cataagccga acaggcaaat aacatatgca cttttetgaa cctacttctc ctgctcccct gccaccccct ggtccctccc aggcctctct ctcctacacc tccctccact tctaaggccc ccaggctttc tgttgtacat agtgtggggg gagctgggca tggagcatca cagagagtcc ttttctgagc cacttcaccc acccccactt gaaggtggct aaaagaaaga attgccaccc ctgatatcag gctgggacaa gctctggcca ttgccaggga acgctgcact atgcggcaaa cagagaggct ggggcttgtc ggctgcatct ccgaccaagt gacggctgcc tacttggcca ctcaagaggt gtgagcccac ggtgttctta agggtagaaa ttgcaaatat agacaggaag tattgtatat actaatatat cccatccagc tcctgtgcac ctttaatgcc ctgctcagaa cactggcctg tcagacctgc ggaacattct tcacctccat tccacctccc ttggtcactg gtttatctct agggggccat gggccccatt cacagtgaga agtccctgca tgtgttcatc aacttcagag ggagcctcct aaacacccct agagcgatac cttgactaga tctgcagtgg tgggtgggac gctggccact ttcagggccc gcttccctcc cagaaagggg ccccacccag cctttcagac atcacacgga tcctgttcga acctgaacga tcgtgcttcc tcgggaggtg gagaaaggca ttttagaaaa atttgtattg atatacatac atacatcaaa agcatataaa
564717
75
tatgtatttc ttaattttag agaaaaattg cgctgtgaat aatttttaac agccttaatg
19800
aaatataact cacgtaccat actattcacc cattttcact catttaaagt gtacagctca
19860
atggctttta gtatattcac agagttgcac tacaatcaat tttacaatat tttcatttcc
19920
cccagaagag cctcccccat tccttccacc cccaggtaac cactgcccta ctttctatct
19980
ctgtggattt gcctattttg tttcatataa atggaatcat accgtaagca gtctttgtgg
20040
ctggctcctt tcacttagca tgttttcaaa gttcgtcagc accgtggcac gtatccgtgc
20100
ctcttttttt atgactgaat atgtacacac tgttttttgt actgcttttc cgtcctcctc
20160
atatatctaa acatctttcc aagacattaa acattcttct gtgacatcat tttaaatatt
20220
tgcctgacaa tctgcggtgg gaattcatct ccagttgctt acccatttcc ctgatgttga
20280
acagtcctgt tcttgcattc aaaatgacca ccaggccaga cattaccgtc aattagggtg
20340
ctgctgccta ggctctgggc ttggttttaa aggctacaga ggcttctggc tccagaactt
20400
ccttacttca tgacttaatt cgtttaaatg aaacccagtt gagttcaatt caattttgat
20460
ggctcatctg tactgaacac actctatgta ccaggtacgt tgctaagcat tttattttct
20520
ttacctcact tagtgatcaa aacctatgca agagatccac acgttagcct cattttacag
20580
atgaagaaac tgaggctcag agaggttaag gaatgtgccc aaggtcacac agcctggtag
20640
caaagccaag actggctctg agcccttcct tggagccggt aaagggtata actctgcaga
20700
tcgccccagg gggtcacctt gacccctcac agggcccccc tgcaggccta gggaacagga
20760
gacactccta catgcccagg gaggcttccg ccacccccag acccctccca gccagaggcc
20820
ctggggcctg catttgagcc cagctgcgtc agaccacaga gctgggtcct ttctcctgtc
20880
cagcatttcc cccgtctgca gtcacttgtg tatctctgtc atgatgttca ctgtatttcc
20940
taccacttct gttaaattta caaaatcaca ttttcattta aatcaactca ctctttccct
21000
agctcatttg gaagggaacc tatgacaacc cagggcaaag ccagcgttgc gcgctgccaa
21060
aggagaaagt gatgagcgcc gagtgctgtg atgaagcaca ggtcccgagc actggcctga
21120
tgctgttgct cttgccaaga ccccatctct cccttaaaga gaacactggg aaagtattca
21180
gtgttaaggg tgtagtggca cctaactgag acgctctcct tgacccggct gggaagactg
21240
gaagagaaga aaggattaac tttctcctgg tatgattcag ggctaacgag tgaacagcag
21300
ggcccacaga atctccacct gagcccctgg cctccgcgcc tcccccccca ctctgggaag
21360
ttccacagtg cagcgcttgg agcatttcct ggctccaggg tgcagtgggg gcggggtggg
21420
gggctgtcac ccagagggcg agtccccagg ccgacccttc cccctgcctc acagacctca
21480
gcagagagga agcactctcc cccgacgccc gcagcacggg attctgtcac ctcttaacaa
21540
gctcgagaaa caatctgctt ccttggaacc aagctaagaa gtttagcctg tacagtttaa
21600
acaggcgctg cgtgaaccat ctgcaaagaa aagtgagcgc tgggaagggt agctctggcc
21660
acgcttttta attgctattt attgtccctt aaaaaaatgt cttcttgctt cactgcatgt
21720
gaacagggga aggttgagca ggcagacatg gcctgtgtgt tcagatccag cagctccgat
21780
tccatcccat ccacagtcta cattacggag gggccttggg ggcgggcgat gccaccgttt
21840
ctcaaatgcc cttcattctg gggcctttgc acatcgctgt ccctctgttg tcctctttcc
21900
atgagtcctc cattctcgtc ccggccgcag gtcggactgg cctgggggct ccgggaccac
21960
cagtccactg aaggcagaac ctctagggga ggggttggga tgggcatggc tcccaggcac
22020
cccccacccc caccccagga tcctagttcc tgctgatcct tcaggtctgg cgtcctgact
22080
ccctcagtgg gtttgtacat tccagtggca cctacagctc tttgctcaca ttgatgtcag
22140
564717
76
gagtaatgag cttgttgggt cacctcgtcc tgtgaaaggt cccagggggc ccggactctt
22200
ggtcttgcca gccatttatc ccctgcaccc atcacagggt ctggttctcc gcaggttctc
22260
ttgaagccct tagatgtttt aaatgaatga atgagaaagg tgatgcaagg tctctcaaag
22320
gacacaaggc cttcctttag cagttggaac acttccaaag tggaaagcgc cttccacgta
22380
gaatgggcta aagtgaatga ggtgtgttca gaagtctgag gtctcccagc tgtcccagat
22440
tctctggcat ttgggccagc tggggtcaga ggaaggaggg gcccgttggt acccttggcc
22500
tgtttgttgg ggtggcgtgg gggttgtcta aggcggagct aggcctccag gggcctgggt
22560
ggtgtggccc atctgtgcca gctgccctgc cctcaccctg cagggctccg tgcgctgcca
22620
tgggggtggc atctttgacc ttgaaggggt ggctccaact cctggctccg tgaatttgat
22680
ttgtcttgtg ttcaaatagc aagagaagcc ccgtttctcg agggccccct ctgcgctggg
22740
ccctgtgtta gacgcctttt cttctcttag acccaggcca gaggggcccc tgctctggct
22800
tttctccagt ggcccttctc cacagggtga ggaacccacg gggccacagg gtgcccaccc
22860
agaagggcac ccttccgacc ccagccatac ctgggatccc tgagttcctt gcccagatgt
22920
cccacacctg cctccagcgc ctgcctctgt cctcctgagg gcagacagtg ctctggtgtg
22980
tcggcctcag gcccacgagg aaggcaggga gaaaaggggg cggggagcct gggcccagag
23040
gccggccttc attctcccct gcagcccatt ccagaaggcc aaggagtcct ggaattctaa
23100
attcacacgc gtccttccag acccagcact gtgcactaga aacctgccgc gatgaagcag
23160
tgttctctgt ccaacgcagc ggccgctgga cgggatgcct gggctattga aatgcagcca
23220
gtgaagccga ggaatggaat tttaaatttt actgaattta aatttaaata gccacctgtg
23280
gctgggggtg gctgcctcac cccgggcagc acagttccac aacatgttga agctctgtgt
23340
ataaagtaag ggaagataaa atggtatttg acttaaccat tcaatatatg tggacaaagg
23400
gttggtgggc ctctgcgctc atcctgggtc ctgcaaatgt tgtgtgagtg ctgagtcact
23460
tcggtcgtgt ccgactctgt gcaaccccat ggactgtggc ccaccaggct cctctgtcca
23520
tgcgggttct ccaggcaaga atactggagt ggggtgccat gcccccctca aacgtgtggg
23580
ggcgtgtaat ggtctgcatt ctcctcactg aatctgcatg tgagtgcttg aagttatttg
23640
tcatccccat gacaacagat gagaaagaaa gtaacgtgcc aaacatcgca gaatcatcgt
23700
aggatttgag ctgggatttg ggtgcttgtc tgtctcattc caagttacta actacagtgg
23760
gtttggccct ggaccaagtg ttttcataac cattatttaa tttaatccta accacggtgc
23820
tcagacgctg gtcgaaagaa ccttcgcttg cactcgccat
Ctgtccagaa gttacagatt
23880
gcccccagtt ttacaaacaa ggaagcaggc tccagttcac ctagtctcag ggaggcctgc
23940
cggaactcac gaagaaacct atagagtcag atttcacacc caggactgcc aaacttcaaa
24000
gccagtgttt tggggggatg actgagatta acttcttctc ctgggacgtc cctcgcatcg
24060
aagaatttca aagccagtct gcaagatttt attgaattct ggctggagag cgatttcttc
24120
ccaaagtcat gcacactggc cacagaaagc agccgagtca tgctcgttac tgctatgtgt
24180
ggaaaaattc atggtccatg cctcaaccag tctgaccagt ttcatgctgg agctagaaaa
24240
catgtccgca gaccgcctcc ctgagaatca agcccaggac atgcactaga ggcccgtggc
24300
tgctgcctcg tgactacaag gcaacgggca gggcctctca ccgtctgtga ggctgcgcgc
24360
acaggcggag gcaaacaaga gctgctccca acaagacgcc cggtgtgccg ggcgctggac
24420
gactgattct tacaaccacc tgcaaagcac gtgttctgcc tctcgagcgt tcacacattc
24480
ctttagcagt tgcttcttag gagctgcctg tgtgccatgc acagggtgaa atgtggtcac
24540
564717
77
tgtcttcaca gggaagacag catcaaagaa ggagccaggt ccataaagga gataattaca
24600
gtgtgggaca gacaccttgg agacactgca gactgagctg tcagggaaag ggacgctgga
24660
gctgagacct gtgggagggg aatgggccag cctcagggtc tttgcacctg ctttccctgc
24720
tgcctgcagt gcttctgttt ctgagcagaa gaaccccagt gtggaggtct gggctcagct
24780
tccgctcccc tagtgcgacc ttgggcaagc cactttccta ccctgggcct cagtttcctc
24840
ctctgtcaac tgggattgca ctaaacacat ctcaaaattc agcttctaag atgcagtaaa
24900
tgtcctgcct tcctagcaaa aagctgctgt tgttcaaatt tgttgttggg tagttgctaa
24960
gtcatgtccg actcttttgt gaccccatgg actttagctc accaggccct tctgtccatg
25020
ggatttttcc aggcaagaat actggagtgg gttgccattt atcgatttaa ttataaaaac
25080
ttgtttggag acatttggaa agtcgagaaa agtttaaatg agaagtcagg catggcagta
25140
attctccttc tggaatttca cagttgccgg tggtcttgct cctcatcccc tcccaggtgc
25200
ccaactgaac tcacccgcca ggggtaatgg caataacaat aataataata atacaattat
25260
gtgatcaaca gcagttaaca tttgtttagc acttactgtt aacctggcct tcttgtaagc
25320
caacagcatg ggttatttta tttaatttgc atgataactc agagatgcat atgctattta
25380
tacaatgttt gtgctatacc tcattataca ggtggggaaa ctgaacttca gagatgtcaa
254 40
gtaacttgtt caagtaccat ggcaagtgtg tggggacaca gggagtaact gctgctcgat
25500
agatatttaa ttagttgtgt tacgtctctc tcccgggtta aagtactgaa ccaagtcttg
25560
actgccactt aagaccaaga gacgtccttc atagatattt tttcctccca agttattttt
25620
atctcatgtt tgtttttctt tcctgctggt ttgtttttct cctatttttt cataggagaa
25680
aatactgaaa atctagaaaa acatagacaa agaaattaaa gtctgcatca cccgctgcat
25740
gctctattgt tactatttgg cacatttctc ccaggctctt tccactgtga tttcttttct
25800
tttttaacac ggctgagctc agaccgtgtc tgcggttctc tctgctgctc tttcactcgg
25860
tgttctctgt gttattatga actctttata aacacctttt taataactga tgcagccttt
25920
gacaagtgac tcatacagta tttaaaaagc gagaacttgt aagtacttcc atttcactta
25980
taagtacatc catttcacct gtgagggatc aggcagccca gagcaggaga aaagttccag
26040
ggtttcggag cccccttcca caacccacca gctttgcctt cttcggggag tcacctccag
26100
tccccgattc tcatttttaa ctgagaactc caatcatgta atagctgctt ctggtggcta
26160
gcatgaggcc
3.a.3.6.C30.Q3.C
actgaagcat tcttttaaaa atgtgcacat aaagccagtg
26220
tttgctctgg acaacgtttt gatttatttg gaaaaactga tctctttcag aatgcagaag
26280
tgggctccaa tttaatcaaa ctgtcgtcta caacagcacg agccctaaag gaaaaagatg
26340
accaggtcta ctgtcagccg gagttgctct gtgaaggtaa aacgcatctt ctctcctgtg
26400
ttcagaggaa ggggtggatg tcctctttac gtaacgcctc cttccagtgc agaggagctg
26460
tgtgtggacc agggaggcaa gggaaggctg tgggactcta ggcattgtac tggagacgct
26520
tctctctcat gtaccttcat acccctccca gcactggggt ggtctctaga cccaggaact
26580
ttcacggctg gccttttgaa tgctgccatg taattttttt gctttctttt aaaacattta
26640
tttatttgtt gcatttaaat ttttatacac tttttaaagg ttactttcca cttagttatt
26700
ccaaaatatt ggctgtgttc cccatgttgt acaatcgtcc ttgagcctgt cttacatcca
26760
atagtctgta cttccctgtc cccaccccta tatcatcccc tccttccccc tcactggtaa
26820
ccactaactt gttctctgtc tcaaatgctg ccatgcattt gcaaaaaatg ctcgacatcg
26880
cttattatta gagaaatgga aatcaaaact gcagggaggt atcagctcac accagtcaga
26940
27000
27060
27120
27180
27240
27300
27360
27420
27480
27540
27600
27660
27720
27780
27840
27900
27960
28020
28080
28140
28200
28260
28320
28380
28440
28500
28560
28620
28680
28740
28800
28860
28920
28980
29040
29100
29160
29220
29280
29340
564717
atggccatca gtatggagaa ctatggaaaa gcaatcccac ggacagcaca ccatctcctc aacgtaaagc tgcggctgcg cctgcaaccc cccgtgctca agtgcgtagg ggctccaggc ttgacacagc tttggattac atataaacta tttacacttg atggggacat ttcattagat ttcacgccca tgcagtggcc gatgtttata gctggggtaa gggcctgagt gcctcgatac tcgtctcttt ccgctatatc tggcgggtgg aggactgaga atatgctgga caggtaagac gagcaccagc caaggaactc tttttaaaac taggagacga cactccagtg atggggtcac tacaatatca gctatttgtt ttttccttag atcagtgcag tcagccaaaa aagggaaccc cagtatggag tgcagggcct gcatacgcgg tgctggccag ccctcggaga gggcggaggg gactgggaac gtgctgtcac tgaggagcga tcgccacatc ggacatatgt aaaagtgata cgagccacaa taacatagtg tttcccatgt tcatctaact catacatttg tctgatttta cacttaaaaa ttatctgaag gatgttcagg cacgaagggc ccaggcttag tttgctgctg cagcctggga tgcccttggg agcaatgacg gtggtggctg ttaggcacac atagtctcaa ttttttattc gggttcaatc ttcttacctg agagtcggac tgttagtttc gaacacttgc tagcttcagg gaatgtacaa aacaaacaaa cctcttgcac atttctttga acaccctgag gtctccacct aggagaaaac gaagccacag ggcagcgctg cactgtccta cagcacgaga actggcctgc gctgccccat aaatatgtat gacataatga taagaaagtt gtgataattc tactaaaatt attccccact cacgcacata atcatgtaat ccaaggtcta ttccctcctc gatcctggct gggtggtccc aactgcctca gagtccagtg tggtcatgtc gatgggggta gagcatccag gagtttatcc gcagagaaaa aagtatagtg cctgggtccg ggaaatctca acaacttggc aggtgttcag tgtttcccag ttttagtaag tgagaaaaat
78
aaaaaatcta agttggtggg aaactaggaa gaaaccacag gggcacactg tgggcctcag cccccggagc ggcattgcag gagcagaagc caggaaccag ccaggacacg gccatggtgg acatggacat aattttcaca atcagcccag tctaggcagc ctttgtaaat gatgcttcct cttttttttt tctctttgat tcttaccttg ctgccctttg tggaagccat accctcctga catcaattat gagccccatc tgtctgcaaa ggtcagagag gagagggaac agtgatttct agtgaacagg cagagctgaa gctcagtggt gaagatcccc tagacagagc aactaaacag eacagaggtt ggaccatgct aaaatggaaa taaggttccc taaaaaaaat aatgtaaact taaaactacc ttccgatgct agcaccctgc aaaggctgtg aaggcttccc ctgttgagca ctgtggagca gctcagagca gctggggagt gggggacttg ggatgaaatt tgagaaactc tagcacttct cttttacgtc attacttttt ccatgcttca tttaagagct ttctatcttg ctcccgaagc accacccttg ttttgtagaa agggggcttt gcccacaatg ccaaccaagg tctgagttct tgagctcagt tgctcctaat ttattggatg aggatcctgg acttggagga aaagaattca tggagaaaga agcctggcag ctgtaacaac catattttgg gagtgctgta atcttttttt acatgctcat gctggagagg gatatagcca acgtgacaca gccatgtgct agaacaggcg cgccacgcag tgcccagcac gcgtccgggg gggcctgtgc gggtgtgctc ggaggggcag gaaatttaat tgaaggcttt agaatggaca aaacttttcc taacgttatc attatacgat gaagtttctg tttaaaaagt gataattaaa ttctaataga atccaactca aggatgcctg gtcagccatg ggcagccata tactggtccc cagggatgtc ttgggacaaa gggcttcact catatttact ttcctgccct tttattttgc cctgccaatg aatgggaacc gctacagtcc ggggctgatt ttcctaaact cagacatcac aaaaaacgca ccacacgccc
564717
79
cgagtcactt cccattaatc ccacacccta aggggaaaga ggcttattag agacagagag
29400
aaggccaggg actcacgctt ggaaagcaat gaagaaagca cattatgctt taaaggctgg
29460
gcaatgaata tgttagatta tctgatacgt ataaaaattg caatatagtg cccaatatca
29520
gataaatcca gggttcttat ttcttaaaga attcagagtg tatatcagct ctgctgtggt
29580
ggggggacct tagcttaaag ccggagaagg aagggagctg ctttcctgtt tttgtgctga
29640
ccagaggcag gtgtcatctg gggcctatgc tcgctggtgt gaggtcccca tctgaggcgc
29700
tcaccccttc tctggggCCt tgacgccttc ctattcgccg ggaatgagca tgctctgcaa
29760
ccattcattt cgcgtttctc ccctttgcat gcgcagataa taaaatacga cattctcacc
29820
aagtcctcct tgacgtggag agaggagcac tgctggccgg cggagcccct gttcgtgccc
29880
acaccaggtg ccaaggacga ggatgatggt aaggcccggg aagggcaacc caccttctgc
29940
actgtgcttg tggtcacccc tcaaactcag acaccgttca cacagctggt gcagccatgc
30000
tttgaagtcc aggcggttct gagtgcaaag cagtagggcg agggagtccc caagcccacc
30060
cccacttctg acatcagctg caagttcagg ggtccccaag accaccctca gcttcactag
30120
gagcacccac agaagactct gaaagctgtt acactcgtgg ggacagttat ctcagcggaa
30180
gggtacagac tgaaagagaa aaggcgagtg ggatggaatc cgggagaatc ccatgggtgg
30240
tctccggtgg tcctccctgc gggagtcacg gaccactccc acttccccaa cgagccccac
30300
ggagcgcggc caaccagggg tgctccctca gcctcggagc ccatgtgggg cgtggtcgtg
30360
cggaccgcgc tgacttctag tcttcacccc tctggaggtc gagctgacgg ctgtggccca
30420
agtactaaat tgcttccgtc gtgtcccact ctttgcgtcc ccatggactg ttgccgacca
30480
ggctcctctg tccatcggat tctccaggca ggaacactgg agtgggctgc catgccctcc
30540
tcctggggat cttcctgacc caaggatcca acctgcatct cttacgtctc ctgcattggc
30600
aggtgagacc tttaccatta gtgcctcctg ggaagccctg tggcccaagg ccgggttgga
30660
tggcacagcc tcaggtaaac aaagacctct cgtcagaccg acagtccaag gactcgggcg
30720
tcacctccca ggggcagagg agagacctcc ttttggggga gactaacaca aaagcccact
30780
tccttgccac agggcaacac tgccctggcc cagagtgtgg aatcctggtt ttaaaattca
30840
ttgcagtggc tgggtcagaa cttgtaactc ctctgagcac ataaactttc cagggagccc
30900
cttaaaaagc ttgtgggact cctctctcat ccaggtttcc acctgaccta aaggtccttt
30960
ccctccttga ttttgcccca gaaatgccct gcccgtcccc ctggggatgc caaggagggt
31020
aaatagaagc ctgggtgggg ctgccaacac ctggggggag ggcaggctgc ccgctgagcc
31080
tgccagctgc ttccctcggg cgagaaggga gtggttttct acctgggagt gtcttcctat
31140
tgacctgcac ggcacagcgt gccaggcagg ggacccgtgt cactgtcgcc tctaaggtag
31200
gggacggttg gtgccacctc ctgccgaggc tggcctcccc cgggttgggc gtgtacgccc
31260
tgcatattct cctgtggaca ggaagcagcc ttgtcttcat tccaccagaa gaaaggctca
31320
gagcttgagg tcacttaccc acggttgcac gcccagtaca tacacagggg agggtttgag
31380
tccaggctga tttgacctca gttgtgctca gccactctga aacccgggag gcctcgggaa
31440
gaatcaggaa tatttgctgg aagtttattg ctctattttg tttagccctc tgtccctttg
31500
agagtgaata tttagactat tctcatttca caaaggggaa agctgcacac gcatgaggtg
31560
gttgcactta ttcaaacgcc atttgaatgc agcaggtctg agtttaggct aagtgcagtg
31620
ttgaagctga ttcgctgtaa ctggaatttc taaatagtag gaaaatagcc atttttctgt
31680
atgtgcaata attcacaaag tgctttcata actgaccgtg caagatcctc tcagcaaccc
31740
564717
80
cctgagagtg gggagattat ccccactcta cagatgaaga gactgaggct cagaaaggtg
31800
aggggatggg cccagccaag ccttccgatt tccctggcct ttcccttcag ctcacacaga
31860
atatcaaatt cacttccctt ctgtttcgat tcaattcagt tcagttcagt tcagtcgctc
31920
agtcatgtcg actctttgeg accccatgaa tcacagcaca ccaggcctcc ctgtccatca
31980
ccaactcccg gagttcaccc aaactcacgt ccatcgagtc agtgatgcca tccagccatt
32040
tcatcctctg tcatccttct cttcctgccc ccaatccctc cccagcatca gggtcttttc
32100
caatgagtca actcttcgca ttaggtacaa agaacaggga gacagagctg agaaacttag
32160
cacccccctc cccaccccag ctggaagagg ttcgagggtg cacttttaca aaatgtctct
32220
ctttgtttcc ccaaagggat tatcttatcg gccatagtct ctaccgatcc ccagaagtcg
32280
ccttttctgc tggttctcga tgccagaact tttacggaac tggcccgcgc gtccgtcgat
32340
gtggagatgc acctggattt ccacgggctg ttcatcccag atgcaggcag ggacccgggg
32400
aagcaggccc cttcccagga ggcgccggcc agggctgccg ccggacgtgc ggccccaaga
32460
acctgacagc ctggaggctt tggtcctggg gaccagctcc gcccagctca cggctgtccc
32520
cgccccgggg gaggtgcggg agaggtggtc cttcgttcca tttcgcactt attctttccg
32580
cagctgcttt gagtcaacat tctga
32605
<210> 2 <211> 1791 <212> DNA <213> Bos taurus <220>
<221> CDS <222> (1)..(1791)
<4 00> 2
atg gaa ata ata ttt ggc aga aat aag aag gag caa ctg gag cct gtg 48
Met Glu lie lie Phe Gly Arg Asn Lys Lys Glu Gin Leu Glu Pro Val 15 10 15
agg gcc aga gta aca ggc aag att cca gcc tgg ctg cag ggg ate ctg Arg Ala Arg Val Thr Gly Lys lie Pro Ala Trp Leu Gin Gly lie Leu 20 25 30
96
ctc cgc aat ggg cct ggg atg cac acg gtg ggc gag acc aga tac aac 14 4
Leu Arg Asn Gly Pro Gly Met His Thr Val Gly Glu Thr Arg Tyr Asn 35 40 45
cac tgg ttc gat ggc ctg gcc ttg ctc cac age ttc acc ate aga gat 192
His Trp Phe Asp Gly Leu Ala Leu Leu His Ser Phe Thr lie Arg Asp 50 55 60
564717
81
ggt gaa gtc tac tac agg age aaa tac ctg aga agt gat acc tac act 240
Gly Glu Val Tyr Tyr Arg Ser Lys Tyr Leu Arg Ser Asp Thr Tyr Thr 65 70 75 80
gcc aac ate gaa gca aac agg att gtg gtg tea gaa ttc gga aca atg 288
Ala Asn lie Glu Ala Asn Arg lie Val Val Ser Glu Phe Gly Thr Met 85 90 95
gcc tat cca gac ccc tgc aaa aac ata ttt tcc aaa get ttc tcc tac 336
Ala Tyr Pro Asp Pro Cys Lys Asn lie Phe Ser Lys Ala Phe Ser Tyr 100 105 110
ctg tcc cac act ate ccc gat ttc aca gac aac tgt ctg ate aac ate 384
Leu Ser His Thr lie Pro Asp Phe Thr Asp Asn Cys Leu lie Asn lie 115 120 125
agg agg tgc gga gaa gac ttc tac gcg acc aca gag acc agt tac ate 4 32
Arg Arg Cys Gly Glu Asp Phe Tyr Ala Thr Thr Glu Thr Ser Tyr lie 130 135 140
agg agg ate aac ccc cag acc ctg gaa acc ctg gag aag gtt gat ttt 480
Arg Arg lie Asn Pro Gin Thr Leu Glu Thr Leu Glu Lys Val Asp Phe 145 150 155 160
cgt aaa tat gtg get gta aat ctg gca act tea cat cct cac tac gac 528
Arg Lys Tyr Val Ala Val Asn Leu Ala Thr Ser His Pro His Tyr Asp 165 170 175
get get gga aat gtt ctc aat gtt ggc acg tcc ate gtg gac aag ggg 57 6
Ala Ala Gly Asn Val Leu Asn Val Gly Thr Ser lie Val Asp Lys Gly 180 185 190
aag aca aaa tac gtg ate ttt aag ate cct gcc cca gtc cca ggg ggc 624
Lys Thr Lys Tyr Val lie Phe Lys lie Pro Ala Pro Val Pro Gly Gly 195 200 205
agg aag gag ggc egg age ccc ctg aag gac acg gag gtc ttc tgc tcc Arg Lys Glu Gly Arg Ser Pro Leu Lys Asp Thr Glu Val Phe Cys Ser 210 215 220
672
564717
ate gcc lie Ala
225
gtc age Val Ser ate ctc lie Leu tgc ctg Cys Leu cga agg Arg Arg
290
gtg gta
Val Val 305
ttc gat Phe Asp ttg gcc Leu Ala atg ccc Met Pro gca gaa Ala Glu 370
gcc cac Ala His gag aac Glu Asn aag atg Lys Met 260
gcc ttc Ala Phe 275
acg egg Thr Arg ttt cac Phe His gtc ate Val lie aac ctg Asn Leu
340
acc ctc Thr Leu
355
gtg ggc
Val Gly tcc ctc Ser Leu 230
tac ate Tyr lie 245
gcc acg Ala Thr cac ggg His Gly aag ccc
Lys Pro cac gtc His Val 310
acc tac Thr Tyr 325
aac gag Asn Glu aag agg
Lys Arg tcc aat Ser Asn ctc tcc Leu Ser att ttc lie Phe get tac Ala Tyr gag gac Glu Asp 280
gtg ccg Val Pro 2 95
aat gcc Asn Ala gag gac Glu Asp gac ttt Asp Phe ttc gtg Phe Val 360
tta ate Leu lie
375
82
ccg age Pro Ser ctc gag Leu Glu
250
att egg lie Arg 265
aag act Lys Thr acc aag Thr Lys tac gag Tyr Glu ggc age Gly Ser 330
aag gag Lys Glu 345
ctt ccc Leu Pro aaa ctg
Lys Leu tat tac
Tyr Tyr 235
cag cct Gin Pro ggt gtg Gly Val cac ate
His lie tat cac Tyr His 300
gag gac Glu Asp 315
ctc tac Leu Tyr aac tcc Asn Ser ctc cac Leu His teg tct Ser Ser 380
cac age
His Ser ttc aag Phe Lys age tgg Ser Trp
270
cac ate His lie 285
acg gac Thr Asp ggc tgc Gly Cys cag ctc Gin Leu agg ctc Arg Leu 350
gtg gac Val Asp 365
aca aca Thr Thr ttc gga
Phe Gly 240
ttg gac Leu Asp 255
get tcc Ala Ser ate gac lie Asp ccc atg Pro Met ctc ctg Leu Leu 320
ttc tac Phe Tyr 335
acc tcc Thr Ser aag aat Lys Asn gca cga Ala Arg
564717
gcc eta aag Ala Leu Lys
385
tgt gaa ggc Cys Glu Gly cca tac cgc Pro Tyr Arg ttg att tac Leu lie Tyr
435
gag gag cac Glu Glu His 450
gcc aag gac Ala Lys Asp 4 65
gat ccc cag Asp Pro Gin acg gaa ctg Thr Glu Leu cac ggg ctg His Gly Leu 515
cct tcc cag Pro Ser Gin 530
gaa aaa gat Glu Lys Asp 390
tta gaa ctg Leu Glu Leu 405
tat ate ttt Tyr lie Phe 420
gcc gcg att Ala Ala lie tgc tgg ccg Cys Trp Pro gag gat gat Glu Asp Asp 470
aag teg cct Lys Ser Pro 485
gcc cgc gcg Ala Arg Ala 500
ttc ate cca Phe He Pro gag gcg ccg Glu Ala Pro gac cag gtc Asp Gin Val cct cac ate Pro His lie get get gga Ala Ala Gly
425
cgc ctt gcc Arg Leu Ala 440
gcg gag ccc Ala Glu Pro 455
ggg att ate Gly lie lie ttt ctg ctg Phe Leu Leu tcc gtc gat Ser Val Asp 505
gat gca ggc Asp Ala Gly 520
gcc agg get Ala Arg Ala 535
83
tac tgt cag Tyr Cys Gin 395
aat tat gcc Asn Tyr Ala 410
gtc cag tgg Val Gin Trp aag tcc tcc Lys Ser Ser ctg ttc gtg Leu Phe Val 460
tta teg gcc Leu Ser Ala 475
gtt ctc gat Val Leu Asp 490
gtg gag atg Val Glu Met agg gac ccg Arg Asp Pro gcc gcc gga Ala Ala Gly 540
ccg gag ttg Pro Glu Leu cac aat ggg His Asn Gly
415
age cct agg Ser Pro Arg 430
ttg acg tgg Leu Thr Trp 445
ccc aca cca Pro Thr Pro ata gtc tct lie Val Ser gcc aga act Ala Arg Thr 495
cac ctg gat His Leu Asp 510
ggg aag cag Gly Lys Gin 525
cgt gcg gcc Arg Ala Ala ctc 1200
Leu
400
cag 12 4 8
Gin cca 1296
Pro aaa 1344
Lys ggt 1392
Gly acc 1440
Thr
480
ttt 1488
Phe ttc 1536
Phe gcc 1584
Ala cca 1632
Pro
564717
84
aga act gac age ctg gag get ttg gtc ctg ggg acc age tcc gcc cag 1680
Arg Thr Asp Ser Leu Glu Ala Leu Val Leu Gly Thr Ser Ser Ala Gin 545 550 555 560
ctc acg get gtc ccc gcc ccg ggg gaa ggg egg gag agt ggt cct teg 1728
Leu Thr Ala Val Pro Ala Pro Gly Glu Gly Arg Glu Ser Gly Pro Ser 565 570 575
ttc cat ttc gca cat att ctt tcc gca get get ttg agt caa aat tct 177 6
Phe His Phe Ala His lie Leu Ser Ala Ala Ala Leu Ser Gin Asn Ser 580 585 590
gaa acg gaa aca taa 1791
Glu Thr Glu Thr 595
<210> 3 <211> 596 <212> PRT <213> Bos taurus <400> 3
Met Glu lie lie Phe Gly Arg Asn Lys Lys Glu Gin Leu Glu Pro Val 15 10 15
Arg Ala Arg Val Thr Gly Lys lie Pro Ala Trp Leu Gin Gly lie Leu 20 25 30
Leu Arg Asn Gly Pro Gly Met His Thr Val Gly Glu Thr Arg Tyr Asn 35 40 45
His Trp Phe Asp Gly Leu Ala Leu Leu His Ser Phe Thr lie Arg Asp
50 55 60
Gly Glu Val Tyr Tyr Arg Ser Lys Tyr Leu Arg Ser Asp Thr Tyr Thr 65 70 75 80
Ala Asn lie Glu Ala Asn Arg lie Val Val Ser Glu Phe Gly Thr Met 85 90 95
564717
85
Ala Tyr Pro Asp Pro Cys Lys Asn lie Phe Ser Lys Ala Phe Ser Tyr 100 105 110
Leu Ser His Thr 115
Arg Arg Cys Gly 130
Arg Arg lie Asn 145
Arg Lys Tyr Val
Ala Ala Gly Asn
18 0
lie Pro Asp Phe
120
Glu Asp Phe Tyr 135
Pro Gin Thr Leu 150
Ala Val Asn Leu 155
Val Leu Asn Val
Thr Asp Asn Cys
Ala Thr Thr Glu 140
Glu Thr Leu Glu
155
Ala Thr Ser His 170
Gly Thr Ser lie
185
Leu lie Asn lie 125
Thr Ser Tyr lie
Lys Val Asp Phe 160
Pro His Tyr Asp 175
Val Asp Lys Gly 190
Lys Thr Lys Tyr Val lie Phe Lys lie Pro Ala Pro Val Pro Gly Gly 195 200 205
Arg Lys Glu Gly Arg Ser Pro Leu Lys Asp Thr Glu Val Phe Cys Ser 210 215 220
lie Ala Ala His Ser Leu Leu Ser Pro Ser Tyr Tyr His Ser Phe Gly 225 230 235 240
Val Ser Glu Asn Tyr lie lie Phe Leu Glu Gin Pro Phe Lys Leu Asp 245 250 255
lie Leu Lys Met Ala Thr Ala Tyr lie Arg Gly Val Ser Trp Ala Ser 250 265 270
Cys Leu Ala Phe His Gly Glu Asp Lys Thr His lie His lie lie Asp
275 280 285
Arg Arg Thr Arg Lys Pro Val Pro Thr Lys Tyr His Thr Asp Pro Met
290 295 300
Val Val Phe His His Val Asn Ala Tyr Glu Glu Asp Gly Cys Leu Leu
564717
86
305 310 315 320
Phe Asp Val lie Thr Tyr Glu Asp Gly Ser Leu Tyr Gin Leu Phe Tyr 325 330 335
Leu Ala Asn Leu Asn Glu Asp Phe Lys Glu Asn Ser Arg Leu Thr Ser 340 345 350
Met Pro Thr Leu Lys Arg Phe Val Leu Pro Leu His Val Asp Lys Asn
355 360 365
Ala Glu Val Gly Ser Asn Leu lie Lys Leu Ser Ser Thr Thr Ala Arg
370 375 380
Ala Leu Lys Glu Lys Asp Asp Gin Val Tyr Cys Gin Pro Glu Leu Leu
385 390 395 400
Cys Glu Gly Leu Glu Leu Pro His lie Asn Tyr Ala His Asn Gly Gin 405 410 415
Pro Tyr Arg Tyr lie Phe Ala Ala Gly Val Gin Trp Ser Pro Arg Pro 420 425 430
Leu lie Tyr Ala Ala lie Arg Leu Ala Lys Ser Ser Leu Thr Trp Lys
435 440 445
Glu Glu His Cys Trp Pro Ala Glu Pro Leu Phe Val Pro Thr Pro Gly
450 455 460
Ala Lys Asp Glu Asp Asp Gly lie lie Leu Ser Ala lie Val Ser Thr
465 470 475 480
Asp Pro Gin Lys Ser Pro Phe Leu Leu Val Leu Asp Ala Arg Thr phe 485 490 495
Thr Glu Leu Ala Arg Ala Ser Val Asp Val Glu Met His Leu Asp Phe 500 505 510
His Gly Leu Phe lie Pro Asp Ala Gly Arg Asp Pro Gly Lys Gin Ala 515 520 525
564717
87
Pro Ser Gin Glu Ala Pro Ala Arg Ala Ala Ala Gly Arg Ala Ala Pro
530
535
540
Arg Thr Asp Ser Leu Glu Ala Leu Val Leu Gly Thr Ser Ser Ala Gin
545
550
555
560
Leu Thr Ala Val Pro Ala Pro Gly Glu Gly Arg Glu Ser Gly Pro Ser
565
570
575
Phe His Phe Ala His lie Leu Ser Ala Ala Ala Leu Ser Gin Asn Ser
580
58 5
590
Glu Thr Glu Thr 595
<210> 4 <211> 20 <212> DNA <213> Artificial <22 0>
<223> Synthetic <400> 4
tctcactcca cccagagagg 20
<210> 5
<211> 20
<212> DNA
<213> Artificial <220>
<223> Synthetic
<400> 5
agtcgctgtg ttgtgtccag 20
<210> 6
<211> 20
<212> DNA
<213> Artificial <220>
<223> Synthetic
564717
<4 00> 6
ttgcctggac agaggagtct
<210> 7
<211> 20
<212> DNA
<213> Artificial
<220>
<223> Synthetic
<400> 7
aggctccagt tgctccttct
<210> 8
<211> 20
<212> DNA
<213> Artificial
<220>
<223> Synthetic
<4 00> 8
ggccctcgct ccagataaac
<210> 9
<2H> 20
<212> DNA
<213> Artificial
<220>
<223> Synthetic
<4 00> 9
gacactgcct gggagctcgc
<210> 10
<211> 18
<212> DNA
<213> Artificial <220>
<223> Synthetic
<4 00> 10
ggcagaggga gcgctgag
18
<2 10>
11
<211>
<2 12>
DNA
<213>
Artificial
<220>
<223>
Synthetic
<400>
11
tttgcaatcg gcttatggtg
<210>
12
<211>
29
<212>
DNA
<213>
Artificial
<220>
<223>
Synthetic
<4 00>
12
tgtgcggccg ccatggaaat
<210>
13
<2 11>
<212>
DNA
<213>
Artificial
<2 2 0>
<22 3>
Synthetic
<4 00>
13
gatagtcctc acggccaaaa
<210>
14
<211>
27
<212>
DNA
<213>
Artificial
<220>
<223>
Synthetic
<400>
14
cctggccttc cacagggagg acaagac
<210> 15
<211> 27
<212> DNA
<213> Artificial
564717
90
<220>
<223> Synthetic <4 00> 15
gtcttgtcct ccctgtggaa ggccagg
<210>
16
<211>
<212>
DNA
<213>
Artificial
<22 0>
<22 3>
Synthetic
<4 00>
16
tctacttggc caacctggac gaggacttta aggag
<210> 17
<211> 35
<212> DNA
<213> Artificial <220>
<223> Synthetic
<4 00> 17
ctccttaaag tcctcgtcca ggttggccaa gtaga
564717
91
Claims (43)
1. A method of determining genetic merit of a bovine, wherein the genetic merit is with respect to a milk content phenotype or a milk colour phenotype, the method comprising determining the BCMOl allelic profile of the bovine and determining the genetic merit of the bovine on the basis of the BCMOl allelic profile.
2. The method as claimed in claim 1, wherein the milk content phenotype is milk p-carotene content.
3. The method as claimed in claim 2, wherein the milk p-carotene content is milk fat p-carotene content.
4. The method as claimed in any one of claims 1 to 3, wherein the allelic profile is determined with respect to DNA, mRNA and/or protein obtained from said bovine.
5. The method as claimed in any one of claims 1 to 4, wherein the allelic profile is determined by determining the presence or absence of one or more of the following: the C allele at the C-1054T promoter polymorphism in the BCMOl gene, the T allele at the C-1054T promoter polymorphism in the BCMOl gene, the G allele at the G15929A (G278R) polymorphism in the BCMOl gene, the A allele at the G15929A (G278R) polymorphism in the BCMOl gene, the A allele at the A18068G (N341D) polymorphism in the BCMOl gene, and the G allele at the A18068G (N341D) polymorphism in the BCMOl gene.
6. The method as claimed in claim 5 wherein the presence or absence of one or more of the alleles is determined by the use of one or more polymorphisms in linkage disequilibrium with the allele.
7. The method as claimed in any one of claims 1 to 6, wherein the allelic profile is determined by determining the expression of a BCMOl gene or the expression or activity of a BCMOl gene product.
8. The method as claimed in any one of claims 1 to 1, the method comprising determining milk colour or milk P-carotene content of the bovine, determining the BCMOl allelic profile of the bovine, comparing the BCMOl allelic profile of the bovine or the milk colour or milk P-carotene content of the bovine with that of a bovine having a known BCMOl allelic profile; and determining the genetic merit of the bovine on the basis of the comparison. 564717 92
9. A method of identifying or selecting a bovine having a desired BCMOl allelic profile comprising determining a BCMOl allelic profile of the bovine according to the method of any one of claims 1 to 8 and identifying or selecting said bovine on the basis of the determination.
10. A method of determining genetic merit of a bovine with respect to milk colour or milk p-carotene content, the method comprising providing data about the BCMOl allelic profile of said bovine, and determining the genetic merit of the bovine on the basis of the data.
11. The method of claim 10 wherein the genetic merit is a capability of producing progeny that will have increased milk colour or milk P-carotene content.
12. The method of claim 10 wherein the genetic merit is a capability of producing progeny that will have decreased milk colour or milk J3-carotene content.
13. A method for identifying or selecting a bovine with respect to milk colour or milk p-carotene content, or with respect to capability of producing progeny that will have increased or decreased milk colour or milk P-carotene content, the method comprising providing data about the BCMOl allelic profile of said bovine, and identifying or selecting the bovine on the basis of the data.
14. The method as claimed in any one of claims 10 to 13, wherein the data about the BCMOl allelic profile is indicative of the presence or absence of one or more polymorphisms selected from the group comprising: the C-1054T promoter polymorphism in the bovine BCMOl gene, the G15929A (G278R) polymorphism in the bovine BCMOl gene, the A18068G (N341D) polymorphism in the bovine BCMOl gene, or one or more polymorphisms in linkage disequilibrium with one or more of the group comprising: the C-1054T promoter polymorphism of the bovine BCMOl gene, the G15929A (G278R) polymorphism of the bovine BCMOl gene, or the Al 8068G (N341D) polymorphism of the bovine BCMOl gene.
15. The method as claimed in any one of claims 2 to 14, wherein the milk colour or milk content is increased milk colour or increased milk P-carotene content.
16. The method as claimed in claim 15, the method comprising determining 564717 93 (a) the presence of the T allele at the C-1054T promoter polymorphism in the bovine BCMOl gene, or (b) the absence of the C allele at the C-1054T promoter polymorphism in the bovine BCMOl gene, or (c) the presence of the G allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene, or (d) the absence of the the A allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene, or (e) the presence of the G allele at the A18068G (N341D) polymorphism in the bovine BCMOl gene, or (f) the absence of the A allele at the A18068G (N341D) polymorphism in the bovine BCMOl gene, or (g) any combination of two or more of (a) to (f), and identifying or selecting the bovine on the basis of the determination.
17. The method as claimed in any one of claims 2 to 14, wherein the milk colour or milk content is decreased milk colour or decreased milk P-carotene content.
18. The method as claimed in claim 17, the method comprising determining (a) the presence of the C allele at the C-1054T promoter polymorphism in the bovine BCMOl gene, or (b) the absence of the T allele at the C-1054T promoter polymorphism in the bovine BCMOl gene, or (c) the presence of the A allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene, or (d) the absence of the the G allele at the G15929A (G278R) polymorphism in the bovine BCMOl gene, or (e) the presence of the A allele at the A18068G (N341D) polymorphism in the bovine BCMOl gene, or (f) the absence of the G allele at the A18068G (N341D) polymorphism in the bovine BCMOl gene, or (g) any combination of two or more of (a) to (f), and identifying or selecting the bovine on the basis of the determination.
19. The method as claimed in any one of claims 1 to 18, further comprising the step of amplifying at least a fragment of the bovine BCMOl gene to determine the presence 564717 94 or absence of one or more polymorphisms associated with increased or decreased expression or activity of a BCMOl gene product.
20. The method as claimed in claim 19, wherein the primers used in the amplification are selected from the group consisting of SEQ ID NOs: 4 to 17.
21. The method of any one of claims 10 to 20, wherein the method additionally comprises providing data indicative of the presence or absence of one or more of the group comprising the C allele at the C-321G promoter polymorphism in the SCARB1 gene, the G allele at the C-321G promoter polymorphism in the SCARB1 gene, the G allele at the W80Stop G/A polymorphism in the BC02 gene, the A allele at the W80Stop G/A polymorphism in the BC02 gene, or one or more polymorphisms in linkage disequilibrium with one or more of the group comprising the C allele at the C-321G promoter polymorphism in the SCARB1 gene, the G allele at the C-321 G promoter polymorphism in the SCARB1 gene, the G allele at the W80Stop G/A polymorphism in the BC02 gene, or the A allele at the W80Stop G/A polymorphism in the BC02 gene.
22. A probe or primer comprising a nucleotide sequence selected from the group comprising: (a) from 12 to 10000 contiguous nucleotides of SEQ ID NO:l and comprising a cytosine at the C-1054T promoter polymorphism or a nucleotide capable of hybridising to a thymine at the C-1054T promoter polymorphism; or (b) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising a cytosine at the C-1054T polymorphism or a nucleotide capable of hybridising to a thymine at the C-1054T promoter polymorphism; or (c) from 12 to 10000 contiguous nucleotides of SEQ ID NO:l and comprising a thymine at the C-1054T polymorphism or a nucleotide capable of hybridising to an adenine at the C-1054T promoter polymorphism; or (d) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising a thymine at the C-1054T polymorphism or a nucleotide capable of hybridising to an adenine at the C-1054T promoter polymorphism; or 564717 95 (e) from 12 to 10000 contiguous nucleotides of SEQ ID NO:l and comprising a guanine at the G15929A (G278R) polymorphism or a nucleotide capable of hybridising to a cytosine at the G15929A (G278R) polymorphism; or (f) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising a guanine at the G15929A (G278R) polymorphism or a nucleotide capable of hybridising to a cytosine at the G15929A (G278R) polymorphism; or (g) from 12 to 10000 contiguous nucleotides of SEQ ID NO:l and comprising an adenine at the G15929A (G278R) polymorphism or a nucleotide capable of hybridising to a thymine at the G15929A (G278R) polymorphism; or (h) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising an adenine at the G15929A (G278R) polymorphism or a nucleotide capable of hybridising to a thymine at the G15929A (G278R) polymorphism; or (i) from 12 to 10000 contiguous nucleotides of SEQ ID NO:l and comprising a guanine at the the A18068G (N341D) polymorphism or a nucleotide capable of hybridising to a cytosine at the Al 8068G (N341D) polymorphism; or (j) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising a guanine at the A18068G (N341D) polymorphism or a nucleotide capable of hybridising to a cytosine at the A18068G (N341D) polymorphism; or (k) from 12 to 10000 contiguous nucleotides of SEQ ID NO:l and comprising an adenine at the A18068G (N341D) polymorphism or a nucleotide capable of hybridising to a thymine at the A18068G (N341D) polymorphism; or (1) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising an adenine at the A18068G (N341D) polymorphism or a nucleotide capable of hybridising to a thymine at the A18068G (N341D) polymorphism.
23. A probe or primer having about at least 12 contiguous bases of any one of SEQ ID NOs: 4- 17.
24. A probe or primer as claimed in claim 22 comprising a nucleotide sequence comprising at least about 12 contiguous bases of SEQ ID NO: 1 or SEQ ID NO: 2 wherein the about 12 contiguous bases comprise or are within about 1 to about 2000 nucleotides of one or more of the group consisting of the C-1054T promoter polymorphism of the bovine BCMOl gene, the G15929A (G278R) polymorphism of the bovine BCMOl gene, or the A18068G (N341D) polymorphism of the bovine BCMOl gene. 564717 96
25. A pair of primers comprising any two primers as claimed in any one of claims 22 to 24.
26. A bovine identified by the method of any one of claims 9 or 13 to 20.
27. A bovine as claimed in claim 26, wherein the bovine is a bull.
28. Collected semen produced by a bovine as claimed in claim 27.
29. A bovine as claimed in claim 26, wherein the bovine is a cow.
30. A method of selecting a herd of bovine, comprising selecting individuals by the method of any one of claims 9 or 13 to 20, and segregating and collecting the selected individuals to form the herd.
31. A herd of bovine selected by the method of claim 30.
32. A herd of bovine comprising two or more bovine, wherein the bovine are the progeny of one or more bovine selected by the method of any one of claims 9 or 13 to 20.
33. Collected or pooled milk produced by bovine as claimed in claim 29 or by a herd of bovine as claimed in claim 31 or 32.
34. Collected or pooled milk as claimed in claim 33 having increased or decreased colour or increased or decreased p-carotene content when compared to milk produced by a bovine having a BCMOl gene comprising the nucleotide sequence of SEQ ID NO: 1.
35. A dairy product made from the milk as claimed in any one of claims 33 to 34.
36. A kit for genotyping a bovine with respect to one or more milk colour or |3-carotene content phenotypes, comprising a probe or primer as defined in any one of claims 22 to 24 or a pair of primers as defined in claim 25.
37. An isolated, purified or recombinant nucleic acid molecule comprising nucleotide sequence selected from the group comprising: (a) from 12 to 20000 contiguous nucleotides of SEQ ID NO:l and comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism; or (b) from 12 to 1791 contiguous nucleotides of SEQ ID NO:2 and comprising one or more of the C-1054T promoter polymorphism, the G15929A (G278R) polymorphism, or the A18068G (N341D) polymorphism; or (c) from 12 to 20000 contiguous nucleotides of a functional variant of SEQ ID NO:l; or 564717 97
38.
39.
40.
41.
42.
43. (d) from 12 to 1791 contiguous nucleotides of a functional variant of SEQ ID NO:2; or (e) at least 12 contiguous nucleotides of any one of SEQ ID NOs:4 - 17; or (f) a complement of any one of (a) to (e); or (g) a sequence of at least 12 contiguous nucleotides and capable of hybridising to the nucleotide sequence of any one of (a) to (f) under stringent conditions. A vector comprising the nucleic acid of claim 37. A host cell comprising a vector as claimed in claim 38, wherein the host cell does not form part of a human being. An isolated, purified or recombinant polypeptide comprising an amino acid sequence having at least 95% sequence identity with a sequence of at least 10 contiguous amino acids of SEQ ID NO:3, wherein the polypeptide has one or more of the following: (a) arginine at the position corresponding to amino acid 278 of SEQ ID NO:3; or (b) an amino acid other than glycine at the position corresponding to amino acid 278 of SEQ ID NO:3; or (c) aspartate at the position corresponding to amino acid 341 of SEQ ID NO:3; or (d) an amino acid other than asparagine at the position corresponding to amino acid 341 of SEQ ID NO:3; or (e) any combination of (a) or (b) and (c) or (d). The polypeptide as claimed in claim 40, wherein the polypeptide comprises at least 10 amino acids. The polypeptide as claimed in claim 40 wherein the polypeptide has at least about 20% of the enzymatic activity of a BCMO 1 polypeptide consisting of the amino acid sequence of SEQ ID NO:3. An antibody capable of binding a polypeptide as claimed in any one of claims 40 to 42. AJ Park per T(k AGENTS FOR THE APPLICANT 564717 98 ABSTRACT The present invention provides methods of genotyping bovine for desired milk production phenotypes, such as milk colour or milk p-carotene content phenotypes, by determining the BCMOl genotype of said bovine. In particular, the methods are directed to determining the genotype at one or more of the C-1054T promoter polymorphism in the BCMOl gene, the G15929A (G278R) polymorphism in the BCMOl gene, or the A18068G (N341D) polymorphism in the BCMO 1 gene, wherein each is associated with variation in milk colour or milk p-carotene content. Isolated, purified or recombinant polynucleotides and polypeptides are also provided, as are milk and milk products from bovine selected by the methods described herein.
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
NZ56471707A NZ564717A (en) | 2007-12-21 | 2007-12-21 | Marker assisted selection of bovine for desired milk fat colour |
AU2008261149A AU2008261149A1 (en) | 2007-12-21 | 2008-12-19 | Marker Assisted Selection Of Bovine For Desired Milk Content |
IE20081015A IE20081015A1 (en) | 2007-12-21 | 2008-12-19 | Marker assisted selection of bovine for desired milk content |
GB0823253A GB2455657A (en) | 2007-12-21 | 2008-12-19 | b-carotene 15,15-monooxygenase for genotyping bovine milk for colour or b-carotene content |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
NZ56471707A NZ564717A (en) | 2007-12-21 | 2007-12-21 | Marker assisted selection of bovine for desired milk fat colour |
Publications (1)
Publication Number | Publication Date |
---|---|
NZ564717A true NZ564717A (en) | 2010-04-30 |
Family
ID=41129387
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
NZ56471707A NZ564717A (en) | 2007-12-21 | 2007-12-21 | Marker assisted selection of bovine for desired milk fat colour |
Country Status (2)
Country | Link |
---|---|
IE (1) | IE20081015A1 (en) |
NZ (1) | NZ564717A (en) |
-
2007
- 2007-12-21 NZ NZ56471707A patent/NZ564717A/en not_active IP Right Cessation
-
2008
- 2008-12-19 IE IE20081015A patent/IE20081015A1/en not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
IE20081015A1 (en) | 2009-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DK2380985T3 (en) | Cells expressing the vitamin K epoxide reductase, as well as their use | |
JP6424027B2 (en) | DNA markers related to the 6 traits of pigs developed by analyzing the relations with traits and their discrimination systems | |
AU2006301578B9 (en) | Method for diagnosing thromboembolic disorders and coronary heart diseases | |
EP0972075B1 (en) | Methods for assessing cardiovascular status and compositions for use thereof | |
KR101890350B1 (en) | SNP maker for predicting meat quality of pig and use thereof | |
Winter et al. | Assessment of the gene content of the chromosomal regions flanking bovine DGAT1 | |
US7238479B2 (en) | Single nucleotide polymorphism markers in the bovine CAPN1 gene to identify meat tenderness | |
WO2000022166A2 (en) | Genes for assessing cardiovascular status and compositions for use thereof | |
WO2014198686A1 (en) | Genetic test | |
WO1997041217A1 (en) | ob PROTEIN RECEPTOR GENES AND USE OF THE SAME | |
JP5424519B2 (en) | Parkin gene mutations, compositions, methods and uses | |
CN106957907B (en) | Genetic test for liver copper accumulation in dogs | |
Kale et al. | FASN gene and its role in bovine milk production | |
EP1537237A2 (en) | Use of pp2a phosphatase modulators in the treatment of mental disorders | |
NZ564717A (en) | Marker assisted selection of bovine for desired milk fat colour | |
KR101911074B1 (en) | Single Nucleotide Polymorphisms Determining Trypanosomiasis-resistance of N'Dama breeds and Use Thereof | |
GB2455657A (en) | b-carotene 15,15-monooxygenase for genotyping bovine milk for colour or b-carotene content | |
US20090148844A1 (en) | Dna marker for meat tenderness in cattle | |
NZ561998A (en) | Marker assisted selection of bovine for milk fat colour | |
JP3682688B2 (en) | Osteoporosis drug sensitivity prediction method and reagent kit therefor | |
EP2501825B1 (en) | Methods for diagnosing skin diseases | |
WO2012176125A1 (en) | Marker assisted selection of mammalian subjects with desired phenotype | |
JP3684921B2 (en) | Osteoporosis drug sensitivity prediction method | |
NZ561999A (en) | Marker assisted selection of bovine for milk fat colour | |
JP2002521061A (en) | Genetic polymorphisms in the human neurokinin 2 receptor gene and their use in diagnosis and treatment of disease |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PSEA | Patent sealed | ||
LAPS | Patent lapsed |