WO2019117704A1

WO2019117704A1 - Methods for detecting pathogenicity of ganoderma sp.

Info

Publication number: WO2019117704A1
Application number: PCT/MY2018/050010
Authority: WO
Inventors: Peck Lei CHEONG; Sue Sean TEE; Eng Piew KOK; Mei Ling CHONG; Weng Wah LEE; Suan Choo Cheah
Original assignee: Acgt Sdn Bhd
Priority date: 2017-12-14
Filing date: 2018-03-01
Publication date: 2019-06-20
Also published as: MY189736A

Abstract

The present invention generally relates to markers that can be used for detecting Ganoderma species. The present invention also relates to use of the markers in methods of screening and determining pathogenicity of a Ganoderma species, such as Ganoderma boninense (G. boninense), Ganoderma steyaertanum and Ganoderma orbiforme.

Description

METHODS FOR DETECTING PATHOGENICITY OF GANODERMA SP.

TECHNICAL FIELD

[0001] The present disclosure generally relates to markers that can be used to detect Ganoderma. More specifically, the present disclosure relates to markers that can be used to detect pathogenic Ganoderma species. The present disclosure also relates to methods of screening and determining Ganoderma pathogenicity in the Ganoderma boninense (G. boninense ) species as well as other species. The disclosure further relates to methods for analyzing a sample such as, but not limited to a soil sample, to detect the presence of G. boninense and methods for predicting the pathogenicity of the G. boninense detected using one or more markers of the present disclosure.

BACKGROUND

[0002] G. boninense is a type of white rot fungus which is threatening sustainable oil palm production in South East Asia, especially Malaysia and Indonesia. G. boninense breaks down the lignin in wood, leaving the lighter-colored cellulose behind, hence leading to the name “white rot fungus”. This fungus is a threat to the oil palm industry as it causes the basal stem rot (BSR) disease in oil palm. BSR is characterized by a decay of the bole, visible symptoms on the diseased palm such as multiple unopened spears and production of G. boninense fruiting bodies on the base of the trunk.

[0003] Colonization of G. boninense at the basal stem of the oil palm will cause disruption to the plant vascular system, resulting in reduced fruit yield. As oil palm is a food crop, BSR disease can cause severe economic losses through the direct reduction in oil palm bunch numbers and weight of fruit bunches from standing diseased palms. A survey conducted by Idris et al. in 2011 found that 3.71% of 1.59 million hectares of oil palm plantation were infected with Ganoderma , and this figure is increasing, with the reported disease incidence higher in replanting areas and in younger palms. Besides the economic loss caused by the disease, the disease also threatens the safety of workers in the field as often an infected oil palm tree looks sturdy before they topple over and this can lead to injuries.

[0004] Studies have been conducted for the past 40 years to understand the disease and spread of the disease. However, to date, the understanding of the fungus and the disease caused is still insufficient to efficiently manage the spread of the disease. [0005] Visual observation of the symptoms is the current detection method of the disease. However, little can be done once the physical symptoms are observed, as the damage to the plant has already been done. It should be noted that though BSR is often associated with G. boninense, it may be possible for other Ganoderma species to also have pathogenic ability in oil palm, and it is also possible for different varieties of G. boninense to have varying degrees of pathogenicity.

[0006] Therefore, there is a need for a means of detecting the presence of pathogenic Ganoderma, in particular pathogenic strains of G. boninense , at an early stage of infection to enable early treatment of an affected plant, and to restrict spread of the disease. This enables planters to better manage the spread of disease and consequently reduce loss of fresh fruit bunch yield and plants before their full commercial value is realized.

[0007] There is also a need to detect the presence of pathogenic Ganoderma species, in particular pathogenic strains of G. boninense, directly from an environmental sample, such as but not limited to, a soil sample.

[0008] However, in the current state of the art, there are only methods to identify the Ganoderma isolate or strain but there are no known methods for identifying and differentiating between pathogenic and non -pathogenic Ganoderma isolates using any available biological markers.

[0009] Therefore, there is further need to provide molecular markers that are capable of differentiating between pathogenic and non-pathogenic Ganoderma species to enable detecting and identifying a pathogenic Ganoderma species.

SUMMARY

[0010] In one aspect, there is provided a method for determining the pathogenicity of a Ganoderma species, wherein the method comprises the step of detecting the presence or absence of one or more sequences selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof, wherein the presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is pathogenic, and the absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is non- pathogenic. [0011] Advantageously, the method allows for direct identification of Ganoderma, and its pathogenicity from a sample, such as a soil sample. This translates into a shorter time for analyzing a sample, such as a soil sample, as the method eliminates the need for isolation of Ganoderma strains before analysis can be performed.

[0012] In another aspect, there is provided a method for detecting presence of a pathogenic Ganoderma species in a sample, comprising the steps of: (a) extracting DNA from the sample; (b) subjecting the DNA to sequencing to determine sequences of the extracted DNA; (c) comparing the DNA sequences determined from step (b) to a reference DNA sequence selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof; and (d) determining the absence or presence of one or more sequences selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof, wherein the presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is pathogenic, and the absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is non-pathogenic.

[0013] In another aspect, there is provided a method for identifying a nucleic acid sequence containing a marker for Ganoderma pathogenesis comprising the steps of: (a) extracting DNA from a group of pathogenic and non-pathogenic Ganoderma strains; (b) preparing whole-genome sequencing libraries of each of the pathogenic and non-pathogenic Ganoderma strain using the extracted DNA of step (a); (c) sequencing the libraries from step (b) to form a plurality of sequencing reads; (d) assembling the sequencing reads from step (c) to generate a backbone sequence for the pathogenic Ganoderma strain and a backbone sequence for the non-pathogenic Ganoderma strain; (e) mapping the sequencing reads from step (c) from each strain to the backbone sequence of step (d) to perform comparative genomics to identify variant sequence(s); (f) designing and preparing primers from the variant sequence(s) from step (e); (g) applying the primers prepared from step (f) to DNA extracted from the pathogenic and non-pathogenic Ganoderma strains for DNA amplification and sequencing; and (h) classifying the variant sequence(s) into those associated with the pathogenic Ganoderma strains and those associated with the non-pathogenic Ganoderma strains.

[0014] In another aspect, there is provided an oligonucleotide primer comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 124 to 369. [0015] Advantageously, the methods of the present disclosure are able to determine the presence of pathogenic Ganoderma in a sample, such as a soil sample, with confidence level of up to about 100%. For example, using one of the markers (SEQ ID NO: 83) isolated by the methods of the present disclosure, it is possible to determine the presence of pathogenic Ganoderma at about 100% confidence level.

[0016] Advantageously, the methods of the present disclosure are also highly sensitive in detecting DNA from pathogenic Ganoderma in a sample. For example, one of the markers identified by the present disclosure (SEQ ID NO: 25) is capable of detecting DNA from

_2

pathogenic Ganoderma in a soil sample that is present in an amount as low as 10- ng.

[0017] In another aspect, there is provided a kit for use in the methods defined herein, wherein the kit comprises: (a) one or more primer(s) selected from the group consisting of SEQ ID NOs: 124 to 369 for determining the presence or absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123; (b) one or more container(s) suitable for sample collection; (c) one or more container(s) suitable for sample processing; (d) one or more reagents for sample processing; (e) one or more buffer(s); (f) one or more PCR reagent(s); and (g) instructions for use in accordance with the methods described herein.

DEFINITION OF TERMS

[0018] The following words and terms used herein shall have the meaning indicated:

[0019] The term“molecular marker” or“marker”, as used therein, refers to a sequence of genetic information that is associated with a location in the genome, a trait associated with the marker or a DNA-based feature, such as a repeat region or identifiable and conserved DNA region. A molecular marker may refer to the entirety of the sequence, or fragments of that sequence, or a region of flanking genetic region, that will allow for the identification of the molecular marker. Ideally, a molecular marker is polymorphic in nature so that it can be used to differentiate between the different features that they are related to. In the present disclosure, the feature related to the marker is pathogenicity of a Ganoderma isolate.

[0020] In one example, molecular markers may comprise but are not limited to single nucleotide polymorphisms (SNPs), simple sequence repeats (SSR), microsatellites, insertion- deletion of bases (INDELs), amplified fragment length polymorphism (AFLP), random amplification of polymorphic DNA (RAPD), restriction fragment length polymorphism (RFLP), unique-event polymorphism (UEP), transposable element position, copy number variation, conserved DNA regions and such.

[0021] The term“single nucleotide polymorphism” or“SNP”, as defined herein, is a DNA sequence variation that occurs when a single nucleotide (A, T, C, or G) in the genome sequence is altered or differs between members of a biological species. Thus, as used herein, a SNP is any polymorphism characterized by a different single nucleotide at a particular physical position in at least one allele. Each individual in a given population has many single nucleotide polymorphisms that together create a unique DNA pattern for that individual.

[0022] As used herein,“sequence identity” refers to the residues in two sequences that are the same when aligned for maximum correspondence over a specified window of comparison by means of computer programs known in the art, such as Burrows-Wheeler Alignment (BWA) version 0.5.9. and GAP provided in the GCG program package (Program Manual for the Wisconsin Package, Version 8, August 1996, Genetics Computer Group, 575 Science Drive, Madison, Wisconsin, USA 53711) (Needleman, S.B. and Wunsch, C.D., (1970), Journal of Molecular Biology, 48, 443-453).

[0023] Percentage of mappable reads is defined as the percentage of alignment of the trimmed and filtered reads to a G. boninense backbone sequence.

[0024] The term“variant” as used herein includes a reference to substantially similar sequences. These sequence variants may have at least 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98% or 99% sequence identity to the“non-variant” nucleic acid, or to a section within the“non-variant” nucleic acid sequence. For example, a variant of any one of SEQ IN NOs: 1 to 123 may have at least about 85%, at least about 86%, at least about 87%, at least about 88%, at least about 89%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98% or at least about 99%. In one example, where the variant sequence includes a molecular marker, the differences that make up the variants may occur outside of the section where the molecular marker is located. That is to say that the differences that constitute the variant sequences can be found outside the location of the molecular marker. By way of an example, if a sequence comprises a SNP at base pair 267, the differences that make up the variants then can be found at any location within the variant sequence other than base pair 267. [0025] The term "nucleic acid" includes a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise limited, encompasses known analogues of natural nucleotides that hybridize to nucleic acids in a manner similar to naturally occurring nucleotides. The terms“nucleic acid”,“nucleic acid molecule”,“nucleic acid sequence” and polynucleotide etc. may be used interchangeably herein unless the context indicates otherwise.

[0026] The term“fragment” as used herein includes a reference to a nucleic acid molecule that is a constituent of a particular nucleic acid or variant thereof. Fragments of a nucleic acid sequence do not necessarily need to encode polypeptides which retain biological activity, for example, hybridisation probes or PCR primers.

[0027] The term“oligonucleotide” as used herein refers to short nucleic acid molecules useful, for example, as hybridizing probes, nucleotide array elements or amplification primers. Oligonucleotide molecules are comprised of two or more nucleotides, i.e. deoxyribonucleotides or ribonucleotides, preferably more than five and up to 30 or more. The exact size will depend on many factors, which in turn depend on the ultimate function or use of the oligonucleotide. Oligonucleotides can comprise ligated natural nucleic acid molecules or synthesized nucleic acid molecules and comprise between 10 to 150 nucleotides or between about 12 and about 100 nucleotides which have a nucleotide sequence which can hybridize to a strand of polymorphic DNA, e.g. to permit detection of a polymorphism. Such oligonucleotides may be nucleic acid elements for use on solid arrays (e.g. synthesized or spotted). In some examples, such oligonucleotides can comprise as few as 12 hybridizing nucleotides, e.g. for assays where the oligonucleotide also comprises a detectable label. In other examples, the oligonucleotide can comprise as few as about 15 hybridizing nucleotides, e.g. for single base extension assays. Such oligonucleotides may also be primers for use in polymerase chain reaction (PCR) or other reactions.

[0028] The term“primer” as used herein refers to a nucleic acid molecule, such as an oligonucleotide, whether derived from a naturally occurring molecule such as one isolated from a restriction digest or one produced synthetically, which is capable of acting as a point of initiation of synthesis when placed under conditions in which synthesis of a primer extension product which is complementary to a nucleic acid strand is induced, i.e., in the presence of nucleotides and an agent for polymerization such as DNA polymerase and at a suitable temperature and pH. The primer is preferably single stranded for maximum efficiency in amplification, but may alternatively be double stranded. If double stranded, the primer is first treated to separate its strands before being used to prepare extension products. The primer may be an oligodeoxyribonucleotide. The primer must be sufficiently long to prime the synthesis of extension products in the presence of the agent for polymerization. The exact lengths of the primers will depend on many factors according to the particular application, including temperature and source of primer. For example, depending on the complexity of the target sequence, the oligonucleotide primer typically contains at least 18, at least 19, at least 20, at least 21, at least 22, at least 23 or at least 24 nucleotides, which are identical or complementary to the template and optionally a tail of variable length which need not match the template. It is understood the longer the nucleic acid binding sequence of the primer, the more specific the binding of the primer to the intended target sequence. However, the length of the tail should not be so long that it interferes with the recognition of the template. Short primer molecules generally require cooler temperatures to form sufficiently stable hybrid complexes with the template.

[0029] The primers herein are selected to be “substantially” complementary to the different strands of each specific sequence to be amplified. This means that the primers must be sufficiently complementary to hybridize with their respective strands. Therefore, the primer sequence need not reflect the exact sequence of the template. For example, a non complementary nucleotide fragment may be attached to the 5' end of the primer, with the remainder of the primer sequence being complementary to the strand. Alternatively, non complementary bases or longer sequences can be interspersed into the primer, provided that the primer sequence has sufficient complementarity with the sequence of the strand to be amplified to hybridize therewith and thereby form a template for synthesis of the extension product of the other primer. Computer generated searches using programs such as Primer3 (www-genome.wi.mit.edu/cgi-bin/primer/primer3.cgi), STSPipeline (www- genome.wi.mit.edu/cgi-bin/www-STS_Pipeline), or GeneUp, for example, can be used to identify potential PCR primers. Exemplary primers include primers that are 18 to 24 bases long, where at least between 18 bases are identical or complementary to at least 18 bases of a segment of the template sequence. In some examples described herein, primers that are 20 bases long are used. In other examples, primers that are 19, 21, 22, 23, or 24 bases long are used. [0030] The term“backbone sequence” refers to an artificial sequence generated from assembly of sequencing reads from multiple strains of an organism (such as Ganoderma in this case), which may share a similar characteristic (such as being pathogenic or non- pathogenic). The assembly which generates such a“backbone sequence” may be performed using CLC Bio Assembly Cell Version 4.06beta.67l89.

[0031] This disclosure also contemplates and provides primer pairs for amplification of nucleic acid molecules in order to detect, for example, polymorphisms or single nucleotide polymorphisms (SNPs). As used herein“primer pair” refers to a set of two oligonucleotide primers based on two separated sequence segments of a target nucleic acid sequence. One primer of the pair is a“forward primer” or“5' primer” having a sequence which is identical to the more 5' of the separated sequence segments (sometimes also denoted as the“+” strand). The other primer of the pair is a“reverse primer” or“3' primer” having a sequence which is the reverse complement of the more 3' of the separated sequence segments (sometimes denoted as the strand). A primer pair allows for amplification of the nucleic acid sequence located between the binding sites of the two primers on the double- stranded template nucleic acid sequence. Optionally, each primer pair can comprise additional sequences, e.g. universal primer sequences or restriction endonuclease sites, at the 5' end of each primer, e.g. to facilitate cloning, DNA sequencing, or re-amplification of the target nucleic acid sequence.

[0032] The term“isolate” when used in reference to a microorganism isolate, such as but not limited to, Ganoderma isolates, refers to a culture of microorganism removed from its original environment.

[0033] Unless specified otherwise, the terms “comprising” and “comprise”, and grammatical variants thereof, are intended to represent“open” or“inclusive” language such that they include recited elements but also permit inclusion of additional, unrecited elements.

[0034] As used herein, the term “about”, in the context of, but not limited to, concentrations of DNA, chemicals, chemical solutions, enzymes or components of a buffer, typically means +/- 5% of the stated value, more typically +/- 4% of the stated value, more typically +/- 3% of the stated value, more typically, +/- 2% of the stated value, even more typically +/- 1% of the stated value, and even more typically +/- 0.5% of the stated value.

[0035] Throughout this disclosure, certain embodiments may be disclosed in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the disclosed ranges. Accordingly, the description of a range should be considered to have specifically disclosed all the possible sub-ranges as well as individual numerical values within that range. For example, description of a range such as from one to six should be considered to have specifically disclosed sub-ranges such as from one to three, from one to four, from one to five, from two to four, from two to six, from three to six etc., as well as individual numbers within that range, for example, one, two, three, four, five, and six. This applies regardless of the breadth of the range.

[0036] Certain embodiments may also be described broadly and generically herein. Each of the narrower species and subgeneric groupings falling within the generic disclosure also form part of the disclosure. This includes the generic description of the embodiments with a proviso or negative limitation removing any subject matter from the genus, regardless of whether or not the excised material is specifically recited herein.

[0037] The disclosure illustratively described herein may suitably be practiced in the absence of any element or elements, limitation or limitations, not specifically disclosed herein. Additionally, the terms and expressions employed herein have been used as terms of description and not of limitation, and there is no intention in the use of such terms and expressions of excluding any equivalents of the features shown and described or portions thereof, but it is recognized that various modifications are possible within the scope of the disclosure claimed. Thus, it should be understood that although the present disclosure has been specifically disclosed by preferred embodiments and optional features, modification and variation of the disclosures embodied therein herein disclosed may be resorted to by those skilled in the art, and that such modifications and variations are considered to be within the scope of this disclosure.

DETAILED DISCLOSURE OF THE EMBODIMENTS

[0038] The present disclosure provides methods for identifying markers indicative of pathogenicity of a Ganoderma species. These markers may be identified by comparing genomic DNA sequences between pathogenic and non-pathogenic species of Ganoderma. For example, when a sequence is only present in the DNA of a pathogenic Ganoderma species and not in the DNA of a non-pathogenic Ganoderma species, the sequence may be selected for further use as a marker indicative of pathogenicity of a Ganoderma strain or isolate. For example, the present disclosure provides methods for the identification of single nucleotide polymorphic (SNP) markers through amplification of DNA that are able to identify and differentiate between pathogenic or non-pathogenic Ganoderma species in varying confidence intervals using the strategy as set out in Figure 2.

[0039] The present disclosure also provides uses of the identified markers for identifying a pathogenic Ganoderma species, differentiating a pathogenic Ganoderma species from a non- pathogenic Ganoderma species, and detecting a pathogenic Ganoderma species in a given sample. For example, the present disclosure provides means for screening a panel of Ganoderma isolates with known pathogenicity (pathogenic or non-pathogenic) using the MiSeq genotyping system.

[0040] A pathogenic Ganoderma species is one that is capable of causing disease in plants, such as but not limited to, stem rot disease. The diseased plants may present one or more of the following symptoms such as, but not limited to white mass of mycelia, leaf chlorosis, drying up of the plant, and plant death. Table 1 provides the symptoms visible on plants, according to the severity of the infection. These symptoms may also be present in one or more of the following parts of a plant infected with a pathogenic Ganoderma species such as, but not limited to, root, shoot, stem, bark, leaves, fruits and seed. A non-pathogenic Ganoderma species, on the other hand, is one that does not cause disease in plants, typically with no visible symptoms as described above.

[0041] The inventors have identified 123 markers (comprising 123 SNP markers) that are able to differentiate between pathogenic and non-pathogenic Ganoderma species, with up to 100% accuracy, as shown in Table 4. For example, the markers comprising SEQ ID NO: 36, SEQ ID NO: 39, SEQ ID NO: 60, SEQ ID NO: 61, SEQ ID NO: 65, SEQ ID NO: 71, SEQ ID NO: 72, SEQ ID NO: 78, SEQ ID NO: 84, SEQ ID NO: 96, and SEQ ID NO: 116 are able to achieve the differentiation at about 50% accuracy. In another example, the markers comprising SEQ ID NO: 10, SEQ ID NO: 13, SEQ ID NO: 45, SEQ ID NO: 48, SEQ ID NO: 51, SEQ ID NO: 54, SEQ ID NO: 57, SEQ ID NO: 59, SEQ ID NO: 62, SEQ ID NO: 66, SEQ ID NO: 70, SEQ ID NO: 76, SEQ ID NO: 77, SEQ ID NO: 87, SEQ ID NO: 94, SEQ ID NO: 99, SEQ ID NO: 101, SEQ ID NO: 103, SEQ ID NO: 112, SEQ ID NO: 119, and SEQ ID NO: 122 are able to achieve the differentiation at about 60% accuracy. In another example, the markers comprising SEQ ID NOs: 25 to 27, SEQ ID NO: 33, SEQ ID NO: 37, SEQ ID NO: 53, SEQ ID NO: 64, SEQ ID NO: 74, SEQ ID NO: 81, SEQ ID NO: 85, SEQ ID NO: 86 SEQ ID NO: 93, SEQ ID NO: 106, SEQ ID NO: 117, and SEQ ID NO: 120 are able to achieve the differentiation at about 70% accuracy. In another example, the markers comprising SEQ ID NO: 38, SEQ ID NO: 46, SEQ ID NO: 49, SEQ ID NO: 55, SEQ ID NO: 56, SEQ ID NO: 63, SEQ ID NO: 67, SEQ ID NO: 69, SEQ ID NO: 90, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO: 98, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 109, SEQ ID NO: 111, and SEQ ID NO: 113 are able to achieve the differentiation at about 80% accuracy. Markers such as those comprising SEQ ID NO: 19, SEQ ID NO: 28, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 58, SEQ ID NO: 92, SEQ ID NO: 100, SEQ ID NO: 105, SEQ ID NO: 107, SEQ ID NO: 108, and SEQ ID NO: 110 are able to achieve the differentiation at about 90% accuracy. In another example, the marker comprising SEQ ID NO: 83 is able to achieve the differentiation at about 100% accuracy.

[0042] The present disclosure also provides the determination of pathogenic and non- pathogenic Ganoderma species by performing artificial infection in oil palm seedlings. The present disclosure further provides the use of Ganoderma isolates to test the pathogenicity (pathogenic or non-pathogenic) detection ability of the SNP-linked PCR primers. Exemplary isolates which may be useful include, but are not limited to, the Ganoderma isolates as set out in Table 2 of the present disclosure. The skilled person can further determine other Ganoderma strains/isolates suitable for testing pathogenicity using methods disclosed in the present disclosure. The flanking regions containing the SNP markers are provided in Table 7.

[0043] These SNP marker related primer sets can be used as a tool for early detection of Ganoderma in the field and in the laboratory and do not rely on the exhibition of physical symptoms of the Ganoderma wood rot disease (which are typically only exhibited at a late stage of infection). The sequences for these SNP marker related PCR primer sets are provided in Table 8 which may be used to amplify a DNA region from a Ganoderma isolate that is flanked by sequences complementary with the PCR primer sets, to thereby identify the SNP sequence linked to the pathogenicity (pathogenic or non-pathogenic) prediction and make a pathogenicity (pathogenic or non-pathogenic) prediction for the isolate. With the disclosed early detection method, planters are able to better manage the spread of the disease and subsequently reduce the loss of fresh fruit bunch yield and loss of plants before their full commercial value is realized.

[0044] In one aspect, there is provided a method for determining the pathogenicity of a Ganoderma species, wherein the method comprises the step of detecting the presence or absence of one or more sequences selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof, wherein the presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is pathogenic, and the absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is non- pathogenic.

[0045] In the context of detecting marker sequences,“absence” can refer to when the marker cannot be detected using a particular detection method or if the level of marker detected is below a pre-determined threshold value. For example, a marker may be considered to be absent if its detected level in a sample containing a Ganoderma DNA is

_2

below a pre-determined threshold value, such as a concentration of less than 10- ng.

[0046] In the context of detecting marker sequences,“presence” can refer to when the marker can be detected using a particular detection method or if the level of marker detected is above a pre-determined threshold value. For example, a marker may be considered to be present if its detectable level in a sample containing Ganoderma DNA is above the pre-

_2

determined threshold value, such as concentration of more than 10- ng. Symptoms of infection (for example, by a Ganoderma species), such as those that are listed in Table 1, may or may not be visible on the plant when one or more of the markers is detected and determined to be present.

[0047] Variants of the markers of the present disclosure may have at least 80% identity, at least 85% identity, at least 90% identity, at least 95% identity or at least 99% identical to any one of SEQ ID NOs 1 to 123. The variants may have about 80%, about 81%, about 82%, about 83%, about 84%, about 85%, about 86%, about 87%, about 88%, about 89%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98% or about 99% identical to any one of SEQ ID NOs 1 to 123. In one embodiment, the variant has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to any one of SEQ ID NOs: 1 to 123.

[0048] In one embodiment, the one or more marker sequences are selected from the group consisting of SEQ ID NO: 19, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 33, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 46, SEQ ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 55, SEQ ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 67, SEQ ID NO: 69, SEQ

ID NO: 74, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ ID NO: 86, SEQ ID

NO: 90, SEQ ID NO: 92, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 97, SEQ ID NO:

98, SEQ ID NO: 100, SEQ ID NO 102, SEQ ID NO: 104, SEQ ID NO: 105, SEQ ID NO:

106, SEQ ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, SEQ ID NO: 110, SEQ ID NO:

111, SEQ ID NO: 113; SEQ ID NO: 117, and SEQ ID NO: 120, or a variant thereof. In one embodiment, the sequences show a confidence interval of at least 70%.

[0049] In another embodiment, the one or more sequences are selected from the group consisting of SEQ ID NO: 25, SEQ ID NO: 28, SEQ ID NO: 38, SEQ ID NO: 46, SEQ ID

NO: 56, SEQ ID NO: 64, SEQ ID NO: 67, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO:

93, SEQ ID NO: 95, SEQ ID NO: 98, SEQ ID NO: 107, and SEQ ID NO: 111, or a variant thereof. Advantageously, these sequences have a percentage of mappable reads of at least

_2

60% and are detectable in a soil sample when present in an amount as low as 10- ng using the methods of the present disclosure.

[0050] The one or more sequences, or a variant thereof, as described above may comprise a single nucleotide polymorphism (SNP). For example, a sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 123, or a variant thereof, comprises a SNP. The nucleotide sequences flanking and containing SNP markers that can be used to identify and differentiate between pathogenic or non-pathogenic Ganoderma species with varying confidence levels using MiSeq are shown in Tables 4 and 7 of the present disclosure (in Example 3).

[0051] Exemplary methods which may be used for detecting SNPs include, but are not limited to, sequencing, hybridization-based methods (such as, but not limited to, dynamic allele- specific hybridization, molecular beacons, and SNP microarrays), enzyme -based methods such as, but not limited to, restriction fragment length polymorphism, Flap endonuclease, primer extensions, 5’ -nuclease, oligonucleotide ligation assay, and other PCR- based methods), and other post-amplification methods which are based on physical properties of the SNP-containing amplified DNA (such as, but not limited to, single strand conformation polymorphism, temperature gradient gel electrophoresis, denaturing high performance liquid chromatography, high-resolution melting of the entire amplicon, use of DNA mismatch binding proteins, SNPlex and surveyor nuclease assay). [0052] The detection using the methods of the disclosure may be carried out on any suitable sample. For example, the detection may be conducted on a sample that potentially contains or is suspected of containing a pathogenic Ganoderma strain. Alternatively, the detection may also be conducted on a sample where there is no suspicion of pathogenic Ganoderma strain infection or presence, for example in a routine check for infection.

[0053] Samples on which the methods of the disclosure may be applied include, but are not limited to, environmental samples such as soil samples, and plant parts selected from root, shoot, stem, bark, leaves, fruits, seed, and laboratory samples such as an isolated microbial strain. The“isolated microbial strain” may be one that has been isolated from environmental samples such as soil samples, and plant parts selected from root, shoot, stem, bark, leaves, fruits and seed, and may be present, for example, as a cell culture.

[0054] The detection in the methods of the disclosure may be achieved using techniques available in the art, such as but not limited to sequencing (such as MiSeq sequencing and HiSeq sequencing), hybridization-based methods, enzyme -based methods, PCR-based methods, and post-amplification methods. In one embodiment, the PCR in a PCR-based method is conducted using at least one set of primers selected from the group consisting of SEQ ID NOs: 124 to 369.

[0055] In one embodiment, the method of the disclosure comprises the steps of: (a) extracting DNA from a sample; (b) subjecting the DNA to sequencing to determine sequences of the extracted DNA; (c) comparing the DNA sequences determined from step (b) to a reference DNA sequence selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof; and (d) determining the absence or presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof, wherein the presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is pathogenic, and the absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is non-pathogenic. Exemplary methods which may be used for extracting DNA from Ganoderma, include, but are not limited to, the rapid preparation protocol based on Reader and Broda (1985), CTAB extraction method and Mericon extraction method. Other protocols suitable for extracting DNA from a fungal sample may also be used. Exemplary methods which may be used for extracting DNA from a soil sample include, but are not limited to, the use of the MOBIO Powerlyzer PowerSoil DNA Isolation Kit. Other protocols suitable for extracting DNA from soil may also be used.

[0056] The methods and uses described herein may be used for detection of pathogenic Ganoderma infection in a wide variety of plants such as, but not limited to, plants or trees susceptible to Ganoderma infection. Exemplary plants which may be susceptible to Ganoderma infection, include, but are not limited to crops (such as oil palm, Mucuna bracteata, coconut, betel nut, sugar palm, sago palm, rubber tree, Acacia mangium ), forest trees, and ornamental plants (such as ornamental palm Chrysalidocarpus lutescens ).

[0057] A pathogenic Ganoderma strain that may be detected or identified using the methods of the disclosure includes, but is not limited to, a strain from Ganoderma boninense species. Additional exemplary Ganoderma species which may also be detected include, but are not limited to, G. steyaertanum, G. weberianum, G. lucidum, G. orbiforme, G. pseudoferreum, G. fornicatum, G. williamsianum and G. applanatum. Exemplary pathogenic species which may be detected are such as, but not limited to G. boninense, G. steyaertanum and G. orbiforme. Exemplary non-patho genic species which may be detected are such as, but not limited to G. weberianum, G. lucidum, G. fornicatum, G. williamsianum, G. pseudorerreum and G. applanatum.

[0058] In another aspect, there is provided a method for detecting presence of a pathogenic strain of Ganoderma in a sample, comprising the steps of: (a) extracting DNA from the sample; (b) subjecting the DNA to sequencing to determine sequences of the extracted DNA; (c) comparing the DNA sequences determined from step (b) to a reference DNA sequence selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof; and (d) determining the absence or presence of one or more sequences selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof, wherein the presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma strain is a pathogenic strain, and the absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma strain is a non-pathogenic strain. Exemplary methods which may be used for extracting DNA from Ganoderma, include, but are not limited to the rapid preparation protocol based on Reader and Broda (1985), CTAB extraction method and Mericon extraction method. Other protocols suitable for extracting DNA from a fungal sample may also be used. Exemplary methods which may be used for extracting DNA from a soil sample, include, but are not limited to the use of the MOBIO Powerlyzer PowerSoil DNA Isolation Kit. Other protocols suitable for extracting DNA from soil may also be used.

[0059] In another aspect, there is provided a method for identifying a nucleic acid sequence containing a marker for Ganoderma pathogenesis comprising the steps of: (a) extracting DNA from a group of pathogenic and non-pathogenic Ganoderma strains; (b) preparing whole-genome sequencing libraries of each of the pathogenic and non-pathogenic Ganoderma strain using the extracted DNA of step (a); (c) sequencing the libraries from step (b) to form a plurality of sequencing reads; (d) assembling the sequencing reads from step (c) to generate a backbone sequence for the pathogenic Ganoderma strain and a backbone sequence for the non-pathogenic Ganoderma strain; (e) mapping the sequencing reads from step (c) from each strain to the backbone sequence of step (d) to perform comparative genomics to identify variant sequence(s); (f) designing and preparing primers from the variant sequence(s) from step (e); (g) applying the primers prepared from step (f) to DNA extracted from the pathogenic and non-pathogenic Ganoderma strains for DNA amplification and sequencing; and (h) classifying the variant sequence(s) into those associated with the pathogenic Ganoderma strains and those associated with the non-pathogenic Ganoderma strains. Comparative genomics is a technique which enables comparison of a genomic feature (such as a DNA sequence) from different organisms in order to identify similar or different features. In the methods of the present disclosure, comparative genomics may be used to identify differences in DNA sequence between pathogenic and non-pathogenic species of Ganoderma which may subsequently be used as molecular markers to determine pathogenicity of any given strain of Ganoderma.

[0060] The extraction of DNA from a sample may be conducted using methods known in the art, such as, but not limited to the rapid preparation protocol based on Reader and Broda (1985), CTAB extraction method and Mericon extraction method; or using commercial kits such as, but not limited to DNeasy Qiagen Plant Mini Kit and MOBIO Powerlyzer PowerSoil DNA Isolation Kit. Sequencing may be performed using methods known in the art, such as, but not limited to next-generation sequencing (such as MiSeq and HiSeq), chain-termination methods and de novo sequencing.

[0061] In another aspect, there is provided an oligonucleotide primer comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 124 to 369. [0062] In another aspect, there is provided a kit for use in the methods defined herein, wherein the kit comprises: (a) one or more primer(s) selected from the group consisting of SEQ ID NOs: 124 to 369 for determining the presence or absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123; (b) one or more container(s) suitable for sample collection; (c) one or more container(s) suitable for sample processing; (d) one or more reagents for sample processing; (e) one or more buffer(s); (f) one or more PCR reagent(s) and (g) instructions for use in accordance with the methods described herein. The kit may further comprise other components such as, but not limited to, filtration columns for sample purification. PCR reagents may include, but are not limited to, nucleotides, template DNA (to be used as negative or positive controls in a reaction), PCR buffers, and enzymes (such as DNA polymerase).

[0063] Establishing a platform and method for the detection of Ganoderma in environmental sample such as soil is challenging due to the complexity of soil DNA as compared to plant samples or pure pathogen. The SNP markers of the present disclosure allow for the early prediction of presence of a pathogenic Ganoderma strain from any plant or environmental sample and will be useful for the planning of how to address the infection status. The markers of the present disclosure can be used in different formats, kits and protocols as determined by a person skilled in the art, for determining the presence of a pathogenic Ganoderma strain.

[0064] Unlike other known methods that detect Ganoderma in diseased palms, the development of SNP markers in the present disclosure allows the differentiation of pathogenic Ganoderma strains from non-pathogenic Ganoderma strains. Known methods for detecting Ganoderma involve detection of ergosterol (found in cell membrane of fungi), detection of internal transcribed spacer (ITS) region, protein profiling (WO 2014/109629 Al) and fragment length analysis (WO 2013/066144 Al). In contrast, the present disclosure utilizes molecular markers (particularly SNP markers) that can differentiate between, and thereby detect pathogenic and non-pathogenic Ganoderma strains. SNP markers are more sensitive and specific compared to biochemical markers such as ergosterol and protein profiling. A SNP marker which is based on a particular sequence is also more specific than fragment length analysis which is based on detection of the length of a DNA fragment. Advantageously, a SNP marker which is present in a specific location in the genome is unique and thus more specific than an ITS marker which may be present as multiple copies in a genome due to its high copy number.

[0065] Other known methods also use gel electrophoresis system (which are of a smaller scale), whereas the present disclosure utilizes high-throughput sequencing with the ability to screen 384 samples within 40 hours. In addition, known methods are restricted to pure isolates, while the present disclosure provides methods that can be applied directly onto environmental samples, such as soil samples. The methods of the present disclosure circumvent the additional steps needed for growing and isolating microbes from samples prior to identification of Ganoderma, which results in shorter detection time.

BRIEF DESCRIPTION OF THE DRAWINGS

[0066] The disclosure will be better understood with reference to the detailed description when considered in conjunction with the non-limiting examples and the accompanying drawings, in which:

[0067] Fig. 1. The figure shows visual symptoms of Ganoderma infection as described in Table 1 for categorizing the disease into one of the 5 disease classes.

[0068] Fig. 2. The flow chart describes the process of identification of the markers for determining the presence of pathogenic or non-pathogenic Ganoderma directly from soil sample as described in Example 3.

TABLES

[0069] Table 1: Disease severity index for Ganoderma disease in oil palm.

[0070] Table 2: Test list of pathogenic and non-pathogenic Ganoderma strains.

[0071] Table 3: List of pathogenic and non-pathogenic Ganoderma strains used for amplicon screening.

[0072] Table 4: Percentage confidence level for each SNP marker in differentiating pathogenic and non-pathogenic Ganoderma.

[0073] Table 5: List of markers showing the associated mappable reads (criterion 1) and their detectability at 10^-2 ng in soil samples (criterion 2). The markers which were ultimately selected were those which have mappable reads of at least 60% and are able to consistently detect Ganoderma DNA in soil present in an amount as low was 10^-2 ng

[0074] Table 6: Shortlist of markers showing the associated mappable reads (criterion 1) and their detectability at 10^-2 ng in soil samples (criterion 2). [0075] Table 7: Flanking sequences containing SNPs associated with pathogenicity of Ganoderma.

[0076] Table 8: Primer sequences for amplicon sequencing.

EXAMPLES

Example 1 : Materials and Methods

Method for identifying a nucleic acid sequence containing a marker for Ganoderma pathogenesis

[0077] In order to mine for candidate markers for distinguishing between pathogenic and non-pathogenic Ganoderma, comparative genomics was performed between both groups. The genome size of Ganoderma species is estimated to be approximately 50 Mbps. For the marker discovery, a total of 16 pathogenic Ganoderma isolates (i.e. strains) and six non- pathogenic Ganoderma isolates were sequenced. Whole genome sequencing of at least lOOx coverage for each isolate was performed on Illumina HiSeq2500. After the whole genome sequencing, all the reads generated were pulled together to be assembled as a combined assembly using CLC Bio Assembly Cell version 4.06beta.67l89. The assembled sequence serves as the backbone for the comparative genome for the SNP mining. Upon establishing the backbone sequence, all the reads from each isolate were mapped using Burrows-Wheeler Alignment (BWA) version 0.5.9 to the backbone and variant called using SAMtools (verison 0.1.16) with base quality cut off of >Q25 and base coverage of >4 reads. Thereafter, polymorphic SNPs between pathogenic and non-pathogenic Ganoderma were identified. A total of 123 SNP were shortlisted to be used for amplicon screening using the 86 isolates from Table 3.

Whole-genome Sequencing (WGS)

[0078] WGS Illumina libraries were prepared using DNA isolated from 16 pathogenic and six non-pathogenic Ganoderma isolates (listed in Table 2). DNA extraction was performed according to the rapid preparation protocol based on Reader and Broda (1985). Genomic DNA (2 pg) was sheared using the Covaris S220 to generate insert size of approximately 550 bp. The sheared DNA was end repaired, size selected, added with A-tail and ligated with Illumina adaptors using TruSeq DNA PCR-free Library Prep Kit (from Illumina). The prepared libraries were verified using Agilent Technologies 2100 Bioanalyzer for qualitative purposes on the library size distribution. A SYBR green quantitative PCR (qPCR) assay with primers specific to the Illumina adapters (KAPA Library Quantification Kits) was used to determine the concentration of the libraries. The final library was mixed with Illumina- generated PhiX control libraries and denatured using sodium hydroxide. The detailed protocol with index sequences is provided by Illumina protocol and the supplemental material (which can be found on the Illumina website at https://support.illumina.com/content/dam/illumina- support/documents/documentation/chemistry_documentation/samplepreps_truseq/truseqdnap crfree/truseq-dna-pcr- free-library-prep-guide- 15036187-d.pdf,

https://support.illumina.com/content/dam/illumina- support/documents/documentation/system_documentation/hiseqkits/hiseq-ga-denaturing- diluting-libraries-reference-guide-l5050l07-03.pdf and https://support.illumina.com/content/dam/illumina- support/documents/documentation/system_documentation/hiseq2500/hiseq-2500-system- guide-l5035786-0l.pdf). The denatured libraries were loaded into cBot for cluster generation on HiSeq flowcell (TruSeq Cluster Kit v3) and were sequenced on the Illumina Hiseq 2500 using TruSeq SBS Kit v3, paired-end 2x100 cycles.

Ganoderma Genotyping-By-Sequencing (GBS) Analysis

[0079] Raw reads generated from Next-Generation Sequencing (NGS) of 16 pathogenic and six non-pathogenic Ganoderma isolates (listed in Table 2) using HiSeq2000 2 x lOObp pair-ended approach was filtered at base quality cut off of >Q25 and trimmed at sequence length >72bp using trim_fastq.pl. Filtered and trimmed reads were mapped to the G. boninense backbone sequence using Burrows-Wheeler Alignment (BWA) version 0.5.9. Single nucleotide polymorphism (SNP) variant calling was performed using SAMtools version 0.1.16 with base quality cut off of >Q25 and base coverage of >4 reads.

Shortlisting of the Candidate markers from GBS results

[0080] SNPs markers were selected based on variation that shows homozygous / heterozygous bases differentiating at least 50% of each of the pathogenic and non-pathogenic Ganoderma isolates. The SNP markers differentiating pathogenic and non-pathogenic Ganoderma isolates were shortlisted for MiSeq sequencing analysis.

Sample preparation for sequencing

[0081] A total of 71 libraries were constructed using the pool of 123 amplicons of the regions of interest from 71 Ganoderma samples listed in Table 3. The DNA used for constructing the libraries was isolated from Ganoderma using the rapid preparation protocol based on Reader and Broda (1985). Amplicons were generated using a high-fidelity polymerase (KAPA HiFi HotStart ReadyMix) and then were purified using a magnetic bead capture kit (Agencourt AMPure XP) and quantified using a fluorometric kit (QuantIT PicoGreen; Invitrogen) and Agilent Technologies 2100 Bioanalyzer (Agilent High Sensitivity DNA Kit). The purified amplicons were pooled according to the sample and indexed using Nextera XT index kit. The libraries were then purified using magnetic bead capture kit and verified using Agilent Technologies 2100 Bioanalyzer (Agilent High Sensitivity DNA Kit) on the library size distribution. Samples from all libraries were pooled using equal molar quantities of DNA into one final library and the final concentration of the pooled library was determined using qPCR assay as described above in the WGS protocol using Illumina HiSeq 2500. The final pooled library was mixed with Illumina-generated PhiX control libraries and denatured using sodium hydroxide. The detailed protocol with index sequences is provided by Illumina protocol for 16S metagenomics sequencing library preparation (https://support.illumina.com/content/dam/illumina- support/documents/documentation/chemistry_documentation/l6s/l6s-metagenomic-library- prep-guide-l5044223-b.pdf and https://support.illumina.com/content/dam/illumina- support/documents/documentation/system_documentation/miseq/miseq-reagent-kit-v3- reagent-prep-guide-l5044983-b.pdf) and the supplemental material. The sequencing runs were performed on MiSeq using MiSeq Reagent V3 kit (paired-end 2x300 cycles) with Illumina Real Time Analysis (RTA) and HiSeq Control Software (HCS).

[0082] Subsequent screenings of amplicon sequencing using MiSeq system to further shortlist the markers were carried out using similar methods as described above using the appropriate number of markers and samples.

MiSeq Sequencing Analysis

[0083] Sequence base calling of MiSeq flow cell was performed using MiSeq reporter with zero base mismatch allowed for de-multiplexing based on sample index sequences. For the MiSeq run with Ganoderma DNA, de-multiplexed reads were mapped to the targeted region of the SNP location without secondary de-multiplexing step. For MiSeq run with soil DNA, de-multiplexed reads based on samples were further secondly de-multiplexed based on amplicons primer sequences with minimal three base mismatches using fastx_barcode_splitter.pl script in fastx toolkit version 0.0.13.2. Raw sequences were filtered with base quality cut off of >Q25 and trimmed at sequence length >72 bp using trim_fastq.pl. Filtered and trimmed sequences were mapped to the backbone sequence prepared as described above using Burrows-Wheeler Alignment (BWA) version 0.5.9 with default parameter settings (<4% mismatch of the read length). Single nucleotide polymorphism (SNP) variant calling was performed using SAMtools version 0.1.16 with based quality cut off of >Q25 and base coverage of >4 reads.

Example 2: Determination of Ganoderma Pathogenicity Level

[0084] To measure and observe the effect that isolated Ganoderma strains have on oil palm, the oil palm seeds were germinated and then infected with the Ganoderma strains through the use of rubber wood blocks that were inoculated with the Ganoderma strains. The rubber wood blocks were prepared using the following method:

[0085] Step 1: Rubber wood blocks measuring 2.5” x 2.5” x 5” were placed into an autoclavable plastic bag with media suitable for Ganoderma growth, sealed and then autoclave sterilized.

[0086] Step 2: The prepared rubber wood blocks containing media were then inoculated with Ganoderma by introducing media with cultivated Ganoderma on it.

[0087] Step 3: Using this method, each of the strains of Ganoderma were localized onto rubber wood blocks which were then used to challenge the oil palm seedlings as part of a pathogenicity trial so that the pathogenicity (pathogenic or non-pathogenic) of the Ganoderma strains can be determined.

[0088] A total of 22 strains of Ganoderma (listed in Table 2) were tested using this method.

[0089] The experimental design of the pathogenicity trial consisted of randomized complete block design with four replicates and 12 plots per replicate for each of the Ganoderma strain. After the germinated oil palm seeds were introduced to the Ganoderma impregnated rubber wood blocks, bi-weekly observations were carried out to observe the development of the disease symptoms until 36 weeks post-infection. A disease severity index was created as shown in Table 1 and the Ganoderma strains were assessed to be either pathogenic or non-pathogenic based on visual inspection of the symptoms shown in Figure 1. In cases where the oil palm seedlings did not exhibit symptoms of Ganoderma disease, the Ganoderma strain used were categorized as non-pathogenic. [0090] Table 1: Disease severity index for Ganoderma disease in oil palm.

Example 3: Pathogenicity Testing

Marker Identification

[0091] A total of 22 Ganoderma isolates (listed in Table 2) which are either non- pathogenic or pathogenic were selected for genome sequencing. On average, each genome has a genome coverage of 100X to provide a good assembly. The genomes were individually mapped to the Ganoderma backbone sequence.

[0092] Table 2: Test list of pathogenic and non-pathogenic Ganoderma strains.

[0093] Comparative genomics for each of the 16 pathogenic and 6 non-pathogenic genomes (from the 22 strains in Table 2) to the backbone sequences found that the amount of SNP varies from isolate to isolate. The number of SNP discovered in each genome ranges from 8,841 to 958,490. In silico comparison of SNPs that are able to differentiate the pathogenic and non-pathogenic group was done and 123 unique SNPs were identified (SEQ ID NOs: 1 to 123 as set out in Table 7). These SNPs have no flanking SNPs within 300 bp. The criterion for selection is that each SNP has to be able to predict the pathogenicity or non pathogenicity of an isolate for at least 50% of a test list of pathogenic and non-pathogenic isolates. Each SNP is compared against the list and only those that satisfied the criterion of being able to correctly predict pathogenicity or non-pathogenicity 50% of the time is selected.

[0094] Primers designed for the 123 SNP markers (as set out in Table 8) were screened on 71 Ganoderma isolates using MiSeq which amplified 800bp amplicons. Mapping of the sequences based on amplicon found that the 123 markers have different percentage confidence level of differentiating pathogenic and non-pathogenic Ganoderma. The fragments of the sequences generated from the sequencing were mapped to the expected SNP position on the backbone sequence to identify the SNP call on the sequenced fragment (of each sample).

[0095] Table 3: List of pathogenic and non-pathogenic Ganoderma strains used for amplicon screening.

[0096] To determine the confidence level of a SNP marker, if for example 50 out of 56 of the pathogenic isolates tested have the expected SNP call associated with pathogenicity, then the confidence level is 89%. The percentage confidence level for each marker based on this calculation method is as tabulated in Table 4.

[0097] There is one SNP marker that showed 100% confidence level in differentiating pathogenic and non-pathogenic Ganoderma, 11 markers at 90% confidence level, 17 SNP markers at 80% confidence level, followed by 15 markers at 70% confidence level, 21 SNP markers at 60% confidence level, 11 SNP markers at 50% confidence level and 47 SNP markers that is less than 50% confidence level. The assay was repeated and the results were reproducible.

[0098] Table 4: Percentage confidence level for each SNP marker in differentiating pathogenic and non-pathogenic Ganoderma. Each SNP marker was screened on 71 pure Ganoderma samples. SNP call [P] refers to the SNP observed in pathogenic Ganoderma.

[0099] Based on the screening results of 123 markers on 71 pure Ganoderma samples, 44 markers with at least 70% confidence level (as listed in Table 5) were selected to be tested on environmental samples such as soil to assess the sensitivity of the SNP markers in detecting pathogenic Ganoderma in the soil using MiSeq. [00100] Six soil samples were collected from oil palm estate (three of which are from

Ganoderma infected areas and another three from non-inf ected areas).

[00101] In order to test the sensitivity limit of the 44 markers, five sets of the same soil samples were spiked with different concentrations of DNA isolated from a pathogenic strain of Ganoderma (Isolate ID: SG12-G1-A ): lng, l0^-2ng, l0^-4ng, l0^-6ng, and l0^-8ng. DNA was extracted from Ganoderma using the rapid preparation protocol based on Reader and Broda

(1985). DNA was extracted from these six sets of soil samples (one set original soil and five sets spiked with Ganoderma soil) for the marker screening using MiSeq using the MOBIO

Powerlyzer PowerSoil DNA Isolation Kit according to the manufacturer’s recommendation. Similar protocols and parameters were adopted for the DNA sequencing and bioinformatics analysis. The results showed that detection of Ganoderma in the soil could be achieved even at concentration as low as 10^-2 ng. These 44 markers were further shortlisted to 14 markers which can effectively detect Ganoderma in soil samples.

[00102] The criterion for SNP marker selection was that there must be more than 60% of the total reads mappable to the targeted regions in both spiked and non-spiked samples, and also be able to detect Ganoderma DNA at concentration as low as 10^-2 ng.

[00103] Table 5 summarizes the results of screening 44 markers on soil samples. A total of 14 SNP markers in the table were shortlisted based on the two criteria described above.

[00104] Table 5: List of markers showing the associated mappable reads (criterion 1) and their detectability at 10^-2 ng in soil samples (criterion 2). The markers which were ultimately selected were those which have mappable reads of at least 60% and are able to consistently detect Ganoderma DNA in soil present in an amount as low was 10^-2 ng.

[00105] Table 6: Shortlist of markers showing the associated mappable reads (criterion 1)

_2

and their detectability at 10- ng in soil samples (criterion 2).

[00106] Table 7: Flanking sequences containing SNPs associated with pathogenicity of

Ganoderma.

[00107] Table 8: Primer sequences for the amplicon sequencing.

[00108] References:

IDRIS, A S; MIOR, M H A Z; MAIZATUL, S M and KUSHAIRI, A (2011). Survey on status of Ganoderma disease of oil palm. Proc. of the PIPOC 2011 International Palm Oil Congress - Agriculture Conference. MPOB, Bangi. p. 235-238

READER, U. and BRODA, P (1985) Rapid preparation of DNA from filamentous fungi. Letters in Applied Microbiology, 1:17-20

Claims

1. A method for determining the pathogenicity of Ganoderma species, wherein the method comprises the step of detecting the presence or absence of one or more sequences selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof, wherein the presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is pathogenic, and the absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is not pathogenic.

2. The method of claim 1, wherein the one of more sequences are selected from the group consisting of SEQ ID NO: 19, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 33, SEQ ID NO: 37, SEQ ID NO: 38, SEQ ID NO: 46, SEQ

ID NO: 49, SEQ ID NO: 50, SEQ ID NO: 52, SEQ ID NO: 53, SEQ ID NO: 55, SEQ

ID NO: 56, SEQ ID NO: 58, SEQ ID NO: 63, SEQ ID NO: 64, SEQ ID NO: 67, SEQ

ID NO: 69, SEQ ID NO: 74, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 85, SEQ

ID NO: 86, SEQ ID NO: 90, SEQ ID NO: 92, SEQ ID NO: 93, SEQ ID NO: 95, SEQ

ID NO: 97, SEQ ID NO: 98, SEQ ID NO: 100, SEQ ID NO: 102, SEQ ID NO: 104, SEQ ID NO: 105, SEQ ID NO: 106, SEQ ID NO: 107, SEQ ID NO: 108, SEQ ID NO: 109, SEQ ID NO: 110, SEQ ID NO: 111, SEQ ID NO: 113, SEQ ID NO: 117, and SEQ ID NO: 120, or a variant thereof.

3. The method of claim 1 or 2, wherein the one of more sequences are selected from the group consisting of SEQ ID NO: 25, SEQ ID NO: 28, SEQ ID NO: 38, SEQ ID NO: 46, SEQ ID NO: 56, SEQ ID NO: 64, SEQ ID NO: 67, SEQ ID NO: 81, SEQ ID NO: 83, SEQ ID NO: 93, SEQ ID NO: 95, SEQ ID NO: 98, SEQ ID NO: 107, and SEQ ID NO: 111, or a variant thereof.

4. The method of any one of the preceding claims, wherein the one or more sequences, or a variant thereof, comprises a single nucleotide polymorphism (SNP).

5. The method of claim 4, wherein the one or more sequences, or a variant thereof, comprises a SNP selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 123.

6. The method of any one of the preceding claims, wherein the variant has at least 85%, at least 86%, at least 87%, at least 88%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to one of SEQ ID NOs 1 to 123.

7. The method of any one of the preceding claims, wherein the detecting is conducted on a sample potentially containing or suspected of containing said Ganoderma species.

8. The method of the method of claim 7, wherein the sample is selected from the group consisting of soil, root, shoot, stem, bark, leaves, fruits, seed and an isolated microbial strain.

9. The method of any one of the preceding claims, wherein the detecting is conducted using a technique selected from the group consisting of sequencing, hybridization-based methods, enzyme-based methods, PCR-based methods, and post-amplification methods.

10. The method of claim 9, wherein the PCR is conducted using at least one set of primers selected from the group consisting of SEQ ID NOs: 124 to 369.

11. The method of any one of claims 7 to 10, comprising the steps of:

(a) extracting DNA from the sample;

(b) subjecting the DNA to sequencing to determine sequences of the extracted DNA;

(c) comparing the DNA sequences determined from step (b) to a reference DNA sequence selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof;

(d) determining the absence or presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof.

12. The method of any one of the preceding claims, wherein the Ganoderma species is pathogenic in a plant selected from the group consisting of crops, wherein the crop is optionally selected from a group consisting of oil palm, Mucuna bracteata, coconut, betel nut, sugar palm, sago palm, rubber tree, and Acacia mangium forest trees; and ornamental plants, optionally selected from Chrysalidocarpus lutescens.

13. The method of any one of the preceding claims, wherein the Ganoderma is selected from the group consisting of a Ganoderma boninense, Ganoderma orbiforme, Ganoderma sp., Ganoderma orbiforme, Ganoderma fornicatum, Ganoderma lucidum, Ganoderma weberianum, Ganoderma williamsianum, Ganoderma steyaertanum and Ganoderma applanatum.

14. The method of claim 13, wherein the Ganoderma sp. is selected from the group consisting of Ganoderma sp. BL-6 18S (accession no. JN400509.1), Ganoderma sp. BL-9 18S (accession no. JN400510.1), Ganoderma sp. BL-94 18S (accession no. JN400511.1), Ganoderma sp. BP-16 18S (accession no. JN400513.1), Ganoderma sp. BRIUMSa (accession no. JN234427.1), Ganoderma sp. BRIUMSb (accession no. JN234428.1), Ganoderma sp. BRIUMSc (accession no. JN234429.1), Ganoderma sp. C 16452 (accession no. EU239386.1), Ganoderma sp. MT-l (accession no. AY220543.1), Ganoderma sp. G31 (accession no. KR093030.1), and Ganoderma sp. MEL 2382607 (accession no. KP012934.1)

15. The method of claim 13, wherein the Ganoderma is a Ganoderma boninense, Ganoderma steyaertanum or Ganoderma orbiforme species.

16. A method for detecting presence of pathogenic Ganoderma in a sample, comprising the steps of:

(a) extracting DNA from the sample;

(b) subjecting the DNA to sequencing to determine the sequences of the extracted DNA;

(c) comparing the DNA sequences determined from step (b) to a reference DNA sequence selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof; and (d) determining the absence or presence of one or more sequences selected from a group consisting of SEQ ID NOs: 1 to 123 or a variant thereof, wherein the presence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is pathogenic, and the absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123 or a variant thereof indicates that the Ganoderma species is not pathogenic.

17. The method of claim 16, wherein the sample is selected from the group consisting of root, shoot, stem, bark, leaves, fruits, seed and an isolated microbial strain.

18. A method for identifying a nucleic acid sequence containing a marker for Ganoderma pathogenesis comprising the steps of:

(a) extracting DNA from a group of pathogenic and non-pathogenic Ganoderma strains;

(b) preparing whole-genome sequencing libraries of each of the pathogenic and non- pathogenic Ganoderma strain using the extracted DNA of step (a);

(c) sequencing the libraries from step (b) to form a plurality of sequencing reads;

(d) assembling the sequencing reads from step (c) to generate a backbone sequence for the pathogenic Ganoderma strain and a backbone sequence for the non-pathogenic Ganoderma strain;

(e) mapping the sequencing reads from step (c) from each strain to the backbone sequence of step (d) to perform comparative genomics to identify variant sequence(s);

(f) designing and preparing primers from the variant sequence(s) from step (e);

(g) applying the primers prepared from step (f) to DNA extracted from the pathogenic and non-pathogenic Ganoderma strains for DNA amplification and sequencing; and

(h) classifying the variant sequence(s) into those associated with the pathogenic Ganoderma strains and those associated with the non-pathogenic Ganoderma strains.

19. An oligonucleotide primer comprising a nucleic acid sequence selected from the group consisting of SEQ ID NOs: 124 to 369.

20. A kit for use in the methods of any one of claims 1 to 18, wherein the kit comprises: (a) one or more primer(s) selected from the group consisting of SEQ ID NOs: 124 to 369 for determining the presence or absence of one or more sequences selected from the group consisting of SEQ ID NOs: 1 to 123;

(b) one or more container(s) suitable for sample collection;

(c) one or more container(s) suitable for sample processing;

(d) one or more reagents for sample processing;

(e) one or more buffer(s);

(f) one or more PCR reagent(s); and

(g) instructions for use in accordance with the methods any one of claims 1 to 18.