CN113913499B

CN113913499B - Method for detecting target mutation by using Cas12j effector protein

Info

Publication number: CN113913499B
Application number: CN202011567811.1A
Authority: CN
Inventors: 梁亚峰; 段志强
Original assignee: Shandong Shunfeng Biotechnology Co Ltd
Current assignee: Shandong Shunfeng Biotechnology Co Ltd
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2024-07-16
Anticipated expiration: 2040-12-25
Also published as: CN113913499A

Abstract

The invention provides methods for detecting mutations of interest using Cas12j effector proteins. The method is a method of detecting the presence or absence of a mutation of interest in a target nucleic acid using a Cas12j effector protein, comprising contacting a sample with a V-type CRISPR/Cas effector protein, a gRNA (guide RNA) comprising a region that binds to the CRISPR/Cas effector protein and a guide sequence that hybridizes to a mutant target nucleic acid, and a single stranded nucleic acid detector; the detectable signal generated by the CRISPR/CAS effector protein cleavage single stranded nucleic acid detector is detected, with the mutation of interest being located at positions 1-20, preferably 9-10, of the gRNA targeting sequence from the 5' end.

Description

Method for detecting target mutation by using Cas12j effector protein

Technical Field

The invention relates to the field of nucleic acid detection, in particular to a method for detecting target mutation by using CRISPR technology, in particular to a method for detecting target mutation by using Cas12j effector protein.

Background

The method for specifically detecting the nucleic acid molecule (Nucleic acid detection) has important application values, such as pathogen detection, genetic disease detection and the like. In pathogen detection, since each pathogen microorganism has a unique characteristic nucleic acid molecule sequence, nucleic acid molecule detection for specific species, also called nucleic acid diagnosis (NADs, nucleic acid diagnostics) can be developed, and has important significance in the fields of food safety, environmental microorganism pollution detection, human pathogen infection and the like. Another aspect is the detection of single nucleotide polymorphisms (SNPs, single nucleotide polymorphisms) of humans or other species. Understanding the relationship between genetic variation and biological function at the genomic level provides a new perspective for modern molecular biology, and SNPs are closely related to biological functions, evolution, diseases and the like, so the development of SNPs detection and analysis techniques is particularly important.

The detection of specific nucleic acid molecules established at present usually requires two steps, the first step being the amplification of the nucleic acid of interest and the second step being the detection of the nucleic acid of interest. The existing detection technology comprises a restriction endonuclease method, southern, northern, a spot hybridization method, a fluorescent PCR detection technology, a LAMP loop-mediated isothermal amplification technology, a recombinase polymerase amplification technology (RPA) and other methods. After 2012, CRISPR gene editing technology is raised, zhang Feng groups developed a new nucleic acid diagnosis technology (SHERLOCK technology) with Cas13 as a core targeting RNA based on RPA technology, doudna groups developed a diagnosis technology (DETECTR technology) with Cas12 enzyme as a core, shanghai institute of plant physiology and ecology institute of china, king doctor and the like developed a new nucleic acid detection technology (HOLMES technology) based on Cas 12. Nucleic acid detection techniques developed based on CRISPR technology are playing an increasingly important role.

The invention applies the CRISPR nucleic acid detection technology to detect whether the target nucleic acid has mutation in a target region and detect whether the target nucleic acid has target mutation, and in particular provides an efficient detection method.

Disclosure of Invention

The invention provides a method, a system and a kit for detecting whether a target mutation site exists in target nucleic acid and detecting whether mutation exists in a target region of the target nucleic acid by using Cas12j effector protein.

Method for detecting mutation of target

In one aspect, the invention provides a method for detecting the presence or absence of a mutation site of interest in a target nucleic acid using a Cas12j effector protein, the method comprising contacting the target nucleic acid with a type V CRISPR/Cas effector protein, a gRNA (guide RNA) comprising a region that binds to the CRISPR/Cas effector protein and a guide sequence that hybridizes to a mutant target nucleic acid containing the mutation of interest, the guide sequence comprising a base that pairs with the mutation site of interest, and a single stranded nucleic acid detector; the detectable signal generated by the CRISPR/CAS effector protein cleavage single stranded nucleic acid detector is detected.

In one embodiment, the mutation of interest is a site of inconsistency between the wild-type target nucleic acid and the mutant target nucleic acid within the region targeted by the gRNA targeting sequence; since the guide sequence of the gRNA hybridizes to the mutant target nucleic acid, including the base pairing with the target mutation site, the target mutation site is also the site where the guide sequence of the gRNA is not identical to the wild-type target nucleic acid sequence targeted by the target mutation site.

In one embodiment, the guide sequence of the gRNA comprises at least 13 bases, e.g., 13-30 bases, e.g., 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, or 29 bases.

In one embodiment, the mutation of interest comprises a single base mutation, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen base mutations, or more base mutations; the target mutation may be a continuous base mutation or a discontinuous base mutation; preferably, the mutation of interest comprises a single base mutation or a two base mutation, preferably, the two base mutation is a continuous two base mutation.

In one embodiment, the base paired with the mutation site of interest is disposed at one or more of positions 1-20, specifically at one or more of positions 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 of the 5' end of the gRNA targeting sequence; preferably, one or more of the 7-16 positions of the 5 'end, more preferably, the 9 th and/or 10 th positions of the 5' end.

In one embodiment, the target mutation is a single base mutation, and the base paired with the target mutation site is disposed at positions 1-20, specifically positions 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 of the 5' end of the gRNA targeting sequence; preferably, the 5 'end is at positions 7-16, more preferably, the 5' end is at positions 9 or 10;

The detectable signal intensity for detecting mutant target nucleic acid in the above method is significantly different from the signal for detecting wild-type target nucleic acid; specifically, the detectable signal for detecting a mutant target nucleic acid is significantly stronger than that for detecting a wild-type target nucleic acid, and at this time, it can be determined whether or not the target mutation site is present in the target nucleic acid based on the strength of the detectable signal.

In one embodiment, the detectable signal of the wild-type target nucleic acid and the detectable signal of the mutant target nucleic acid may be different detectable signals, and the detectable signal of the wild-type target nucleic acid and the detectable signal of the mutant target nucleic acid may be the same detectable signal.

Preferably, the method further comprises the step of detecting a wild-type target nucleic acid for comparison, and the method further comprises the step of providing a standard wild-type target nucleic acid.

Preferably, the method further comprises the step of detecting the mutant target nucleic acid for comparison, and the method further comprises the step of providing a standard mutant target nucleic acid.

In one embodiment, the invention also provides the use of a V-type CRISPR/CAS effector protein, a gRNA (guide RNA) and a single stranded nucleic acid detector as described above for the preparation of a reagent, composition or kit for detecting the presence or absence of a mutation site of interest in a target nucleic acid.

In one embodiment, the mutation site of interest includes a substitution (substitution), insertion or deletion, preferably. The mutation is a point mutation.

In one embodiment, the target nucleic acid is amplified to confirm the presence of a 500bp upstream (5 'end) to 500bp downstream (3' end) fragment, preferably a 300bp upstream (5 'end) to 300bp downstream (3' end) fragment, and a 200bp upstream (5 'end) to downstream (3' end) fragment, more preferably a 100bp upstream (5 'end) to 100bp downstream (3' end) fragment, of the target mutation site in the target nucleic acid by conventional amplification methods such as PCR, NASBA, RPA, SDA, LAMP, HAD, NEAR, MDA, RCA, LCR, RAM, preferably PCR, and the confirmation of the presence of a fragment upstream and downstream of the target mutation site is accomplished by conventional methods such as electrophoresis, qPCR, and the like. The method can confirm that an amplification product is obtained, but cannot confirm a specific sequence, particularly whether or not a target mutation site is present.

Method for detecting presence or absence of mutation in target region

In one aspect, the invention provides a method for detecting the presence or absence of a mutation in a target nucleic acid using a Cas12j effector protein, the method comprising contacting a sample with a type V CRISPR/Cas effector protein, a gRNA (guide RNA) comprising a region that binds to the CRISPR/Cas effector protein and a guide sequence that hybridizes to a wild type target nucleic acid, and a single stranded nucleic acid detector; the detectable signal generated by the CRISPR/CAS effector protein cleavage single stranded nucleic acid detector is detected.

In one embodiment, the target region refers to the position of the target nucleic acid targeted at positions 1-20 of the 5' -end of the gRNA targeting sequence, and the presence mutation refers to a mutation comprising a single base within the target region, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty base mutations; the mutation may be a continuous base mutation or a discontinuous base mutation.

In one embodiment, the target region refers to the position of the target nucleic acid targeted at positions 7-16 of the 5' end of the gRNA targeting sequence; the presence mutation refers to a mutation of single base, two, three, four, five, six, seven, eight, nine and ten base in a target region, and the mutation can be a continuous base mutation or a discontinuous base mutation.

In one embodiment, the target region refers to the position of the target nucleic acid targeted at positions 9-10 of the 5' end of the gRNA targeting sequence; the presence mutation refers to a single base mutation or a two base mutation in a target region, and the presence mutation refers to a mutation at the 9 th and/or 10 th positions of the 5' end of the gRNA guiding sequence.

In the above methods, the wild-type target nucleic acid and the mutant target nucleic acid may produce significantly different detectable signals; specifically, the detectable signal for detecting a mutant target nucleic acid is significantly weaker than that for detecting a wild-type target nucleic acid, and at this time, it can be determined whether or not the target mutation site is present in the target nucleic acid based on the strength of the detectable signal.

In other embodiments, the invention also provides the use of a V-type CRISPR/CAS effector protein, a gRNA (guide RNA) and a single stranded nucleic acid detector as described above in the preparation of a reagent, composition or kit for detecting the presence or absence of a mutation in a target region of a target nucleic acid; preferably, the target region is the position of the target nucleic acid targeted by the 1 st to 20 th positions of the 5' end of the gRNA guiding sequence, preferably, the target region is the position of the target nucleic acid targeted by the 7 th to 16 th positions of the 5' end of the gRNA guiding sequence, preferably, the target region is the position of the target nucleic acid targeted by the 9 th to 10 th positions of the 5' end of the gRNA guiding sequence.

In one embodiment, the target nucleic acid is amplified to confirm the presence of a 500bp upstream (5 'end) to 500bp downstream (3' end) fragment, preferably a 300bp upstream (5 'end) to 300bp downstream (3' end) fragment, more preferably a200 bp upstream (5 'end) to 200bp downstream (3' end) fragment, in the target region, in the target nucleic acid, by an amplification method comprising PCR, NASBA, RPA, SDA, LAMP, HAD, NEAR, MDA, RCA, LCR, RAM or other common amplification methods, preferably a PCR method, wherein the confirmation of the presence of a fragment upstream and downstream of the target mutation site is achieved by conventional means such as electrophoresis, qPCR or the like, which can confirm the presence of the amplified product, but cannot confirm the specific sequence, especially the presence of the target mutation site.

Reagents, kits and compositions

In one aspect, the invention provides a reagent, kit or composition for detecting the presence or absence of a mutation site of interest in a target nucleic acid using a Cas12j effector protein, the reagent, kit or composition comprising a type V CRISPR/Cas effector protein as described above, a gRNA (guide RNA) comprising a region that binds to the CRISPR/Cas effector protein and a guide sequence that hybridizes to a mutant target nucleic acid containing the mutation of interest, the guide sequence comprising a base that pairs with the mutation site of interest, and a single stranded nucleic acid detector.

In one aspect, the invention provides a reagent, kit or composition for detecting the presence or absence of mutation in a target nucleic acid by using a Cas12j effector protein, wherein the reagent, kit or composition comprises the V-type CRISPR/CAS effector protein, a gRNA (guide RNA) and a single-stranded nucleic acid detector, the gRNA comprises a region combined with the CRISPR/CAS effector protein and a guide sequence hybridized with a wild-type target nucleic acid, the target region is the position of the target nucleic acid targeted by the 10 th position to the 13 th position of the 5' end of the gRNA guide sequence, preferably the target nucleic acid targeted by the 7 th position to the 16 th position of the 5' end of the gRNA guide sequence, preferably the target region is the position of the target nucleic acid targeted by the 9 th position to the 10 th position of the 5' end of the gRNA guide sequence.

Single-stranded nucleic acid detector

In some embodiments, the single stranded nucleic acid detector does not hybridize to the gRNA.

In one embodiment, the single-stranded nucleic acid detector comprises different reporter groups or marker molecules at both ends, which when in an initial state (i.e., in an uncleaved state) do not exhibit a reporter signal, and when the single-stranded nucleic acid detector is cleaved exhibit a detectable signal, i.e., exhibit a detectable distinction after cleavage from before cleavage.

In some embodiments, the 5 'end and the 3' end of the single-stranded nucleic acid detector are respectively provided with different reporter groups, and when the single-stranded nucleic acid detector is cleaved, a detectable reporter signal can be displayed; for example, a fluorescent group and a quenching group are respectively provided at both ends of the single-stranded nucleic acid detector; or a single-stranded nucleic acid detector is provided with a first molecule (e.g., FAM or FITC) and a second molecule (e.g., biotin) attached to the 3' end, respectively, at both ends.

When a fluorescent group and a quenching group are provided at both ends of a single-stranded nucleic acid detector, respectively, a detectable fluorescent signal can be exhibited when the single-stranded nucleic acid detector is cleaved. The fluorescent group is selected from one or more of FAM, FITC, VIC, JOE, TET, CY, CY5, ROX, texas Red or LC RED 460. The quenching group is selected from one or more of BHQ1, BHQ2, BHQ3, dabcy1 or Tamra.

When the first molecule (such as FAM or FITC) and the second molecule (such as biotin) are respectively arranged at two ends of the single-stranded nucleic acid detector, the reaction system containing the single-stranded nucleic acid detector is matched with the flow strip to detect the characteristic sequence (preferably, a colloidal gold detection mode). The flow strip is designed with two capture lines, with an antibody binding to a first molecule (i.e., a first molecular antibody) at the sample contact end (colloidal gold), an antibody binding to the first molecular antibody at the first line (control line), and an antibody binding to a second molecule (i.e., a second molecular antibody, such as avidin) at the second line (test line). When the reaction flows along the strip, the first molecular antibody binds to the first molecule carrying the cleaved or uncleaved oligonucleotide to the capture line, the cleaved reporter will bind to the antibody of the first molecular antibody at the first capture line, and the uncleaved reporter will bind to the second molecular antibody at the second capture line. Binding of the reporter group at each line will result in a strong readout/signal (e.g., color). As more reporter is cut, more signal will accumulate at the first capture line and less signal will appear at the second line.

In one embodiment, the single stranded nucleic acid detector comprises one or more of: 1) a base modified nucleotide, 2) a glycosyl modified nucleotide, 3) an altered chemical bond, 4) a modified backbone.

In one embodiment, the nucleotide is one or more of ribonucleotide, deoxyribonucleotide, nucleic acid analog; the base of the ribonucleotide is one or more of adenine A, uracil U, cytosine C, guanine G, thymine T and hypoxanthine I; the base of the deoxyribonucleotide is selected from one or any several of A, T, C, G, U, I.

In one embodiment, the base modification is a chemical modification of an adenine, cytosine, guanine, uracil, or thymine component of a nucleotide. Other similar base modifications will be readily apparent to those skilled in the art, and thus, such other methods should also fall within the scope of the present invention.

In one embodiment, the base modified nucleotides further comprise abasic spacers (single stranded nucleic acid detectors comprising locked nucleic acids are also described in chinese application CN 2020108880363); the abasic spacer is selected from one or any of dSpacer,Spacer C3,Spacer C6,Spacer C12,Spacer9,Spacer12,Spacer18,Inverted Abasic Site(dSpacer abasic furan) and rAbasic Site (rSpacer abasic furan).

In one embodiment, the glycosyl modified nucleotides include, 2' -fluoro modifications, 2' -oxymethyl modifications, locked nucleic acids (single stranded nucleic acid detectors comprising locked nucleic acids are also described in chinese application CN 2020105609327), bridged nucleic acids, morpholino nucleic acids, ethylene glycol nucleic acids, hexitol nucleic acids, threose nucleic acids, arabinose nucleic acids, 2' -methoxyacetyl modifications, 2' -amino modifications, 4' -sulfur RNAs, peptide Nucleic Acids (PNAs), cyclohexenyl nucleic acids (CENAs), and combinations thereof; the base of the glycosyl modified nucleotide is selected from one or any several of the bases in A, U, C, G, T, I.

In one embodiment, the altered chemical bonds include modified nucleic acid backbones and unnatural internucleoside linkages, and nucleic acids having modified backbones include those that retain phosphorus atoms in the backbone and those that do not have phosphorus atoms in the backbone.

In one embodiment, the single stranded nucleic acid detector may be linear or circular.

In one embodiment, the detection method can be used for quantitative detection of the feature sequence to be detected. The quantitative detection index can be quantified according to the signal intensity of the reporter group, such as the luminous intensity of the fluorescent group, the width of the color-developing strip, and the like.

CRISPR/CAS effector protein

Further, the V-type CRISPR/CAS effector protein is selected from the group consisting of:

(1) A protein shown in SEQ ID No. 1;

(2) A derivative protein which is formed by substituting, deleting or adding one or more (such as 2,3, 4, 5, 6,7, 8, 9 or 10) amino acid residues of the amino acid sequence shown in SEQ ID No.1 or an active fragment thereof and has basically the same function;

(3) Proteins having 80%,85%,90%,91%,92%,93%,94%,95%,96%,97%,98%,99% identity to the sequence shown in SEQ ID No. 1.

In one embodiment, the Cas protein mutant comprises an amino acid substitution, deletion, or substitution, and the mutant retains at least its trans-cleavage activity. Preferably, the mutants have Cis and trans cleavage activity.

Target nucleic acid

In the present invention, the target nucleic acid includes ribonucleotides or deoxyribonucleotides, including single-stranded nucleic acids, double-stranded nucleic acids, such as single-stranded DNA, double-stranded DNA, single-stranded RNA, double-stranded RNA, DNA-RNA hybrids, or nucleic acid modifications.

In one embodiment, the target nucleic acid is derived from a sample of a virus, bacterium, microorganism, soil, water source, human, animal, plant, or the like.

In one embodiment, the target nucleic acid is the product of PCR, NASBA, RPA, SDA, LAMP, HAD, NEAR, MDA, RCA, LCR, RAM or the like enrichment or amplification.

In one embodiment, the method further comprises the step of obtaining the target nucleic acid from the sample.

In one embodiment, the target nucleic acid is a viral nucleic acid, a bacterial nucleic acid, a specific nucleic acid associated with a disease, such as a specific mutation site or SNP site, or a nucleic acid that differs from a control; preferably, the virus is a plant virus or an animal virus, for example, papilloma virus, hepadnavirus, herpes virus, adenovirus, poxvirus, parvovirus, coronavirus; preferably, the virus is a coronavirus, preferably SARS, SARS-CoV2 (COVID-19), HCoV-229E, HCoV-OC43, HCoV-NL63, HCoV-HKU1, mers-CoV.

In some embodiments, the target nucleic acid is derived from a cell, e.g., from a cell lysate.

In one embodiment, the target nucleic acid is amplified to confirm the presence of a 500bp upstream (5 'end) to 500bp downstream (3' end) fragment, preferably a 300bp upstream (5 'end) to 300bp downstream (3' end) fragment, more preferably a 200bp upstream (5 'end) to 200bp downstream (3' end) fragment, in the target region, in the target nucleic acid, by an amplification method comprising PCR, NASBA, RPA, SDA, LAMP, HAD, NEAR, MDA, RCA, LCR, RAM or other common amplification methods, preferably a PCR method, wherein the confirmation of the presence of a fragment upstream and downstream of the target mutation site is achieved by conventional means such as electrophoresis, qPCR or the like, which can confirm the presence of the amplified product, but cannot confirm the specific sequence, especially the presence of the target mutation site. Detectable signal

In some embodiments, the methods of the invention further comprise the step of measuring the detectable signal produced by the CRISPR/CAS effector protein (CAS protein). After the V-type CRISPR/CAS effect protein contacts gRNA and target nucleic acid, the trans activity is excited, so that the single-stranded nucleic acid detector can be cut more efficiently, and a detectable signal is displayed.

In the present invention, the detectable signal may be any signal that is generated when a single-stranded nucleic acid detector is cleaved. For example, gold nanoparticle based detection, fluorescence polarization, fluorescence signal, colloidal phase change/dispersion, electrochemical detection, semiconductor based sensing.

The detectable signal may be read out by any suitable means including, but not limited to: measurement of detectable fluorescent signals, gel electrophoresis detection (by detecting a change in the band on the gel), detection based on the presence or absence of a visual or sensor color, or differences in color (e.g., based on gold nanoparticles), and differences in electrical signals.

In some embodiments, the measurement of the detectable signal may be quantitative, and in other embodiments, the measurement of the detectable signal may be qualitative.

Proportion of

In one embodiment, the Cas protein to gRNA molar ratio is used in an amount of (0.8-1.2): 1.

In one embodiment, the Cas protein is used in a final concentration of 20-200nM, preferably 30-100nM, more preferably 40-80nM, more preferably 50nM.

In one embodiment, the final concentration of the gRNA is used in an amount of 20-200nM, preferably 30-100nM, more preferably 40-80nM, more preferably 50nM.

In one embodiment, the target nucleic acid is used in a final concentration of 5-100nM, preferably 10-50nM.

In one embodiment, the single stranded nucleic acid detector is used in a final concentration of 100-1000nM, preferably 150-800nM, preferably 200-500nM, preferably 200-300nM.

Application of

On the other hand, the invention also provides application of the Cas12i in preparing a composition, a reagent or a kit for detecting whether target mutation exists in target nucleic acid.

On the other hand, the invention also provides application of the Cas12i in preparing a composition, a reagent or a kit for detecting whether mutation exists in a target region of target nucleic acid.

On the other hand, the invention also provides application of the Cas12i in detecting whether target mutation exists in target nucleic acid and detecting whether mutation exists in a target region of the target nucleic acid.

In another aspect, the invention also provides the use of the above composition, reagent or kit for detecting the presence or absence of a mutation of interest in a target nucleic acid and for detecting the presence or absence of a mutation in a region of interest in a target nucleic acid.

On the other hand, the invention also provides application of the mutant base arranged in the non-target region in improving the efficiency of detecting whether the target nucleic acid has mutation in the target region.

General definition:

unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art.

The term "hybridization" or "complementary" or "substantially complementary" means that a nucleic acid (e.g., RNA, DNA) comprises a nucleotide sequence that enables it to bind non-covalently, i.e., form base pairs and/or G/U base pairs with another nucleic acid in a sequence-specific, antiparallel manner (i.e., the nucleic acid specifically binds to the complementary nucleic acid), "anneal" or "hybridize". Hybridization requires that the two nucleic acids contain complementary sequences, although there may be mismatches between bases. Suitable conditions for hybridization between two nucleic acids depend on the length of the nucleic acids and the degree of complementarity, variables well known in the art. Typically, the hybridizable nucleic acid is 8 nucleotides or more in length (e.g., 10 nucleotides or more, 12 nucleotides or more, 15 nucleotides or more, 20 nucleotides or more, 22 nucleotides or more, 25 nucleotides or more, or 30 nucleotides or more).

It will be appreciated that the sequence of a polynucleotide need not be 100% complementary to the sequence of its target nucleic acid to specifically hybridize. Polynucleotides may comprise 60% or more, 65% or more, 70% or more, 75% or more, 80% or more, 85% or more, 90% or more, 95% or more, 98% or more, 99% or more, 99.5% or more, or 100% sequence complementarity to a target region in a target nucleic acid sequence to which it hybridizes.

The term "amino acid" refers to a carboxylic acid containing an amino group. Various proteins in living bodies are composed of 20 basic amino acids.

The terms "polynucleotide", "nucleotide sequence", "nucleic acid molecule" and "nucleic acid" are used interchangeably and include DNA, RNA or hybrids thereof, which may be double-stranded or single-stranded.

The term "homology" or "identity" is used to refer to the match of sequences between two polypeptides or between two nucleic acids. When a position in both sequences being compared is occupied by the same base or amino acid monomer subunit (e.g., a position in each of two DNA molecules is occupied by adenine, or a position in each of two polypeptides is occupied by lysine), then the molecules are identical at that position. Between the two sequences. Typically, the comparison is made when two sequences are aligned to produce maximum identity. Such an alignment can be determined by using, for example, amino acid sequence identity by conventional methods, with reference to, for example, the teachings of Smith and Waterman,1981,Adv.Appl.Math.2:482Pearson Lipman,1988,Proc.Natl.Acad.Sci.USA85:2444,Thompsonetal.,1994,Nucleic Acids Res 22:467380, et al, by computerized operation algorithms (GAP, BESTFIT, FASTA in Wisconsin Genetics software package, and TFASTA, genetics Computer Group). The default parameters may also be used to determine using BLAST algorithms available from the national center for Biotechnology information (NCBI www.ncbi.nlm.nih.gov /).

As used herein, "biotin" is also known as vitamin H, a small molecule vitamin having a molecular weight of 244 Da. "avidin" is also known as avidin, which is an alkaline glycoprotein having 4 binding sites with very high affinity for biotin, and is commonly known as streptavidin. The extremely strong affinity of biotin for avidin can be used to amplify or enhance the detection signal in a detection system. For example, biotin is easily combined with protein (such as antibody) by covalent bond, while avidin molecule combined with enzyme reacts with biotin molecule combined with specific antibody, thus playing the role of multi-stage amplification, and achieving the purpose of detecting unknown antigen (or antibody) molecule due to the catalytic action of enzyme when encountering corresponding substrate.

Abasic spacer

As used herein, "abasic Spacer" refers to a nucleoside that does not contain specific coding information. An abasic spacer may be associated with an oligonucleotide, including the 3 'and 5' ends, or within the nucleotide chain. Common spacers include ：dSpacer(abasic furan),Spacer C3,Spacer C6,Spacer C12,Spacer9,Spacer12,Spacer18,,Inverted Abasic Site(dSpacer abasic furan) and rAbasic Site (rSpacer abasic furan).

The abasic spacers described above are known in the art, for example, dSpacer, spacer 9,Spacer 18,Spacer C3 is disclosed in U.S. patent No. 8153772B 2; chinese patent application CN101454451a discloses dSpacer.

The preferred abasic spacer "dSpacer" herein is also referred to as an abasic site, tetrahydrofuran (THF) or an apurinic/apyrimidinic site (apurinic/APYRIMIDINIC (AP) site), or an abasic linker in which the methylene group is located at the 1-position of the 2' -deoxyribose. dSpacer is not only very similar in structure to the natural site, but is also quite stable. The structure is as follows:

when the dSpacer is connected by nucleotides, the following structure can be formed:

Target nucleic acid

As used herein, the term "target nucleic acid" refers to a polynucleotide molecule extracted from a biological sample (sample to be tested). The biological sample is any solid or fluid sample obtained, excreted or secreted from any organism, including but not limited to single cell organisms such as bacteria, yeasts, protozoa, amoebas and the like, multicellular organisms (e.g. plants or animals, including samples from healthy or surface healthy human subjects or human patients affected by the condition or disease to be diagnosed or investigated, e.g. infection by pathogenic microorganisms such as pathogenic bacteria or viruses). For example, the biological sample may be a biological fluid obtained from, for example, blood, plasma, serum, urine, stool, sputum, mucus, lymph, synovial fluid, bile, ascites, pleural effusion, seroma, saliva, cerebrospinal fluid, aqueous or vitreous humor, or any bodily secretion, exudate (e.g., fluid obtained from an abscess or any other site of infection or inflammation) or a fluid obtained from a joint (e.g., a normal joint or a joint affected by a disease, such as rheumatoid arthritis, osteoarthritis, gout, or septic arthritis), or a swab of a skin or mucosal surface. The sample may also be a sample obtained from any organ or tissue (including a biopsy or autopsy specimen, such as a tumor biopsy) or may comprise cells (primary cells or cultured cells) or a medium conditioned by any cell, tissue or organ. Exemplary samples include, but are not limited to, cells, cell lysates, blood smears, cell centrifuge preparations, cytological smears, bodily fluids (e.g., blood, plasma, serum, saliva, sputum, urine, bronchoalveolar lavage, semen, etc.), tissue biopsies (e.g., tumor biopsies), fine needle aspirates, and/or tissue sections (e.g., cryostat tissue sections and/or paraffin embedded tissue sections).

In other embodiments, the biological sample may be a plant cell, a callus, a tissue or organ (e.g., root, stem, leaf, flower, seed, fruit), or the like.

In the present invention, the target nucleic acid further includes a DNA molecule formed by reverse transcription of RNA, and further, the target nucleic acid may be amplified by using a technique known in the art, such as isothermal amplification technique and non-isothermal amplification technique, and the isothermal amplification may be nucleic acid sequencing-based amplification (NASBA), recombinase Polymerase Amplification (RPA), loop-mediated isothermal amplification (LAMP), strand Displacement Amplification (SDA), helicase-dependent amplification (HDA), or Nicking Enzyme Amplification Reaction (NEAR). In certain exemplary embodiments, non-isothermal amplification methods may be used, including, but not limited to, PCR, multiple Displacement Amplification (MDA), rolling Circle Amplification (RCA), ligase Chain Reaction (LCR), or derivative amplification methods (RAM).

Further, the detection method of the present invention further comprises a step of amplifying the target nucleic acid; the detection system further comprises a reagent for amplifying the target nucleic acid. The amplified reagents include one or more of the following group: DNA polymerase, strand displacing enzyme, helicase, recombinase, single-stranded binding protein, and the like.

CRISPR

As used herein, the "CRISPR" refers to clustered, regularly interspaced short palindromic repeats (Clustered regularly interspaced short palindromic repeats) derived from the immune system of a microorganism.

Cas proteins

"Cas protein" as used herein refers to a CRISPR-associated protein, preferably from a type V or VI CRISPR/Cas protein (CRISPR/Cas effect protein), which upon binding to the feature sequence to be detected (target sequence), i.e. forming a ternary complex of Cas protein-gRNA-target sequence, can induce its trans activity, i.e. randomly cleave non-targeted single stranded nucleotides (i.e. single stranded nucleic acid detector as described herein). When the Cas protein binds to a signature sequence, it either cleaves or does not cleave the signature sequence, which can induce its trans activity; preferably, it induces its trans activity by cleaving the signature sequence; more preferably, it induces its trans activity by cleaving the single stranded signature sequence. The Cas protein recognizes a signature sequence by recognizing PAM (protospacer adjacent motif) adjacent to the signature sequence.

The Cas protein of the present invention is a protein having at least trans-cleavage activity, preferably, the Cas protein is a protein having Cis and trans-cleavage activity. The Cis activity refers to the activity that the Cas protein can recognize the PAM locus and specifically cut the target sequence under the action of gRNA.

The Cas proteins comprise V-type CRISPR/CAS effect proteins, including protein families such as Cas12, cas14 and the like. Preferably, for example, a Cas12 protein, such as Cas12a, cas12 b, cas12j; preferably, the Cas protein is Cas12j.

In embodiments, cas proteins referred to herein, such as Cas12, also encompass functional variants of Cas or homologs or orthologs thereof. "functional variant" of a protein as used herein refers to a variant of such a protein that retains, at least in part, the activity of the protein. Functional variants may include mutants (which may be insertion, deletion or substitution mutants), including polymorphs and the like. Functional variants also include fusion products of such proteins with another nucleic acid, protein, polypeptide or peptide that is not normally associated. Functional variants may be naturally occurring or may be artificial. Advantageous embodiments may relate to engineered or non-naturally occurring V-type DNA targeting effector proteins.

In one embodiment, one or more nucleic acid molecules encoding a Cas protein, such as Cas12, or an ortholog or homolog thereof, may be codon optimized for expression in eukaryotic cells. Eukaryotes may be as described herein. One or more nucleic acid molecules may be engineered or non-naturally occurring.

In one embodiment, the Cas12 protein or an ortholog or homolog thereof may comprise one or more mutations (and thus the nucleic acid molecule encoding the same may have one or more mutations).

In one embodiment, the Cas protein may be from: ciliates, listeria, corynebacteria, sarium, legionella, treponema, actinomycetes, eubacteria, streptococcus, lactobacillus, mycoplasma, bacteroides, flaviivola, flavobacterium, azoospira, sphaerochaeta, gluconacetobacter, neisseria, rochanteria, parvibaculum, staphylococci, nitratifractor, mycoplasma, campylobacter and chaetomium.

In one embodiment, the Cas protein is selected from the group consisting of proteins consisting of:

(1) A protein shown in SEQ ID No. 1;

(2) A derivative protein which is formed by substituting, deleting or adding one or more (such as 2, 3, 4, 5, 6, 7, 8, 9 or 10) amino acid residues of the amino acid sequence shown in SEQ ID No.1 or an active fragment thereof and has basically the same function.

In one embodiment, the Cas protein further includes proteins having 50%, preferably 55%, preferably 60%, preferably 65%, preferably 70%, preferably 75%, preferably 80%, preferably 85%, preferably 90%, preferably 99%, sequence identity (homology) to the above-described sequences and having trans activity.

The Cas protein can be obtained by recombinant expression vector technology, namely, a nucleic acid molecule for encoding the protein is constructed on a proper vector and then is transformed into a host cell, so that the encoding nucleic acid molecule is expressed in the cell, and the corresponding protein is obtained. The protein can be secreted by the cell or disrupted by conventional extraction techniques to obtain the protein. The coding nucleic acid molecule may or may not be integrated into the genome of the host cell for expression. The vector may further comprise regulatory elements that facilitate sequence integration, or self-replication. The vector may be, for example, a plasmid, virus, cosmid, phage, etc., which are well known to those skilled in the art, and preferably the expression vector in the present invention is a plasmid. The vector further comprises one or more regulatory elements selected from the group consisting of promoters, enhancers, ribosome binding sites for translation initiation, terminators, polyadenylation sequences, and selectable marker genes.

The host cell may be a prokaryotic cell, such as E.coli, streptomyces, agrobacterium: or lower eukaryotic cells, such as yeast cells; or higher eukaryotic cells, such as plant cells. It will be clear to one of ordinary skill in the art how to select appropriate vectors and host cells.

gRNA

As used herein, the "gRNA" is also known as guide RNA or guide RNA, and has the meaning commonly understood by those of skill in the art. In general, the guide RNA can comprise, consist essentially of, or consist of, a direct (direct) repeat sequence and a guide sequence (spacer), also referred to in the context of endogenous CRISPR systems. The gRNA may include crRNA and tracrRNA, or may contain only crRNA, depending on the Cas protein on which it depends, in different CRISPR systems. The crRNA and tracrRNA may be fused by artificial engineering to form single guide RNA (sgRNA). In certain instances, a targeting sequence is any polynucleotide sequence that has sufficient complementarity to a target sequence (the feature sequence described herein) to hybridize to the target sequence and direct specific binding of a CRISPR/Cas complex to the target sequence, typically having a sequence length of 12-25nt, in preferred embodiments, the targeting sequence has a sequence length of 13-20nt, e.g., 14nt, 15nt, 16nt, 17nt, 18nt, 19nt, or 20nt. The co-repeat sequence can be folded to form a specific structure (e.g., a stem-loop structure) for Cas protein recognition to form a complex. The targeting sequence need not be 100% complementary to the feature sequence (target sequence). The targeting sequence is not complementary to the single stranded nucleic acid detector.

In certain embodiments, the degree of complementarity (degree of matching) between a targeting sequence and its corresponding target sequence is at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or at least 99% when optimally aligned. It is within the ability of one of ordinary skill in the art to determine the optimal alignment. For example, there are published and commercially available alignment algorithms and programs such as, but not limited to, the Smith-Waterman algorithm (Smith-Waterman), bowtie, geneious, biopython, and SeqMan in ClustalW, matlab.

The gRNA of the invention can be natural or artificially modified or designed and synthesized.

Sequence information

SEQ ID No.1	Details of the	Type(s)
			1	Cas12j	Protein

Drawings

FIG. 1 shows the results of the detection of each site of the targeted region of the gRNA on the target nucleic acid when the target gene is OsTGW and the gRNA length is 16 bp.

FIG. 2 shows the results of the detection of each site of the targeted region of the gRNA on the target nucleic acid when the target gene is OsTGW and the gRNA length is 18 bp.

FIG. 3 shows the results of the detection of each site of the targeted region of the gRNA on the mutant target nucleic acid in sequence when the target gene is OsTGW and the gRNA length is 20 bp.

FIG. 4 shows the detection results of each site of the targeted region of the gRNA on the target nucleic acid when the target gene is CV19 and the gRNA length is 16 bp.

FIG. 5 shows the results of the detection of each site of the targeted region of the gRNA on the mutant target nucleic acid in sequence when the target gene is OsTGW and the gRNA length is 16 bp.

FIG. 6 shows the results of the detection of each site of the targeted region of the gRNA on the mutant target nucleic acid in sequence when the target gene is OsTGW and the gRNA length is 18 bp.

FIG. 7 shows the results of the detection of each site of the targeted region of the gRNA on the mutant target nucleic acid in sequence when the target gene is OsTGW and the gRNA length is 20 bp.

FIG. 8 shows the detection results of each site of the targeted region of the gRNA on the target nucleic acid when the target gene is CV19 and the gRNA length is 16 bp.

FIG. 9 shows the results of the detection of each site of the targeted region of the gRNA on the target nucleic acid, when the target gene is Ngene and the gRNA length is 16 bp.

FIG. 10 shows that, when OsTGW is detected with 14bp or 16bp gRNA, the two lengths of gRNA have little effect on the detection results; when CV19 is detected by using 14bp or 16bp gRNA, the detection result of 14bp gRNA is better, and the difference between the wild type and the mutant type is more obvious.

FIG. 11 shows that the single base substitution mutation does not affect the detection result regardless of the base into which the mutation is made.

FIG. 12 shows the result of detecting deletion mutations in the region targeted by the gRNA on the target nucleic acid in sequence when the target gene is Ngene and the gRNA length is 14 bp.

FIG. 13 shows the results of detecting insertion mutations in the region targeted by the gRNA on the target nucleic acid in sequence when the target gene is Ngene and the gRNA length is 14 bp.

FIG. 14 shows the result of detecting deletion mutations in the region targeted by the gRNA on the target nucleic acid in sequence when the target gene is OsTGW and the gRNA length is 14 bp.

FIG. 15 shows the results of detecting insertion mutations in the region targeted by the gRNA on the target nucleic acid in sequence when the target gene is OsTGW and the gRNA length is 14 bp.

FIG. 16 shows the result of detecting deletion mutations in the region targeted by gRNA on a target nucleic acid in sequence when the target gene is CV19 and the gRNA length is 14 bp.

FIG. 17 shows the result of detecting insertion mutation in the region targeted by gRNA on the target nucleic acid in sequence when the target gene is CV19 and the gRNA length is 14 bp.

FIG. 18 shows that when the target gene is OsTGW and the gRNA length is 14bp, the detection efficiency of the mutant base is set at the base (position 2 or 6) of the gRNA guide sequence which is not paired with the target mutation site.

FIG. 19 shows that when the target gene is Ngene and the gRNA length is 14bp, the detection efficiency of the mutant base is set at the base (position 2 or 6) of the gRNA guide sequence which is not paired with the target mutation site.

FIG. 20 shows that when the target gene is CV19 and the gRNA is 14bp in length, the detection efficiency of the mutant base is set at the base (position No. 2 or position No. 6) where the gRNA targeting sequence is not paired with the target mutation site.

Description of the embodiments

The present invention is further described in terms of the following examples, which are given by way of illustration only, and not by way of limitation, of the present invention, and any person skilled in the art may make any modifications to the equivalent examples using the teachings disclosed above. Any simple modification or equivalent variation of the following embodiments according to the technical substance of the present invention falls within the scope of the present invention.

The technical scheme of the invention is based on the following principle that nucleic acid of a sample to be detected is obtained, for example, target nucleic acid can be obtained by an amplification method, and the target nucleic acid is identified and combined by using the gRNA which can be paired with the target nucleic acid to guide Cas protein; subsequently, the Cas protein excites the cleavage activity of the single-stranded nucleic acid detector, thereby cleaving the single-stranded nucleic acid detector in the system; fluorescent groups and quenching groups are respectively arranged at two ends of the single-stranded nucleic acid detector, and if the single-stranded nucleic acid detector is cut, fluorescence is excited; if the single-stranded nucleic acid detector cannot be cleaved, fluorescence is not excited; in other embodiments, both ends of the single-stranded nucleic acid detector may be provided with a label that can be detected by colloidal gold.

Example 1 test of detection efficiency by setting different mutations in the region targeted by gRNA Using double-stranded DNA as target nucleic acid

Synthesizing OT-ssDNA (primer) containing different OsTGW gene mutation sites and complementary strands thereof, annealing, performing T-Blunt ligation, picking up correct monoclonal sequencing, extracting plasmid, and performing experimental addition according to plasmid concentration to ensure that the final concentration reaches 5nM; the gene corresponding to CV19-Lamb-j19g1-16bp is Orflab-A (the gene is constructed on a carrier), OT-ssDNA primer and T7 primer containing different mutation sites are respectively utilized to amplify plasmid Orflab-A, then PCR products are recovered, experiments are carried out, and experimental addition is carried out according to the concentration of the PCR products, so that the final concentration is controlled at 10nM. Under the conditions that the final concentration of Cas12j19 is 50nM, the final concentration of gRNA is 50nM, and the concentration of reporter-FB-T is 200nM, when different target sequences are detected, the influence of gRNA with different lengths on detection mutation sites located at different positions on a gRNA targeting target nucleic acid region on detection results is verified, namely the influence of detection on in vitro trans activity of Cas12j 19.

TABLE 1 Experimental arrangement of target nucleic acid as double-stranded DNA

The region sequence of the gRNA combined with the Cas protein is GUGCUGCUGUCUCCCAGACGGGAGGCAGAACUGCAC, and the guiding sequence is positioned at the 3' end of the sequence; counting the difference sites from one end (namely the 5 'end of the guide sequence on the gRNA, namely the 5' end of the Spacer) of the near PAM sequence, and sequentially carrying out single base mutation; effective means that there is a significant difference between the detection results of this difference and the absence of this difference.

As shown in FIG. 1, when the target gene is OsTGW.sup.6 and the gRNA length is 16bp, the No. 1 (the 1 st base at the 5 'end of the gRNA is different from the target nucleic acid, i.e., the 1 st base at the near PAM end of the target nucleic acid on the target region of the gRNA), the No. 2 (the 2 nd base at the 5' end of the gRNA is different from the target nucleic acid, i.e., the 2 nd base at the near PAM end of the target nucleic acid on the target region of the gRNA), and thus, at least any one of positions No. 16 and the target nucleic acid are different (i.e., mutation exists between the position and the wild-type gene or SNP exists between the position during SNP detection) has a significant influence on the detection result (fluorescence signal reduction). That is, any mutation at the targeted position of the gRNA can be obviously observed when the mutation on the sequence is detected.

As shown in FIG. 2, when the target gene is OsTGW6 and the gRNA length is 18bp, the difference (mutation) between the numbers 1 to 3, 7 to 14 or 16 can have a significant influence on the detection result (fluorescence signal reduction). That is, when detecting the mutation in this sequence, the mutation at least any one of the 5' -end No. 1-3, no. 7-14 or No. 16 of the target position of gRNA can be clearly observed.

As shown in FIG. 3, when the target gene is CV19 and the gRNA length is 20bp, at least any difference between No. 1-5 or No. 7-10 will have a significant effect on the detection result (decrease in fluorescence signal). That is, the mutations at positions 1 to 3, 7 to 14 or 16 on the 5' -end of the gRNA target can be clearly observed when the mutation on the sequence is detected.

As shown in FIG. 4, when the target gene is OsTGW and the gRNA length is 16bp, at least any of the 5-15 positions and the target nucleic acid have different mutation, which has obvious influence (reduced fluorescence signal) on the detection result. That is, the mutation at the 5' -position 5-15 of the targeting position of the gRNA can be obviously observed when the mutation on the segment of the sequence is detected.

Example 2 testing of detection efficiency by setting different mutations in the region targeted by gRNA Using Single-stranded DNA as target nucleic acid

The OT-ssDNA primer is synthesized as target nucleic acid, the final concentration is 50nM, and under the conditions that the final concentration of Cas12j19 is 50nM, the final concentration of gRNA is 50nM and the concentration of reporter-FB-T is 200nM, when different target sequences are detected, the influence of gRNA with different lengths on detecting templates (target nucleic acid) containing different mutation sites, namely the influence on the in-vitro trans activity of Cas12j19, is verified.

TABLE 2 Experimental arrangement of target nucleic acid as double-stranded DNA

The region sequence of the gRNA combined with the Cas protein is GUGCUGCUGUCUCCCAGACGGGAGGCAGAACUGCAC, and the guiding sequence is positioned at the 3' end of the sequence; the differential sites were counted starting from the 5' end of the gRNA; * Effective means that there is a significant difference between the detection results of the difference and the absence of the difference.

As shown in FIG. 5, the difference between the target nucleic acid and any of positions 9 to 13 when the target gene is OsTGW and the gRNA length is 16bp has a significant effect (reduced fluorescence signal) on the detection result. That is, when the mutation on the sequence is detected, the mutation of at least any one of the 5' -end 9-13 positions of the targeting position of the gRNA can be obviously observed.

As shown in FIG. 6, when the target gene is OsTGW6 and the gRNA length is 18bp, the difference between the numbers 9 to 10 or 16 to 17 has a significant effect on the detection result (fluorescence signal decrease). That is, when the mutation on the sequence is detected, the mutation of at least any one of the 5' -end positions 9-10 or 16-17 of the targeting position of the gRNA can be obviously observed.

As shown in FIG. 7, when the target gene is CV19 and the gRNA length is 20bp, the difference in any of No. 6, no. 9-10, no. 12 or No. 14-19 will have a significant effect on the detection result (decrease in fluorescence signal). That is, when detecting the mutation in this sequence, the mutation in at least any of the 5' -end 6, 9-10, 12 or 14-19 of the target position of the gRNA can be clearly observed.

As shown in FIG. 8, when the target gene is CV19 and the gRNA length is 16bp, the difference between at least any one of the 7-15 positions and the target nucleic acid will have a significant effect (decrease in fluorescence signal) on the detection result. That is, when the mutation on the sequence is detected, the mutation at any of the 5' -end 7-15 positions of the targeting position of the gRNA can be obviously observed.

As shown in FIG. 9, the difference between the target nucleic acid and any of the positions 6 to 15 when the target gene is Ngene and the gRNA length is 16bp has a significant effect (decrease in fluorescence signal) on the detection result. That is, when the mutation on the sequence is detected, the mutation at any of the 5' -end 6-15 positions of the targeting position of the gRNA can be obviously observed.

Example 3 setting different mutations in the region targeted by gRNA with single-stranded DNA as target nucleic acid, the 14bp length of gRNA has higher detection efficiency

According to the reaction systems of examples 1 and 2, two genes were selected in the arrangement of Table 3 to verify the effect of the difference in the position setting at the 5' end number 9 on gRNAs of different lengths on the detection effect.

TABLE 3 experimental arrangement of gRNA of different lengths when the target nucleic acid is single-stranded DNA

As shown in FIG. 10, when OsTGW is detected by using 14bp or 16bp gRNA, the effect of the two lengths of gRNA on the detection result is not great; when CV19 is detected by using 14bp or 16bp gRNA, the detection result of 14bp gRNA is better, and the difference between the wild type and the mutant type is more obvious.

Example 4 Using Single-stranded DNA as target nucleic acid, the detection efficiency was verified by mutating position 9 of the region targeted by gRNA to different bases

The gRNAs shown in Table 4 were synthesized, and the sequence of the 9 th position set in the region targeted by the gRNAs was changed to different bases, and whether or not the different base mutations had an influence on the detection results was examined.

TABLE 4 gRNA names, sequences and experimental results

The region sequence of the gRNA combined with the Cas protein is GUGCUGCUGUCUCCCAGACGGGAGGCAGAACUGCAC, and the guiding sequence is positioned at the 3' end of the sequence; the term "effective" in the tables means that a mutation can be detected when it is such a base.

As a result, as shown in FIG. 11, the single base substitution mutation did not affect the detection result regardless of the base type of the mutation.

Example 5 Using Single-stranded DNA as target nucleic acid, different insertion or deletion mutations were set in the region targeted by gRNA, and detection efficiency was verified

Three gRNAs shown in Table 5 are synthesized, different target nucleic acids are respectively targeted, and mutation with insertion or deletion at different positions is sequentially designed in the region targeted by the gRNAs, so that whether the mutation with insertion or deletion at different positions of the target region of the gRNAs can be detected by the method is verified.

TABLE 5 gRNA name, sequence and experimental results

The region sequence of the gRNA combined with the Cas protein is GUGCUGCUGUCUCCCAGACGGGAGGCAGAACUGCAC, and the guiding sequence is positioned at the 3' end of the sequence; sequentially arranging insertion or deletion of a base at positions 1-14/1-16 of the 5' end of the guide sequence; insertion of a base means insertion of a base at the 5 'end of the position, for example, insertion at position 1 means insertion of a base at the 5' end of position 1; deletion of a base refers to deletion of a base at that position, for example, deletion at position 1 refers to deletion of a base at position 1, in the same manner as described above.

Example 6 setting of a mutated base at a base where the gRNA targeting sequence does not pair with the target mutation site (position 2 or 6) to verify detection efficiency

According to the gRNA of table 6, mutations were set at position 9 in the region targeted by the gRNA, while artificial mutations were designed at positions other than position 9 (positions 2 or 9), and the detection efficiency of designing mutations at bases other than the base pairing with the target mutation site was detected.

The experimental results are shown in figures 18-20, the mutation is designed at the position corresponding to the non-target mutation on the gRNA, and the detection efficiency is not affected.

TABLE 6 gRNA name, sequence and experimental results

TABLE 7 target nucleic acid names, sequences

Name of the name	Sequence(s)
		12j19-ostgw6-3-ssdna9-insert-2a	CCCCGCCTTTTGGACCAACTCGCtATCAATACCATGTAGGCGTCGGCGATG
12j19-ostgw6-3-ssdna9-insert-6a	CCCCGCCTTTTGGACCAACTCGCtATAAATCCCATGTAGGCGTCGGCGATG
		12j19-ostgw6-3-ssdna2a	CCCCGCCTTTTGGACCAACTCGCATCAATACCATGTAGGCGTCGGCGATG
12j19-ostgw6-3-ssdna6a	CCCCGCCTTTTGGACCAACTCGCATAAATCCCATGTAGGCGTCGGCGATG
		n-b-12j19g1-ssdna89 10-insert2a	CCCAGCGCTTCAGCGTTCTTCGGAaATGTCGAGCATTGGCATGGAAGTCACAC
n-b-12j19g1-ssdna89 10-insert6a	CCCAGCGCTTCAGCGTTCTTCGGAaATATCGCGCATTGGCATGGAAGTCACAC
		n-b-j19g1-ssdna2a	CCCAGCGCTTCAGCGTTCTTCGGAATGTCGAGCATTGGCATGGAAGTCACAC
n-b-j19g1-ssdna6a	CCCAGCGCTTCAGCGTTCTTCGGAATATCGCGCATTGGCATGGAAGTCACAC
		cv-j19g1-ssdna0	GGCACCAAATTCCAAAGGTTTACCTTGGTAATCATCTTCAGTACCATACTCATATTGAG
cv-j19g1-ssdna789-insert	GGCACCAAATTCCAAAGGTTTACCTTGGTAATCATCtTTCAGTACCATACTCATATTGAG
		cv-j19g1-ssdna789-insert2c	GGCACCAAATTCCAAAGGTTTACCTTGGTAATCATCtTTCAGTcCCATACTCATATTGAG
cv-j19g1-ssdna789-insert6a	GGCACCAAATTCCAAAGGTTTACCTTGGTAATCATCtTTaAGTACCATACTCATATTGAG
		cv-j19g1-ssdna2c	GGCACCAAATTCCAAAGGTTTACCTTGGTAATCATCTTCAGTCCCATACTCATATTGAG
cv-j19g1-ssdna6a	GGCACCAAATTCCAAAGGTTTACCTTGGTAATCATCTTAAGTACCATACTCATATTGAG

The results of the above examples demonstrate that two nucleic acid sequences having at least one different base (single base substitution, insertion or deletion) can be rapidly detected using the designed gRNA, and that the wild type and mutant type can be separately sequenced without amplification, thereby providing a faster, more convenient and accurate method for rapid classification of target nucleic acids.

Sequence listing

<110> Shunfeng biotechnology Co., ltd

<120> Method for detecting target mutation using Cas12j effector protein

<130> JH-CNP202142DJ

<160> 1

<170> PatentIn version 3.5

<210> 1

<211> 908

<212> PRT

<213> Artificial Sequence

<220>

<223> Cas12 j

<400> 1

Met Pro Ser Tyr Lys Ser Ser Arg Val Leu Val Arg Asp Val Pro Glu

1 5 10 15

Glu Leu Val Asp His Tyr Glu Arg Ser His Arg Val Ala Ala Phe Phe

20 25 30

Met Arg Leu Leu Leu Ala Met Arg Arg Glu Pro Tyr Ser Leu Arg Met

35 40 45

Arg Asp Gly Thr Glu Arg Glu Val Asp Leu Asp Glu Thr Asp Asp Phe

50 55 60

Leu Arg Ser Ala Gly Cys Glu Glu Pro Asp Ala Val Ser Asp Asp Leu

65 70 75 80

Arg Ser Phe Ala Leu Ala Val Leu His Gln Asp Asn Pro Lys Lys Arg

85 90 95

Ala Phe Leu Glu Ser Glu Asn Cys Val Ser Ile Leu Cys Leu Glu Lys

100 105 110

Ser Ala Ser Gly Thr Arg Tyr Tyr Lys Arg Pro Gly Tyr Gln Leu Leu

115 120 125

Lys Lys Ala Ile Glu Glu Glu Trp Gly Trp Asp Lys Phe Glu Ala Ser

130 135 140

Leu Leu Asp Glu Arg Thr Gly Glu Val Ala Glu Lys Phe Ala Ala Leu

145 150 155 160

Ser Met Glu Asp Trp Arg Arg Phe Phe Ala Ala Arg Asp Pro Asp Asp

165 170 175

Leu Gly Arg Glu Leu Leu Lys Thr Asp Thr Arg Glu Gly Met Ala Ala

180 185 190

Ala Leu Arg Leu Arg Glu Arg Gly Val Phe Pro Val Ser Val Pro Glu

195 200 205

His Leu Asp Leu Asp Ser Leu Lys Ala Ala Met Ala Ser Ala Ala Glu

210 215 220

Arg Leu Lys Ser Trp Leu Ala Cys Asn Gln Arg Ala Val Asp Glu Lys

225 230 235 240

Ser Glu Leu Arg Lys Arg Phe Glu Glu Ala Leu Asp Gly Val Asp Pro

245 250 255

Glu Lys Tyr Ala Leu Phe Glu Lys Phe Ala Ala Glu Leu Gln Gln Ala

260 265 270

Asp Tyr Asn Val Thr Lys Lys Leu Val Leu Ala Val Ser Ala Lys Phe

275 280 285

Pro Ala Thr Glu Pro Ser Glu Phe Lys Arg Gly Val Glu Ile Leu Lys

290 295 300

Glu Asp Gly Tyr Lys Pro Leu Trp Glu Asp Phe Arg Glu Leu Gly Phe

305 310 315 320

Val Tyr Leu Ala Glu Arg Lys Trp Glu Arg Arg Arg Gly Gly Ala Ala

325 330 335

Val Thr Leu Cys Asp Ala Asp Asp Ser Pro Ile Lys Val Arg Phe Gly

340 345 350

Leu Thr Gly Arg Gly Arg Lys Phe Val Leu Ser Ala Ala Gly Ser Arg

355 360 365

Phe Leu Ile Thr Val Lys Leu Pro Cys Gly Asp Val Gly Leu Thr Ala

370 375 380

Val Pro Ser Arg Tyr Phe Trp Asn Pro Ser Val Gly Arg Thr Thr Ser

385 390 395 400

Asn Ser Phe Arg Ile Glu Phe Thr Lys Arg Thr Thr Glu Asn Arg Arg

405 410 415

Tyr Val Gly Glu Val Lys Glu Ile Gly Leu Val Arg Gln Arg Gly Arg

420 425 430

Tyr Tyr Phe Phe Ile Asp Tyr Asn Phe Asp Pro Glu Glu Val Ser Asp

435 440 445

Glu Thr Lys Val Gly Arg Ala Phe Phe Arg Ala Pro Leu Asn Glu Ser

450 455 460

Arg Pro Lys Pro Lys Asp Lys Leu Thr Val Met Gly Ile Asp Leu Gly

465 470 475 480

Ile Asn Pro Ala Phe Ala Phe Ala Val Cys Thr Leu Gly Glu Cys Gln

485 490 495

Asp Gly Ile Arg Ser Pro Val Ala Lys Met Glu Asp Val Ser Phe Asp

500 505 510

Ser Thr Gly Leu Arg Gly Gly Ile Gly Ser Gln Lys Leu His Arg Glu

515 520 525

Met His Asn Leu Ser Asp Arg Cys Phe Tyr Gly Ala Arg Tyr Ile Arg

530 535 540

Leu Ser Lys Lys Leu Arg Asp Arg Gly Ala Leu Asn Asp Ile Glu Ala

545 550 555 560

Arg Leu Leu Glu Glu Lys Tyr Ile Pro Gly Phe Arg Ile Val His Ile

565 570 575

Glu Asp Ala Asp Glu Arg Arg Arg Thr Val Gly Arg Thr Val Lys Glu

580 585 590

Ile Lys Gln Glu Tyr Lys Arg Ile Arg His Gln Phe Tyr Leu Arg Tyr

595 600 605

His Thr Ser Lys Arg Asp Arg Thr Glu Leu Ile Ser Ala Glu Tyr Phe

610 615 620

Arg Met Leu Phe Leu Val Lys Asn Leu Arg Asn Leu Leu Lys Ser Trp

625 630 635 640

Asn Arg Tyr His Trp Thr Thr Gly Asp Arg Glu Arg Arg Gly Gly Asn

645 650 655

Pro Asp Glu Leu Lys Ser Tyr Val Arg Tyr Tyr Asn Asn Leu Arg Met

660 665 670

Asp Thr Leu Lys Lys Leu Thr Cys Ala Ile Val Arg Thr Ala Lys Glu

675 680 685

His Gly Ala Thr Leu Val Ala Met Glu Asn Ile Gln Arg Val Asp Arg

690 695 700

Asp Asp Glu Val Lys Arg Arg Lys Glu Asn Ser Leu Leu Ser Leu Trp

705 710 715 720

Ala Pro Gly Met Val Leu Glu Arg Val Glu Gln Glu Leu Lys Asn Glu

725 730 735

Gly Ile Leu Ala Trp Glu Val Asp Pro Arg His Thr Ser Gln Thr Ser

740 745 750

Cys Ile Thr Asp Glu Phe Gly Tyr Arg Ser Leu Val Ala Lys Asp Thr

755 760 765

Phe Tyr Phe Glu Gln Asp Arg Lys Ile His Arg Ile Asp Ala Asp Val

770 775 780

Asn Ala Ala Ile Asn Ile Ala Arg Arg Phe Leu Thr Arg Tyr Arg Ser

785 790 795 800

Leu Thr Gln Leu Trp Ala Ser Leu Leu Asp Asp Gly Arg Tyr Leu Val

805 810 815

Asn Val Thr Arg Gln His Glu Arg Ala Tyr Leu Glu Leu Gln Thr Gly

820 825 830

Ala Pro Ala Ala Thr Leu Asn Pro Thr Ala Glu Ala Ser Tyr Glu Leu

835 840 845

Val Gly Leu Ser Pro Glu Glu Glu Glu Leu Ala Gln Thr Arg Ile Lys

850 855 860

Arg Lys Lys Arg Glu Pro Phe Tyr Arg His Glu Gly Val Trp Leu Thr

865 870 875 880

Arg Glu Lys His Arg Glu Gln Val His Glu Leu Arg Asn Gln Val Leu

885 890 895

Ala Leu Gly Asn Ala Lys Ile Pro Glu Ile Arg Thr

900 905

Claims

1. A method of detecting the presence or absence of a target mutation site in a target region of a target nucleic acid using a Cas12j effector protein, the method being a method for non-disease diagnosis and treatment purposes, the method comprising contacting a sample with a Cas12j effector protein, a gRNA comprising a region that binds to the Cas12j effector protein and a guide sequence that hybridizes to a wild-type target nucleic acid, and a single-stranded nucleic acid detector; the region sequence that binds to the Cas12j effector protein is GUGCUGCUGUCUCCCAGACGGGAGGCAGAACUGCAC;

the guide sequence of the gRNA is 14-20 bases in length;

The target mutation position is at any one of the 9 th position and the 10 th position of the 5' end of the gRNA guide sequence; detecting a detectable signal generated by cleavage of the single stranded nucleic acid detector by the Cas12j effector protein;

the detectable signal of the mutant target nucleic acid is significantly weaker than the signal when the wild-type target nucleic acid is detected.

2. The method of claim 1, wherein the target nucleic acid comprises ribonucleotides or deoxyribonucleotides.

3. The method of claim 1, wherein the target nucleic acid comprises single-stranded DNA, double-stranded DNA, or single-stranded RNA.

4. The method of claim 1, wherein the amino acid sequence of the Cas12j effector protein is set forth in SEQ ID No. 1.

5. Use of the Cas12j effector protein, gRNA, and single stranded nucleic acid detector of any one of claims 1,2,3, and 4 in the manufacture of a kit for detecting the presence or absence of a mutation site of interest in a target nucleic acid.