CN112746332A - Nucleic acid coding compound library composed of non-natural nucleotides - Google Patents
Nucleic acid coding compound library composed of non-natural nucleotides Download PDFInfo
- Publication number
- CN112746332A CN112746332A CN202011177047.7A CN202011177047A CN112746332A CN 112746332 A CN112746332 A CN 112746332A CN 202011177047 A CN202011177047 A CN 202011177047A CN 112746332 A CN112746332 A CN 112746332A
- Authority
- CN
- China
- Prior art keywords
- nucleic acid
- compound
- base
- nucleotides
- natural
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B40/00—Libraries per se, e.g. arrays, mixtures
- C40B40/04—Libraries containing only organic compounds
- C40B40/06—Libraries containing nucleotides or polynucleotides, or derivatives thereof
- C40B40/08—Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
-
- C—CHEMISTRY; METALLURGY
- C40—COMBINATORIAL TECHNOLOGY
- C40B—COMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
- C40B30/00—Methods of screening libraries
- C40B30/04—Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding
Landscapes
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biochemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Medicinal Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Immunology (AREA)
- Genetics & Genomics (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a nucleic acid coding compound and a compound library consisting of non-natural nucleotides. The nucleic acid encoding compounds and compound libraries of the present invention overcome the limitations of the use of natural base libraries of nucleic acid encoding compounds by introducing Z, P, S, B, S' five artificial bases.
Description
Technical Field
The invention particularly relates to a nucleic acid coding compound and/or compound library consisting of non-natural nucleotides.
Background
In the field of new drug development, high-throughput screening for biological targets is one of the main means for rapidly obtaining lead compounds. However, traditional high throughput screening based on single molecules requires long time, large equipment investment, limited number of library compounds (millions), and the building of compound libraries requires decades of accumulation, limiting the efficiency and possibility of discovery of lead compounds. The recently developed DNA coding compound library technologies (WO2005058479, WO2018166532, CN103882532) combine the technologies of combinatorial chemistry and molecular biology, and each compound is labeled with a DNA tag on the molecular level, so that a compound library up to hundred million levels can be synthesized in a very short time, and the compound can be identified by a gene sequencing method, so that the size and synthesis efficiency of the compound library are greatly increased, and the technology becomes the trend of the next generation compound library screening technology. DNA-encoded compound library technology is beginning to be widely used in the pharmaceutical industry and produces many positive effects (Accounts of Chemical Research,2014,47, 1247-.
With the expansion of the application of the technology, the DNA coding label also shows the limitation on the screening of certain biological targets: 1) For example, proteins such as transcription factors and the like which interact with DNA sequences, ribonucleic acid (RNA) and the like are used as disease regulation targets, more background binding signals are generated by traditional screening of DNA coding compound libraries. These signals may result from the affinity of the DNA tag of the compound for the transcription factor protein, or the resulting hybridization affinity of the DNA tag of the compound to the RNA target, not via binding of the compound structure itself to the biological target. 2) When the traditional screening of the DNA coding compound library is applied to certain biological samples (such as screening based on living cell in situ membrane protein targets), the amplification efficiency is reduced and certain mismatch false positive amplification signals are formed because the biological samples are easy to generate interference of endogenous genomic DNA on the amplification and detection of DNA tags of enriched molecules.
Shuichi Hoshika et al disclose a nucleic acid coding system for non-natural bases (Science, 2019, 363: 884-887). The invention applies the non-natural base to the DNA coding compound library technology and uses a coding system different from natural coding nucleotide, thereby overcoming the application limitation and improving the application range of the coding compound library technology.
Disclosure of Invention
The invention discloses a nucleic acid encoding compound, comprising a functional part and a nucleic acid part, wherein the base of the nucleic acid part is selected from non-natural base Z, P, S, B, S';
Further, the nucleic acid encoding compound further comprises a linking group, whereby the functional moiety and the nucleic acid moiety are linked by the linking group.
Further, the nucleic acid portion includes single-stranded nucleic acid and/or double-stranded nucleic acid.
Further, in the double-stranded nucleic acid, the base Z corresponds to the base P, the base S corresponds to the base B, and the base S' corresponds to the base B.
Further, the nucleic acid portion is composed of ribonucleotides and/or deoxyribonucleotides.
Further, the bases of the ribonucleotides are Z, P, S', B and the base of the deoxyribonucleotides is Z, P, S, B.
Further, the nucleic acid part is greater than 10bp in nucleic acid length.
Further, the nucleic acid portion may be inserted with a nucleotide having a natural base, but 3 or more consecutive nucleotides of the natural base are not inserted, and the number of nucleotides of the natural base is less than 30% of the total number of nucleotides of the nucleic acid portion.
Further, the nucleic acid encoding compound has the structure shown in formula I:
wherein the content of the first and second substances,
x is an atom or molecular framework having a valence of at least 3;
L1is a linking group to which the 5' end of a nucleic acid can be linked;
L2is a linking group to which the 3' end of a nucleic acid can be linked;
Z1is a first nucleic acid moiety;
Z2is a second nucleic acid moiety; wherein the bases of the second nucleic acid portion at least partially correspond to the bases of the first nucleic acid portion;
m is a linking group to which a functional moiety may be attached;
y is a functional moiety consisting of one or more synthons.
Further, X is a carbon atom.
Still further, said M is selected from an alkylene chain or a poly (ethylene glycol) chain.
Further, said L1、L2Selected from alkylene chains or poly (ethylene glycol) chains.
Further specifically, the alkylene chain, poly (ethylene glycol) chain, bears a phosphate linker group.
Further, Z is1And Z2Each further comprising a PCR primer binding site sequence.
The invention also discloses a library of nucleic acid encoding compounds comprising at least 102A different one of the above nucleic acid encoding compounds.
Obviously, many modifications, substitutions, and variations are possible in light of the above teachings of the invention, without departing from the basic technical spirit of the invention, as defined by the following claims.
The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples. All the technologies realized based on the above contents of the present invention belong to the scope of the present invention.
Drawings
FIG. 1 shows compound 550, known to bind to TAR RNA, having two sites to which nucleic acid codes can be ligated.
Detailed Description
Example 1 construction of nucleic acid encoding Compounds
1) A compound 550 (figure 1, Ki ═ 0.039 mu M) with a binding effect on TAR RNA is subjected to the method described in WO2005058479 or WO2018166532 to construct nucleic acid coding compounds 1-3 with a nucleic acid tag and control compounds 4-6, wherein the structures of the compounds are as follows:
numbering | Compounds moieties | Nucleic acid tag |
1 | Compound 550 | Natural DNA sequence with TAR RNA binding function |
2 | Compound 550 | Non-native nucleic acid sequences of the invention |
3 | Compound 550 | Natural DNA sequence without binding effect with TAR RNA |
4 | Is free of | Natural DNA sequence with TAR RNA binding function |
5 | Is free of | Non-native nucleic acid sequences of the invention |
6 | Is free of | Natural DNA sequence without binding effect with TAR RNA |
Example 2 verification of screening methods for Compounds encoded by nucleic acids of the invention
The 3' end of the TAR RNA sequence was modified with biotin for immobilization. Tag small peptide ends were labeled with FAM. Fixing TAR RNA with neutral avidin protein magnetic beads, incubating with FAM-tat, eluting once, heating the TAR RNA and the magnetic beads, measuring the fluorescence content of FAM in supernatant, confirming that FAM-tat is combined with the fixed TAR RNA, and ensuring that the TAR RNA has activity.
The compounds 1-6 were incubated with TAR RNA in a screening buffer (50mM Tris,80mM KCl,0.3mg/mL ssDNA, 0.01% Tween 20, pH 7.5) for 1h, followed by addition of neutravidin magnetic beads for incubation at room temperature for 30min to immobilize the TAR RNA, which was then eluted with the screening buffer, followed by transfer of the beads to an elution buffer (50mM Tris,160 mM KCl, pH 7.5) heated to 95 ℃ for 10min, and the supernatant was collected. The nucleic acid content in the elution buffer was quantified by qPCR and the degree of enrichment of nucleic acid was compared between groups.
Adding the compounds 1-6 into a traditional DNA coding compound library according to the number of molecules of 10^ 5-10 ^9, and screening the TAR RNA, wherein the specific operation steps are as described above. The encoded compounds from the first round of screening were subjected to a second round of screening in TAR RNA, and this was repeated until the total number of eluted molecules was around 10^ 8. And carrying out PCR amplification and sequencing on the obtained coding compound, then decoding a sequencing result, and comparing the final enrichment copy number of the compounds 1-6.
Claims (15)
2. The compound of claim 1, wherein: the nucleic acid encoding compound further comprises a linking group by which the functional moiety and the nucleic acid moiety are linked.
3. The compound of claim 1, wherein: the nucleic acid portion includes single-stranded nucleic acid and/or double-stranded nucleic acid.
4. A compound according to claim 3, characterized in that: in the double-stranded nucleic acid, the base Z corresponds to the base P, the base S corresponds to the base B, and the base S' corresponds to the base B.
5. The compound of claim 1, wherein: the nucleic acid portion is composed of ribonucleotides and/or deoxyribonucleotides.
6. The compound of claim 5, wherein: the bases of the ribonucleotides are Z, P, S' and B, and the base of the deoxyribonucleotides is Z, P, S, B.
7. The compound of claim 1, wherein: the nucleic acid part has a nucleic acid length of more than 10 bp.
8. The compound of claim 5, wherein: the nucleic acid portion may be inserted with nucleotides having a natural base, but not with 3 or more consecutive natural base nucleotides, and the number of nucleotides of the natural base is less than 30% of the total number of nucleotides of the nucleic acid portion.
9. The compound of claim 1, wherein: the structure of the nucleic acid coding compound is shown as the formula I:
wherein the content of the first and second substances,
x is an atom or molecular framework having a valence of at least 3;
L1is a linking group to which the 5' end of a nucleic acid can be linked;
L2is a linking group to which the 3' end of a nucleic acid can be linked;
Z1is a first nucleic acid moiety;
Z2is a second nucleic acid moiety; wherein the bases of the second nucleic acid portion at least partially correspond to the bases of the first nucleic acid portion;
m is a linking group to which a functional moiety may be attached;
y is a functional moiety consisting of one or more synthons.
10. The compound of claim 9, wherein: and X is a carbon atom.
11. The compound of claim 9, wherein: the M is selected from an alkylene chain or a poly (ethylene glycol) chain.
12. The compound of claim 9, wherein: said L1、L2Selected from alkylene chains or poly (ethylene glycol) chains.
13. The compound of claim 12, wherein: the alkylene chain, poly (ethylene glycol) chain, bears a phosphate linker group.
14. The compound of claim 9, wherein: z is1And Z2Each further comprising a PCR primer binding site sequence.
15. A library of nucleic acid encoding compounds comprising at least 102A different nucleic acid encoding compound of claims 1-14.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911030067 | 2019-10-29 | ||
CN2019110300679 | 2019-10-29 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112746332A true CN112746332A (en) | 2021-05-04 |
Family
ID=75648794
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011177047.7A Pending CN112746332A (en) | 2019-10-29 | 2020-10-29 | Nucleic acid coding compound library composed of non-natural nucleotides |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112746332A (en) |
-
2020
- 2020-10-29 CN CN202011177047.7A patent/CN112746332A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7259182B2 (en) | Hybridization chain reaction method for in situ molecular detection | |
JP7033602B2 (en) | Barcoded DNA for long range sequencing | |
JP6925424B2 (en) | A method of increasing the throughput of a single molecule sequence by ligating short DNA fragments | |
WO2005026686A2 (en) | Multiplexed analytical platform | |
US11401543B2 (en) | Methods and compositions for improving removal of ribosomal RNA from biological samples | |
CN102016068A (en) | Method of making a paired tag library for nucleic acid sequencing | |
CN107446995B (en) | Primer group for amplifying multiple target DNA sequences in sample and application thereof | |
CN110904512A (en) | High-throughput sequencing library construction method suitable for single-stranded DNA | |
AU2016102398A4 (en) | Method for enriching target nucleic acid sequence from nucleic acid sample | |
CN107760686B (en) | Aptamer of DKK-1 protein and application thereof | |
CN107109698B (en) | RNA STITCH sequencing: assay for direct mapping RNA-RNA interaction in cells | |
CN102181943A (en) | Paired-end library construction method and method for sequencing genome by using library | |
Wang et al. | Bisulfite-free, single base-resolution analysis of 5-hydroxymethylcytosine in genomic DNA by chemical-mediated mismatch | |
CN109750092B (en) | Method and kit for targeted enrichment of target DNA with high GC content | |
CN106191256B (en) | Method for DNA methylation sequencing aiming at target region | |
US11345959B2 (en) | Method for exploring useful genetic resources through bulk metagenome analysis and use thereof | |
CN108166067A (en) | A kind of Novel DNA banking process and its application | |
CN112746332A (en) | Nucleic acid coding compound library composed of non-natural nucleotides | |
CN116287124A (en) | Single-stranded joint pre-connection method, library construction method of high-throughput sequencing library and kit | |
US11104942B2 (en) | Method for identification of the most abundant oligonucleotide species in a library of oligonucleotides | |
KR101811737B1 (en) | Method for Screening Useful Gene Products via Metagenomics-based Mega-throughput Screening System and Uses Thereof | |
Liu et al. | Genome-wide identification of protein binding sites on RNAs in mammalian cells | |
CN110699428B (en) | Method for homogenizing oligonucleotide library | |
CN114196714B (en) | Method for synthesizing oligonucleotide chain containing non-natural base by using terminal deoxyribonucleotide transferase without template and application thereof | |
CN113166756B (en) | Fusion primer for three-generation sequencing library construction, library construction method, sequencing method and library construction kit |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |