CN112746332A - Nucleic acid coding compound library composed of non-natural nucleotides - Google Patents

Nucleic acid coding compound library composed of non-natural nucleotides Download PDF

Info

Publication number
CN112746332A
CN112746332A CN202011177047.7A CN202011177047A CN112746332A CN 112746332 A CN112746332 A CN 112746332A CN 202011177047 A CN202011177047 A CN 202011177047A CN 112746332 A CN112746332 A CN 112746332A
Authority
CN
China
Prior art keywords
nucleic acid
compound
base
nucleotides
natural
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011177047.7A
Other languages
Chinese (zh)
Inventor
李进
巩晓明
窦登峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitgen Inc
Original Assignee
Hitgen Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitgen Inc filed Critical Hitgen Inc
Publication of CN112746332A publication Critical patent/CN112746332A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • C40B40/08Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B30/00Methods of screening libraries
    • C40B30/04Methods of screening libraries by measuring the ability to specifically bind a target molecule, e.g. antibody-antigen binding, receptor-ligand binding

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biochemistry (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a nucleic acid coding compound and a compound library consisting of non-natural nucleotides. The nucleic acid encoding compounds and compound libraries of the present invention overcome the limitations of the use of natural base libraries of nucleic acid encoding compounds by introducing Z, P, S, B, S' five artificial bases.

Description

Nucleic acid coding compound library composed of non-natural nucleotides
Technical Field
The invention particularly relates to a nucleic acid coding compound and/or compound library consisting of non-natural nucleotides.
Background
In the field of new drug development, high-throughput screening for biological targets is one of the main means for rapidly obtaining lead compounds. However, traditional high throughput screening based on single molecules requires long time, large equipment investment, limited number of library compounds (millions), and the building of compound libraries requires decades of accumulation, limiting the efficiency and possibility of discovery of lead compounds. The recently developed DNA coding compound library technologies (WO2005058479, WO2018166532, CN103882532) combine the technologies of combinatorial chemistry and molecular biology, and each compound is labeled with a DNA tag on the molecular level, so that a compound library up to hundred million levels can be synthesized in a very short time, and the compound can be identified by a gene sequencing method, so that the size and synthesis efficiency of the compound library are greatly increased, and the technology becomes the trend of the next generation compound library screening technology. DNA-encoded compound library technology is beginning to be widely used in the pharmaceutical industry and produces many positive effects (Accounts of Chemical Research,2014,47, 1247-.
With the expansion of the application of the technology, the DNA coding label also shows the limitation on the screening of certain biological targets: 1) For example, proteins such as transcription factors and the like which interact with DNA sequences, ribonucleic acid (RNA) and the like are used as disease regulation targets, more background binding signals are generated by traditional screening of DNA coding compound libraries. These signals may result from the affinity of the DNA tag of the compound for the transcription factor protein, or the resulting hybridization affinity of the DNA tag of the compound to the RNA target, not via binding of the compound structure itself to the biological target. 2) When the traditional screening of the DNA coding compound library is applied to certain biological samples (such as screening based on living cell in situ membrane protein targets), the amplification efficiency is reduced and certain mismatch false positive amplification signals are formed because the biological samples are easy to generate interference of endogenous genomic DNA on the amplification and detection of DNA tags of enriched molecules.
Shuichi Hoshika et al disclose a nucleic acid coding system for non-natural bases (Science, 2019, 363: 884-887). The invention applies the non-natural base to the DNA coding compound library technology and uses a coding system different from natural coding nucleotide, thereby overcoming the application limitation and improving the application range of the coding compound library technology.
Disclosure of Invention
The invention discloses a nucleic acid encoding compound, comprising a functional part and a nucleic acid part, wherein the base of the nucleic acid part is selected from non-natural base Z, P, S, B, S';
wherein the base Z is
Figure BDA0002749223140000021
Base P is
Figure BDA0002749223140000022
Base S is
Figure BDA0002749223140000023
Base B is
Figure BDA0002749223140000024
The base S' is
Figure BDA0002749223140000025
Further, the nucleic acid encoding compound further comprises a linking group, whereby the functional moiety and the nucleic acid moiety are linked by the linking group.
Further, the nucleic acid portion includes single-stranded nucleic acid and/or double-stranded nucleic acid.
Further, in the double-stranded nucleic acid, the base Z corresponds to the base P, the base S corresponds to the base B, and the base S' corresponds to the base B.
Further, the nucleic acid portion is composed of ribonucleotides and/or deoxyribonucleotides.
Further, the bases of the ribonucleotides are Z, P, S', B and the base of the deoxyribonucleotides is Z, P, S, B.
Further, the nucleic acid part is greater than 10bp in nucleic acid length.
Further, the nucleic acid portion may be inserted with a nucleotide having a natural base, but 3 or more consecutive nucleotides of the natural base are not inserted, and the number of nucleotides of the natural base is less than 30% of the total number of nucleotides of the nucleic acid portion.
Further, the nucleic acid encoding compound has the structure shown in formula I:
Figure BDA0002749223140000026
wherein the content of the first and second substances,
x is an atom or molecular framework having a valence of at least 3;
L1is a linking group to which the 5' end of a nucleic acid can be linked;
L2is a linking group to which the 3' end of a nucleic acid can be linked;
Z1is a first nucleic acid moiety;
Z2is a second nucleic acid moiety; wherein the bases of the second nucleic acid portion at least partially correspond to the bases of the first nucleic acid portion;
m is a linking group to which a functional moiety may be attached;
y is a functional moiety consisting of one or more synthons.
Further, X is a carbon atom.
Still further, said M is selected from an alkylene chain or a poly (ethylene glycol) chain.
Further, said L1、L2Selected from alkylene chains or poly (ethylene glycol) chains.
Further specifically, the alkylene chain, poly (ethylene glycol) chain, bears a phosphate linker group.
Further, Z is1And Z2Each further comprising a PCR primer binding site sequence.
The invention also discloses a library of nucleic acid encoding compounds comprising at least 102A different one of the above nucleic acid encoding compounds.
Obviously, many modifications, substitutions, and variations are possible in light of the above teachings of the invention, without departing from the basic technical spirit of the invention, as defined by the following claims.
The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples. All the technologies realized based on the above contents of the present invention belong to the scope of the present invention.
Drawings
FIG. 1 shows compound 550, known to bind to TAR RNA, having two sites to which nucleic acid codes can be ligated.
Detailed Description
Example 1 construction of nucleic acid encoding Compounds
1) A compound 550 (figure 1, Ki ═ 0.039 mu M) with a binding effect on TAR RNA is subjected to the method described in WO2005058479 or WO2018166532 to construct nucleic acid coding compounds 1-3 with a nucleic acid tag and control compounds 4-6, wherein the structures of the compounds are as follows:
numbering Compounds moieties Nucleic acid tag
1 Compound 550 Natural DNA sequence with TAR RNA binding function
2 Compound 550 Non-native nucleic acid sequences of the invention
3 Compound 550 Natural DNA sequence without binding effect with TAR RNA
4 Is free of Natural DNA sequence with TAR RNA binding function
5 Is free of Non-native nucleic acid sequences of the invention
6 Is free of Natural DNA sequence without binding effect with TAR RNA
Example 2 verification of screening methods for Compounds encoded by nucleic acids of the invention
The 3' end of the TAR RNA sequence was modified with biotin for immobilization. Tag small peptide ends were labeled with FAM. Fixing TAR RNA with neutral avidin protein magnetic beads, incubating with FAM-tat, eluting once, heating the TAR RNA and the magnetic beads, measuring the fluorescence content of FAM in supernatant, confirming that FAM-tat is combined with the fixed TAR RNA, and ensuring that the TAR RNA has activity.
The compounds 1-6 were incubated with TAR RNA in a screening buffer (50mM Tris,80mM KCl,0.3mg/mL ssDNA, 0.01% Tween 20, pH 7.5) for 1h, followed by addition of neutravidin magnetic beads for incubation at room temperature for 30min to immobilize the TAR RNA, which was then eluted with the screening buffer, followed by transfer of the beads to an elution buffer (50mM Tris,160 mM KCl, pH 7.5) heated to 95 ℃ for 10min, and the supernatant was collected. The nucleic acid content in the elution buffer was quantified by qPCR and the degree of enrichment of nucleic acid was compared between groups.
Adding the compounds 1-6 into a traditional DNA coding compound library according to the number of molecules of 10^ 5-10 ^9, and screening the TAR RNA, wherein the specific operation steps are as described above. The encoded compounds from the first round of screening were subjected to a second round of screening in TAR RNA, and this was repeated until the total number of eluted molecules was around 10^ 8. And carrying out PCR amplification and sequencing on the obtained coding compound, then decoding a sequencing result, and comparing the final enrichment copy number of the compounds 1-6.

Claims (15)

1. A nucleic acid encoding compound comprising a functional portion and a nucleic acid portion, wherein the base of the nucleic acid portion is selected from the group consisting of non-natural base Z, P, S, B, S';
wherein the base Z is
Figure RE-FDA0002939332440000011
Base P is
Figure RE-FDA0002939332440000012
Base S is
Figure RE-FDA0002939332440000013
Base B is
Figure RE-FDA0002939332440000014
The base S' is
Figure RE-FDA0002939332440000015
2. The compound of claim 1, wherein: the nucleic acid encoding compound further comprises a linking group by which the functional moiety and the nucleic acid moiety are linked.
3. The compound of claim 1, wherein: the nucleic acid portion includes single-stranded nucleic acid and/or double-stranded nucleic acid.
4. A compound according to claim 3, characterized in that: in the double-stranded nucleic acid, the base Z corresponds to the base P, the base S corresponds to the base B, and the base S' corresponds to the base B.
5. The compound of claim 1, wherein: the nucleic acid portion is composed of ribonucleotides and/or deoxyribonucleotides.
6. The compound of claim 5, wherein: the bases of the ribonucleotides are Z, P, S' and B, and the base of the deoxyribonucleotides is Z, P, S, B.
7. The compound of claim 1, wherein: the nucleic acid part has a nucleic acid length of more than 10 bp.
8. The compound of claim 5, wherein: the nucleic acid portion may be inserted with nucleotides having a natural base, but not with 3 or more consecutive natural base nucleotides, and the number of nucleotides of the natural base is less than 30% of the total number of nucleotides of the nucleic acid portion.
9. The compound of claim 1, wherein: the structure of the nucleic acid coding compound is shown as the formula I:
Figure RE-FDA0002939332440000016
wherein the content of the first and second substances,
x is an atom or molecular framework having a valence of at least 3;
L1is a linking group to which the 5' end of a nucleic acid can be linked;
L2is a linking group to which the 3' end of a nucleic acid can be linked;
Z1is a first nucleic acid moiety;
Z2is a second nucleic acid moiety; wherein the bases of the second nucleic acid portion at least partially correspond to the bases of the first nucleic acid portion;
m is a linking group to which a functional moiety may be attached;
y is a functional moiety consisting of one or more synthons.
10. The compound of claim 9, wherein: and X is a carbon atom.
11. The compound of claim 9, wherein: the M is selected from an alkylene chain or a poly (ethylene glycol) chain.
12. The compound of claim 9, wherein: said L1、L2Selected from alkylene chains or poly (ethylene glycol) chains.
13. The compound of claim 12, wherein: the alkylene chain, poly (ethylene glycol) chain, bears a phosphate linker group.
14. The compound of claim 9, wherein: z is1And Z2Each further comprising a PCR primer binding site sequence.
15. A library of nucleic acid encoding compounds comprising at least 102A different nucleic acid encoding compound of claims 1-14.
CN202011177047.7A 2019-10-29 2020-10-29 Nucleic acid coding compound library composed of non-natural nucleotides Pending CN112746332A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911030067 2019-10-29
CN2019110300679 2019-10-29

Publications (1)

Publication Number Publication Date
CN112746332A true CN112746332A (en) 2021-05-04

Family

ID=75648794

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011177047.7A Pending CN112746332A (en) 2019-10-29 2020-10-29 Nucleic acid coding compound library composed of non-natural nucleotides

Country Status (1)

Country Link
CN (1) CN112746332A (en)

Similar Documents

Publication Publication Date Title
JP7259182B2 (en) Hybridization chain reaction method for in situ molecular detection
JP7033602B2 (en) Barcoded DNA for long range sequencing
JP6925424B2 (en) A method of increasing the throughput of a single molecule sequence by ligating short DNA fragments
WO2005026686A2 (en) Multiplexed analytical platform
US11401543B2 (en) Methods and compositions for improving removal of ribosomal RNA from biological samples
CN102016068A (en) Method of making a paired tag library for nucleic acid sequencing
CN107446995B (en) Primer group for amplifying multiple target DNA sequences in sample and application thereof
CN110904512A (en) High-throughput sequencing library construction method suitable for single-stranded DNA
AU2016102398A4 (en) Method for enriching target nucleic acid sequence from nucleic acid sample
CN107760686B (en) Aptamer of DKK-1 protein and application thereof
CN107109698B (en) RNA STITCH sequencing: assay for direct mapping RNA-RNA interaction in cells
CN102181943A (en) Paired-end library construction method and method for sequencing genome by using library
Wang et al. Bisulfite-free, single base-resolution analysis of 5-hydroxymethylcytosine in genomic DNA by chemical-mediated mismatch
CN109750092B (en) Method and kit for targeted enrichment of target DNA with high GC content
CN106191256B (en) Method for DNA methylation sequencing aiming at target region
US11345959B2 (en) Method for exploring useful genetic resources through bulk metagenome analysis and use thereof
CN108166067A (en) A kind of Novel DNA banking process and its application
CN112746332A (en) Nucleic acid coding compound library composed of non-natural nucleotides
CN116287124A (en) Single-stranded joint pre-connection method, library construction method of high-throughput sequencing library and kit
US11104942B2 (en) Method for identification of the most abundant oligonucleotide species in a library of oligonucleotides
KR101811737B1 (en) Method for Screening Useful Gene Products via Metagenomics-based Mega-throughput Screening System and Uses Thereof
Liu et al. Genome-wide identification of protein binding sites on RNAs in mammalian cells
CN110699428B (en) Method for homogenizing oligonucleotide library
CN114196714B (en) Method for synthesizing oligonucleotide chain containing non-natural base by using terminal deoxyribonucleotide transferase without template and application thereof
CN113166756B (en) Fusion primer for three-generation sequencing library construction, library construction method, sequencing method and library construction kit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination