CN115506036A - DNA coding compound library initial fragment and preparation and application thereof - Google Patents

DNA coding compound library initial fragment and preparation and application thereof Download PDF

Info

Publication number
CN115506036A
CN115506036A CN202210708699.1A CN202210708699A CN115506036A CN 115506036 A CN115506036 A CN 115506036A CN 202210708699 A CN202210708699 A CN 202210708699A CN 115506036 A CN115506036 A CN 115506036A
Authority
CN
China
Prior art keywords
chemical
group
ssdna
compound
multifunctional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210708699.1A
Other languages
Chinese (zh)
Inventor
王璇
索延瑞
张雨晴
刘小红
郑明月
蒋华良
陆晓杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Materia Medica of CAS
Original Assignee
Shanghai Institute of Materia Medica of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Materia Medica of CAS filed Critical Shanghai Institute of Materia Medica of CAS
Publication of CN115506036A publication Critical patent/CN115506036A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/08Liquid phase synthesis, i.e. wherein all library building blocks are in liquid phase or in solution during library creation; Particular methods of cleavage from the liquid support
    • C40B50/10Liquid phase synthesis, i.e. wherein all library building blocks are in liquid phase or in solution during library creation; Particular methods of cleavage from the liquid support involving encoding steps
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B40/00Libraries per se, e.g. arrays, mixtures
    • C40B40/04Libraries containing only organic compounds
    • C40B40/06Libraries containing nucleotides or polynucleotides, or derivatives thereof
    • C40B40/08Libraries containing RNA or DNA which encodes proteins, e.g. gene libraries

Abstract

The invention provides a method for constructing an initial fragment of a novel DNA coding compound library. The construction method comprises the following steps: providing an initial segment having a coding region and a functional region, wherein the functional region has at least one reaction site, and the coding region is two complementary single strands of deoxyribonucleotides. The method provided by the invention is simple to operate, mild in condition and high in yield, and has a promoting effect on the development of the field of medicines.

Description

DNA coding compound library initial fragment and preparation and application thereof
Technical Field
The invention relates to the technical field of biology, in particular to a DNA coding compound library initial fragment and preparation and application thereof.
Background
In the early nineties of the twentieth century, under the influence of the human genome project, medicinal chemists deepened the understanding of biological targets and developed various methods to deeply excavate potential patent drug targets. After mastering a large number of target spots, medicinal chemists hope to obtain the structure of a lead compound with certain physiological activity on a target spot by a screening method, and the lead compound is used for carrying out subsequent structural optimization, pharmacological and pathological research and clinical tests and finally becomes a therapeutic medicament.
To increase the success rate of screening, a high quality, large scale library of compounds is particularly important. Theoretically, the larger the library size, the more uniform the distribution in chemical space (the greater the diversity), the higher the hit rate of the active molecules screened. At present, a high-throughput screening technology which occupies a great position in the research and development of new drugs is based on a huge compound library and adopts an automatic mode to screen a target spot. However, creating a library of molecules that can be used for high throughput screening is a time consuming, labor intensive, and costly process that requires the accumulation and maintenance of chemical synthesizers over decades. Even so, libraries of molecules have limited numbers of compounds, up to about a million. In addition, the amount of target protein required for maintenance and screening of the instrument is a significant expense. Therefore, only a few large pharmaceutical enterprises can develop new drug research and development work by means of high-throughput screening.
In 1988, furda et al proposed the concept of "combinatorial chemistry", which was rapidly developed in the early 90 s. Combinatorial chemistry differs from the traditional chemical synthesis approach in that it is based on the principle of combination, and allows the systematic, iterative, covalent attachment of different building blocks in a short time, thereby producing a large array of structurally related but ordered compounds. Combinatorial techniques are soon being applied to the design, synthesis and screening of new drugs. In order to improve the screening and identification efficiency and reduce the labor and capital costs, researchers incorporate a synchronous coding synthesis strategy into combinatorial synthesis, which requires that an initial segment containing multiple functional groups be used as a hub for connecting target molecules and coding molecules, and in the process of hybrid-segregation synthesis, every time a synthesis block of a target molecule is installed, a coding block capable of representing the synthesis block is connected to the corresponding coding region. After the construction and target screening are completed, the structure of the active molecule can be given by determining the sequence of the coding region. Library techniques for DNA-encoding compounds were developed by the rapid development of high-throughput nucleic acid sequencing technology, in 1992, by the teaching of Brenner and Lenner, the Scripps institute, which places sequencable DNA sequences into the coding region.
DNA-encoded compound library technology combines the features of combinatorial chemistry and molecular biology to allow the construction of large-scale libraries in a short time, as well as 10 -12 The grade amounts were screened and characterized. Moreover, the maintenance and operation of the compound library are very simple, which greatly shortens the research and development period and reduces the research and development cost.
The DNA coding compound library technique also requires a multifunctional initiator fragment to link the target molecule and the coding region. The starting segment currently in use is a tertiary carbon structure containing one amino group and two phosphate functional groups. Amino groups are used for synthesis of the target molecule, while phosphate is used to link the double stranded DNA as a building block for coding. Compared with single-stranded DNA, the double-stranded DNA can be compatible with more chemical reaction types, and a more diverse compound library can be synthesized. However, the disadvantages are obvious, firstly, the PCR amplification efficiency is not as good as that of single strand, which brings difficulty to sequencing, and secondly, a targeted screening mode, such as cross-linking screening, can not be carried out on some special targets. In addition, the route for synthesizing the tertiary carbon initial fragment is complex, and flexible butt joint of the target molecular region and the coding region in space and structure cannot be realized.
In order to overcome the disadvantages of the prior art, the art needs to find an optimization scheme of a DNA coding compound library initial segment which is simple in operation, mild in reaction conditions, high in yield, and capable of flexibly performing docking of a target region and a coding region and interconversion of single/double-stranded DNA, so as to promote wide application of the technology in the field of medicine.
Disclosure of Invention
In view of the above, the present invention aims to provide a novel method for constructing an initial fragment structure of a DNA-encoding compound library, which has the advantages of simple operation steps, mild reaction conditions and high yield, and can promote the wide development and application of DNA-encoding compound library technology in the field of medicine.
In a first aspect the present invention provides a method of constructing a library of DNA encoding compounds starting fragments, said method comprising the steps of:
(1) Providing a multifunctional group framework compound L;
(2) Reacting the polyfunctional group skeleton compound L with the compound A 2 -C 2 -ssDNA 1 Compound A 3 -C 3 -ssDNA 2 And compound R-C 1 -A 1 Reacting to form a compound of formula I, i.e., a DNA encoding the starting fragment of the library of compounds;
Figure BDA0003706324340000031
wherein, the first and the second end of the pipe are connected with each other,
r is a reaction site of the functional region of the starting fragment;
l is a multifunctional group skeleton;
C 1 is a first chemical chain for linking the multifunctional skeleton L with the functional region;
C 2 for linking the multifunctional backbone L with the coding region ssDNA 1 A second chemical chain of (a);
C 3 for linking the multifunctional skeleton L with coding region ssDNA 2 A third chemical strand of (a);
A 1 for connecting C 1 First chaining of LA key learning function;
A 2 for connecting C 2 And a second linkage chemical bond of L;
A 3 for connecting C 3 And a third linkage chemical bond of L;
each X is independently a non-or chemical cleavage site, and the X is located at C 2 Or C 3 At any position above;
ssDNA 1 and ssDNA 2 Each independently a single-stranded DNA, and ssDNA 1 And ssDNA 2 Are sequence complementary.
In another preferred embodiment, at least one X is absent (i.e., not a chemical cleavage site).
In another preferred embodiment, the chemical chain C 2 And C 3 None of them have X.
In another preferred embodiment, X is located in the second chemical chain C 2 Or a third chemical chain C 3 At any position above.
In another preferred embodiment, the "sequence complementarity" indicates ssDNA 1 Sequence of (3) and ssDNA 2 The sequences of (a) are completely complementary or substantially completely complementary, so that a DNA double-stranded structure can be formed (i.e., the degree of complementarity is 80% or more, preferably 90% or more, more preferably 95% or more or 100%).
In another preferred embodiment, said ssDNA 1 And ssDNA 2 The resulting DNA double-stranded structure has blunt ends, or cohesive ends (i.e., containing 1, 2, and 3 overhanging nucleotides).
In another preferred embodiment, the chemical cleavage site is reversible or irreversible.
In another preferred example, the method comprises the steps of:
(a) In an inert solvent, reacting compound A 2 -C 2 -ssDNA 1 Reacting with a multifunctional backbone compound L to form a compound of formula Ia;
Figure BDA0003706324340000032
(b) Reacting said compound of formula Ia with Compound A in an inert solvent 3 -C 3 -ssDNA 2 Reacting to form a compound of formula Ib;
Figure BDA0003706324340000041
(c) Reacting said compound of formula Ib with a compound R-C in an inert solvent 1 -A 1 Reacting to form a compound of formula I, i.e., a DNA encoding the starting fragment of the library of compounds;
Figure BDA0003706324340000042
wherein, in formula I, formula Ia and formula Ib,
r is a reaction site of the functional region of the starting fragment;
l is a multifunctional group skeleton;
C 1 is a first chemical chain for linking the multifunctional skeleton L with the functional region;
C 2 for linking the multifunctional backbone L with the coding region ssDNA 1 A second chemical chain of (a);
C 3 for linking the multifunctional backbone L with the coding region ssDNA 2 A third chemical strand of (a);
A 1 for connecting C 1 And a first linkage chemical bond of L;
A 2 for connecting C 2 And a second linkage chemical bond of L;
A 3 for connecting C 3 And a third linkage chemical bond of L;
each X is independently a zero or chemical cleavage site, and the X is located at C 2 Or C 3 At any position above;
ssDNA 1 and ssDNA 2 Each independently a single-stranded DNA, and ssDNA 1 And ssDNA 2 Are sequence complementary.
In another preferred example, between steps (b) and (c), further comprising: isolating or purifying said compound of formula Ib from the reaction mixture of step (b).
In another preferred embodiment, the first chemical chain C 1 A second chemical chain C 2 And a third chemical chain C 3 The main chain (main chain) of (a) is selected from the group consisting of: C. o, N, P, or a combination thereof, and said first chemical chain C 1 A second chemical chain C 2 And a third chemical chain C 3 Is a saturated or unsaturated chemical chain.
In another preferred embodiment, said first chemical chain C 1 A second chemical chain C 2 And a third chemical chain C 3 The number of main chain atoms of the main chain (main chain) of (1) is independently 1 to 20, preferably 3 to 15, more preferably 4 to 10.
In another preferred embodiment, C is 1 Is substituted or unsubstituted- (L) 1 ) n1 -, in which each L 1 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, oxygen atom, sulfur atom, phosphorus atom; n1 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 2 Is substituted or unsubstituted- (L) 2 ) n2 -, where each L 2 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n2 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 3 Is substituted or unsubstituted- (L) 3 ) n3 -, where each L 3 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n3 is a positive integer from 3 to 30, preferably from 5 to 25;
with the proviso that at least one L 2 And L 3 Is a cleavage site X;
and, said substitution means that one or more hydrogen atoms on the group are substituted with a substituent selected from the group consisting of: o, C 1 -C 4 Alkyl, -OH, C 1 -C 4 alkyl-OH, or combinations thereof.
In another preferred embodiment, the first linkage chemical bond A 1 Second linkage chemical bond A 2 And a third linkage chemical bond A 3 Each independently selected from the group consisting of: a chemical bond, an amide bond, an ester bond, a phosphate bond, or a combination thereof.
In another preferred embodiment, the chemical bond is selected from the group consisting of: carbon-carbon bonds, carbon-nitrogen bonds, carbon-oxygen bonds.
In another preferred embodiment, the multifunctional skeleton L contains 3 or 4 bonds A for bonding with the first linking chemical bond 1 A second linkage chemical bond A 2 And a third linkage chemical bond A 3 Active reaction site of the reaction.
In another preferred embodiment, the multifunctional skeleton L is not a tertiary carbon skeleton.
In another preferred embodiment, the multifunctional skeleton L is selected from the group consisting of:
Figure BDA0003706324340000051
in another preferred embodiment, L is a multifunctional backbone derived from the backbone compound cyanuric chloride
Figure BDA0003706324340000052
In another preferred embodiment, C is 2 And C 3 None of them have a chemical cleavage site X; or C 2 Or C 3 Has a chemical cutting site X.
In another preferred embodiment, the chemical cleavage site X is selected from the group consisting of: chemical bonds, DNA sequences that can be cleaved by enzymes, or combinations thereof.
In another preferred embodiment, the chemical cleavage site X is a DNA sequence that can be cleaved by a cleaving enzyme, such as GAATTC.
In another preferred embodiment, the chemical cleavage site X is selected from the group consisting of:
(i) A chemical bond that can be cleaved under acidic conditions;
preferably, the chemical bond cleavable under acidic conditions is selected from the group consisting of:
Figure BDA0003706324340000061
(ii) A chemical bond that can be cleaved under alkaline conditions;
preferably, the chemical bond cleavable under alkaline conditions is selected from the group consisting of:
Figure BDA0003706324340000062
(iii) A chemical bond that can be cut by light,
preferably, the photo-cleavable chemical bond is selected from the group consisting of:
Figure BDA0003706324340000063
(iv) Chemical bonds that can be cleaved under reducing conditions,
preferably, the chemical bond cleavable under reducing conditions comprises a disulfide bond;
(v) (iv) any combination of (i), (ii), (iii), and (iv) above.
In another preferred embodiment, X is a disulfide bond.
In another preferred embodiment, the reactive group of the reactive site R of the functional region of the initiator fragment is selected from the group consisting of: amino, hydroxyl (-SH), aldehyde carbonyl (-CHO), ketone carbonyl (-C (O) -), carboxyl (-COOH), azide (-N = N-), alkynyl, halogen, or a combination thereof, more preferably amino.
In another preferred embodiment, the first chemical chain C 1 Is covalently linked to the reaction site R of the functional region of the initiator fragment, more preferably selected from the group consisting of: amino, carboxyl, sulfonyl, halogenated aromatic rings, aromatic heterocycles, isocyanate groups, aldehydes, ketones, alpha, beta-unsaturated ketones or esters, or combinations thereof.
In another preferred embodiment, the compound R-C 1 -A 1 Serving as toolsoxaC having at least two terminal amino groups and containing 1, 2 or 3 oxygen atoms 4 -C 10 An alkane, preferably 1,8-diamino-3,6-dioxaoctane.
In another preferred embodiment, said ssDNA 1 And ssDNA 2 Are fully or partially complementary to form a double-stranded DNA structure.
In another preferred embodiment, said ssDNA 1 And ssDNA 2 Are each independently 3 to 30nt (or bp), more preferably 5 to 20nt (or bp).
In another preferred embodiment, said ssDNA 1 The oligonucleotide sequence of (a) is: TGACTCCC.
In a second aspect, the present invention provides an initiator fragment for use in constructing a library of DNA encoding compounds, said initiator fragment having the structure shown in formula I below:
Figure BDA0003706324340000071
wherein, the first and the second end of the pipe are connected with each other,
r is a reaction site of the functional region of the starting fragment;
l is a multifunctional group skeleton;
C 1 is a first chemical chain for linking the multifunctional skeleton L with the functional region;
C 2 for linking the multifunctional backbone L with the coding region ssDNA 1 A second chemical chain of (a);
C 3 for linking the multifunctional backbone L with the coding region ssDNA 2 A third chemical strand of (a);
A 1 for connecting to C 1 And a first linkage chemical bond of L;
A 2 for connecting C 2 And a second linkage chemical bond of L;
A 3 for connecting C 3 And a third linkage chemical bond of L;
each X is independently a zero or chemical cleavage site, and the X is located at C 2 Or C 3 At any position above;
ssDNA 1 and ssDNA 2 Each independently a single-stranded DNA, and ssDNA 1 And ssDNA 2 Are sequence complementary.
In another preferred embodiment, L is a multifunctional skeleton selected from the group consisting of:
Figure BDA0003706324340000081
and/or R is selected from the group consisting of: amino, hydroxyl (-SH), aldehyde carbonyl (-CHO), ketone carbonyl (-C (O) -), carboxyl (-COOH), azide (-N = N-), alkynyl, halogen.
In another preferred embodiment, C is 1 Is substituted or unsubstituted- (L) 1 ) n1 -, where each L 1 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, oxygen atom, sulfur atom, phosphorus atom; n1 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 2 Is substituted or unsubstituted- (L) 2 ) n2 -, in which each L 2 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n2 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 3 Is substituted or unsubstituted- (L) 3 ) n3 -, where each L 3 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n3 is a positive integer from 3 to 30, preferably from 5 to 25;
with the proviso that at least one L 2 And L 3 Is a cleavage site X;
and, said substitution means that one or more hydrogen atoms on the group are substituted with a substituent selected from the group consisting of: o, C 1 -C 4 Alkyl, -OH, C 1 -C 4 alkyl-OH, or combinations thereof.
In another preferred embodiment, the starting fragment is selected from the group consisting of:
Figure BDA0003706324340000082
in a third aspect the invention provides a use of the starting fragments for constructing a library of DNA encoding compounds according to the second aspect for the preparation of reagents for constructing a library of DNA encoding compounds.
In a fourth aspect the invention provides a method of constructing a library of DNA encoding compounds comprising the steps of:
(S1) ligating the starting fragment of a library of DNA encoding compounds of the structure of formula I as defined in claim 11 with a predetermined compound to form a library of DNA encoding compounds:
Figure BDA0003706324340000091
wherein the content of the first and second substances,
R、L、C 1 、C 2 、C 3 、A 1 、A 2 、A 3 、X、ssDNA 1 and ssDNA 2 As defined in claim 11.
In another preferred example, the step (S1) further includes: the DNA encoding oligonucleotide (or barcode fragment) is ligated to ssDNA either before or after ligation of the initiator fragment to the predetermined compound 1 And ssDNA 2 The resulting DNA double stranded structures are ligated to form a library of DNA encoding compounds.
In another preferred embodiment, the starting fragment is linked to the predetermined compound via R.
In another preferred embodiment, the compound comprises a small molecule compound.
In another preferred embodiment, the method further comprises the steps of:
(S2) incubating the library of DNA coding compounds with a target spot of interest, and collecting library members having an affinity with the target spot of interest; and
(S3) carrying out PCR amplification and high-throughput sequencing on the encoding oligonucleotides corresponding to the library members in the step (f), thereby determining the structures of the library members capable of being combined with the target spots.
In another preferred embodiment, the target points include: protein, nucleic acid, cell, virus.
It is to be understood that within the scope of the present invention, the above-described features of the present invention and those specifically described below (e.g., in the examples) may be combined with each other to form new or preferred embodiments. Not to be repeated herein, depending on the space.
Drawings
FIG. 1 shows a mass spectrum of the starting fragment Linker-1 prepared in example 1.
FIG. 2 shows a mass spectrum of BB1-S-Linker-1-P-T1 prepared in step (3) of example 2.
FIG. 3 shows a mass spectrum of BB2-BB1-S-Linker-1-P-T1-T2 prepared in step (4) of example 2.
FIG. 4 shows a mass spectrum of the library of DNA encoding compounds BB2-BB 1-S-Linker-1-P-T1-T2-cloning prepared in step (5) of example 2.
FIG. 5 shows a mass spectrum of the starting fragment Linker-2 prepared in example 3.
FIG. 6 shows the mass spectrum of the fragment Linker-2-P prepared in example 4.
FIG. 7 shows the mass spectrum of the fragment Linker-2-P-T1 prepared in example 5.
FIG. 8 shows the mass spectrum of the fragment Linker-2-P-T1-T2 prepared in example 6.
FIG. 9 shows the mass spectra of the two chains resulting from cleavage of the disulfide bonds in example 7.
FIG. 10 shows the mass spectrum of the DNA precipitate obtained by cleaving the disulfide bond of the library of DNA encoding compounds BB2-BB 1-S-Linker-1-P-T1-T2-cloning in example 8.
Detailed Description
The present inventors have extensively and intensively studied and, through a large number of screenings, have for the first time developed an initial fragment of a library of DNA-encoding compounds of novel structure and a method for constructing the same. The initial fragment of the invention has unique structure, simple construction method route, simple and convenient operation, mild reaction condition and higher yield. The present inventors also provide methods of constructing libraries of DNA encoding compounds using the starting fragments of the present invention and uses thereof. On the basis of this, the present invention has been completed.
Term(s) for
Herein, unless otherwise specified, the term "substituted" means that one or more hydrogen atoms on a group are replaced with a substituent selected from the group consisting of: c 1 ~C 6 Alkyl radical, C 3 ~C 8 Cycloalkyl radical, C 1 ~C 6 Alkoxy, halogen, hydroxy, carboxy (-COOH), amino, phenyl; the phenyl group includes an unsubstituted phenyl group or a substituted phenyl group having 1 to 3 substituents selected from: halogen, C 1 -C 10 Alkyl, cyano, OH, nitro, amino.
Unless otherwise specified, each chiral carbon atom in all compounds of the invention may optionally be in the R configuration or the S configuration, or a mixture of the R configuration and the S configuration.
The term "C 1 ~C 6 Alkyl "means a straight or branched chain alkyl group having 1 to 6 carbon atoms, such as methyl, ethyl, propyl, isopropyl, butyl, isobutyl, sec-butyl, tert-butyl, or the like.
The term "C 3 ~C 8 Cycloalkyl "means a cycloalkyl group having 3 to 8 carbon atoms, such as cyclopropyl, cyclobutyl, cyclopentyl, cyclohexyl, or the like.
The term "3-8 membered heterocyclic group" means a group formed by losing one hydrogen atom from a 3-8 membered saturated ring having 1-3 hetero atoms selected from the group consisting of: n, S, O; such as pyrrolidinyl, piperidinyl, piperazinyl, morpholinyl, or the like.
The term "6-10 membered aryl" refers to a group formed by 6-10 membered aryl having one hydrogen atom removed; such as phenyl, naphthyl, or the like.
The term "5-to 10-membered heteroaryl" refers to a group formed by a 5-to 8-membered aryl group having 1 to 3 heteroatoms selected from the group consisting of: n, S, O, wherein each cyclic system of heteroaryl groups can be monocyclic or polycyclic; such as pyrrolyl, pyridyl, thienyl, furyl, imidazolyl, pyrimidinyl, benzothienyl, indolyl, imidazopyridinyl, quinolinyl, or the like.
The term "C 1 ~C 6 Alkoxy "means a straight or branched chain alkoxy group having 1 to 6 carbon atoms, such as methoxy, ethoxy, propoxy, isopropoxy, butoxy, isobutoxy, sec-butoxy, tert-butoxy, or the like.
The term "halogen" refers to F, cl, br and I.
Unless otherwise specified, the structural formulae depicted herein are intended to include all isomeric forms (e.g., enantiomers, diastereomers and geometric isomers (or conformers): e.g., R, S configuration containing an asymmetric center, (Z), (E) isomers and (Z), (E) conformers of the double bond.
In this context, a group shaped as "C1 to C6" means that the group may have 1 to 6 carbon atoms, for example 1, 2, 3, 4,5 or 6.
Initiation fragment
As used herein, the terms "starting fragment of the invention", "starting fragment of a DNA coding compound library of the invention" are used interchangeably and refer to the starting fragment described in the second aspect of the invention. The starting fragment of the invention has the structure shown in formula I.
It will be understood that in formulas I, ia and Ib of the present invention, C1 and C 1 Are the same meaning and all represent a first chemical chain; similarly, C2 and C 2 Denotes the same meaning, C3 and C 3 Are meant to be the same.
The starting fragment of the present invention has the structure shown in formula I below:
Figure BDA0003706324340000111
wherein, the first and the second end of the pipe are connected with each other,
r is a reaction site of the functional region of the starting fragment;
l is a multifunctional group skeleton;
C 1 is a first chemical chain for linking the multifunctional skeleton L with the functional region;
C 2 for linking the multifunctional backbone L with the coding region ssDNA 1 A second chemical chain of (a);
C 3 for linking the multifunctional backbone L with the coding region ssDNA 2 A third chemical strand of (a);
A 1 for connecting C 1 And a first linkage chemical bond of L;
A 2 for connecting C 2 And a second linkage chemical bond of L;
A 3 for connecting to C 3 And a third linkage chemical bond of L;
each X is independently a zero or chemical cleavage site, and the X is located at C 2 Or C 3 At any position above;
ssDNA 1 and ssDNA 2 Each independently a single-stranded DNA, and ssDNA 1 And ssDNA 2 Are sequence complementary.
In other embodiments, L is a multifunctional backbone selected from the group consisting of:
Figure BDA0003706324340000121
and/or R is selected from the group consisting of: amino, hydroxyl (-SH), aldehyde carbonyl (-CHO), ketone carbonyl (-C (O) -), carboxyl (-COOH), azide (-N = N-), alkynyl, halogen.
In other embodiments, said C 1 Is substituted or unsubstituted- (L) 1 ) n1 -, where each L 1 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, oxygen atom, sulfur atom, phosphorus atom; n1 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 2 Is substituted or unsubstituted-(L 2 ) n2 -, in which each L 2 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n2 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 3 Is substituted or unsubstituted- (L) 3 ) n3 -, where each L 3 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n3 is a positive integer from 3 to 30, preferably from 5 to 25;
with the proviso that at least one L 2 And L 3 Is a cleavage site X;
and, said substitution means that one or more hydrogen atoms on the group are substituted with a substituent selected from the group consisting of: o, C 1 -C 4 Alkyl, -OH, C 1 -C 4 alkyl-OH, or combinations thereof.
In other embodiments, the starting fragment is selected from the group consisting of:
Figure BDA0003706324340000131
preparation of starting fragments of a library of DNA-encoded compounds
The invention also provides a preparation method of the initial fragment of the DNA coding compound library.
One representative construction method is shown in scheme I, and includes the following steps:
(a) In an inert solvent, reacting the compound A 2 -C 2 -ssDNA 1 Reacting with a multifunctional backbone compound L to form a compound of formula Ia;
Figure BDA0003706324340000132
(b) Reacting said compound of formula Ia with Compound A in an inert solvent 3 -C 3 -ssDNA 2 Reacting to form a compound of formula Ib;
Figure BDA0003706324340000133
(c) Reacting said compound of formula Ib with a compound R-C in an inert solvent 1 -A 1 Reacting to form a compound of formula I, i.e., a DNA encoding the starting fragment of the library of compounds;
Figure BDA0003706324340000141
wherein, in formula I, formula Ia and formula Ib,
r is a reaction site of the functional region of the initiation fragment;
l is a multifunctional group skeleton;
C 1 is a first chemical chain for linking the multifunctional skeleton L with the functional region;
C 2 for linking the multifunctional backbone L with the coding region ssDNA 1 A second chemical chain of (a);
C 3 for linking the multifunctional backbone L with the coding region ssDNA 2 A third chemical strand of (a);
A 1 for connecting C 1 And a first linkage chemical bond of L;
A 2 for connecting C 2 And a second linkage chemical bond of L;
A 3 for connecting C 3 And a third linkage chemical bond of L;
each X is independently a zero or chemical cleavage site, and the X is located at C 2 Or C 3 At any position above;
ssDNA 1 and ssDNA 2 Each independently a single-stranded DNA, and ssDNA 1 And ssDNA 2 Are sequence complementary.
Wherein the content of the first and second substances,
the multifunctional group skeleton L is selected from the following group:
Figure BDA0003706324340000142
r is selected from the group consisting of: amino, hydroxyl, sulfhydryl, aldehyde carbonyl, ketone carbonyl, carboxylic acid, azide, alkynyl, halogen;
C 1 、C 2 、C 3 is a chemical chain of a multifunctional skeleton L, a functional region and a coding region, and is respectively and independently selected from the following groups: saturated and unsaturated chains containing carbon, oxygen, nitrogen, phosphorus atoms;
A 1 、A 2 、A 3 each independently selected from the group consisting of: carbon-carbon bonds, carbon-nitrogen bonds, carbon-oxygen bonds, amide bonds, ester bonds, phosphate bonds, and triazole.
The multifunctional backbone L contains a plurality of active reaction sites.
In the step (1), the multifunctional group skeleton L is linked with a chemical bond A through a linking chemical bond A 1 、A 2 And A 3 The functional region on the initial segment is connected with the coding region.
Said chemical chain C 2 And C 3 The chemical cleavage site X in (b) can effect cleavage of the bond under mild conditions.
The single-stranded deoxyribonucleotide with the cutting site X in the coding zone is introduced into other coding oligonucleotides in an enzyme-linked mode.
The chemical reaction site R of the functional region of the starting fragment structure I is linked to the synthetic building block containing at least one reaction site by a covalent bond.
In some embodiments, the chemical cleavage site X is selected from a deoxynucleotide sequence capable of being cleaved by a cleaving enzyme, such as GAATTC.
In other embodiments, the chemical cleavage site X is selected from a chemical bond, such as a disulfide bond, that can be cleaved by a reduction method.
In other embodiments, the chemical cleavage site X is selected from the group consisting of a bond capable of being cleaved under acidic conditions
Figure BDA0003706324340000151
In other embodiments, the chemical cleavage site X is selected from the group consisting of a bond that can be cleaved under alkaline conditions
Figure BDA0003706324340000152
In other embodiments, the chemical cleavage site X is selected from the group of chemical bonds that can be cleaved under light:
Figure BDA0003706324340000153
furthermore, in ssDNA 1 And ssDNA 2 Or a double-stranded DNA structure formed by the DNA double-stranded structure can also contain an additional enzyme cutting site Z. Preferably, the deoxyribonucleotide single strand (or double strand) bearing the cleavage site Z can be linked to the deoxyribonucleotide strand encoding the information representing the reaction block by an enzymatic reaction.
Applications of
The invention also provides the use of the starting fragments, in particular for constructing libraries of DNA-encoding compounds for large-scale and high-throughput screening of active compounds.
Typically, the method of the invention comprises:
(S1) ligating an initial fragment of a library of DNA encoding compounds of the structure shown in formula I with a predetermined compound to form a library of DNA encoding compounds:
Figure BDA0003706324340000161
wherein the content of the first and second substances,
R、L、C 1 、C 2 、C 3 、A 1 、A 2 、A 3 、X、ssDNA 1 and ssDNA 2 As defined above.
In another preferred example, the step (S1) further includes: in the starting fragment with a predetermined compoundThe DNA encoding oligonucleotides (or barcode fragments) are ligated to ssDNA either before or after ligation is performed 1 And ssDNA 2 The resulting DNA double stranded structures are ligated to form a library of DNA encoding compounds.
In another preferred embodiment, the starting fragment is linked to the predetermined compound via R.
In another preferred embodiment, the compound comprises a small molecule compound.
In another preferred embodiment, the method of the present invention further comprises the steps of:
(S2) incubating a DNA coding compound library constructed by the initial fragment structure I and a target protein together, and collecting DNA coding molecules with a certain affinity with the target;
(S3) carrying out PCR amplification and high-throughput sequencing on the deoxynucleotide sequence carried by the DNA coding molecule collected in the step (S2), and determining the specific structure of the collected coding molecule;
(S4) chemically synthesizing the encoded molecules collected in step (S3), and verifying the activity of the molecules by pharmacological experiments.
The main advantages of the invention include:
1) The existing complex synthetic route of the initial segment is avoided, so that the target molecular chain and the coding chain can be connected in a more concise and flexible way.
2) When constructing a library of DNA-encoding compounds, more stable double-stranded DNA can be used as a code to participate in the chemical reaction of library construction; when the target spot is screened, a more appropriate screening scheme can be selected in a targeted and flexible way: a) The screening was carried out using the double-stranded DNA as a code. b) After cleavage under mild conditions, the DNA fragments were used in the form of single-stranded DNA for screening. c) The single-stranded DNA is cross-linked with an auxiliary strand to complete the screening work together.
3) After screening, the double-stranded DNA is converted into the single-stranded DNA, so that the PCR efficiency can be greatly improved, the risk of base pair mismatching is reduced, the accuracy of DNA sequencing is improved, and the whole coding technology becomes more reliable.
4) The reagents required in the process of synthesizing and cutting off the initial fragment of the novel DNA coding compound library are cheap and easily available, the related reaction conditions are mild, the yield is high, and the stability of the novel initial fragment is good.
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. The experimental procedures, in which specific conditions are not noted in the following examples, are generally carried out under conventional conditions or conditions recommended by the manufacturers. Unless otherwise indicated, percentages and parts are percentages and parts by weight.
Example 1: preparation of starting fragment Linker-1
In this example, a 5' -phosphorylated DNA strand having the following nucleic acid sequence was synthesized as a starting fragment.
Figure BDA0003706324340000171
1.1ssDNA Structure
Single-stranded DNA ssDNA-1 Structure:
Figure BDA0003706324340000172
single-stranded DNA ssDNA-2 structure:
Figure BDA0003706324340000173
1.2 methods
An aqueous solution of 100nmol ssDNA-1 (100. Mu.L) was diluted with 100. Mu.L of sodium borate buffer (pH =9.4, 500mM) and 7.5. Mu.L of cyanuric chloride in acetonitrile (200 mM) was added. The reaction system was reacted at 4 ℃ for 10 minutes. Then, 100nmol of an aqueous solution (100. Mu.L) of ssDNA-2 was added, and the reaction was allowed to react at room temperature for 1 hour. After completion of the reaction, 30. Mu.L of 5M sodium chloride solution and 1000. Mu.L of cold ethanol were added to the reaction mixture, and after leaving at-78 ℃ for 0.5 hour, the mixture was centrifuged at 4 ℃ to remove the supernatant, and the residue was removed by lyophilization. The obtained solid was added to a sodium borate buffer solution of pH 9.4 to prepare a 0.5mM aqueous solution, 150. Mu.L of a DMA solution (200 mM) of 1,8-diamino-3,6-dioxaoctane was added thereto, and the reaction system was reacted at 80 ℃ for 2 hours. After the reaction, 30. Mu.L of 5M sodium chloride solution and 1000. Mu.L of cold ethanol were added to the reaction solution, and after standing at-78 ℃ for 0.5 hour, the mixture was centrifuged at 4 ℃ again to remove the supernatant, and the residual solvent was removed by lyophilization, to obtain Linker-1, the identification results of which are shown in FIG. 1.
Example 2: preparation of a library of 96X 96-pool DNA-encoding Compounds
1. Step (1): connecting the 4-fluoro-3-nitrobenzoic acid with the segment Linker-1 to obtain the segment S-Linker-1 with a reaction framework.
600nmol of linker-1 was dissolved in 600. Mu.L of sodium borate buffer (pH =9.4, 250 mM), and 45. Mu.L of 200mM Fmoc-AOP (dissolved in DMA, 15 equivalents, ready to use), 45. Mu.L of 200mM 2- (7-azabenzotriazole) -N, N, N ', N' -tetramethyluronium Hexafluorophosphate (HATU) (dissolved in DMA, 15 equivalents, ready to use), 45. Mu.L of 200mM N, N-Diisopropylethylamine (DIPEA) (dissolved in DMA, 15 equivalents, ready to use) were added and reacted at room temperature. After the reaction, 74. Mu.L of 5M sodium chloride solution and 2100. Mu.L of cold ethanol were added to the reaction mixture, and after standing at-78 ℃ for 0.5 hour, the mixture was centrifuged at 4 ℃ to remove the supernatant, thereby obtaining a DNA precipitate which was further dissolved in a 5% piperidine aqueous solution. After the reaction was complete, ethanol was precipitated as described above, and the product was desalted and purified using a 3K size ultrafiltration tube (Amicon Ultra Centrifugal) of 500. Mu.L size. The resulting filtrate was added to 600. Mu.L of sodium borate buffer (pH =9.4, 250 mM), 180. Mu.L of 200mM 4-fluoro-3-nitrobenzoic acid (dissolved in DMA, 60 equivalents, ready-to-use), 180. Mu.L of 200mM 2- (7-azabenzotriazole) -N, N, N ', N' -tetramethyluronium Hexafluorophosphate (HATU) (dissolved in DMA, 60 equivalents, ready-to-use), 180. Mu.L of 200mM N, N-Diisopropylethylamine (DIPEA) (dissolved in DMA, 60 equivalents, ready-to-use), and reacted at room temperature. After the reaction, 11.4. Mu.L of 5M sodium chloride solution and 3200. Mu.L of cold ethanol were added to the reaction solution, and the mixture was left at-78 ℃ for 0.5 hour and then centrifuged at 4 ℃ to remove the supernatant, thereby obtaining a fragment S-Linker-1 having a reaction skeleton.
2. Step (2): connecting an initial Primer (Primer) with the segment S-Linker-1 to obtain the segment S-Linker-1-P with the Primer
In this exemplary library of DNA-encoding compounds, an immobilized coding nucleotide duplex (abbreviated as a starter primer, custom made by soviet Jin Weizhi biotechnology limited, HPLC purified) was ligated into S-Linker-1 for subsequent screening of PCR universal primer fragments for all DNA-encoding compounds:
upper chain of the initial primer: 5' -PO 4 2- -AGGCTAACTTGCGTACACAG-3' (SEQ ID NO: 01)
The lower strand of the initial primer: 5' -PO 4 2- -ACGCAAGTTAGCCTTCGGGA-3' (SEQ ID NO: 02)
600. Mu.L of 1mM fragment S-Linker-1 aqueous solution and 368. Mu.L of 660nmol annealed primer aqueous solution were mixed uniformly, and T4 buffer solution, T4 ligase and 16 ℃ reaction were added at 0 ℃ for 16 hours. After the reaction is completed, 300 mu L of 5M sodium chloride aqueous solution and 8500 mu L of frozen ethanol are added, the mixture is placed at the temperature of minus 78 ℃ for 0.5 hour and then centrifuged at the temperature of 4 ℃, the supernatant is removed, and the obtained DNA precipitate is dissolved in 585 mu L of distilled water to obtain the fragment S-Linker-1-P with the primer.
3. And (3): first cycle Synthesis of libraries of DNA encoding compounds
Similar to the ligation reaction in step (2), 96 ligation reactions were set, and a 1mM solution of the fragment S-Linker-1-P with primers was dispensed into 96 consecutive wells of a 96-well plate, 5.5. Mu.L per well, and 3.1. Mu.L each of the top and bottom strands (first cycle labeled nucleotide duplex, custom made by Jin Weizhi Biotech, suzhou, HPLC purification) of the labeled nucleotide duplex at a concentration of 1.8mM was continuously added to each well. After the reaction was completed, ethanol precipitation was performed as described above, and the resulting DNA precipitate was further dissolved in 8. Mu.L of sodium borate buffer (pH =9.4, 250 mM), and the corresponding compound building block, small molecule primary amine (100 equivalents, 200mM DMA solution), was added and heated in a PCR apparatus at 60 ℃ for 16 hours (top temperature 105 ℃). After the reaction was complete, the resulting DNA precipitate was dissolved in 200. Mu.l of ethanol as described aboveThe product was desalted and purified in distilled water L using a 10K size ultrafiltration tube (Amicon Ultra Centrifugal) of 500. Mu.L size. The purified DNA precipitate was dissolved in 500. Mu.L of distilled water (0.8 mM), and 160. Mu.L of 200mM FeSO was added 4 (dissolved in distilled water, 80 equivalents, ready for use) and heated in a PCR instrument at 60 ℃ for 16 hours (cover temperature 105 ℃). After the reaction is completed, centrifuging at 4 ℃, discarding the precipitate, and performing ethanol precipitation on the supernatant as described above to obtain a first cycle product BB1-S-Linker-1-P-T1. The results of the identification are shown in FIG. 2.
The small molecule and corresponding nucleotide duplex information for the first cycle is shown in table a below:
TABLE A
Figure BDA0003706324340000191
Figure BDA0003706324340000201
Figure BDA0003706324340000211
Figure BDA0003706324340000221
4. And (4): second cycle Synthesis of libraries of DNA encoding Compounds
The ligation of the labeled nucleotide duplex of the second cycle (customized by Jin Weizhi, bioscience, limited, HPLC purification) to the product BB1-S-Linker-1-P-T1 of the first cycle was similar to the ligation reaction of step (2), and the purified DNA precipitate was further dissolved in 3 μ L of phosphate buffer (pH =5.5, 500 mM), and the corresponding compound building block, small molecule aldehyde (200 eq, 200mM DMA solution), was added and reacted overnight at room temperature. After the reaction is completed, all reaction solutions are mixed together, ethanol precipitation is carried out as described above, the obtained DNA precipitate is continuously dissolved in 200 mu L of distilled water, and the product is desalted and purified by using an ultrafiltration tube (Amicon Ultra Centrifugal) with the size of 500 mu L and the specification of 10K, so as to obtain a second circulating product BB2-BB1-S-Linker1-P-T1-T2. The results of the identification are shown in FIG. 3.
The small molecule and corresponding nucleotide duplex information for the second cycle are specifically shown in table B below:
TABLE B
Figure BDA0003706324340000231
Figure BDA0003706324340000241
Figure BDA0003706324340000251
Figure BDA0003706324340000261
5. And (5): the tail primer is linked to a library of DNA-encoding compounds
In this exemplary DNA-encoding compound library, an immobilized coding nucleotide duplex (abbreviated as tail primer, custom made by sovik Jin Weizhi biotechnology limited, HPLC purified) was ligated to the DNA-encoding compound library as the other universal primer fragment for PCR:
and (3) winding the tail end primer:
5'-PO 4 2- -ACATAACGCTBBBBBBBBBBBBBBCTGATGGCGCGAGGGAGGC-3' (SEQ ID NO: 03)
Tail primer pull-down: 5' -PO 4 2- -TTTATGTAC-3' (SEQ ID NO: 04)
mu.L of an aqueous solution of 1mM of a library of DNA-encoding compounds 30. Mu.L and 24. Mu.L of an aqueous solution of 1mM of the annealed tail primer were mixed well, and then a T4 buffer solution, T4 ligase and the mixture were added at 0 ℃ for 16 hours at 16 ℃. After the reaction was completed, 12. Mu.L of 5M aqueous sodium chloride solution and 350. Mu.L of frozen ethanol were added, the mixture was left at-78 ℃ for 0.5 hour and centrifuged at 4 ℃ again, the supernatant was removed, the resulting DNA precipitate was dissolved in 50. Mu.L of distilled water, and the product was desalted and purified using 500. Mu.L of a 10K ultrafiltration tube (Amicon Ultra Centrifugal), and the resulting DNA-encoding compound library BB2-BB 1-S-Linker-1-P-T1-T2-cloning (identified as shown in FIG. 4) was used for the subsequent protein affinity screening.
Example 3: preparation of initiation fragment Linker-2
A5' -phosphorylated DNA strand having the following nucleic acid sequence was synthesized as a starting fragment.
Figure BDA0003706324340000271
3.1.SsDNA Structure
Single-stranded DNA ssDNA-3 Structure:
Figure BDA0003706324340000272
single-stranded DNA ssDNA-4 Structure:
Figure BDA0003706324340000273
3.2. method of producing a composite material
An aqueous solution of 100nmol ssDNA-3 (100. Mu.L) was diluted with 100. Mu.L of sodium borate buffer (pH =9.4, 500mM) and 7.5. Mu.L of cyanuric chloride in acetonitrile (200 mM) was added. The reaction was allowed to react at 4 ℃ for 10 minutes, and then 100nmol of an aqueous solution of ssDNA-4 (100. Mu.L) was added. The reaction system was reacted at room temperature for 1 hour. After the reaction, 30. Mu.L of a 5M sodium chloride solution and 1000. Mu.L of cold ethanol were added to the reaction mixture, and the mixture was left at-78 ℃ for 0.5 hour, centrifuged at 4 ℃ again, the supernatant was removed, and the residual solvent was removed by lyophilization. To the obtained solid, sodium borate buffer solution of pH 9.4 was added to prepare a 0.5mM aqueous solution, 150. Mu.L of 1, 8-diamino-3,6-dioxaoctane DMA solution (200 mM) was added thereto, and the reaction system was reacted at 80 ℃ for 2 hours. After the reaction, 30. Mu.L of 5M sodium chloride solution and 1000. Mu.L of cold ethanol were added to the reaction solution, and after standing at-78 ℃ for 0.5 hour, the mixture was centrifuged at 4 ℃ to remove the supernatant, and then lyophilized to remove the residual solvent, to obtain linker-2, the identification of which is shown in FIG. 5.
Example 4: ligation initiation primer
Upper chain of the initial primer: 5' -PO 4 2- -AAATCGATGTG-3' (sequence code 05)
The initial primer is down-linked: 5' -PO 4 2- -CATCGATTTGG-3' (SEQ ID NO: 06)
5 mu L of 5nmol of initial fragment Linker-2 aqueous solution and 3 mu L of 5.5nmol of annealed initial primer aqueous solution are uniformly mixed, T4 buffer solution and T4 ligase are added at 0 ℃, and reaction is carried out for 16 hours at 16 ℃. After the reaction was completed, 2. Mu.L of 5M aqueous sodium chloride solution and 100. Mu.L of frozen ethanol were added, and after standing at-78 ℃ for 0.5 hour, centrifugation was carried out at 4 ℃ to remove the supernatant, and the resulting DNA precipitate was dissolved in 5. Mu.L of distilled water to obtain a fragment Linker-2-P with the initiator, and the results of the identification are shown in FIG. 6.
Example 5: ligation of first round DNA coding sequences
Coding uplink: 5' -PO 4 2- -AGACGCAAGAG-3' (SEQ ID NO: 07)
Coding of the lower chain: 5' -PO 4 2- -CTTGCGTCTCA-3' (SEQ ID NO: 08)
5 mu L of 5nmol of Linker-2-P aqueous solution and 3 mu L of 5.5nmol of annealed first round DNA coding sequence aqueous solution are uniformly mixed, T4 buffer solution and T4 ligase are added at 0 ℃, and reaction is carried out for 16 hours at 16 ℃. After the reaction was completed, 2. Mu.L of 5M aqueous sodium chloride solution and 100. Mu.L of frozen ethanol were added, and after standing at-78 ℃ for 0.5 hour, centrifugation was carried out at 4 ℃ to remove the supernatant, and the resulting DNA precipitate was dissolved in 5. Mu.L of distilled water to obtain a fragment Linker-2-P-T1 with the primer, the identification results of which are shown in FIG. 7.
Example 6: ligation of second round DNA coding sequences
Coding the uplink: 5' -PO 4 2- -TCCTTGTCAGT-3' (SEQ ID NO: 09)
Coding of the lower chain: 5' -PO 4 2- -TGACAAGGACT-3' (SEQ ID NO: 10)
5 mu L of 5nmol of Linker-2-P-T1 aqueous solution and 3 mu L of 5.5nmol of annealed second round DNA coding sequence aqueous solution are uniformly mixed, T4 buffer solution and T4 ligase are added at 0 ℃, and reaction is carried out for 16 hours at 16 ℃. After the reaction was completed, 2. Mu.L of 5M aqueous sodium chloride solution and 100. Mu.L of frozen ethanol were added, and after standing at-78 ℃ for 0.5 hour, centrifugation was carried out at 4 ℃ to remove the supernatant, and the resulting DNA precipitate was dissolved in 5. Mu.L of distilled water to obtain a fragment Linker-2-P-T1-T2 with the primer starting, and the results of the identification are shown in FIG. 8.
Example 7: cleaving disulfide bonds
5nmol of Linker-2-P-T1-T2 aqueous solution was mixed with 5. Mu.L of pH 9.4 buffer solution, 0.5. Mu.L of 1M DTT aqueous solution was added, and reaction was carried out at room temperature for 1 hour. After the reaction was completed, 1. Mu.L of 5M aqueous sodium chloride solution and 50. Mu.L of frozen ethanol were added, and after standing at-78 ℃ for 0.5 hour, centrifugation was carried out at 4 ℃ to remove the supernatant, and the resulting DNA precipitate was dissolved in 5. Mu.L of distilled water. The molecular weights of the two strands generated by cleavage were 12328 and 13361, respectively, and the results are shown in fig. 9.
Example 8: cleaving disulfide bonds in a library of DNA-encoding compounds
2nmol of an aqueous solution of the DNA coding compound library BB2-BB 1-S-Linker-1-P-T1-T2-cloning was mixed with 2. Mu.L of a buffer solution of pH 9.4, and 0.2. Mu.L of a 1M aqueous solution of DTT was added to the mixture, followed by reaction at room temperature for 1 hour. After completion of the reaction, 0.2. Mu.L of 5M aqueous sodium chloride solution and 20. Mu.L of frozen ethanol were added, and after standing at-78 ℃ for 0.5 hour, centrifugation was carried out at 4 ℃ to remove the supernatant, and the resulting DNA precipitate was dissolved in 2. Mu.L of distilled water and identified by LCMS, the result of which is shown in FIG. 10.
Example 9: separation of two nucleic acid Single strands
One of the single strands was pulled out using 500. Mu.L of Streptavidin MagPoly beads. Leaving the other nucleic acid strand to be screened.
Example 10: photocrosslinking screening
Mu.g of protein (control group without protein), single-stranded nucleic acid and complementary nucleic acid strand bearing photocrosslinking group were incubated in 50. Mu.L of buffer (PBS pH7.4,0.3mg/mL, sssDNA,0.1M NaCl) for 1h and transferred to a 96-well plate. UV irradiation at 365nm on ice for 20min. Adding 50 μ L of 10% SDS. Transfer to 12. Mu.L magnetic beads and incubate for 1h. Washing was performed 5 times using a washing buffer (PBS pH7.4,0.2% SDS,0.01% Tween-20). Elution with 200mM imidazole; second generation sequencing of eluent.
All documents referred to herein are incorporated by reference into this application as if each were individually incorporated by reference. Furthermore, it should be understood that various changes and modifications of the present invention can be made by those skilled in the art after reading the above teachings of the present invention, and these equivalents also fall within the scope of the present invention as defined by the appended claims.

Claims (16)

1. A method of constructing a starting fragment of a library of DNA encoding compounds, said method comprising the steps of:
(1) Providing a multifunctional group framework compound L;
(2) Reacting the polyfunctional group skeleton compound L with the compound A 2 -C 2 -ssDNA 1 Compound A 3 -C 3 -ssDNA 2 And compound R-C 1 -A 1 Reacting to form a compound of formula I, i.e., a DNA encoding the starting fragment of the library of compounds;
Figure FDA0003706324330000011
wherein the content of the first and second substances,
r is a reaction site of the functional region of the starting fragment;
l is a multifunctional group skeleton;
C 1 is a first chemical chain for linking the multifunctional skeleton L with the functional region;
C 2 for linking the multifunctional backbone L with the coding region ssDNA 1 A second chemical chain of (a);
C 3 for linking the multifunctional backbone L with the coding region ssDNA 2 The third chemical strand of (4);
A 1 for connecting C 1 And a first linkage chemical bond of L;
A 2 for connecting to C 2 And a second linkage chemical bond of L;
A 3 for connecting C 3 And a third linkage chemical bond of L;
each X is independently a zero or chemical cleavage site, and the X is located at C 2 Or C 3 At any position above;
ssDNA 1 and ssDNA 2 Each independently a single-stranded DNA, and ssDNA 1 And ssDNA 2 Are sequence complementary.
2. A method of constructing a library of DNA encoding compounds starting fragments, said method comprising the steps of:
(a) In an inert solvent, reacting the compound A 2 -C 2 -ssDNA 1 Reacting with a multifunctional backbone compound L to form a compound of formula Ia;
Figure FDA0003706324330000012
(b) Reacting said compound of formula Ia with Compound A in an inert solvent 3 -C 3 -ssDNA 2 Reacting to form a compound of formula Ib;
Figure FDA0003706324330000021
(c) Reacting said compound of formula Ib with a compound R-C in an inert solvent 1 -A 1 Reacting to form a compound of formula I, i.e., a DNA encoding the starting fragment of the library of compounds;
Figure FDA0003706324330000022
wherein, in formula I, formula Ia and formula Ib,
r is a reaction site of the functional region of the starting fragment;
l is a multifunctional group skeleton;
C 1 is a first chemical chain for linking the multifunctional skeleton L with the functional region;
C 2 for linking the multifunctional backbone L with the coding region ssDNA 1 A second chemical chain of (a);
C 3 for linking the multifunctional backbone L with the coding region ssDNA 2 The third chemical strand of (4);
A 1 for connecting C 1 And a first linkage chemical bond of L;
A 2 for connecting C 2 And a second linkage chemical bond of L;
A 3 for connecting C 3 And a third linkage chemical bond of L;
each X is independently a zero or chemical cleavage site, and the X is located at C 2 Or C 3 At any position above;
ssDNA 1 and ssDNA 2 Each independently a single-stranded DNA, and ssDNA 1 And ssDNA 2 Are sequence complementary.
3. The method of claim 1, wherein said first chemical chain C is 1 A second chemical chain C 2 And a third chemical chain C 3 The main chain (main chain) of (a) is selected from the group consisting of: C. o, N, P, or a combination thereof, and said first chemical chain C 1 A second chemical chain C 2 And a third chemical chain C 3 Is a saturated or unsaturated chemical chain.
4. The method of claim 1, wherein C is 1 Is substituted or unsubstituted- (L) 1 ) n1 -, in which each L 1 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, oxygen atom, sulfur atom, phosphorus atom; n1 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 2 Is substituted or unsubstituted- (L) 2 ) n2 -, where each L 2 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n2 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 3 Is substituted or unsubstituted- (L) 3 ) n3 -, in which each L 3 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n3 is a positive integer from 3 to 30, preferably from 5 to 25;
with the proviso that at least one L 2 And L 3 Is a cleavage site X;
and, said substitution means that one or more hydrogen atoms on the group are substituted with a substituent selected from the group consisting of: o, C 1 -C 4 Alkyl, -OH, C 1 -C 4 alkyl-OH, or combinations thereof.
5. The method of claim 1, wherein said first linkage chemical bond is a 1 Second linkage chemical bond A 2 And a third linkage chemical bond A 3 Each independently selected from the group consisting of: a chemical bond, an amide bond, an ester bond, a phosphate bond, or a combination thereof.
6. The method of claim 1, wherein the multifunctional skeleton L is selected from the group consisting of:
Figure FDA0003706324330000031
7. the method of claim 1, wherein C is 2 And C 3 None of them have a chemical cleavage site X; or C 2 Or C 3 Has a chemical cutting site X.
8. The method of claim 1, wherein the chemical cleavage site X is selected from the group consisting of: chemical bonds, DNA sequences that can be cleaved by enzymes, or combinations thereof.
9. The method of claim 8, wherein the chemical cleavage site X is selected from the group consisting of:
(i) A chemical bond that can be cleaved under acidic conditions;
preferably, the chemical bond cleavable under acidic conditions is selected from the group consisting of:
Figure FDA0003706324330000032
(ii) A chemical bond that can be cleaved under alkaline conditions;
preferably, the chemical bond cleavable under alkaline conditions is selected from the group consisting of:
Figure FDA0003706324330000041
(iii) A chemical bond that can be cut by light,
preferably, the photo-cleavable chemical bond is selected from the group consisting of:
Figure FDA0003706324330000042
(iv) Chemical bonds that can be cleaved under reducing conditions,
preferably, the chemical bond cleavable under reducing conditions comprises a disulfide bond;
(v) (iv) any combination of (i), (ii), (iii), and (iv) above.
10. The method of claim 1, wherein the reactive group of the reactive site R of the functional region of the initiator fragment is selected from the group consisting of: amino, hydroxyl (-SH), aldehyde carbonyl (-CHO), ketone carbonyl (-C (O) -), carboxyl (-COOH), azide (-N = N-), alkynyl, halogen, or a combination thereof, more preferably amino.
11. An initiator fragment for use in constructing a library of DNA encoding compounds, said initiator fragment having the structure of formula I:
Figure FDA0003706324330000043
wherein, the first and the second end of the pipe are connected with each other,
r is a reaction site of the functional region of the starting fragment;
l is a multifunctional group skeleton;
C 1 is a first chemical chain for linking the multifunctional skeleton L with the functional region;
C 2 for linking the multifunctional backbone L with the coding region ssDNA 1 The second chemical strand of (4);
C 3 for linking the multifunctional backbone L with the coding region ssDNA 2 The third chemical strand of (4);
A 1 for connecting C 1 And a first linkage chemical bond of L;
A 2 for connecting C 2 And a second linkage chemical bond of L;
A 3 for connecting C 3 And a third linkage chemical bond of L;
each X is independently a zero or chemical cleavage site, and the X is located at C 2 Or C 3 At any position above;
ssDNA 1 and ssDNA 2 Each independently a single-stranded DNA, and ssDNA 1 And ssDNA 2 Are sequence complementary.
12. The starting fragment of claim 11, wherein L is a multifunctional backbone selected from the group consisting of:
Figure FDA0003706324330000051
and/or R is selected from the group consisting of: amino, hydroxyl (-SH), aldehyde carbonyl (-CHO), ketone carbonyl (-C (O) -), carboxyl (-COOH), azide (-N = N-), alkynyl, halogen.
13. The starting fragment of claim 11, wherein C is 1 Is substituted or unsubstituted- (L) 1 ) n1 -, where each L 1 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, oxygen atom, sulfur atom, phosphorus atom; n1 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 2 Is substituted or unsubstituted- (L) 2 ) n2 -, in which each L 2 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n2 is a positive integer from 3 to 30, preferably from 5 to 25;
said C is 3 Is substituted or unsubstituted- (L) 3 ) n3 -, where each L 3 Each independently selected from the group consisting of: CH (CH) 2 CH, C, NH, C (O), N, an oxygen atom, a sulfur atom, a phosphorus atom, or a cleavage site X; n3 is a positive integer from 3 to 30, preferably from 5 to 25;
with the proviso that at least one L 2 And L 3 Is a cleavage site X;
and, said substitution means that one or more hydrogen atoms on the group are substituted with a substituent selected from the group consisting of: o, C 1 -C 4 Alkyl, -OH, C 1 -C 4 alkyl-OH, or combinations thereof.
14. The starting fragment of claim 14, wherein the starting fragment is selected from the group consisting of:
Figure FDA0003706324330000052
Figure FDA0003706324330000061
15. use according to claim 11 for the construction of a starting fragment of a library of DNA-encoding compounds for the preparation of a reagent for the construction of a library of DNA-encoding compounds.
16. A method of constructing a library of DNA encoding compounds comprising the steps of:
(S1) ligating the starting fragment of a library of DNA encoding compounds of the structure of formula I as defined in claim 11 with a predetermined compound to form a library of DNA encoding compounds:
Figure FDA0003706324330000062
wherein the content of the first and second substances,
R、L、C 1 、C 2 、C 3 、A 1 、A 2 、A 3 、X、ssDNA 1 and ssDNA 2 As defined in claim 11.
CN202210708699.1A 2021-06-22 2022-06-21 DNA coding compound library initial fragment and preparation and application thereof Pending CN115506036A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110693698 2021-06-22
CN202110693698X 2021-06-22

Publications (1)

Publication Number Publication Date
CN115506036A true CN115506036A (en) 2022-12-23

Family

ID=84500636

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210708699.1A Pending CN115506036A (en) 2021-06-22 2022-06-21 DNA coding compound library initial fragment and preparation and application thereof

Country Status (1)

Country Link
CN (1) CN115506036A (en)

Similar Documents

Publication Publication Date Title
US11168321B2 (en) Methods of creating and screening DNA-encoded libraries
CA2495881C (en) Evolving new molecular function
CN105917004B (en) polynucleotide modification on solid supports
CA2848023C (en) Methods for tagging dna-encoded libraries
US10240147B2 (en) Production of encoded chemical libraries
CN107075513B (en) Isolated oligonucleotides and their use in nucleic acid sequencing
JP4756805B2 (en) DNA fragment synthesis method
JP4969459B2 (en) Free reactants used in nucleic acid template synthesis
CN101914528A (en) Structural nucleic acid guided chemosynthesis
US20190330682A1 (en) Methods and Compositions for Improving Removal of Ribosomal RNA from Biological Samples
WO2013065827A1 (en) Nucleic acid linker
JP6069314B2 (en) Large-scale synthesis method of long nucleic acid molecules
WO2012004204A1 (en) Synthesis of chemical libraries
CN115506036A (en) DNA coding compound library initial fragment and preparation and application thereof
CN112920247B (en) Method for modifying DNA by using glycosidase and oxyamine compound
KR20240024835A (en) Methods and compositions for bead-based combinatorial indexing of nucleic acids
EP3262185B1 (en) Dna display and methods thereof
CN113355379A (en) Economical and practical nucleic acid chain 5' -hydroxyl phosphorylation method
WO2020128064A1 (en) Nucleic acid encoded chemical libraries
WO2021182587A1 (en) Target module for use in search for novel peptide aptamer utilizing photocrosslinkable base, linker for use in search for novel peptide aptamer, and method for searching for novel peptide aptamer using said target module or said linker

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination