AU2001237811A1

AU2001237811A1 - A biological molecule based computing method based on a blocking principle

Info

Publication number: AU2001237811A1
Application number: AU2001237811A
Authority: AU
Inventors: Grzegorz Rozenberg; Herman Pieter Spaink
Original assignee: Universiteit Leiden
Current assignee: Universiteit Leiden
Priority date: 2000-02-11
Filing date: 2001-02-12
Publication date: 2001-08-20
Also published as: EP1254428A1; JP2003531575A; CA2399694A1; EP1124198A1; MXPA02007743A; US20030073114A1; WO2001059704A1

Description

Title: A biological molecule based computing method based on a blocking principle.

The present invention relates to the field of biology and computer science, in particular it relates to the use of biological molecules for computational purposes.

Biological molecules such as nucleic acid and protein are complex polymers of rather simple molecules. DNA (deoxyribcnucleic acid) is an unbranched polymer used by organisms to store their genetic information. A DNA polymer is usually referred to as a DNA strand and is composed of monomer molecules, which are called nucleotides. Each nucleotide is connected to the next in a polymerization process. Nucleotides differ in their bases, of which some typical representatives are: adenine (A), guanine (G) , thymidine (T) or cytosine (C) . Considering that for each position in a DNA strand at least four possible bases are possible it is not difficult to imagine that many different sequences can be generated. In fact with each added nucleotide the number of possible combinations can be increased by a factor of four.

So, a DNA strand comprises a segueπce; this is the sequence of the nucleotides from one end of the DNA strand to the other. The ends of a DNA strand are chemically different: one end is called the 5 ' end while the other end is the 3 ' end. So single-stranded DNA has a sequence and an orientation .

There are several features of DNA molecules which makes them in principle attractive for computing purposes, we name here three of them: (1) Watson-Crick complementarity (2) the availability of natural enzymes which can recognize DNA sequences and (3) the potential for massive parallelism.

The use of computing with biological molecules provides a potential for solving problems which presently do not have feasible (m time) solutions within the available silicon based technology. Computing with biological molecules and m particular DNA molecules, can solve many of such problems, because the massive parallelism of DNA strands allows the trillions of operations taking place simultaneously.

A natural target class of probl_^ ms which require massive parallelism are the so-called NP complete problems (see for a description: M. R. Garey and D. S. Johnson, Computers and Intractability, A Guide to the Theory of NP-Completeness , W.H. Freeman and Co., San Francisco, 1979.). Perhaps the most famous of the problems from this class is the satisfiability (SAT) problem for Boolean formulas (explained below) . Adleman (L. M. Adleman, Science, 266: 1021-1024, 1994) was the first to conduct an experiment that constituted a "proof of principle" for the use of DNA computing for solving an NP- complete problem. After this pioneering work a number of DNA- based computing methods have been investigated. The method used by Adleman is the filtering method. It stares with a set of DNA molecules which represent all possible assignments to all variables of a given problem (called the combinatorial library) and then filters out the molecules corresponding to good solutions. Lipton (R. J. Lipton, Science, 268: 542-545, 1995) has outlined a solution for the SAT problem using the filtering method. The current methods for computing with biological molecules have, as outlined above, m principle an enormous advantage over the silicon based technologies. Technologically, however, the current methods for computing with biological molecules are cumbersome. Most methods require some knowledge of characteristics of the desired solution m order to filter out the desired solution.

The present invention provides a method for finding a potential solution to a computational problem through a computing method. In this aspect of the invention, the method utilizes a library of biological molecules. The library comprises a number of biological molecules that represent a set of combinations of values for variables of the computational problem. Upon generation of the library, it is not known whether a specific instance of the computational problem is solvable, i.e. has a true solution. To determine whether the problem instance is solvable one can determine whether the library comprises a biological molecule that represents a true solution to the problem instance. In the art, many methods have been proposed to find such a solution, all with their particular advantages and disadvantages. WO 97/07440 describes molecular computing using the so-called filtering method. Starting from a library comprising many different candidate solutions to a computational problem, false solutions are removed in a step wise and partly iterative fashion such that when all steps have been completed the library only contains true solutions to the computational problem, if they are present. Murphy et al describe a modification of the filtering method described by Adleman in which enzymatic methods are used to destroy false solutions in the library such that the library is enriched for true solutions, if present (Murphy et al , 1997) . Lui et al follow a similar strategy to Murphy et al in that false solutions are removed by enzymatic digestion (Liu et al, 2000) . All these methods are relying on the filtering method; meaning that first a library of molecules is generated whereupon molecules from the library that do not represent a true solution to the problem are removed (destroyed) from the library. Filtering methods need information on a good solution to find it. For instance in Lui et al information on true solutions is used to perform mark operations. The mark operations are protecting true solutions in order to destroy false solutions. Each of the marking events needs information on what the true solution is to be able to prevent its removal from the library. In the present invention we provide a method that is based on preventing detection of combinations of values for variables to said computational problem that do not represent a true solution. Prevention of detection is achieved by blocking at least one biological molecule that represents a combination of values for variables to said computational problem that do not represent a true solution. Blocking is achieved by providing the library with a blocking agent.

The method of the invention is very versatile in that many different computational problems can be solved. It is often much easier to find a combination of values that do not represent a solution (i.e. represent a false solution) than it is to find a combination of values that represent a solution to the computational problem (also referred to as a true solution) . In fact, for many computational problems it is possible to easily determine all possible false solutions to the computational problem. Many computational problems can be expressed as boolean formulas. Boolean formula's can be written into a conjunctive normal form (CNF) . The conjunctive normal form representation of a computational problem allows a person skilled in the art to determine rapidly all combinations of values for variables that represent false solutions to the problem. This process can also easily be automated using a computer. In the present invention it is possible to generate one or more blocking agents that are capable of blocking essentially all biological molecules that represent false solutions to the computational problem, leaving essentially only true solutions unblocked (if any exist) . In addition, if one true solution is found in the library, a blocking agent capable of blocking this solution may be provided to the library, thereby leaving only other true solutions to the problem unblocked, (if they exist) . This process can be repeated until essentially all true solutions to the problem have been found. A particular advantage of the present invention over the current filtering methods is that it can be performed in essentially one step. By blocking detection of essentially all biological molecules representing combinations of values that represent a false solution to the problem one can in one step detect whether the problem comprises a true solution. Repeated incubations with one or more blocking agents and selections of part of the biological molecules are possible in the present invention, but not required. In the present invention, true solutions that have not ye- been detected, are not blocked. This is preferably achieved by adding blocking agents for false solutions. These blocking agents do not associate with a true solution. Once a true solution is detected a specific blocking agent for said detected true solution may be used to block further detection of said true solution. This feature is particularly useful for determining whether a computational problem comprises further true solutions apart from the ones already detected. In this embodiment a blocking agent for a false solution is added to the library, whereupon it associates with a molecule representing a solution of which detection is not wanted, typically a false solution. In this preferred embodiment said blocking agent is not capable of associating with a solution of which detection is wanted, typically a true solution.

Although, preferably all biological molecules representing false solutions are blocked by a blocking agent, the present invention is already useful when a limited number of false solutions are blocked. Partial blocking at least in part limits the search for a true solution using a method of the art. A method of the invention can therefore easily be combined with a method in the art, for instance a filtering method, to simplify the search for a true solution. A DNA library comprising candidate solutions of a computational problem can be subjected to size fractionation before a blocking agent is added to the library.

Thus, in one aspect the invention provides the use of a method for determining whether a specific biological molecule is present in a library of biological molecules, wherein said library represents a set of combinations of values for variables of a computational problem, the method comprising providing said library w th a molecule capable of associating with at least one biological molecule representing a combination of values for variables to said problem, wherein said association marks said at least one biological molecule as a combination of values representing a false solution to said problem, and determining whether said specific biological molecule is present m said library. The marking of a combination of values that represents a false solution (if it exists) leaves a combination that represents a true solution unmarked. By providing the library with sufficient molecules to associate with essentially all false solutions to said computational problem, only those biological molecules that represent true solutions to the problem are left unmarked. This, of course, under the assumption that the considered problem instance is solvable, i.e. comprises a true solution. When a biological molecule representing a true solution is detected, it may be identified. Identification entails the determination of the specific combination of values represented by said biological molecule. Upon identification of a true solution a blocking agent can be devised capable of blocking detection of biological molecules representing this solution m the library. The library can be provided with this blocking agent. A method of the invention can then be used to determine whether said library comprises another true solution to said problem. Thus, m one embodiment the invention provides a method of the invention, further comprising blocking an identified biological molecule representing a true solution of said problem and identifying in said library, a possibly present biological molecule representing a combination of values for variables, which combination is a another true solution to said problem.

The marking of biological molecules can be used to discriminate between marked and/or unmarked biological molecules and thereby between biological molecules representing respectively false or true solutions to said computational problem. In a preferred embodiment blocking of detection of a marked biological molecule is achieved using a blocking agent capable of blocking detection of said at least one biological molecule. A blocking agent of the invention can also be made such that it is capable of blocking two or more biological molecules, said molecules representing different combination of values for variables of said computational problem. Preferably, said two or more biological molecules represent false solutions of said problem.

Blocking of detection of a biological molecule representing a false solution to said computational problem at least limits the search for a biological molecule representing a true solution to said problem. In a preferred embodiment the detection of essentially all biological molecules representing a false solution to said computational problem is blocked. Thus if a biological molecule is detected this means that the problem comprises at least one true solution. A blocking agent is a physical and/or any other means for enabling elimination of detection of a biological molecule .

Non-limiting examples of biological molecules that can be used to generate a library of molecules that represent a set of combinations of values for variables of a computational problem comprise nucleic acid, protein and/or lipid, or a functional equivalent of these biological molecules. Preferably, said biological molecule comprises nucleic acid. A functional equivalent of nucleic acid comprises the same base-pairing capabilities as nucleic acid in kind not necessarily in amount.

In a preferred embodiment said at least one combination of values for variables of said computational problem comprises a false solution to said computational problem. Detection of unblocked biological molecules is preferably performed with an amplification step. In this way the number of unblocked molecules is increased relatively to the number of the blocked molecules thereby making the detection a lot easier. Thus, in one embodiment a use of the invention further comprises subjecting said library to an amplification step, wherein said blocking agent is capable of at least in part preventing amplification of said at least one biological molecule representing at least one combination of values for variables of said computational problem.

A blocking agent can be designed to be specific for one clause. However, in a preferred embodiment a blocking agent is specific for more than one clause. For instance if a formula Φ is expressed in conjunctive normal form with 3 literals per clause, a blocking agent can be designed to make a statement about these 3 literals, but be completely neutral towards all other variables. Alternatively, a separate blocking agent can be designed for every possible assignment . In this case the number of blocking agents increases exponentially with the dimensions of the problem (n variables require 2^n~3 blockers) . Exponential increase of the number of blocking agents can be drastically reduced by introducing some level of redundancy in the blocking agents. At neutral positions, that are non-discriminative for a true or a false solution (i.e. in which a blocking agent can specify either a 1 or a 0) , a blocking agent can comprise redundancy such that it accommodates both possibilities. Thus a blocking agent can be generated such that it is capable of blocking detection of two or more biological molecules, said molecules representing different combination of values for variables of said computational problem. Preferably, said blocking agent is capable of blocking biological molecules that represent the same (preferably false) combination of values for variables for at least one clause of a conjunctive normal form representation of said computational problem. In this way exponential increase of the number of blocking agents can at least m part be prevented. The increase m the number of blocking agents can be linear instead of exponential, for instance by generating all blocking agents m this way. Preferably, said two or more biological molecules represent false solutions of said problem. Blocking agents capable of blocking detection of two or more biological molecules representing different solutions to said problem can be generated by utilizing so-called degenerate or universal bases m the blocking agent. To this end a blocking agent may comprise at least one degenerate base or analogue thereof. Such a degenerate base (also referred to as universal base) is capable of associating with two or more bases . This feature of the invention can be used to limit the number of blocking agents required to block biological molecules m the library. With one or more degenerate bases at a certain location (or locations) of a blocking agent said blocking agent is less discriminative at said location (s) and therefore capable of associating with biological molecules comprising only a difference m said location (s) . Said association m this embodiment leading to blocking of detection.

In a preferred embodiment, wherein a computational problem is encoded m nucleic acid, this can be achieved through the incorporation of a universal nucleotide at positions of ambiguity. Several functional analogues of the natural DNA bases G, C, T and A exist. Some of these have altered base-pairing specificities. Nucleotides exist and more are under development, which efficiently pair to more than one natural base. Depending on the encoding, this allows for the specification of both a 1 and a 0 m one blocking agent. In living systems, such a redundancy is used m the translation from nucleic acid to protein. Proteins are combinations of about 20 ammo acids. Which ammo acid is to be incorporated in a protein is specified by the nucleotide sequence of mRNA (messenger πbonucleic acid, a single- stranded DNA analogue) . Three nucleotides (a "codon") are used to encode one amino acid. Two nucleotides do not allow enough combinations (4² = 16) , but three allow too many (4³ = 64) . Therefore, for some amino acids, the third position of the codon can be ignored. This so-called "wobble" basepairing is often the result of the presence of the base hypoxanthine in the complementary anticodon of some tRNA species (transfer RNA, the molecules that recognizes mRNA codons) .

The DNA nucleotide form of hypoxanthine, deoxyinosine (base I) , is frequently used in recombinant DNA technology as an universal nucleotide. However, it is not truly universal, since some hybridization interactions are more stable than others. Deoxyinosine pairs efficiently with C, less efficiently with A, and badly with G and T. So deoxyinosine is preferably used in encodings in which a 1 or a 0 is determined by a C or an A in a certain position. Artificial universal nucleotides have been developed which have other pairing preferences : - 6H, 8H-3, 4-dihydropyrimido [4 , 5-c] [1, 2] oxazin-7-one] , base P, pairs to A and G;

- N^s-methoxy-2 , 6-diaminopurme, base K, pairs to C and T. A universal nucleotide that does not form hydrogen bonds is base M, 1- (2 ' -deoxy-β-D-ribofuranosyl) -3 -nitropyrrole . It does not pair to any natural base, but can be placed opposite either G, C, T or A in a DNA duplex without seriously affecting hybridization. Some artificial nucleotides even hardly pair to natural bases, but only to other artificial ones: difluorotoluene deoxynucleoside (base F) pairs to 4- methylbenzimidazole deoxynucleoside (base Z) with reasonable efficiency. A:F and T:Z basepairs are possible, but not preferred. Such nucleotides effectively extend the genetic code to 6 bases instead of 4. To manage the number of blocking agents to be generated one or more approaches may be combined. For instance, redundancy may be combined with mixed base synthesis.

Detection of unblocked molecules does not have to be performed in an amplification step. It is very well possible to devise other methods of detecting unblocked molecules. For instance, one can combine the biological molecules with a fluorochrome that upon blocking with a blocking agent becomes quenched. In this way detection of unblocked molecules can be done by simply determining whether fluorescence can still be detected.

Another way of detecting an unblocked biological molecule comprises providing a digestion signal to the biological molecule with the blocking agent . The presence of undigested biological molecules in the library then reflects the presence of unblocked molecules. Unblocked molecules are not marked or associated with blocking agent and are therefore not digested.

Although unblocked molecules can be detected in many ways, biological molecules are preferably detected using an amplification step. The amplification step does not have to be performed for detection of unblocked molecules. It can also be performed to increase the amount of molecules to facilitate further handling of the material. Preferably, the amplification step comprises a nucleic acid amplification reaction such as polymerase chain reaction and/or a nucleic acid amplification in a cell. Of course it is not necessary to use only one detection method. It is very well possible to combine two or more detection methods.

A computational problem often can be represented through two or more subproblems. This feature can be used to design and/or choose the blocking agent . Defining subproblems to a computational problems is advantageous for the present invention. Particularly when for a desired solution the outcome of all subproblems must be true, it may be easier to design the blocking agent. In this case, finding an agent capable of blocking a part in a biological molecule representing a false solution of a subproblem identifies the biological molecule as having not the desired solution of the complete problem. In a preferred embodiment of the invention said at least one blocking agent is capable of blocking at least a part of said at least one biological molecule wherein said part represents at least one combination of values for variables of a subproblem of said computational problem. An additional advantage of defining subproblems is that all biological molecules comprising a representation of a particular false solution of a subproblem can be blocked by the same kind of blocking agent, irrespective of the particular representations of solutions of other subproblems in the biological molecules. Considering that in a preferred embodiment of the invention all false solutions of the computational problem are blocked by a blocking agent it is in this case entirely possible that more than one blocking agent is blocking a particular biological molecule.

Many computational problems can be represented through Boolean formulas. The Boolean formulas are particularly suitable for encoding candidate solutions of a computational problem into a library of biological molecules. A Boolean formula can be given in a Conjunctive Normal Form (CNF) from. It consists of a set of clauses linked together by the conjunction "and" operator. Therefore if an assignment for all variables is a true solution of a CNF formula, then each and every clause in the formula must be true. Thus a clause can be seen as a subproblem of the complete problem. The computional problem has a true solution only when all of the clauses of the CNF representation of the Boolean formula are true. Therefore, in a preferred embodiment a blocking agent is capable of blocking a part of a biological molecule in said library that represent a false solution of at least one clause of said CNF representation of the mathematical problem. Preferably, said CNF representation is a 3 CNF representation.

A library of biological molecules can be generated from a number of different biological molecules provided that it comprises sufficient information storage capacity. Examples of suitable biological molecules are nucleic acid derived molecules, protein and lipids. Preferably, said biological molecule and/or said blocking agent comprises nucleic acid or a functional analogue thereof.

In one aspect of the invention the basic principle of blocking is based on the fundamental Watson-Crick complementarity property of DNA. This means that, due to their chemical nature, two DNA strands can become bonded resulting in a helical double-stranded DNA molecule, the famous double helix (Watson and Crick, 1953) . Bonding of DNA strands arises from the specific pairing (formation of hydrogen bonds) of the bases: adenine (A) always pairs with thymine (T) and guanine (G) always with cytosine (C) . The complementary strands of a double-stranded molecule are arranged in an anti -parallel fashion. This means that the two stands are in a 'head-to-tail ' arrangement: the 5' to 3 ' orientation of one strand corresponds to the 3 ' to 5 ' orientation of the complementary strand. Raising the temperature can lead to separation of strands of a double- stranded DNA molecule resulting in two single- stranded DNA molecules. This process is called mel ting. If after melting the temperature is slowly lowered, the complementary strands will anneal to form the original double-stranded helical molecule again. When a short oligonucleotide (called primer) is annealed to a single stranded DNA molecule, this oligonucleotide can serve as a primer for an enzyme, called DNA polymerase, to produce a second strand of complementary DNA.

There are several technical approaches that can be followed for the mactivation of DNA molecules and subsequent detection. One method which we describe m detail is

„nactιvatιon for replication using peptide nucleic acid (PNA) blocking, and PCR (Polymerase Chain Reaction) detection. As an alternative to PCR, of course, any method for nucleic acid amplification can be used. PCR is a technique used to amplify specific DNA strands in vi tro . For PCR the nucleotide sequence of the ends of the DNA strands to be amplified has to be known. This is necessary because short oligonucleotides primers complementary to the end of the DNA strands to be amplified have to be synthesized. A PCR 'cycle' consists of: (1) melting of the double- stranded target DNA resulting m s gle-stranded target DNA molecules, (2) cooling to allow annealing of specific primers to the target DNA, and (3) extension of the primers by the enzymatic activity of DNA polymerase. It is very important to realize that the extension products of one primer can serve as a template for the other primer the next cycle, so each cycle (theoretically) doubles the content of target DNA. The DNA polymerases commonly used for PCR are thermostable, so they retain activity despite the high temperatures during the melting periods. In this example, PCR amplification of faulty solutions is blocked, so that only the desired solutions are amplified. This blocking can be achieved through the addition of specific small peptide nucleic acid (PNA) molecules to the PCR reaction mixture. PNA molecules are smgle-stranded DNA mimics with a pseudopeptide backbone (see for a detailed description M. Egholm et al . , Nature 365:566-568, 1993). PNA's are functional equivalents of nucleic acid. PNA's have been shown to hybridize sequence-selectively to complementary sequences of DNA, forming Watson-Crick double helices. Moreover, PNA's do so with higher affinity than comparable DNA molecules. So, by adding PNA molecules that anneal specifically to the target DNA molecules in the same region as the DNA primers, the latter cannot anneal. If the target molecules represent faulty solutions, they will have PNA's annealed to them instead of DNA primers . Because a PNA cannot serve as a primer for DNA polymerase, polymerization, ^nd hence amplification of the target DNA, is prevented. Therefore in one embodiment of the use of the invention said blocking agent comprises peptide nucleic acid.

After PCR, the amplified DNA molecules representing the good solutions, can be separated and visualized using many known methods including DNA-chip technology. We also want to mention some alternatives for blocking/detection of DNA molecules. One can also block DNA molecules by making them not accessible for restriction enzymes. This can be done by adding PNA oligonucleotides which anneal to the target DNA molecule at the position of a recognition site of a restriction enzyme. The detection reaction is then based on cloning the restricted molecules in a plasmid vector which is replicated in vivo . Alternatively, the detection can be performed on the basis of size of a DNA molecule which is given in terms of the number of nucleotide base-pairs per molecule. In this embodiment of the invention true solutions exist, if molecules smaller than the standard size exist in the reaction buffer.

The rapidly developing technique of DNA-chip technology provides a helpful tool for the readout of the candidate solutions. It is preferred that the library is present on the chip and the blocking agent is added to the chip. Preferably the biological molecules are arranged such that individual molecules are in physically separated positions of the chip, for instance in an array format. Unblocked biological molecules can then be identified by finding the positions not comprising the blocking agent. On the other hand, the chip may comprise (an array of) blocking agents capable of blocking a variety of biological molecules in the library. The library can then be added to the array. Unblocked molecules can then be detected in the fraction not associated with the blocking agent . A person skilled in the art is able to generate other arrangements of the biological molecules and/or blocking agents on the chip such that the number of different biological molecules and/or blocking agents per position is more than one, while still being able to detect unblocked molecules. Reference is made to a previously mentioned example using a fluorochrome .

Considering the chip implementation of the use of the invention, a preferred embodiment of the invention comprises the use of the invention wherein said library of biological molecules and/or said blocking agent is physically linked to a solid surface. A solid surface is advantageous for many purposes not in the least for detection purposes and handling purposes. The mentioned chip format is desirable when either the library or the blocking agent is present in a multiplicity of compartments and wherein each of compartment comprises one type of biological molecule and/or one type of blocking agent. The compartments may also comprise more than one type of biological molecule or blocking agent . Preferably, the blocking agent and/or the biological molecule comprises a label, preferably a fluorophore such as a fluorescent label . The fluorophore allows discrimination between marked and unmarked biological molecules. A fluorophore preferably comprises a fluorescent label. In a non-limiting example of this embodiment of the invention we describe the use of a blocking agent comprising a label in combination with a library of biological molecules present in an array on a chip. Providing the library with one labeled blocking agent will identify a position in the array comprising a biological molecule representing a false solution the problem. When each position comprises essentially only a biological molecule representing one combination of values then one can easily find biological molecules representing a true solution to the problem by providing the library with labeled blocking agents capable of associating with essentially all biological molecules representing false solutions to said problem. In this case positions that are left unlabeled comprise a true solution to the problem. Detection of positions not containing a label thus identifies such true solutions. In another preferred embodiment of the invention, quenching of fluorescence is used to discriminate between marked and/or unmarked biological molecules. When all elements of the combinatorial library have been labeled by a fluorescent dye the elements which do not represent a true solution can m a preferred embodiment, be blocked for fluorescence by oligonucleotides (or PNAs) which, through annealing, inactivate, or at least decrease a detectable way, the fluorescence of the target DNA molecule. This mactivation can be achieved by the well- known principle of "quenching" . Quenching of fluorescence can result from the presence of another fluorescent dye at the blocking oligonucleotide (or PNA) , which by annealing to the target nucleotide, comes close vicinity of the fluorescent dye attached to the biological molecule. As a non-limit g practical methodology to apply this principle we propose to link the fluorescently labeled combinatorial library to the surface of a chip. After annealing with the fluorescently labeled blockers, one can read out the solutions by detection of the positions of the chip which remain fluorescent . Such a chip, based on the blocking of fluorescence can be described as a "quenching chip readout". We also want to mention that this methodology can be easily combined with other DNA-based computing methods such as for example filtering.

In another aspect the invention provides the use of a blocking agent for disabling detection of a biological molecule, representing a combination of values for variables of a computational problem, in a library of biological molecules, wherein said library represents a set of combinations of values for variables of said computational problem. In another embodiment the invention provides the use of a blocking agent for enabling elimination of detection of a biological molecule in a library of biolo ical molecules representing a set of combinations of values for variables of a computational problem.

In a preferred embodiment the invention provides a method or a use of the invention, wherein said computational problem comprises a SAT problem and/or a SAT related problem.

Examples

We illustrate the use of the blocking method for the satisfiability (SAT) problem using both a PCR method and a fluorescence quenching assay. Let V = { p_1# ... ,p_n} be a set of Boolean variables -- their values may be only 0 and 1 (0 stands for "false" and 1 stands for "true") . A literal is either a variable p_x or its negation —.p_x, we say that p₁, -ι p_x are literals for p_x .

We consider two logical operations: v ("or") and Λ ("and") . A clause E is an expression of the form t_x v ... v t_m where each t₁ is a literal; for the purpose of this example we may assume that for each variable p_x there is at most one literal for _x in E . A Boolean formula (in conjunctive normal form, CNF) is an expression of the form E_x Λ ... Λ E_m where each E_x is a clause.

An assignment is a function φ on V which for each p₁ has the value either 0 or 1. To compute the value of a literal (for a given assignment φ) we use the rule: -ι 0 = 1 and -ι 1 = 0. To compute the value of a clause we use the rule: 0 v 0 = 0 and 0 v l = l v 0 = l v l = l. To compute the value of a formula we use the rule: O Λ 0 = 0 Λ 1 = 1 Λ O = O and 1 Λ 1 = 1. We say that an assignment φ satisfies a formula Φ if the value φ (Φ) of Φ under φ is 1. Otherwise φ falsifies Φ (and φ is a falsifier of Φ) . We say that Φ is satisfiable if there is an assignment satisfying Φ.

Example: Let V = {p₁,p₂/P₃} be a set of variables and let Φ = E_x Λ E₂ Λ E₃ be the Boolean formula over V such that E_λ = p₁ v -, p₂, E₂ = _x v p₂ v -, p₃ and E₃ = -> p_x v -, p₃.

Let φ_x be the following assignment: p₁ = 0 , p₂ = 1 , p₃ = 0. Then φ^E = 0 v 0 = 0, φ₁(E₂) = 0 v l v l = l and < E₃) = 1 v 1 = 1. Thus φi(Φ) = 0 Λ 1 Λ 1 = 0. Let φ₂ be the assignment: p₁ = 0 , p₂ = 0 , p₃ = 0. Then φ₂ (E_x) = 0 v 1 = 1, φ (E₂ ) = 0 v 0 v l = l , and φ₂ (E₃ ) = 1 v 1 = 1 . Thus φ₂ (Φ) = 1

Λ 1 Λ 1 = 1. Hence ^ falsifies Φ, φ₂ satisfies Φ, and Φ is satisfiable .

The Satisfiability Problem (SAT) is to determine whether or not an arbitrary Boolean formula Φ is satisfiable . Note than

SAT does not require that one finds a satisfying assignment in the case that Φ is satisfiable.

The Find Satisfiability Problem (FIND SAT) is to determine whether or not an arbitrary Boolean formula Φ is satisfiable, and if it is then to give an assignment satisfying Φ. In the sequel we assume that we have an infinite sequence of (available) variables p_x,p₂,p₃,... and whenever we consider the case of n variables, the variables are: p_1# p₂, ... , p_n.

Blockers

To start with, we need to code for each (Boolean) variable p₁ its two possible values p_x = 1 and p₁ = 0. Let q₁ ⁽¹⁾ be a single stranded sequence coding the value p-_L = 1 and q₁ ^<01 be a single stranded sequence coding the value p_x = 0.

Example 1: The coding q ¹' = A and q ⁰¹ = C for all 1 < i < n is independent of a variable - the value 1 is always coded by A and the value 0 is always coded by C.

Now we can code all possible assignments of variables by single strands. To this aim, for a given number of variables n, a n-strand is a strand of the form f₁f₂...f_n where each f is either q₁ ⁽¹⁾ or q₁ ⁽⁰⁾ . The set of all n-strands is denoted by

For a n-strand s, we use asg(s) to denote the corresponding assignment φ, and for an assignment φ we use str(φ) to denote the corresponding n-strand. A blocker of a n-strand s is its complement, it is denoted by b(s) . Now, given a Boolean formula Φ = E₂ Λ E₂ Λ... Λ E_m over n variables, for each clause E₁ a blocker of E_x is a blocker of a n-strand s such that asgr(s) is a falsifier of E_x . The set of all blockers for E_t is denoted by B(E_±) . Then the set of all blockers for Φ, B(Φ) is the union of B(E_i) for all clauses E_± of Φ; thus B (Φ) = B(E₁) ... U B(E .

For example, let n = 3 and let Φ = E_x Λ E₂ where E_x = -. p_x v p₃ and E₂ = -i p_x v -, p₂ v -i p₃. The ft:¹ sifiers for E_λ are φ_x and φ₂ where φ_x (p = 1, φ₁(p₂) = φ^P_a) = 0, and φ₂ (p = φ₂ (p₂) = 1 Φ_{2 3}) = °- ^Tne falsifier for E₂ is φ₃ such that φ₃ (p =

Φ₃ ⁽p₂ ⁾ = φ₃ ⁽p₃ ⁾ = i-

Hence if we use the coding from Example 1 then the blockers for E_τ are GGT and GTT, because str(φ_x) = ACC and str(φ₂) =

AAC . Unless clear otherwise DNA molecules are read in the 5' to 3' orientation. The blocker for E₂ is TTT because str(φ₃) = AAA. Hence the set of blockers for Φ is B(Φ) = { GGT, GTT, TTT } .

An Algorithm for SAT

We begin with an initial solution Z₀ that contains the set S_n of all n-strands. To know S_n, we need to know only the number of variables n (without knowing Φ) . Thus we assume that such a solution is prepared in advance -- it is a "ready product on a shelf". This idea is common to filtering methods. Here are two algorithms (A_x, a PCR-based algorithm, and A_lf, a fluorescence based algorithm) for solving SAT.

ALGORITHM A

Input: A Boolean formula Φ of n variables.

1. Add B(Φ) .

2. PCR.

3. PCR Successful? If so, go to 5.

If not, go to 4.

4. Output "NO" and Stop.

5. Output "YES".

6. Stop. ALGORITHM A _lf

Input: A Boolean formula Φ of n variables. 1. Add B(Φ) .

2. Detect fluorescence.

3. Fluorescence from unblocked molecules?

If so, go to 5. If not, go to 4. 4. Output "NO" and Stop.

5. Output "YES".

6. Stop.

As will be explained below, algorithms Al and Alf are equivalent and differ only in the detection methods used. In both algorithms, once we know the input formula Φ, we proceed to Step 1 and add B (Φ) to Z₀ obtaining Z_λ . The intention of this step is to "block" (by annealing) all the n-strands which represent assignments that falsify Φ.

In Step 2 of algorithm A_x, Z_λ is PCR ' ed and Z₂ is obtained.

Here the only n-strands that can be successfully multiplied by PCR are the n-strands that have not been blocked in Step 1 (after the blockers from B(Φ) have been added) . But these are precisely the n-strands s such that the assignment asg(s) satisfies Φ. Thus the PCR here is successful if and only if there exists an assignment satisfying Φ. In Step 2 of algorithm A_lf, fluorescence is detected in Z_λ .

Fluorescence from n-strands that have been blocked in step 1 (after blockers from B (Φ) have been added) is quenched. Only fluorescence from n-strands that have not been blocked in Step 1 is detected. These are precisely the n-strands s such that the assignment asg(s) satisfies Φ. Thus fluorescence is detected if and only if there exists an assignment satisfying

Φ.

In Step 3 we check whether or not the PCR or fluorescence detection from step 2 was successful. In algorithm A_x this is the case if Z₂ contains "clearly" more molecules than Z₁ . In algorithm A_lf this is the case if fluorescence is detected against background noise.

If the PCR was not successful or fluorescence was not detected, then we proceed to Step 4, print "NO", and stop. If the PCR was successful or fluorescence was detected, then we proceed to Step 5, print "YES", and stop in Step 6.

It must be clear by now that these algorithms print "YES"

(and stop) if and only if Φ is satisfiable.

We continue our example: here n = 3 and S₃ = { CCC, CCA, CAC, CAA, ACC, ACA, AAC, AAA }. Since B (Φ) will anneal to their complements in S₃ (in Z₀) , the set of single strands in Z₁₇ denoted by ss(Z₁), equals S₃ - B(Φ) . Hence ss(Z₁) = { CCC,

CCA, CAC, CAA, ACA } . It is easily seen that indeed asgr(ss(Z₁)) is the set of all assignments satisfying Φ. Since this set is not empty, the PCR or the fluorescence detection from Step 2 will be successful, and so the algorithm will output "YES" and stop.

We feel that the following comments concerning the above algorithm are needed here, even before we discuss later the "laboratory implementation" of the algorithm.

(1) We may construct B (Φ) by reading Φ from left to right, clause by clause, as follows. Let E be a clause of Φ, and we assume that literals in E are ordered according to the order p₁₇...,p_n of variables. Let, e.g., E = p_x v p₂ v -i p₄, where n = 4. Reading E from left to right we can spell out the falsifiers of E : p₁ = 0 , p₂ = 0 , p₃ = "any value", p₄ = 1. Thus if a variable p_x is present in E, then we set p₁ = 0 , and if -ι p_x is present in E, then we set p_x = 1. If neither p₁ nor -i p_x is present in E, then we set "any value" which means that p_x can be either 0 or 1. The set of blockers of E is then the set of complements of the n-strands that code the falsifiers. Thus, reading E from right to left we can spell out the blockers of U: "first G" , "then G", "then either G or T", "then T" (recall that 0 is coded by C and 1 is coded by A) .

Hence we have 2 blockers lere : GGGT and GGTT . Spelling out the blockers while reading E from left to right may be considered as giving instructions (for a "robot") for synthesising the set of blockers for the clause considered. Typically, the chemical synthesis of DNA strands proceeds in the 3' to 5 ' direction. Hence for E as above the synthesis would go as follows.

- "first G" : take a solution R_x with "enough G" (each G nucleotide is hooked to a solid support at its 3 ' -end) . - "then G" : attach G to all the free 5 ' -ends of molecules in R_x getting in this way R₂.

- "then either G or T" : divide R₂ into two solutions R_{2 1} R_{2 2} of equal volume, attach G to all the free 5 ' -ends of molecules in R_2#1 getting R_2,ι/G ' attach T to all the free 5'- ends of molecules in R₂₂ getting R_2,2,τ , then mix R_2/1/G with R_{22 T} getting R₃.

- "then T" : attach G to all the free 5 ' -ends of molecules in R₃ getting in this way R₄. Clearly R₄ contains "enough" (and "equal amounts") of all the blockers of U. If neither p_± nor -ι p_x is present in a clause, then the initial mixture R_x will have "enough G" and "enough T" hooked to a solid support. Then the synthesis proceeds as outlined above . An alternative for the mixed base synthesis ("either G or T") is the use of (artificial) universal or degenerate bases. Depending on the encoding, several options are available. Deoxyinosine, for example, efficiently hybridizes to C, slightly less efficiently to A, and badly to G and T. More specific artificial degenerate nucleotides include 6H,8H-3,4- dihydropyrimido [4 , 5-c] [1 , 2] oxazin-7-one (abbreviated dP) and N⁶-methoxy-2 , 6-diaminopurine (dK) , which hybridize to G or A and C or T, respectively (Hill et al . , Proc . Natl . Acad. Sci . USA 95, pp. 4258-4263, 1998). All these alternative nucleotides are available for incorporation oligonucleotides by commercial synthesis services. In fact, any nucleotide replacement or backbone modification affecting local hybridization behavior can be accommodated m a blocker/library scheme.

(2) An innocent phrase "add B (Φ) to Z_x" requires some computation to ensure that all strands from Z_x to be blocked will be indeed blocked. This is a part of the laboratory procedure .

(3) Since PCR is performed Step 2 of algorithm A_x, it is clear that our representation of n-strands and blockers is very simplified. Clearly, one needs to prime strands to be amplified, and so all the n-strands will have special prefixes and suffixes that are needed for a PCR. The blockers are then modified accordingly.

An Algorithm for FIND SAT (and for FULL SAT) .

Here is an algorithm (A ₂) for solving FIND SAT.

ALGORITHM A₂

Input: A Boolean formula Φ of n variables Steps 1 through 5 are as A _x . 6. Take a sample n-strand s.

7. Sequence s.

8. Output asgr(s) .

9. Stop.

If the algorithm A_x outputs "YES", then A ₂ continues m Step 6 by taking a random sample strand from the solution Z₂ which is the "end solution" of A _x .

This sample strand is sequenced Step 7, the resulting sequence is outputed in Step 8, and A ₂ stops m Step 9. It is easy to see that this algorithm A ₂ either (1) stops and outputs "NO", or (2) stops and outputs "YESφ". Case (1) holds if and only if Φ is not satisfiable, and case (2) holds if and only if Φ is satisfiable and φ is an assignment satisfying Φ.

It should be clear by now, that by iterating PCR we can find out all the assignments that satisfy Φ.

Of course, algorithm A₂ can also be executed as a FIND SAT extension to algorithm A_lf, with the following modification to Step 6:

6. Take a sample non-quenched n-strand s.

This is, because blocked (non-satisfymg) assignment strands are still present the "end-solution" of algorithm A_lf . A convenient way to pick non-quenched strands, is by attaching assignment strands to a solid support an addressed manner. Alternatively, strands may be sorted as shown figure 7.

The Full Satisfiability Problem (FULL SAT) is to determine whether or not an arbitrary Boolean formula Φ is satisfiable, and if it is, to give all assignments satisfying Φ. Here is an algorithm A₃ for solving FULL SAT.

ALGORITHM A ₃

Input: A Boolean formula Φ of n variables. Steps 1 through 8 are as m algorithm A₂.

9. Add b(s)

10. PCR

11. PCR Successful? If so, go to 6. If not, go to 12.

12. Output "END"

13. Stop. In Step 9 we add b(s) blocking in this way strands representing the last successful assignment asgr(s) that we found. The resulting solution Z₃ is then PCR'ed in Step 10 yielding solution Z₄; the blocked strands s are then not multiplied by the PCR.

In Step 11 we check whether or not the PCR from Step 10 was successful. This is the case if Z₄ is "clearly" contains more molecules than Z₃.

If this PCR was not successful, then we proceed to Step 12 printing "END", and then the algorithm stops.

If this PCR was successful, then we go back to Step 6 and repeat the cycle of discovering a new assignment satisfying Φ and checking whether there are more assignments that satisfy Φ. Again, algorithm A₃ can be easily modified to yield an algorithm for FULL SAT in a fluorescence based approach.

In order to facilitate a better understanding of experimental implementation of the present invention we give below two descriptions of blocking procedures verified by our laboratory experiments.

PROCEDURE FOR BLOCKING BY PCR

To experimentally test the principle of blocking, a single- stranded DNA molecule 75 nucleotides in length was synthesized (ISOGEN Bioscience BV Maarsen, The Netherlands) . This molecule represents a potential solution to a computing problem and functions as the template molecule, which can be amplified by PCR. Specific primers (ISOGEN Bioscience BV Maarsen, The Netherlands) were obtained so that the template could by multiplied by PCR (fig. 5) . The five nucleotides at the 5 ' end of the PNA blocking molecules are the same as the five nucleotides at the 3 ' end of one of the primers used (fig. 5) . This region of overlap of five nucleotides results in a situation where either a PNA blocker or a primer can bind to the template. By hybridizing with their complementary template sequence the PNA molecules prevent hybridization of one of the primers with the template. Because PNA's cannot be extended by DNA polymerases, hybridization of PNA^'s results in "blocking' of the polymerization.

Two different 13-mer PNA blocker-molecules were synthesized (ISOGEN Bioscience BV, Maarsen, The Netherlands) . PNA's were chosen as blocking molecules because they hybridize sequence- selectively with DNA and do so with a higher affinity, so at higher temperature, than comparable DNA molecules. This is incorporated in the PCR by lowering the temperature after melting to a temperature at which the PNA blockers can anneal to the template DNA. After this step the temperature is lowered further to the annealing temperature of the DNA primers .

One PNA blocker molecule, called B2 , is perfectly complementary to a region of the template whereas blocker B3 has a mismatch one nucleotide from the 3' terminus. This mismatch should result in a very small difference in hybridization stability between B2 and B3 , so a decrease blocking efficiency of B3 relative to B2. If it would be possible to detect this decrease m blocking efficiency, one could use blocking to discriminate between solution molecules differing from one another by only one nucleotide. The positive control is the PCR mixture with template DNA but without PNA blocker. With this mixture normal amplification by PCR should be observed. As a negative control a PCR mixture containing all components, except the DNA template, is taken along. After the PCR the resulting DNA is analyzed on an ethidiumbromide stained polyacrylamid gel (fig. 5) .

As can be seen from the gel figure 5, addition of any of the two blockers B2 or B3 clearly reduces the amount of product formed the PCR. In case of B2 , the amount of PCR product is below the detection limit of the gel. If B3 is used, a faint band can be seen. When the perfectly complementary blocker B2 is used, relatively more reduction is achieved than with the one mismatch blocker B3 , so even one mismatch can indeed yield a detectable reduction of blocking efficiency. Quantification of the double-stranded DNA (dsDNA) concentration after PCR (fig. 5) indicates that addition of B2 reduces the concentration of dsDNA from 2.3 ng/μL to 0.2 ng/μL, a reduction of 91%. Addition of B3 also results less dsDNA after PCR, though the reduction of 78% is significantly less than the 91% achieved by B2.

From the results of the described experiment it can be concluded that blocking of PCR amplification using specific PNA blockers is possible. Using a fully complementary PNA blocker, a 91% reduction m the amount of dsDNA after PCR can be achieved. If, however, a one-mismatch blocker is used, the blocking efficiency is reduced significantly to 78%. These results show that the method of blocking by PNA is possible. It is clear that the percentages of blocking can be further improved. Further evaluation of the optimal conditions for blocking could easily be performed using the LightCycler™ (Roche Diagnostics Nederland B.V., Almere, The Netherlands) .

Another objective was to prove that a significant reduction in blocking efficiency could be detected if a PNA blocker with on ϊ mismatch was used. If it would be possible to detect this decrease in blocking efficiency, one could use blocking to discriminate between DNA molecules differing from one another by only one nucleotide. For this purpose the blocker B2 , which had a fully complementary sequence to part of the template was used and compared with B3 , which had one mismatch at the second nucleotide from the 3' terminus. Theoretically one would expect the association of B2 to the template to be just somewhat stronger than that of B3. From the results of the experiment described it can be concluded that under the experimental conditions tested B2 blocks better than B3 , though the difference is relatively small. This relatively small decrease in blocking efficiency (13%) can be explained by the problem of the relatively slow cooling of the PCR apparatus to annealing temperature, during which the one-mismatch blocker can anneal to the template. Another reason is that the mismatch in B3 is one nucleotide from the 3' terminus. According to recent literature (Igloi, 1998) this results in a very small difference in hybridization stability between B2 and B3 , so only a very small decrease in blocking efficiency of B3. If the mismatch had been in the third or fourth position from the 3 ' terminus the difference in hybridization stability between B2 and B3 would have been considerably larger, resulting in a bigger difference in blocking efficiency. So, the comparison between B2 and B3 is in fact the one at which one would expect to see a very small difference in blocking efficiency. This difference could be detected, indicating that the experimental setup, despite the limitations described above, is quite effective. PROCEDURE FOR BLOCKING BY FLUORESCENCE QUENCHING

Fluorescent labeled molecules may be blocked by means of fluorescence quenching or fluorescence energy transfer (FRET) . If a quenching molecule is brought in close proximity to a fluorescence labeled molecule, its fluorescence will be quenched, i.e. the fluorescence signal will decrease (be blocked) . This technique can be applied to a library/blocking system by attaching a fluorophore to all library molecules and a quenching moiety to the blocking agents. If both library and blockers are DNA molecules, these can be brought in close proximity by hybridization. Thus, blocking of an assignment is equivalent to hybridization of a DNA strand representing this assignment to a blocker oligonucleotide. Several configurations of library/blocker molecules with fluorophores are shown in figure 1. The feasibility of this approach was confirmed by two different experimental approaches .

In a first experiment, two library and two blocker DNA oligonucleotides were synthesized (Eurogentec, Herstal, Belgium) . Sequences of the library molecules were 5 ' GGGG AAGT GAAT AAGT AAGT T and 5 ' GGGG AAGT GAAT GAAT GAAT T . These oligonucleotides represent 4 -bit binary assignments, with sequence 5' AAGT encoding "1" and sequence 5' GAAT encoding "0". Four guanine residues were added at the 5' end, and one thymine residue at the 3 ' end, to provide extra hybridization strength. So the library molecules can be thought of as encoding assignments 0100 and 0111, respectively.

Sequences of the blocking molecules were 5 ' A AYTY ATTC ACTT ATTC CCCC and 5' A ACTT AYTY ATTC ACTT CCCC, in which Y indicates incorporation of either C or T (mixed base synthesis) . Since the encoding for the blockers is the Watson-Crick inverse of the library encodings, these blockers will block molecules encoding assignments 101- and 01-0, respectively, in which "-" indicates either "1" or "0". Thus, in the four possible combinations of blocker with library molecule, three combinations will not result in blocking. Only the combination of the library oligonucleotide with assignment 0100 and blocker to assignment 01-0 will result in blocking .

Library molecules were labeled at the 5 ' terminus with the fluorophore Alexa 488 (Molecular Probes, Leiden, Netherlands) , blocker molecules at the 3 ' terminus with the fluorophore TAMRA (N, N, N', N ' -tetramethyl-6- carboxyrhodamine) . In case of hybridization, these labels would come together closely, as depicted in figure la, and fluorescence should be quenched by TAMRA, thus resulting in a decrease of the Alexa 488 fluorescence signal detected for the complex. In addition, an increase of the fluorescence emission from TAMRA (signal at 576 nm) should be observed. Hybridization is highly dependent on environmental conditions, especially temperature. At low (permissive) temperatures, many oligonucleotides will form double-stranded DNA, even if the sequences are not completely complementary. At higher (restrictive) temperature, only perfect complements will form a stable duplex. At even higher temperatures, these most stable duplexes will also dissociate.

Fluorescence emission spectra of all four possible library/blocker combinations were measured (Perkin-Elmer LS 50B luminescence spectrometer with water circulation temperature control) . Samples were excited at a wavelength of 460 nm, which induces fluorescence emission at 520 nm from Alexa 488, whereas no emission is induced from TAMRA. Therefore, the fluorescence signal at 520 nm can be used for monitoring the quenching of Alexa 488 fluorescence, i.e. the amount of hybridization in a mixture of library and blocker molecules. However, a clear dependence of the fluorescence signal on the amount of blocking was observed only for one of the four library/blocker combinations. A control experiment using a library molecule containing no fluorophore revealed that an additional quenching effect occurred due to interaction of TAMRA with the four guanine residues at the 5 ' terminus of the library molecules. Similar effects have been observed for other rhodamine-like fluorophores (Knemeyer et al . , Anal. Chem. 72, pp. 3717-3724, 2000) . This shows that guanine residues can be used as quenching moieties to detect blocking .

In order to achieve a more sensitive detection of library/blocker interactions, other DNA oligonucleotides were designed and synthesized. To eliminate the additional quenching, the distance between the two fluorophores was increased, Alexa 488 was replaced with fluorescein (FAM: 5'- carboxyfluorescein, which has spectral properties almost identical to those of Alexa 488) and the number of guanines in the encoding was minimized.

Two library oligonucleotides were synthesized: 5' T CTT CAT CTT CTT C (Lb04, assignment 0100) and 5' T CTT CAT CAT CAT C (Lb07, assignment 0111). 5' CTT represents a bit set to "0", 5' CAT represents a bit set to "1". At both ends an extra constant nucleotide is added to prevent terminal mismatches (which would result in less predictable hybridization thermodynamics) . One blocker oligonucleotide was synthesized: 5' G AAG AAG ATG AAG A (B1B0, blocker to assignment 0100) . Fluorescein was attached to the 5 ' terminus of the library molecules, and TAMRA to the 5' terminus of the blocker molecule. In case of hybridization, the configuration would be as shown in figure lb.

The two possible library/blocker combinations were again excited at 460 nm and fluorescence emission spectra were measured (shown in figure 6) . At 37 °C (restrictive temperature) , the fluorescence emission from fluorescein is quenched in the blocking combination (Lb04 + B1B0), but not in the non-blocking combination (Lb07 + B1B0) . Some fluorescein emission is still observed in the blocking combination, indicating that complete quenching is not achieved. This is most likely due to the rather large distance between the two fluorophores, resulting in a background level even for complete hybridization. At high temperature (52 °C) , even the blocking combination melts (the DNA duplex dissociates) , and fluorescein emission is restored. Figure 6b clearly shows that the fluorescence signal is significantly (125 %) increased compared to the background level (where blocker and library oligonucleotides are not associated through hybridization) . No fluorescence spectra were measured at permissive temperature, as this temperature was too low for the equipment used. However, measurements in a temperature range of 22 °C to 62 °C show that some hybridization still occurs in non-blocking complexes at temperatures below 37 °C . The melting temperature (Tm, equilibrium at which 50% of DNA exists in single-stranded form) for the blocking combination was measured to be around 45 °C . The shape of the partial melting curve of the non-blocking combination leads to an estimated Tm of 10 °C . Both these values agree with thermodynamic estimates of Tm.

In conclusion, blocking using fluorescent labels is feasible and, if conditions and encoding are carefully chosen, completely predictable. Fluorescence measurements at one or two temperatures are sufficient for discrimination between blocking and non-blocking.

In the experiments described above, the library/blocker combinations were prepared physically separated in different test tubes. For large problems, requiring large libraries, this approach can be miniaturized (and automated) by adding blockers to library molecules spatially separated in either micro well plates or on solid supports (DNA array technology) . For even larger combinatorial libraries, a high- throughput capillary system as depicted in figure 7 may be used.

Brief description of the figures

Figure 1. Schematic representation of a method in which detection of a biological molecule of a combinatorial library may be blocked by a blocking agent . In this example the blocker comprises a sequence that is complementary to a part of the nucleic acid sequence of the biological molecule thereby allowing hybridization. In figure la, both the biological molecule and the blocker comprise a fluorescent label (indicated with an open circle) . The fluorescent label attached to the biological molecule is excited by light of wavelength λ_x and emits light of wavelength λ₂. The fluorescent label attached to the blocker is excited by light of wavelength λ₂ and emits light of wavelength λ₃. Through the close proximity of the labels of the blocker and the biological molecule, fluorescence of at least the label associated with the biological molecule is quenched (FRET; Fluorescence: Resonance Energy Transfer) thereby not allowing detection of the light of wavelength λ₂ emitted by the biological molecule that is associated with the blocker. Through the close proximity of the labels of the blocker and the biological molecule, FRET will often cause the fluorecent label attached to the blocker to emit light at wavelength λ₃, thus providing an extra hybridization signal. Quenching of course does not take place when no blocker is associated, thus allowing detection of the label associated with the biological molecule when the biological molecule is not associated with the blocker. FRET is a phenomenon that is dependent on the distance between two fluorescent labels. Depending on the fluorescent labels used, attachment of the labels to the biological molecule and the blocker to opposite ends of the molecules as shown in figure lb may give a better discrimination between hybridized and non-hybridized molecules. It is also possible to attach fluorescent labels to any place of the blocker or on the biological molecule. In a third possible configuration, shown in figure lc, two fluorescent labels are attached to opposite ends of the blocker. Both ends of the blocker contain self-complementary sequences, but the internal sequence is complementary to the biological molecule. In the absence of such biological molecule, the blocker will hybridize intra olecularly, thus bringing the two fluorescent labels in close proximity and through FRET preventing emission of light of wavelength λ₂. In the presence of a complementary biological molecule, the internal sequence in the blocker will hybridize to this biological molecule, and the two fluorescent labels on the blocker will be separated by a distance which decreases the FRET efficiency. Such a configuration is called a molecular beacon (Tyagi and Kramer, Nature biotechnology 14, pp. 303- 308, 1996), which are extremely sensitive sequence recognition tools.

Figure 2. Schematic representation of a method in which detection of a biological molecule of a combinatorial library may be blocked by a blocking agent. In this example the blocker comprises a sequence that is complementary to a part of the nucleic acid sequence of the biological molecule thereby allowing hybridization. Hybridization of the blocker prevents at least in part hybridization of a PCR primer thereby disabling at least in part the amplification of biological molecule nucleic acid and thereby subsequent detection. Amplification of biological molecules, not comprising a blocker, is not impaired thereby allowing amplification and subsequent detection of unblocked biological molecules. Figure 3. Schematic representation of a method in which detection of a biological molecule of a combinatorial library may be blocked by a blocking agent. In this example the blocker comprises a protein nucleic acid (PNA) sequence that is complementary to a part of the nucleic a,:id sequence of the biological molecule thereby allowing hybridization. Hybridization of the blocker in this example, inhibits a selectable quality such as a restriction site, thereby in the case of a restriction site, disallowing cloning of the blocked biological molecule. When the detection system comprises detection of cloned biological molecules then inhibition of the selectable quality results in blocking of detection of blocked biological molecules.

Figure 4. Schematic representation of an example wherein the blocker is associated to a solid support. The solid support may be a bead as depicted and/or a two-dimensional surface. In this schematic representation the blocker is associated with a biological molecule of the library. In an alternative configuration, the biological molecule of the library can be attached to a solid support and the blocker can be in solution.

Figure 5. Photograph of an ethidiumbromide stained 12% polyacrylamid gel containing separated DNA after PCR. On top is indicated if template DNA and/or PNA was added to the PCR mixture. Each PCR mixture also contained 2.5 * 10^"7 M of both primers OMP351 and P1A, 0.25 mM of each dNTP and 5 U SuperTaq (HT Biotechnologies Ltd., Cambridge, UK) polymerase in a total volume of 100 μl SuperTaq buffer. Before starting the PCR 70 μl of mineral oil is put on top to avoid evaporation. The PCR was performed on a hybaid™ thermal reactor (HYBAID, Middlesex, UK), using to following program: 10' 95°C, 17 cycles {l' 95°C, 1' 56°C, 1' 48°C, 1' 72°C}, 10' 72°C. Of each sample 20 μl was mixed with 5 μl loading buffer and separated on a 12% polyacrylamid gel at 10 V/cm for 2 hours (detailed procedure: J. Sambrook, E.F. Fritsch, T. Maniatis, Molecular Clonin: A laboratory manual. Cold Spring harbor Laboratory press, 1989) . DNA was visualized with UV after ethidiumbromide staining. DNA concentration after PCR was determined using the PicoGreen dsDNA Quantitation Kit (Molecular Probes Eugene, Oregon, USA) and a LS50B luminescence spectrometer with well-plate reader (Perkin- Elmer Corp., Analytical Instruments, Norwalk CT, USA).

Figure 6. Detection of blocking using fluorescent labels attached to blockers and library molecules. Library molecules are DNA 14-mers labeled at the 5' terminus with fluorescein (excitation and emission maxima: 492 nm and 520 nm, respectively) . Blocker molecules are DNA 14-mers labeled at the 5' terminus with TAMRA (excitation and emission maxima: 565 nm and 580 nm, respectively) . Emission spectra of mixtures of blocker and library molecules were measured in an ultra-micro Suprasil quartz glass cuvette with a 3 mm light path (Hellma, Rijswijk, The Netherlands) using a Perkin-Elmer LS50B luminescence spectrometer and water circulation temperature control . Figure 6A shows spectra of the mixture : 1.4 μM library molecule LB07, 1.6 μM blocker molecule BLB0, lx SSC (150 mM NaCl , 15 mM NaCitrate, pH 7.0), a non-blocking combination. Figure 6B shows spectra of the mixture: 1.4 μM library molecule LB04, 1.6 μM BLB0, lx SSC, a blocking combination. The dotted line ( ) indicates values measured at 37 °C, the solid line ( ) values measured at

52 °C. Quenching of fluorescein by TAMRA is present at 37 °C in the blocking combination, but not in the non-blocking combination. Quenching in the blocking mixture is relieved by raising the temperature, melting the DNA duplex.

Figure 7. Schematic representation of a high-throughput device for discriminating between blocked and unblocked molecules by fluorescence energy transfer. A mixture of library and blocker molecules which may or may not contain a solution to a certain problem is pumped through a glass capillary (inner diameter 20 micrometer) . An autosampler may serve for feeding a large number of samples (e.g. from 96 well-plates) into the capillary system. Via a microscope objective a laser beam (continuous argon laser, wavelength = 488 nm) is focused onto the capillary (illuminated volume approximately 1-2 pico-liter) . An autosampler may serve for feeding a large number of sacent photons emitted from the molecules passing through the detection volume. By means of a dichroic mirror the laser light is blocked whereas the fluorescence light can pass to the detectors. For simultaneous detection of fluorescence photons emitted at wavelength λ₂ (as in figure 1) and at wavelength λ₃ two avalanche photodiodes, equipped with suitable filters, are used. The electronic signals from these photodiodes are fed into a switching device, which serves for directing the sample flow coming out of the capillary. In case of a low signal measured at wavelength λ₂ and a high one at λ₃ (which indicates blocking) , the sample is directed to the waste container, else it is directed to the solution container.

Claims

1. A method for detecting in a library of biological molecules representing a set of combinations of values for variables of a computational problem a possibly present biological molecule representing a combination of values for said variables, which combination is a true solution for said problem, characterized in that at least one biological molecule representing a false solution of said problem is blocked.

2. A method according to claim 1, wherein said library represents essentially all combinations of values for variables of said computational problem.

3. A method according to claim 1 or claim 2, wherein said biological molecule representing a true solution is not blocked.

4. A method according to anyone of claims 1-3, wherein essentially all biological molecules representing false solutions of said computational problem are blocked.

5. A method according to anyone of claims 1-4, further comprising blocking an identified biological molecule representing a true solution of said problem and identifying in said library, a possibly present biological molecule representing a combination of values for variables, which combination is a another true solution to said problem.

6. A method according to anyone of claims 1-5, wherein said blocking prevents detection of a blocked biological molecule as a true solution.

7. A method according to anyone of claims 1-6, wherein a part of a biological molecule is blocked.

8. A method according to claim 7, wherein said part represents a combination of values for variables of a subproblem of said computational problem.

9. A method according to claim 8, wherein said part represents a false solution for said subproblem.

10. A method according to anyone of claims 7-9, wherein said part represents a clause of a conjunctive normal form representation of said computational problem.

11. A method according to anyone of claims 1-10, wherein said library of biological molecules comprises nucleic acid or a functional analogue thereof.

12. A method according to claim 11, wherein a biological molecule is blocked by the hybridization thereto of a nucleic acid or a functional equivalent thereof, comprising complementarity to at least part of said biological molecule.

13. A method according to anyone of claims 1-12, wherein a blocking agent is capable of blocking detection of two or more biological molecules, said molecules representing different combination of values for variables of said computational problem.

14. A method according to claim 13, wherein said two or more biological molecules represent false solutions of said problem.

15. A method according to any one of claim 11-14, wherein said blocking agent comprises at least one universal nucleotide or analogue thereof.

16. A method according to any one of claims 1-15, wherein said blocking agent is capable of blocking biological molecules that represent the same combination of values for variables for at least one clause of a conjunctive normal form representation of said computational problem.

17. A method according to any one of claims 11-16, wherein said nucleic acid comprises peptide nucleic acid.

18. A method according to anyone of claims 1-17, further comprising subjecting said library to an amplification step.

19. A method according to any one of claims 1-18, wherein said blocking at least in part prevents amplification of a blocked biological molecule.

20. A method according to claim 18 or claim 19, wherein said amplification step comprises a nucleic acid amplification reaction such as polymerase chain reaction and/or a nucleic acid amplification in a cell.

21. A method according to anyone of claims 1-20, wherein said blocking results in quenching of fluorescence of a blocked biological molecule.

22. A method according to anyone of claims 1-21, wherein a biological molecule and/or a blocked biological molecule is linked to a solid surface.

23. A method according to claim 22, wherein said solid surface comprises a multiplicity of compartments wherein each of said compartments comprises at least one biological molecule of said library.

24. Use of a blocking agent for enabling elimination of detection of a biological molecule in a library of biological molecules representing a set of combinations of values for variables of a computational problem.

25. A method according to anyone of claims 1-23 or a use according to claim 24, wherein said computational problem comprises a SAT problem a. d/or a SAT related problem.