WO2021194635A1 - Combinatorial affinity-based analysis assemblies and methods - Google Patents

Combinatorial affinity-based analysis assemblies and methods Download PDF

Info

Publication number
WO2021194635A1
WO2021194635A1 PCT/US2021/016075 US2021016075W WO2021194635A1 WO 2021194635 A1 WO2021194635 A1 WO 2021194635A1 US 2021016075 W US2021016075 W US 2021016075W WO 2021194635 A1 WO2021194635 A1 WO 2021194635A1
Authority
WO
WIPO (PCT)
Prior art keywords
datum
sample
analytes
vector
low
Prior art date
Application number
PCT/US2021/016075
Other languages
French (fr)
Inventor
Stephen P. Mcgrew
Original Assignee
Mcgrew Stephen P
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mcgrew Stephen P filed Critical Mcgrew Stephen P
Publication of WO2021194635A1 publication Critical patent/WO2021194635A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/543Immunoassay; Biospecific binding assay; Materials therefor with an insoluble carrier for immobilising immunochemicals
    • G01N33/54313Immunoassay; Biospecific binding assay; Materials therefor with an insoluble carrier for immobilising immunochemicals the carrier being characterised by its particulate form
    • G01N33/54326Magnetic particles
    • G01N33/54333Modification of conditions of immunological binding reaction, e.g. use of more than one type of particle, use of chemical agents to improve binding, choice of incubation time or application of magnetic field during binding reaction

Definitions

  • the present disclosure relates to the identification of macro and/or micro molecules. Particular embodiments relate to analysis methods including combinatorial affinity-based analysis methods and/or assemblies
  • the goal is typically to determine qualitative data, quantitative data, and/or some combination of both.
  • the analysis can provide a general identity of molecules or portions of molecules, or it can provide a very specific identity.
  • Assays and methods may identify specific molecules, proteins, moieties, complexes or elements.
  • the “lateral flow assay” has been used as a method for detecting the presence or absence of specific substances (“targets” or “antigens” or “pathogens”) in a sample.
  • targets or “antigens” or “pathogens”
  • pathogens a substance that is a substance that is a substance that is a substance that is a substance that is a substance that is a substance that is a substance that is a substance that is a substance.
  • targets or antigens
  • pathogens a substance that is, the antibodies are “designed” to bind to one, and only one, specific antigen.
  • An entire industry has built up around developing those highly specific antibodies and incorporating them into various antibody-based tests including tests for drugs of abuse, histocompatibility tests, tests for particular viruses, bacteria and other pathogens, and so on.
  • ELISA sandwich assays in which a single type of highly specific antibody is printed on the bottom of each well in an array of wells, and/or the amount of antigen from a sample that binds to each well, is interpreted as indicative of the concentration of the corresponding antigen in the sample.
  • PCT/AU201 6/000371 by MacDonald describes a lateral flow assay that employs multiple multi-specific or mono-specific antibodies to produce a “digital” display, such as a 7-segment alphanumeric display. MacDonald explicitly does not use low-specificity antibodies, stating that the assay “preferably has no cross-reactivity”.
  • the present disclosure provides heretofore unknown combinatorial affinity-based analysis assemblies and methods.
  • a binding affinity-based assay method comprising: providing a first set of known low specificity binding elements; presenting a known sample to the first set to define a first datum vector; providing a second set of known low-specificity binding elements, the first and second sets including the same known low- specificity binding elements; presenting an unknown sample to the second set to acquire a second datum vector; and comparing the first and second datum vectors to determine similarity or difference between the known and unknown samples.
  • the “difference” between two datum vectors may, depending on context, mean the vector difference, or the root-mean-square difference, or any other useful difference measure.
  • a binding affinity-based assay method comprising: presenting a known sample to a set of known low- specificity binding elements to define an individual datum vector to develop a library of datum vectors associated with known samples; presenting an unknown sample to the set of known low-specificity binding elements to acquire a test sample datum vector; and comparing the test sample datum vector to those of the library to determine the most likely mixture of antigens to have produced the observed datum vector.
  • Fig. 1 is a general assay method according to an embodiment of the disclosure.
  • Fig. 2 illustrates constructing a library of datum vectors according to an embodiment of the disclosure.
  • Fig. 3 is a graphic representation of an example system having entities bound thereto according to an embodiment of the disclosure.
  • Fig. 4 is an example method for analyzing systems with bound entities according to an embodiment of the disclosure.
  • Fig. 5 is an example method for analyzing systems with bound entities according to an embodiment of the disclosure.
  • Fig. 6 is an example method for a graphical representation of a DS matrix.
  • the DS matrix is a matrix representing the set of binding affinities between a set of antigens and a set of antibodies.
  • Fig. 7 is an example method for calculating a correction to a trial sample vector according to an embodiment of the disclosure.
  • the term “detector” is used to mean any device, component, moiety, element, or sensor that binds preferentially to a particular category of molecules or to particular aspects of molecules.
  • the term “reader” is used to mean a device or set of devices that reports and/or records the results of allowing a sample containing one or more types of analytes to come into contact with a set of detectors. Flence, for example, a particular type of antibody spotted onto a surface or onto the bottoms of wells in a plate constitutes a “detector”. If that antibody binds with low specificity to a range of different analytes, it is a “low specificity detector”, or a “detector with low binding specificity”.
  • the low-specificity binding elements can include selected molecular types.
  • the low-specificity binding elements may be defined by at least one region spatially distinct from other individual regions. Each of the individual regions corresponds to a distinct low-specificity binding element.
  • a set or array of such “low specificity binding elements” may be used to determine a datum vector that describes the response of the set or array to a sample containing one or more analytes.
  • molecular type is used to refer to set of molecules that are all the same, or that all have a specific property.
  • matrix model is used herein with a generalized meaning that includes “algorithmic model”, neural net model, and so on: any computational model or analog model that may be used to predict the behavior of a system.
  • algorithm any computational model or analog model that may be used to predict the behavior of a system.
  • a general assay method that first includes determining a datum vector for a known sample.
  • This known sample can include single or multiple known entities, one or more of which can be complexed by an assay that includes low- specificity complexing entities. From this binding a datum vector can be established that can be associated with both the known low- specificity complexing entities and the known sample. This datum vector can provide either or both of qualitative or quantitative information. Datum vectors can be established for many known samples.
  • a first set of known low specificity binding elements can be provided and a known sample can be presented to the first set to define a first datum vector
  • a datum vector for an unknown sample can be determined.
  • This unknown sample may or may not be provided in the same matrix as the known sample. From the complexing of the unknown sample with the low-specificity complexing entities a datum vector for the unknown can be determined.
  • a second set of known low-specificity binding elements can be provided.
  • the first and second sets can include including the same known low-specificity binding elements in unbound form, then an unknown sample can be provided to the second set to acquire a second datum vector.
  • the datum vector of the known and unknown samples can be compared to determine the likelihood of similarity of differences between the unknown and known samples. Comparing the first and second datum vectors can include using an algorithm to determine a degree of difference or similarity between the first and second datum vectors. Additionally or alternatively, an algorithm can be used to determine analyte concentrations in the unknown sample.
  • a general method is shown for determining datum vectors for a plurality of unknowns is shown and a library can be created for same. Accordingly, the unknowns can be associated with a particular trait of the sample, toxicity, infection, disease, etc.
  • an unknown datum vector can be determined and compared with one or more entries in the library for similarity or dissimilarity.
  • the library of the datum vectors can correspond to known samples to determine relative analyte concentrations of the unknown sample.
  • mathematical models of a detector response to a mix of analyte concentrations created from detector responses can be used to vary concentrations of pure samples of known analytes.
  • the binding equilibria of analytes to determine analytes and/or concentration of analytes can be used to vary concentrations of pure samples of known analytes.
  • the binding equilibria of analytes to determine analytes and/or concentration of analytes.
  • the kinetic association/dissociation of analytes to determine analytes and/or concentration of analytes.
  • the similarity or difference between the known and unknown samples can be communicated to a remote device.
  • the two datum vectors unknowns to knowns, knowns to a library, and/or unknowns to a library
  • the presenting of the known samples to the low-specificity binding elements can be performed at a first location
  • the presenting of the unknown samples to the low-specificity binding elements can be performed at a second location.
  • the first and second locations can be the same or different.
  • a known sample can be presented to a set of known low-specificity binding elements to define an individual datum vector to develop a library of datum vectors associated with known samples.
  • An unknown sample can be presented to the set of known low-specificity binding elements to acquire a test sample datum vector. Then, the test sample datum vector can be compared to those of the library to determine the most likely mixture of antigens to have produced the observed datum vector.
  • the assemblies and methods of the present disclosure can utilize a Combinatorial Affinity-Based Assay (CABA) that can incorporate a new approach to recognizing known pathogens and detecting/fingerprinting new pathogens.
  • CABA Combinatorial Affinity-Based Assay
  • the assemblies and/or methods of the present disclosure can utilize an antibody-based test, but rather than using maximally specific antibodies (which are expensive and time-consuming to develop), the assemblies and methods can utilize a set of low-specificity antibodies to develop a “profile” or “fingerprint” (later herein called a “datum vector”) of a sample which may contain numerous different pathogens. This datum vector can be analyzed to identify the particular pathogens present in the sample, or to reveal the presence of an unknown pathogen or pathogens, or to reveal the non-presence of a known pathogen.
  • CABA Combinatorial Affinity-Based Assay
  • CABA instead of relying on antibodies that are highly specific to particular target antigens, CABA relies on antibodies that have low specificity and bind to multiple antigens. Rather than identifying which antigen is being detected by the particular antibody to which the antigen binds exclusively, in CABA it is the combination of antibodies that bind to a target antigen in CABA that characterize or identify the target antigen.
  • the assemblies and/or methods of the present disclosure can be used to analyze any substances to which antibodies will bind with low specificity, such as proteins, DNA, RNA, pollens, other antibodies, and even fats and sugars. Further, the assemblies and methods of the present disclosure are not limited to any particular method (e.g., ELISA or LFA) for detecting binding affinity of a sample to members of a set of antibodies.
  • multiplexed antibody based assays are referred to as “multiplexed antibody based assays” or “MAbs” when using the assemblies and/or methods of the present disclosure.
  • MAbs multipleplexed antibody based assays
  • the assemblies and/or methods of the present disclosure can employ “aptamers”, which are nucleic acid molecules that distinguish between protein isoforms and conformations, and possess target recognition features.
  • aptamers there are various kinds of aptamers, including DNA or RNA or XNA aptamers that are constructed of (usually short) strands of oligonucleotides, and peptide aptamers that are constructed of one (or more) short variable peptide domains, attached at both ends to a protein scaffold.
  • Arrays that incorporate multiple types of low- specificity detectors can include some detector elements with high specificity and still fall within the scope of this disclosure. Sets of detector elements can be referred to herein as “low-specificity detector arrays”, “detector arrays”, “detector element arrays”, and/or “MAbs”.
  • Antigen may refer to a sample component that is detected by a detector element in an antibody array or aptamer array or the equivalent.
  • Antibody can encompass antibodies, aptamers, and any other kind of molecular type or surface property that binds to an antigen with relatively low specificity.
  • the present disclosure provides a new way to use antibodies in antibody, aptamer, or other ligand arrays to provide a highly reliable analyte profile for any sample, based on the sample's analyte composition.
  • This can be accomplished by using a “panel” or array of detectors that essentially classify each sample according to a set (e.g., 20 to 2000) of approximately orthogonal criteria or “affinity groupings”.
  • a set e.g., 20 to 2000
  • an “orthogonal” criterion is largely independent of the other criteria in the sense that a set of vectors can be (linearly) independent of each other and therefore orthogonal in a mathematical sense.
  • criteria should provide detector array activation patterns that are highly distinguishable in regions of interest in analyte space. This does not require highly specific antibody/antigen or aptamer/analyte or ligand/ligand binding affinities. Instead, it benefits from selecting multiple antibodies, aptamers, or ligands that reliably bind to different large subsets of the set of all possible antigens or analytes of interest (the “UAS” or “Universal Antigen Set”).
  • a “large subset” can encompass usually 20 to 80 percent of the UAS, but may also encompass a much larger or much smaller percentage of the UAS. Note that a typical high-specificity antibody will only bind strongly to less than 0.00001 percent of antigens in the UAS, for example.
  • An example method of selecting antibodies for inclusion in the detector array of the present disclosure can be as follows:
  • a single analyte may be selected as a reference analyte.
  • Samples containing the reference analyte in the same concentration and each of a wide range of different known single analytes in a range of different concentrations may be presented to a detector array containing a large assortment of single antibody-coated zones.
  • Data received from the detectors may be analyzed to obtain approximate values of the equilibrium constants of competitive binding between the reference analyte and the known analytes, at each detector.
  • antibodies may be selected that produce optimally distinguishable binding affinity patterns on the detector array for all the analytes, while producing optimal distinguishability between binding affinity patterns for regions of analyte space that are deemed particularly important for any reason.
  • “competitive binding” assays, non-competitive binding assays, and other such assays may be adapted for use in the present disclosure, with the requirement that they provide a quantitative readout of the amount of analyte bound to an antibody, or of the binding affinity or other such related quantity, regarding the interaction between an antibody and an analyte.
  • “epitope binning” methods may be employed to provide a basis for selecting antibodies that may produce distinguishable binding affinity patterns on the detector array for all the analytes, while producing optimal distinguishability between binding affinity patterns in regions of analyte space that are deemed particularly important.
  • Epitope binning emphasizes the binding interactions between different antibodies that target epitopes located near each other on an antigen. The proximity of the epitopes on an antigen causes the presence of a bound antibody of one type at its epitope to inhibit the binding of the other antibody at its epitope.
  • SPR Surface Plasmon Resonance
  • the present disclosure is not dependent on standard antibody- based detector arrays, or on particular methods for detecting binding affinity of each member of a set of antibodies or aptamers to analytes in a sample.
  • Portions of the present disclosure herein refer to “detectors” rather than any particular kind of antibody-based or aptamer-based binding affinity indicator.
  • a “detector” can be an LFA (with the human eye or an automated reader to obtain a numerical reading from the LFA), a well in an ELISA assay (with a suitable reader), or any other device that provides an indication of the binding affinity of a particular molecular type or group of types of molecules to components of a sample, at each of multiple detectors.
  • the present disclosure can encompass other kinds of affinities such as DNA hybridization, along with detectors capable of indicating binding affinity to components of a sample.
  • LFAs, ELISA assays, antibodies, aptamers, LFA readers and ELISA readers are examples of ways to implement the disclosure and to provide a clear exposition.
  • a “detector” can be an ELISA well, a lateral flow indicator line, or the like, coupled with means for obtaining a quantitative readout of a value relating to the binding affinity of components of the sample to a low-selectivity antibody or aptamer; and a “detector array” can be a spatially separated set of such detectors.
  • each antibody or aptamer in the array may be attached to any region on the corresponding detector, or on multiple detectors.
  • Binding affinity can refer to “relative binding affinity”, “specific binding affinity”, or any other quantity that correlates to the amount of an analyte that can bind to a given detector, possibly in competition with other analytes with their own binding affinities to the detector.
  • Quantitative readouts can be obtained from an LFA, ELISA array, or other antibody array with a camera, or magnetic, fluorescent, mass-sensitive, radioactive, SPR, or other detection technology.
  • the present disclosure may also be used as a “negative screening assay”. That is, it may be used to identify a subset of a large set of samples, such that there is high confidence that the subset does not contain a specific analyte.
  • Such an assay can be useful, if sufficiently fast and inexpensive, for screening individuals whose work requires them to be in contact with others who are vulnerable to a disease. If the screened individual is, say, 99.9% sure not to have the disease, it may be deemed safe to have them work with the vulnerable other individuals. Of course, it is best similarly to screen all individuals with whom the screened worker must interact closely.
  • the assemblies and methods of the present disclosure can employ a reader to provide a quantitative readout of the response of each detector (the detector response) to a sample.
  • An ordered set of readouts for a sample can constitute a “datum vector” that correlates to the composition of the sample.
  • the readout for each LFA in an LFA panel can be considered to be approximately proportional to the concentration of the antigens in the sample that bind to the LFA. Note that any given antigen may contribute to the readout values for a large subset of the LFAs in the panel. However, in practice, the readout value may not be directly proportional to antigen concentration in a sample even when the sample contains only that one antigen.
  • the readout can be subject to various nonlinearities that can complicate interpretation of a datum vector.
  • Such nonlinearities include saturation effects, but also depend on the relative concentrations of different antigens in the sample, due to competition between the antigens for binding sites on the antibodies on a detector.
  • Often nonlinearities are much easier to handle by increasing dynamic range of the assay. For example, if the assay is performed six times on a sample, with the sample diluted by a further factor of 10 in each step, the dynamic range of the assay is potentially increased by as much as a million: six powers of ten.
  • An algorithm that can identify and compensate for nonlinearities in the readout system, to enable reliable identification of not only specific antigens, but the relative proportions of antigens in samples containing mixtures of antigens, can be utilized.
  • the algorithm may be structured as follows:
  • Ri(6j) may be represented as a matrix of equilibrium constants K ⁇ (k), where K ⁇ (k) relates to the binding affinity of analyte k to detector j.
  • sample vector is not likely to be a linear sum of known sample vectors, label it as “unknown, possibly novel”. If it is a very close match to such a linear sum, label it as such. If there are multiple distinct linear sums with a reasonably close match, label them as possible matches and assign a relative likelihood related to the quality of their matches to corresponding linear sums.
  • artificial intelligence (Al) methods such as artificial neural networks, neuromorphic computing, evolutionary programming, or deep learning can be used to analyze a database of datum vectors (along with attached data), essentially finding regions in sample space that correspond to regions in datum vector space.
  • Al artificial intelligence
  • Such methods require vast amounts of computing power.
  • the algorithm outlined above requires relatively little computing power and can be executed on a desktop or laptop computer.
  • a complete service employing the assemblies and methods of the present disclosure can include mass-producible readers, mass producible LFA panels, and an online database analogous to the DNA databases of 23andMe.
  • the service may receive and interpret datum vectors along with their associated sample-related information (e.g., time, location, sample type, patient identifier, demographics, sample number, known antigens in the sample, etc.).
  • sample-related information e.g., time, location, sample type, patient identifier, demographics, sample number, known antigens in the sample, etc.
  • the service may also track the trajectories of identified pathogens demographically or geographically, and report to agencies who need that information, at very low cost.
  • Example implementations of the disclosure can include the following steps:
  • Universal set is the set of proteins that are expected to be the target of the assay.
  • spanning subset of the “master library”. This is a set of antibodies which in combination will produce a unique binding pattern for each protein in the representative subset. Generally, it is not a unique subset; there are a large number of different such “spanning subsets” for a given “master library” and given representative subset of proteins. In fact, typically a relatively small (100 to 2000 element) random subset of the “master library” is extremely likely to contain a suitable “spanning subset”.
  • the goal of refinement is an array consisting of a minimal number of antibodies such that the binding patterns of the antibody array to the various proteins are maximally distinguishable from each other.
  • Steps 1 and 2 in the above procedure can use well-known methods.
  • the “universal set” of proteins may be the set of roughly 1500 distinct surface coat proteins of the Chagas parasite, T. cruzi.
  • DNA sequences corresponding to the amino acid sequences of many of those proteins are easily obtained via online databases such as GenBank.
  • the DNA sequences may be synthesized by any of the commonly used “gene synthesis” methods.
  • the resulting DNA sequences are inserted into the genomes of a host organism (e.g., yeast, bacteria, or phages) using methods well-known in the molecular biology art.
  • the host organisms are cultured, and they express the proteins corresponding to the DNA sequences.
  • Step 3 in the above procedure can be done using well-known methods such as inoculating camelids (typically llamas or alpacas) or mice with a mixture of the proteins in the “universal set” to elicit an immune response, then isolating and cloning the resulting antibodies to produce a “master library”, then “panning” the “master library” to capture the subset of antibodies capable of binding to the proteins in the “universal set”.
  • Steps 4 and 5 may be accomplished in practice by first selecting at random 100 to 2000 different unique antibodies from the “master library” and cloning them individually.
  • the resulting monoclonal antibodies may be used to prepare one or multiple antibody arrays, and the equilibrium datum vectors of the proteins to the arrays or the individual antibody/protein association/dissociation curves may be determined using methods well known in the art such as ELISA assays or surface polariton resonance measurements. Any of many well- known optimization techniques may be used to select a subset of the 100 to 2000 antibodies that produce unique, easily distinguishable datum vectors of the proteins on the arrays.
  • Step 6 may be accomplished by well-known methods of cloning and microprinting antibodies onto substrates in, e.g., a rectangular array pattern.
  • Step 7 may be accomplished by presenting each of the proteins in the “representative subset” individually to the arrays, then measuring and storing the resulting datum vectors. Typically the number of elements in the “representative subset” is much larger than necessary. Standard optimization methods may then be used to select antibodies to form a minimally-sized “spanning subset” which results in arrays with maximally distinguishable datum vectors among the array/protein datum vectors of the “representative subset” of proteins.
  • Step 8 the optimized arrays produced in Step 7 are used to determine the datum vectors of each of the proteins in the “universal set” to the arrays. Those datum vectors are stored in a reference database.
  • the sample may be blood, plasma, tissue, a cell or virus culture, urine, milk, etc.
  • the sample may contain a pathogen to be identified.
  • Pre-processing may include such procedures as sonication, digestion, filtration, dilution, centrifugation, Western blot separation, culturing, plating, concentration, and so on, with the objectives of removing irrelevant proteins while retaining the unidentified pathogen and producing a solution with standard concentrations of background proteins and standard properties.
  • the pre-processed sample is presented to the CABA, and the resulting datum vector is recorded.
  • the datum vector is compared to datum vectors in the database, and well-known optimization methods are employed to determine the most likely mixture of reference proteins (i.e., the proteins in the reference database) to have produced the observed datum vector
  • the “anomalous component” of the observed datum vector which is the datum vector of a new protein which, if added to the reference database, would allow a close match between the observed datum vector and the expected datum vector of the most likely mixture on reference proteins plus the new protein.
  • the new protein may correspond to a previously unknown pathogen. This can be confirmed if the same anomalous datum vector is observed in further samples from a population of patients displaying similar symptoms.
  • the assay of the present disclosure employs a single “generic” antibody array to test for the presence of any of a very large number of different potential antigens in a sample. That is, a single relatively small, standardized, CABA array can identify any of the antigens in its “reference database” of antigens. As a result, a small set of small CABA arrays may be used to identify all possible types of potential known antigens. Moreover, the same small set of CABA arrays can be used to detect the presence of new antigens, and to identify new antigens as soon as the new antigens are added to the reference database.
  • Figure 3 is a graphical representation of an example system having entities bound thereto according to an embodiment of the disclosure.
  • the right-hand image represents the binding pattern of a first entity onto the elements of a first detector array.
  • the middle image represents the binding pattern of a second entity onto the elements of a second detector array identical to the first detector array.
  • the left- hand image represents the binding pattern formed by both first and second entities when the entities are presented simultaneously to a third detector array identical to the first two detector arrays.
  • the left-hand binding pattern is a linear superposition of the middle and right-handed binding patterns.
  • Figure 4 illustrates a way that the invention may be implemented, to provide a useful service to medical professionals, clinics, hospitals, health agencies, and so on.
  • a sample is collected (402), which may be a patient's tissues or body fluids. Alternatively, the sample may be from food, waste water, sewage, air, soil, dust, plants, or any other source that may contain viral or bacteria fragments, products, or living organisms.
  • the sample is then prepared (402) for analysis. Depending on the purpose of the analysis, the sample may be filtered, diluted, pH buffered, cultured, amplified, dissolved, sonicated, heated, lysed, centrifuged, passed through a chromatography column, or submitted to a 1 -dimensional or two-dimensional gel for separation.
  • the prepared sample is presented (404) to the array of this invention, such that different components of the prepared sample each bind to one or more elements of the array.
  • a reader which may be a camera, an SPR device, or any other device that can quantitatively measure the amount of material bound to each element of the array, is used to read (406) the results of presenting the sample to the array.
  • the results represented as a “datum vector”, are then sent (408) via the internet or other communication system to a processor that uses artificial intelligence (Al) to analyze the datum vector.
  • the datum vector is stored (414) along with supplementary information and analysis results in a database. As information is added to the database, the information is used (416) to continually or periodically upgrade the Al.
  • Results of the reading are sent (410) to users via the internet or other communication system.
  • a report (412) is compiled for sending to users or to health agencies or government agencies, including a selection of relevant information that can be extracted from the analysis and/or from comparison or correlation of the results with other results stored in the database. Information in the database may be used to continually upgrade the Al.
  • Figure 5 illustrates aspects of how a datum vector may be processed by the Al to obtain a sample vector.
  • the datum vector represents the results of presenting a sample to the array of this invention, while the sample vector represents the most likely composition of the sample, based on the stored information in the database and the observed datum vector (500).
  • the observed datum vector (500) is received over a suitable communication system from a reader.
  • a trial sample vector (506) can be a randomly generated sample vector, or it can be a sample vector obtained from a sample that is similar to the sample from which the observed datum vector (500) was obtained, or it can be any sample vector at all.
  • the trial sample vector (which in the first iteration is also the New Trial Sample Vector 508) is operated upon by a detector array model (SD model) 510 to obtain a trial datum vector 512.
  • SD model detector array model
  • the comparator 502 compares the trial datum vector 512 to the observed datum vector 500 to produce an error datum vector 504. If the error datum vector 504 meets conditions specified by the “difference operator” ( ⁇ D?) 520, it is presented as the best-guess sample vector 518. Otherwise, the error datum vector 504 is passed to the sample vector correction calculator 516 to obtain a correction 514. The correction is subtracted from the trial sample vector 506 by operator 522 to obtain a new trial sample vector 508.
  • the sample vector correction calculator 516 makes use of both the observed datum vector 500 and the error datum vector 504 to calculate the correction 514, using a method like that illustrated in Figure 7. The process described above continues cyclically until the error datum vector 504 meets the conditions specified by the difference operator 520.
  • Figure 6 illustrates a graphical method for representing a DS matrix.
  • the DS matrix may have rows with elements whose values correspond to the binding strength between all antigens in the sample and one antibody in the array. Each row in the DS matrix corresponds to a different antibody, while each column corresponds to a different antigen.
  • the red lines from the square representing antigen 602 go to antibodies 610 and others, to which the antigens bind.
  • the blue lines from the circle representing antibody 608 go to the antigens in 600 that bind to the antibody 608.
  • the lines representing binding between the other antibodies and other antigens are omitted for clarity.
  • Figure 7 illustrates how the sample vector correction calculator may be structured, although it may be structured in other ways.
  • An observed datum vector 700, 500, a trial datum vector 702, 512 and a trial sample vector 794, 506 are given.
  • an approximate value for derivative 3S/3D is calculated at the location of the trial datum vector 700.
  • From qD and 3S/3D calculate 714 a correction term K *
  • K is a constant that is typically user-adjustable.
  • 3S/3D is a matrix whose terms are the partial derivatives of the elements of the sample vector S with respect to the elements of the datum vector D.
  • an ELISA plate with N wells is prepared, with a different one of N different primary antibodies adhered to the bottom of each well.
  • the primary antibodies are selected from a library of antibodies to bind with varying affinity to members of different partially overlapping subsets of the set of antigens that are expected to be in a sample.
  • the sample is poured over the wells, time is allowed for antigens in the sample to bind with the antibodies in the wells, and then the wells are rinsed to remove un-bound antigen.
  • a fluorescent tag which binds to antibody/antigen complexes is added to the wells, and then the wells are rinsed again to remove any unbound tag.
  • the plate with its wells is placed in a reader which illuminates the wells with light that stimulates fluorescence, and which measures the amount of fluorescence emitted by each well. That amount of fluorescence is proportional to the amount of antigen bound to the primary antibody in each well.
  • the fluorescence intensities of the wells as read by the reader comprise a datum vector (e.g., a listing of the detector values read by the reader).
  • the datum vector is submitted to an artificial intelligence (Al) system which includes a mathematical model of the ELISA array and the binding equilibria for known antigens, and a database of known samples and corresponding datum vectors.
  • Al artificial intelligence
  • the Al system seeks a sample composition that, according to the model and the database, optimally matches the measured datum vector from the ELISA reader.
  • That sample composition is output by the Al system as the most probable composition of the sample.
  • the Al system may also output alternative sample compositions along with their relative probabilities of being the correct composition.
  • the Al system may indicate that an unknown or novel antigen is present in the sample.
  • the difference between the observed datum vector and the best-match datum vector that can result from a sample containing known antigens provides a “difference fingerprint” of the unknown or novel antigen; and the difference can also provide information about the unknown or novel antigen that can be helpful in isolating, analyzing, or identifying that antigen.
  • the datum vector and any other available information provided by the user about the sample is placed in the database, and a description of the most-probable antigen composition of the sample may be sent to the user.
  • the assemblies and/or methods can include a set of 20 to 2000 lateral flow elements of standard design with a different antibody type in each lateral flow element as follows:
  • each antibody corresponding to a flow element can bind with low selectivity to some subset comprising anywhere from a small fraction to roughly half of the universal set of antigens;
  • each antibody corresponding to a flow element can bind to a different subset of the universal set of antigens, with respect to the antibodies in the other flow elements;
  • the binding affinities of the antibodies in the set of flow elements may form an effectively complete, redundant, vector space which can be mapped to the universal set of antigens.
  • the assemblies and/or methods of the disclosure can include a reader that provides a quantitative readout of strip color and intensity resulting from a sample submitted to the lateral flow elements.
  • This reader may provide a real-valued vector of values that depends on the composition of the sample. This vector may be referred to herein as a “datum vector”, and is identified with the sample being tested.
  • LFA panel readers that can be used in accordance with the present disclosure can include a color photo scanner such as the Epson® Perfection® V39 Flatbed Desktop Photo Scanner, Model #: B11 B232201 (cost in March, 2020 only $100). This scanner can provide colorimetry and spatial resolution more than sufficient for use as an LFA panel reader.
  • Software to derive a datum vector from a scanned LFA panel can include one or more of:
  • the antibody array is placed in an SPR instrument such as the Carterra Biosystems LSA(c), and a sample is flowed through the instrument.
  • SPR instrument such as the Carterra Biosystems LSA(c)
  • a sample is flowed through the instrument.
  • the binding affinities between the antibodies and the analytes that may be in the sample may be measured by the instrument if the analytes are flowed sequentially through the instrument.
  • an antigen array may be placed in an SPR instrument and the antibodies may be flowed sequentially through the instrument to obtain essentially the same information. This information is used to construct and refine a model of the detector array.
  • a complex unknown sample may be flowed through the instrument containing an antibody array, thereby revealing the amount of antigen bound to each antibody type, and thereby producing a datum vector for the unknown sample.
  • a way to increase the amount of information obtained from an assay performed in accordance with the present disclosure is to obtain a first datum vector using the unknown sample, and then to obtain an additional datum vector by using the unknown sample with a known added quantity of a known mixture of selected analytes.
  • This will provide further information directly relevant to the competitive binding between different analytes at each antibody element of the array; and the information can further be utilized by a machine learning system or Al system or by the preferred algorithm, to provide improved reliability of analysis of the sample and interpretation of the corresponding datum vector.
  • the further information can be concatenated to the datum vector to form an extended datum vector which can be handled by an Al or machine learning system in essentially the same way as the datum vector itself.
  • algorithms can compare an unknown datum vector to other datum vectors stored in a database, to determine the degree of similarity or difference between the datum vectors and therefrom to obtain a maximum-likelihood estimate of the composition of the sample identified with the unknown datum vector.
  • One or more of the algorithms can include the following:
  • “nearest neighbors” within datum space may not be “nearest neighbors” in a Cartesian sense (i.e., the Cartesian distance measure of the square root of the sum of the squares of the coordinate differences, or the RMS differences of coordinate values, where the coordinates are the detector readout values).
  • a useful distance measure may involve taking the nonlinear relationship between sample space and datum vector space into account.
  • the assemblies and/or methods can include a service that allows local testing of samples at a first location and, via a suitable communication medium, remote centralized analysis (at a second location) of datum vectors resulting from the readings, Analysis of datum vectors may include comparing datum vectors corresponding to unknown samples to datum vectors corresponding to known samples and drawing conclusions from the comparison And, of course, if standardized arrays of low-specificity binding elements are available, presentation of unknown samples to the arrays can be done at one location while presentation of known samples can be done at a different location.
  • results of the analysis will be communicated to a user at a location remote from the location where analysis is done, but in some cases the first and second locations will be the same.
  • This “global embodiment” may include regular updating of the database using the new datum vectors and associated data that are submitted with them, and providing geographical, temporal, demographic, and other data to health agencies, to assist in tracking the emergence and trajectories of contagious diseases.
  • the assemblies and/or methods can employ an “ELISA sandwich assay” as an array of wells. At the bottom of each well a distinct corresponding type of very broad-specificity antibody (or selected set of such antibodies) is attached. A microarray reader can be used to detect the amount of antigen that binds to the antibody type in each well.
  • the assemblies and/or methods can employ an “ELISA sandwich assay”, but in each well of the assay a corresponding selected mixture of antibody types is printed and/or attached.
  • Particular antibody types can be components of multiple selected mixtures in wells, or each well can contain a corresponding non-overlapping subset of a large set of antibodies, or some wells can contain mixtures, some unique antibodies, and so on.
  • Steps 2 and 3 can be performed using, for example, the Carterra LSA platform.
  • a relatively small set of detectors e.g.,20 to 2000
  • a much larger number e.g., 100 to 20,000
  • the “psychic” shows a victim a set of six sheets of paper, each containing 4,000 names in different arrangements. Each sheet is divided into quadrants. The victim is asked to indicate which quadrant his surname is in, on each sheet. With just six such indications, the “psychic” is able to know the victim's surname.
  • the “psychic” trick is relatively easy to understand. Less easy to understand is the fact that it is, under the right circumstances, possible not only to detect the presence of a single analyte, but also possible to determine the analyte concentrations in a sample containing multiple analytes. This is an important consequence of the present disclosure.
  • the assemblies and/or methods can separate a set of samples into two subsets, such that one subset almost certainly does not contain a given analyte, and such that the other subset may or may not contain the given analyte. This can be termed a “negative screening assay”.
  • a pre-made existing detector array is exposed to a novel virus in several known samples (which may contain mixtures of a few analytes), and if the detector values for those samples are distinctly different from the detector values for other samples that do not contain the antigen, it can be determined with high confidence that those samples that do not produce the distinctly different detector values do not contain the antigen.
  • the negative screening assay deems a worker to be in the “does not contain COVID-19” subset, that worker should be safe to continue working. If the worker is deemed to be in the other subset, it is definitely not a positive result: it cannot be said that the worker has the virus. However, the assay can give a sound reason for further testing of that worker, or diverting the worker to a non-public-contact role. Of course, it also makes sense to screen individuals with whom that worker must make contact, whenever it is possible to do so, in order to prevent transmission to the worker from those individuals.
  • detectors in an array similar to that of the positive assay would be used in parallel as in the positive assay, and the results used to pare down the possibilities, both positive and negative with respect to the presence of specific analytes, for each sample, as far as is reasonable given the apparent (sample-dependent) noise levels at each detector.
  • Some of the detectors may be designed to be more useful for assessing sample-dependent noise level in particular portions of analyte mixture space, than for classifying a particular sample.
  • a modified or alternative method is to first present the sample to the array in several dilutions, then to add a known mix of antigens - a reference solution which may be selected according to the source of the sample, such as blood, semen, sputum, or urine - to the sample and re-present the resulting sample to the array.
  • a reference solution which may be selected according to the source of the sample, such as blood, semen, sputum, or urine - to the sample and re-present the resulting sample to the array.
  • This procedure can of course be done using varying concentrations of the reference solution.
  • the result is that any similarities or differences between the reference solution and the actual sample will be more easily discernable in the data obtained from the detector array.
  • the same Al methods can be used to separate the effects of the added reference solution and the actual components of the sample; and in the process a more exact understanding of the similarities or differences between the two will be extracted from the resulting data.
  • the present disclosure is primarily aimed at assaying the protein contents of a liquid sample such as blood, urine, sputum, nasal secretions, blood plasma, tears, perspiration, drinking water, milk, runoff from agricultural or industrial sites, sewage, rain, river water, and so on.
  • a liquid sample may also be prepared from solid precursors by dissolving the solids in a suitable solvent like water, hydrocarbon solvents, oils, etc. Samples that originate in a mixed state, like sputum, usually should be converted to liquid samples by dilution in a suitable pure or mixed solvent such as water or water plus alcohol. Proteins in a liquid sample may be in their original form, or may be broken apart into protein fragments by a suitable pre-processing procedure, desirably in an easily repeatable way.
  • the present disclosure is not limited to assaying the protein content of a sample.
  • Many non-protein molecules such as specific types of metal ions can have a binding affinity to antibodies or other such “recognition molecules”, and may therefore be suitable target antigens for this type of assay.
  • molecules recognized by the assemblies and/or methods of the present disclosure need not be free in solution.
  • Virus coat proteins, cell surface proteins and the like, attached to particles much larger than the target molecules, can be recognized by CABA.
  • a detector array of the present disclosure may be used to separate and distinguish particles in a liquid suspension according to molecules on the surface of the particles.
  • Tumor cells can be presented to the detector array, then normal cells presented to an identical detector array, and the differences between the two resulting binding patterns can then be used to select antibodies whose presence and/or absence together can provide high specificity of targeting for Logic-gated CAR- T cells.
  • the assemblies and/or methods of the present disclosure can be used as a clinical instrument to aid in establishing a current-health profile of each patient. Because every reader and every detector array can be identical regardless of the application, the system can be used to detect and identify influenza, pregnancy, metabolic diseases, infections, blood types, serotypes, histocompatibilities, and a vast array of other factors important to public and personal health. It can be applied in similar ways in veterinary medicine and other areas of biology that, today, use specific antibody tests. On the other hand, in some cases it will be helpful to provide different detector arrays that are specific to testing particular categories of samples (e.g., blood, plasma, urine, sera, water, milk, saliva, or samples that have been pre- processed in various ways to reduce background interference or increase concentration of analytes).
  • samples e.g., blood, plasma, urine, sera, water, milk, saliva, or samples that have been pre- processed in various ways to reduce background interference or increase concentration of analytes.
  • the assemblies and/or methods of the present disclosure can be applied, in accordance with example implementations, as one test (or a few tests), extremely inexpensively, with a shelf life of a couple of years, and can be used to detect the presence of practically any pathogen, protein, microbe, or virus in a sample.
  • binding elements or “detecting elements” can mean any of the following, inclusively:
  • Low specificity binding elements means binding elements that bind preferentially to a limited class of target analytes, but to more than one target analyte.
  • a “binding element” may, for example, be a region, coated with antibodies of a specific type, on a surface.
  • a “region” on a detector may be simply connected or multiply connected in the topological sense. That is, it may consist of one or more separate (that is, spatially distinct) non-connected sub-regions, or of a region from which are excluded one or more contained sub-regions.

Abstract

A binding affinity-based assay method is provided, the method comprising: providing a first set of known low specificity binding elements; presenting a known sample to the first set to define a first datum vector; providing a second set of known low-specificity binding elements, the first and second sets including the same known low-specificity binding elements; presenting an unknown sample to the second set to acquire a second datum vector; and comparing the first and second datum vectors to determine similarity or difference between the known and unknown samples.

Description

Combinatorial Affinity-Based Analysis Assemblies and
Methods
CROSS REFERENCE TO RELATED APPLICATION
This application claims priority to and the benefit of U.S. Provisional Patent Application Serial No. 62/993,973 filed March 24, 2020, entitled “Pathogen Analysis Assemblies and Methods”, the entirety of which is incorporated by reference herein.
TECHNICAL FIELD
The present disclosure relates to the identification of macro and/or micro molecules. Particular embodiments relate to analysis methods including combinatorial affinity-based analysis methods and/or assemblies
BACKGROUND
When performing an assay, the goal is typically to determine qualitative data, quantitative data, and/or some combination of both. The analysis can provide a general identity of molecules or portions of molecules, or it can provide a very specific identity. Assays and methods may identify specific molecules, proteins, moieties, complexes or elements.
As one example, the “lateral flow assay” (LFA) has been used as a method for detecting the presence or absence of specific substances (“targets” or “antigens” or “pathogens”) in a sample. Essentially all of the past and current applications of LFAs employ antibodies carefully selected or developed to be as highly specific as possible. That is, the antibodies are “designed” to bind to one, and only one, specific antigen. An entire industry has built up around developing those highly specific antibodies and incorporating them into various antibody-based tests including tests for drugs of abuse, histocompatibility tests, tests for particular viruses, bacteria and other pathogens, and so on. As another example, ELISA sandwich assays in which a single type of highly specific antibody is printed on the bottom of each well in an array of wells, and/or the amount of antigen from a sample that binds to each well, is interpreted as indicative of the concentration of the corresponding antigen in the sample.
In the Ph.D. Dissertation, “A Species Independent Universal Bio detection Microarray for Pathogen Forensics” by Shallom, an extremely large array of every possible nonamer (over 300,000 9-nucleotide sequences in all) is described, which can be used to identify the genetic materials of viruses or other organisms. It has a very limited ability to identify the components of mixtures of genetic materials from other organisms, and needs nearly as many elements in its array as there are organisms to describe.
In U.S. Patent Application Serial No. 16/015/379, Gehrke et al describe an assay that employs spectrally encoded taggants on non specific antibodies, to provide a method for detecting and diagnosing, e.g., an emergent disease for which no specific antibodies have yet been developed. Gehrke's preferred design is a paper-based assay like a lateral flow strip, except with spectrally coded antibodies - e.g., with nanoparticles of assorted colors attached to the antibodies. In order to read Gehrke's assay, it is necessary to use a colorimetric reader capable of distinguishing reliably between the different taggants.
PCT/AU201 6/000371 by MacDonald describes a lateral flow assay that employs multiple multi-specific or mono-specific antibodies to produce a “digital” display, such as a 7-segment alphanumeric display. MacDonald explicitly does not use low-specificity antibodies, stating that the assay “preferably has no cross-reactivity”.
Few assays have been described which employ low-specificity DNA probes. For example, Gehrke's assay can employ low-specificity DNA probes; or it can employ low-specificity antibodies or other ligands. Flowever, Gehrke's assay is limited to using spectrally coded ligands, along with the complication of using a highly accurate colorimetric reader.
The present disclosure provides heretofore unknown combinatorial affinity-based analysis assemblies and methods.
SUMMARY
A binding affinity-based assay method is provided, the method comprising: providing a first set of known low specificity binding elements; presenting a known sample to the first set to define a first datum vector; providing a second set of known low-specificity binding elements, the first and second sets including the same known low- specificity binding elements; presenting an unknown sample to the second set to acquire a second datum vector; and comparing the first and second datum vectors to determine similarity or difference between the known and unknown samples. The “difference” between two datum vectors may, depending on context, mean the vector difference, or the root-mean-square difference, or any other useful difference measure.
A binding affinity-based assay method is provided, the method comprising: presenting a known sample to a set of known low- specificity binding elements to define an individual datum vector to develop a library of datum vectors associated with known samples; presenting an unknown sample to the set of known low-specificity binding elements to acquire a test sample datum vector; and comparing the test sample datum vector to those of the library to determine the most likely mixture of antigens to have produced the observed datum vector.
DRAWINGS
Embodiments of the disclosure are described below with reference to the following accompanying drawings. Fig. 1 is a general assay method according to an embodiment of the disclosure.
Fig. 2 illustrates constructing a library of datum vectors according to an embodiment of the disclosure. Fig. 3 is a graphic representation of an example system having entities bound thereto according to an embodiment of the disclosure.
Fig. 4 is an example method for analyzing systems with bound entities according to an embodiment of the disclosure.
Fig. 5 is an example method for analyzing systems with bound entities according to an embodiment of the disclosure.
Fig. 6 is an example method for a graphical representation of a DS matrix. The DS matrix is a matrix representing the set of binding affinities between a set of antigens and a set of antibodies.
Fig. 7 is an example method for calculating a correction to a trial sample vector according to an embodiment of the disclosure.
DESCRIPTION
In the following description, the term “detector” is used to mean any device, component, moiety, element, or sensor that binds preferentially to a particular category of molecules or to particular aspects of molecules. The term “reader” is used to mean a device or set of devices that reports and/or records the results of allowing a sample containing one or more types of analytes to come into contact with a set of detectors. Flence, for example, a particular type of antibody spotted onto a surface or onto the bottoms of wells in a plate constitutes a “detector”. If that antibody binds with low specificity to a range of different analytes, it is a “low specificity detector”, or a “detector with low binding specificity”. Similarly, it may be called a “low specificity binding element”. The low-specificity binding elements can include selected molecular types. The low-specificity binding elements may be defined by at least one region spatially distinct from other individual regions. Each of the individual regions corresponds to a distinct low-specificity binding element.
A set or array of such “low specificity binding elements” may be used to determine a datum vector that describes the response of the set or array to a sample containing one or more analytes.
The term “molecular type” is used to refer to set of molecules that are all the same, or that all have a specific property.
The terms “library” and “database” are used interchangeably in this specification.
The term “mathematical model” is used herein with a generalized meaning that includes “algorithmic model”, neural net model, and so on: any computational model or analog model that may be used to predict the behavior of a system. The present disclosure will be described with reference to Figs.
1 -7. Referring first to Fig. 1 , a general assay method is provided that first includes determining a datum vector for a known sample. This known sample can include single or multiple known entities, one or more of which can be complexed by an assay that includes low- specificity complexing entities. From this binding a datum vector can be established that can be associated with both the known low- specificity complexing entities and the known sample. This datum vector can provide either or both of qualitative or quantitative information. Datum vectors can be established for many known samples.
Accordingly, a first set of known low specificity binding elements can be provided and a known sample can be presented to the first set to define a first datum vector Continuing with Fig. 1 , a datum vector for an unknown sample can be determined. This unknown sample may or may not be provided in the same matrix as the known sample. From the complexing of the unknown sample with the low-specificity complexing entities a datum vector for the unknown can be determined.
For example, a second set of known low-specificity binding elements can be provided. The first and second sets can include including the same known low-specificity binding elements in unbound form, then an unknown sample can be provided to the second set to acquire a second datum vector.
Again, with regard to Fig. 1 , the datum vector of the known and unknown samples can be compared to determine the likelihood of similarity of differences between the unknown and known samples. Comparing the first and second datum vectors can include using an algorithm to determine a degree of difference or similarity between the first and second datum vectors. Additionally or alternatively, an algorithm can be used to determine analyte concentrations in the unknown sample.
Referring next to Fig. 2, a general method is shown for determining datum vectors for a plurality of unknowns is shown and a library can be created for same. Accordingly, the unknowns can be associated with a particular trait of the sample, toxicity, infection, disease, etc. In accordance with example implementations, an unknown datum vector can be determined and compared with one or more entries in the library for similarity or dissimilarity.
Additionally, the library of the datum vectors can correspond to known samples to determine relative analyte concentrations of the unknown sample. In accordance with example implementations, mathematical models of a detector response to a mix of analyte concentrations created from detector responses can be used to vary concentrations of pure samples of known analytes. In accordance with at least one example, the binding equilibria of analytes to determine analytes and/or concentration of analytes. Additionally, the kinetic association/dissociation of analytes to determine analytes and/or concentration of analytes.
In accordance with example implementations, the similarity or difference between the known and unknown samples can be communicated to a remote device. For example, the two datum vectors (unknowns to knowns, knowns to a library, and/or unknowns to a library) can be determined and/or compared at a remote system via a communication medium. Accordingly, the presenting of the known samples to the low-specificity binding elements can be performed at a first location, and the presenting of the unknown samples to the low- specificity binding elements can be performed at a second location. The first and second locations can be the same or different.
In accordance with example implementations, a known sample can be presented to a set of known low-specificity binding elements to define an individual datum vector to develop a library of datum vectors associated with known samples. An unknown sample can be presented to the set of known low-specificity binding elements to acquire a test sample datum vector. Then, the test sample datum vector can be compared to those of the library to determine the most likely mixture of antigens to have produced the observed datum vector.
The assemblies and methods of the present disclosure can utilize a Combinatorial Affinity-Based Assay (CABA) that can incorporate a new approach to recognizing known pathogens and detecting/fingerprinting new pathogens. The assemblies and/or methods of the present disclosure can utilize an antibody-based test, but rather than using maximally specific antibodies (which are expensive and time-consuming to develop), the assemblies and methods can utilize a set of low-specificity antibodies to develop a “profile” or “fingerprint” (later herein called a “datum vector”) of a sample which may contain numerous different pathogens. This datum vector can be analyzed to identify the particular pathogens present in the sample, or to reveal the presence of an unknown pathogen or pathogens, or to reveal the non-presence of a known pathogen.
In at least one aspect of the present disclosure, instead of relying on antibodies that are highly specific to particular target antigens, CABA relies on antibodies that have low specificity and bind to multiple antigens. Rather than identifying which antigen is being detected by the particular antibody to which the antigen binds exclusively, in CABA it is the combination of antibodies that bind to a target antigen in CABA that characterize or identify the target antigen.
Use of the CABA is not limited to detection and recognition of pathogens; the assemblies and/or methods of the present disclosure can be used to analyze any substances to which antibodies will bind with low specificity, such as proteins, DNA, RNA, pollens, other antibodies, and even fats and sugars. Further, the assemblies and methods of the present disclosure are not limited to any particular method (e.g., ELISA or LFA) for detecting binding affinity of a sample to members of a set of antibodies.
In accordance with example implementations, all such antibody- based assays that incorporate multiple ELISA wells, LFA strips, etc., are referred to as “multiplexed antibody based assays” or “MAbs” when using the assemblies and/or methods of the present disclosure. As an alternative to antibodies the assemblies and/or methods of the present disclosure can employ “aptamers”, which are nucleic acid molecules that distinguish between protein isoforms and conformations, and possess target recognition features.
There are various kinds of aptamers, including DNA or RNA or XNA aptamers that are constructed of (usually short) strands of oligonucleotides, and peptide aptamers that are constructed of one (or more) short variable peptide domains, attached at both ends to a protein scaffold. Arrays that incorporate multiple types of low- specificity detectors can include some detector elements with high specificity and still fall within the scope of this disclosure. Sets of detector elements can be referred to herein as “low-specificity detector arrays”, “detector arrays”, “detector element arrays”, and/or “MAbs”. “Antigen”, “target”, “pathogen”, or “analyte” may refer to a sample component that is detected by a detector element in an antibody array or aptamer array or the equivalent. “Antibody” can encompass antibodies, aptamers, and any other kind of molecular type or surface property that binds to an antigen with relatively low specificity.
The present disclosure provides a new way to use antibodies in antibody, aptamer, or other ligand arrays to provide a highly reliable analyte profile for any sample, based on the sample's analyte composition. This can be accomplished by using a “panel” or array of detectors that essentially classify each sample according to a set (e.g., 20 to 2000) of approximately orthogonal criteria or “affinity groupings”. In this context, an “orthogonal” criterion is largely independent of the other criteria in the sense that a set of vectors can be (linearly) independent of each other and therefore orthogonal in a mathematical sense.
However, not all of the criteria need to be orthogonal to each other. Instead, the full set of criteria can encompass at least one subset of criteria or combinations of criteria that span the space of possible analytes. In practice, criteria should provide detector array activation patterns that are highly distinguishable in regions of interest in analyte space. This does not require highly specific antibody/antigen or aptamer/analyte or ligand/ligand binding affinities. Instead, it benefits from selecting multiple antibodies, aptamers, or ligands that reliably bind to different large subsets of the set of all possible antigens or analytes of interest (the “UAS” or “Universal Antigen Set”). A “large subset” can encompass usually 20 to 80 percent of the UAS, but may also encompass a much larger or much smaller percentage of the UAS. Note that a typical high-specificity antibody will only bind strongly to less than 0.00001 percent of antigens in the UAS, for example.
An example method of selecting antibodies for inclusion in the detector array of the present disclosure can be as follows:
• A single analyte may be selected as a reference analyte.
• Samples containing the reference analyte in the same concentration and each of a wide range of different known single analytes in a range of different concentrations may be presented to a detector array containing a large assortment of single antibody-coated zones.
• Data received from the detectors may be analyzed to obtain approximate values of the equilibrium constants of competitive binding between the reference analyte and the known analytes, at each detector.
• The preceding steps may be repeated for a few hundred or more (potentially many thousands) different such reference analytes.
• From the analysis results, antibodies may be selected that produce optimally distinguishable binding affinity patterns on the detector array for all the analytes, while producing optimal distinguishability between binding affinity patterns for regions of analyte space that are deemed particularly important for any reason.
So-called “competitive binding” assays, non-competitive binding assays, and other such assays may be adapted for use in the present disclosure, with the requirement that they provide a quantitative readout of the amount of analyte bound to an antibody, or of the binding affinity or other such related quantity, regarding the interaction between an antibody and an analyte. Alternatively, “epitope binning” methods may be employed to provide a basis for selecting antibodies that may produce distinguishable binding affinity patterns on the detector array for all the analytes, while producing optimal distinguishability between binding affinity patterns in regions of analyte space that are deemed particularly important. Epitope binning emphasizes the binding interactions between different antibodies that target epitopes located near each other on an antigen. The proximity of the epitopes on an antigen causes the presence of a bound antibody of one type at its epitope to inhibit the binding of the other antibody at its epitope.
Surface Plasmon Resonance (SPR), for example, is often used in epitope binning, allowing dynamic/kinetic measurements of binding affinities and equilibrium constants at relatively high speed and high throughput.
The present disclosure is not dependent on standard antibody- based detector arrays, or on particular methods for detecting binding affinity of each member of a set of antibodies or aptamers to analytes in a sample. Portions of the present disclosure herein refer to “detectors” rather than any particular kind of antibody-based or aptamer-based binding affinity indicator. A “detector” can be an LFA (with the human eye or an automated reader to obtain a numerical reading from the LFA), a well in an ELISA assay (with a suitable reader), or any other device that provides an indication of the binding affinity of a particular molecular type or group of types of molecules to components of a sample, at each of multiple detectors. Moreover, the present disclosure can encompass other kinds of affinities such as DNA hybridization, along with detectors capable of indicating binding affinity to components of a sample.
LFAs, ELISA assays, antibodies, aptamers, LFA readers and ELISA readers are examples of ways to implement the disclosure and to provide a clear exposition. A “detector” can be an ELISA well, a lateral flow indicator line, or the like, coupled with means for obtaining a quantitative readout of a value relating to the binding affinity of components of the sample to a low-selectivity antibody or aptamer; and a “detector array” can be a spatially separated set of such detectors. Moreover, each antibody or aptamer in the array may be attached to any region on the corresponding detector, or on multiple detectors. “Binding affinity” as used herein can refer to “relative binding affinity”, “specific binding affinity”, or any other quantity that correlates to the amount of an analyte that can bind to a given detector, possibly in competition with other analytes with their own binding affinities to the detector. Quantitative readouts can be obtained from an LFA, ELISA array, or other antibody array with a camera, or magnetic, fluorescent, mass-sensitive, radioactive, SPR, or other detection technology.
The present disclosure may also be used as a “negative screening assay”. That is, it may be used to identify a subset of a large set of samples, such that there is high confidence that the subset does not contain a specific analyte. Such an assay can be useful, if sufficiently fast and inexpensive, for screening individuals whose work requires them to be in contact with others who are vulnerable to a disease. If the screened individual is, say, 99.9% sure not to have the disease, it may be deemed safe to have them work with the vulnerable other individuals. Of course, it is best similarly to screen all individuals with whom the screened worker must interact closely.
In practice, the assemblies and methods of the present disclosure can employ a reader to provide a quantitative readout of the response of each detector (the detector response) to a sample. An ordered set of readouts for a sample can constitute a “datum vector” that correlates to the composition of the sample. The readout for each LFA in an LFA panel can be considered to be approximately proportional to the concentration of the antigens in the sample that bind to the LFA. Note that any given antigen may contribute to the readout values for a large subset of the LFAs in the panel. However, in practice, the readout value may not be directly proportional to antigen concentration in a sample even when the sample contains only that one antigen. The readout can be subject to various nonlinearities that can complicate interpretation of a datum vector. Such nonlinearities include saturation effects, but also depend on the relative concentrations of different antigens in the sample, due to competition between the antigens for binding sites on the antibodies on a detector. Often nonlinearities are much easier to handle by increasing dynamic range of the assay. For example, if the assay is performed six times on a sample, with the sample diluted by a further factor of 10 in each step, the dynamic range of the assay is potentially increased by as much as a million: six powers of ten. The equations of equilibrium kinetics (e.g., the Hill equations), in which the binding affinity of each antibody-analyte pair is represented by a corresponding constant, are sufficient for representing the first-order nonlinearities. A greater amount of information can be provided if the detectors and the reader are configured to produce a continuous readout of binding, to reveal binding kinetics (rates of association and dissociation which may vary temporally in characteristic or informative ways under non equilibrium conditions). When association/dissociation kinetics of analytes are included, it is possible to more precisely determine the presence and/or concentrations of analytes.
An algorithm that can identify and compensate for nonlinearities in the readout system, to enable reliable identification of not only specific antigens, but the relative proportions of antigens in samples containing mixtures of antigens, can be utilized.
The algorithm may be structured as follows:
• From a database, or from knowledge of detector properties and the Hill equations, obtain an estimate of the functions Ri(6j), where Ri is the readout value of the ith LFA when the panel of LFAs is presented with a sample containing various antigens corresponding to the subscript j in the relative amounts 6j. Usually this will be approximately a sigmoid function of the sum of all 6j of the antigens in the sample.
• To first order, Ri(6j) may be represented as a matrix of equilibrium constants K\(k), where K\(k) relates to the binding affinity of analyte k to detector j.
• Based on the Ri (6j) and the datum vector Ui , find a sample vector Dj, that optimally reproduces the datum vector Ri (Dj) ® U i. There are several alternative approaches to finding that optimal sample vector; some of the approaches are described below. Determine the likelihood that the sample vector is a linear sum of sample vectors represented by datum vectors stored in the database. In the event that there are more severe nonlinearities, the distance measure between an actual datum vector and a calculated datum vector will require cross-terms that, in the context of differential geometry, would correspond to terms in the metric function of a curved space.
• If the sample vector is not likely to be a linear sum of known sample vectors, label it as “unknown, possibly novel”. If it is a very close match to such a linear sum, label it as such. If there are multiple distinct linear sums with a reasonably close match, label them as possible matches and assign a relative likelihood related to the quality of their matches to corresponding linear sums.
• Periodically or continuously review the database as datum vectors are added to it, to refine the functions Ri(5j) and to improve the distance measure used to define closeness of match.
Alternatively, artificial intelligence (Al) methods such as artificial neural networks, neuromorphic computing, evolutionary programming, or deep learning can be used to analyze a database of datum vectors (along with attached data), essentially finding regions in sample space that correspond to regions in datum vector space. However, such methods require vast amounts of computing power. The algorithm outlined above requires relatively little computing power and can be executed on a desktop or laptop computer.
A complete service employing the assemblies and methods of the present disclosure can include mass-producible readers, mass producible LFA panels, and an online database analogous to the DNA databases of 23andMe. The service may receive and interpret datum vectors along with their associated sample-related information (e.g., time, location, sample type, patient identifier, demographics, sample number, known antigens in the sample, etc.). The service may also track the trajectories of identified pathogens demographically or geographically, and report to agencies who need that information, at very low cost.
Example implementations of the disclosure can include the following steps:
1. Select a representative subset of the “universal set” of proteins. “Universal set” is the set of proteins that are expected to be the target of the assay.
2. Produce pure samples of each member of the representative subset of proteins.
3. Produce a “master library” or database of antibodies that bind with medium affinity to proteins in the “universal set”.
4. Select a “spanning subset” of the “master library”. This is a set of antibodies which in combination will produce a unique binding pattern for each protein in the representative subset. Generally, it is not a unique subset; there are a large number of different such “spanning subsets” for a given “master library” and given representative subset of proteins. In fact, typically a relatively small (100 to 2000 element) random subset of the “master library” is extremely likely to contain a suitable “spanning subset”.
5. Produce pure samples of each of the antibodies in the “spanning subset”. Typically these will be monoclonal antibodies.
6. Prepare a set of identical antibody arrays from the “spanning set” of antibodies.
7. Determine the binding patterns of the individual proteins in the representative subset to the antibody arrays, and refine the antibody array design according to the observed binding patterns. The goal of refinement is an array consisting of a minimal number of antibodies such that the binding patterns of the antibody array to the various proteins are maximally distinguishable from each other.
8. Determine the binding patterns of the individual proteins in the “universal set” of proteins to the refined-design antibody arrays, and enter the patterns and accompanying data into a “reference database”.
Steps 1 and 2 in the above procedure can use well-known methods. For example, the “universal set” of proteins may be the set of roughly 1500 distinct surface coat proteins of the Chagas parasite, T. cruzi. DNA sequences corresponding to the amino acid sequences of many of those proteins are easily obtained via online databases such as GenBank. The DNA sequences may be synthesized by any of the commonly used “gene synthesis” methods. The resulting DNA sequences are inserted into the genomes of a host organism (e.g., yeast, bacteria, or phages) using methods well-known in the molecular biology art. The host organisms are cultured, and they express the proteins corresponding to the DNA sequences. Monocultures of the host organisms are thus used to produce pure samples of each of the subject proteins, again using methods well-known in the art. Step 3 in the above procedure can be done using well-known methods such as inoculating camelids (typically llamas or alpacas) or mice with a mixture of the proteins in the “universal set” to elicit an immune response, then isolating and cloning the resulting antibodies to produce a “master library”, then “panning” the “master library” to capture the subset of antibodies capable of binding to the proteins in the “universal set”.
In selecting antibodies that bind with medium affinity as opposed to high affinity, it is possible to discard those antibodies that do not bind or are very easily dislodged from their binding sites, retain those antibodies (the medium-affinity antibodies) that are less easy to dislodge from their binding sites, and discard those that are very difficult to dislodge from their binding sites.
Steps 4 and 5 may be accomplished in practice by first selecting at random 100 to 2000 different unique antibodies from the “master library” and cloning them individually. The resulting monoclonal antibodies may be used to prepare one or multiple antibody arrays, and the equilibrium datum vectors of the proteins to the arrays or the individual antibody/protein association/dissociation curves may be determined using methods well known in the art such as ELISA assays or surface polariton resonance measurements. Any of many well- known optimization techniques may be used to select a subset of the 100 to 2000 antibodies that produce unique, easily distinguishable datum vectors of the proteins on the arrays.
Step 6 may be accomplished by well-known methods of cloning and microprinting antibodies onto substrates in, e.g., a rectangular array pattern.
Step 7 may be accomplished by presenting each of the proteins in the “representative subset” individually to the arrays, then measuring and storing the resulting datum vectors. Typically the number of elements in the “representative subset” is much larger than necessary. Standard optimization methods may then be used to select antibodies to form a minimally-sized “spanning subset” which results in arrays with maximally distinguishable datum vectors among the array/protein datum vectors of the “representative subset” of proteins.
Finally, in Step 8 the optimized arrays produced in Step 7 are used to determine the datum vectors of each of the proteins in the “universal set” to the arrays. Those datum vectors are stored in a reference database.
In order to use the CABA, the following steps can be performed:
1. An unknown sample is obtained and pre-processed. E.g., the sample may be blood, plasma, tissue, a cell or virus culture, urine, milk, etc. The sample may contain a pathogen to be identified. Pre-processing may include such procedures as sonication, digestion, filtration, dilution, centrifugation, Western blot separation, culturing, plating, concentration, and so on, with the objectives of removing irrelevant proteins while retaining the unidentified pathogen and producing a solution with standard concentrations of background proteins and standard properties.
2. The pre-processed sample is presented to the CABA, and the resulting datum vector is recorded.
3. The datum vector is compared to datum vectors in the database, and well-known optimization methods are employed to determine the most likely mixture of reference proteins (i.e., the proteins in the reference database) to have produced the observed datum vector
4. If there is no mixture of reference proteins that could have produced the observed datum vector, well-known optimization methods are employed to determine the “anomalous component” of the observed datum vector, which is the datum vector of a new protein which, if added to the reference database, would allow a close match between the observed datum vector and the expected datum vector of the most likely mixture on reference proteins plus the new protein. In this case, the new protein may correspond to a previously unknown pathogen. This can be confirmed if the same anomalous datum vector is observed in further samples from a population of patients displaying similar symptoms.
Although each of the steps in the above procedures is well understood in the respective arts of molecular biology and optimization theory, the steps in combination result in an entirely new and highly advantageous approach to identifying antigens (i.e., proteins and other ligands) via binding affinities.
Existing assays depend on highly specific binding, wherein one antibody type binds exclusively to one antigen type. This requires a unique antibody type to be developed for each antigen type, resulting in a highly expensive and time-consuming development process to make an assay for each antigen type. In turn, this results in a very large number of different assays to test for the presence of each possible pathogen: also an expensive, time- and space-consuming aspect of current assays.
The assay of the present disclosure, herein referred to as CABA, employs a single “generic” antibody array to test for the presence of any of a very large number of different potential antigens in a sample. That is, a single relatively small, standardized, CABA array can identify any of the antigens in its “reference database” of antigens. As a result, a small set of small CABA arrays may be used to identify all possible types of potential known antigens. Moreover, the same small set of CABA arrays can be used to detect the presence of new antigens, and to identify new antigens as soon as the new antigens are added to the reference database. The vastly reduced number of required types of antibodies, the vastly reduced number of antibody array types needed, the vastly reduced development time and cost required to enable the CABA to identify new antigens, and the vastly simplified logistics of maintaining a supply of the assays, together give CABA a very substantial advantage over current affinity-based assays.
Figure 3 is a graphical representation of an example system having entities bound thereto according to an embodiment of the disclosure. The right-hand image represents the binding pattern of a first entity onto the elements of a first detector array. The middle image represents the binding pattern of a second entity onto the elements of a second detector array identical to the first detector array. The left- hand image represents the binding pattern formed by both first and second entities when the entities are presented simultaneously to a third detector array identical to the first two detector arrays. To first order, the left-hand binding pattern is a linear superposition of the middle and right-handed binding patterns.
Figure 4 illustrates a way that the invention may be implemented, to provide a useful service to medical professionals, clinics, hospitals, health agencies, and so on. A sample is collected (402), which may be a patient's tissues or body fluids. Alternatively, the sample may be from food, waste water, sewage, air, soil, dust, plants, or any other source that may contain viral or bacteria fragments, products, or living organisms. The sample is then prepared (402) for analysis. Depending on the purpose of the analysis, the sample may be filtered, diluted, pH buffered, cultured, amplified, dissolved, sonicated, heated, lysed, centrifuged, passed through a chromatography column, or submitted to a 1 -dimensional or two-dimensional gel for separation.
The prepared sample is presented (404) to the array of this invention, such that different components of the prepared sample each bind to one or more elements of the array. A reader, which may be a camera, an SPR device, or any other device that can quantitatively measure the amount of material bound to each element of the array, is used to read (406) the results of presenting the sample to the array. The results, represented as a “datum vector”, are then sent (408) via the internet or other communication system to a processor that uses artificial intelligence (Al) to analyze the datum vector. The datum vector is stored (414) along with supplementary information and analysis results in a database. As information is added to the database, the information is used (416) to continually or periodically upgrade the Al. Results of the reading (406) are sent (410) to users via the internet or other communication system. A report (412) is compiled for sending to users or to health agencies or government agencies, including a selection of relevant information that can be extracted from the analysis and/or from comparison or correlation of the results with other results stored in the database. Information in the database may be used to continually upgrade the Al.
Figure 5 illustrates aspects of how a datum vector may be processed by the Al to obtain a sample vector. The datum vector represents the results of presenting a sample to the array of this invention, while the sample vector represents the most likely composition of the sample, based on the stored information in the database and the observed datum vector (500). The observed datum vector (500) is received over a suitable communication system from a reader. A trial sample vector (506) can be a randomly generated sample vector, or it can be a sample vector obtained from a sample that is similar to the sample from which the observed datum vector (500) was obtained, or it can be any sample vector at all. The trial sample vector (which in the first iteration is also the New Trial Sample Vector 508) is operated upon by a detector array model (SD model) 510 to obtain a trial datum vector 512. The comparator 502 compares the trial datum vector 512 to the observed datum vector 500 to produce an error datum vector 504. If the error datum vector 504 meets conditions specified by the “difference operator” ( <D?) 520, it is presented as the best-guess sample vector 518. Otherwise, the error datum vector 504 is passed to the sample vector correction calculator 516 to obtain a correction 514. The correction is subtracted from the trial sample vector 506 by operator 522 to obtain a new trial sample vector 508. The sample vector correction calculator 516 makes use of both the observed datum vector 500 and the error datum vector 504 to calculate the correction 514, using a method like that illustrated in Figure 7. The process described above continues cyclically until the error datum vector 504 meets the conditions specified by the difference operator 520.
Figure 6 illustrates a graphical method for representing a DS matrix. The DS matrix may have rows with elements whose values correspond to the binding strength between all antigens in the sample and one antibody in the array. Each row in the DS matrix corresponds to a different antibody, while each column corresponds to a different antigen. The red lines from the square representing antigen 602 go to antibodies 610 and others, to which the antigens bind. The blue lines from the circle representing antibody 608 go to the antigens in 600 that bind to the antibody 608. The lines representing binding between the other antibodies and other antigens are omitted for clarity.
Figure 7 illustrates how the sample vector correction calculator may be structured, although it may be structured in other ways. An observed datum vector 700, 500, a trial datum vector 702, 512 and a trial sample vector 794, 506 are given. From the DS matrix 708, the SD model 706, and the trial datum vector 700, an approximate value for derivative 3S/3D is calculated at the location of the trial datum vector 700. From the trial datum vector 700 and the observed datum vector 702, calculate 712 the difference qD. From qD and 3S/3D, calculate 714 a correction term K*|qD|*(3S/3D) to subtract 716 from the trial sample vector to produce a new trial sample vector 718. K is a constant that is typically user-adjustable. Note that 3S/3D is a matrix whose terms are the partial derivatives of the elements of the sample vector S with respect to the elements of the datum vector D.
In at least one embodiment of the disclosure, an ELISA plate with N wells is prepared, with a different one of N different primary antibodies adhered to the bottom of each well. In accordance with the current invention, the primary antibodies are selected from a library of antibodies to bind with varying affinity to members of different partially overlapping subsets of the set of antigens that are expected to be in a sample.
The sample is poured over the wells, time is allowed for antigens in the sample to bind with the antibodies in the wells, and then the wells are rinsed to remove un-bound antigen. A fluorescent tag which binds to antibody/antigen complexes is added to the wells, and then the wells are rinsed again to remove any unbound tag. The plate with its wells is placed in a reader which illuminates the wells with light that stimulates fluorescence, and which measures the amount of fluorescence emitted by each well. That amount of fluorescence is proportional to the amount of antigen bound to the primary antibody in each well.
The fluorescence intensities of the wells as read by the reader comprise a datum vector (e.g., a listing of the detector values read by the reader). The datum vector is submitted to an artificial intelligence (Al) system which includes a mathematical model of the ELISA array and the binding equilibria for known antigens, and a database of known samples and corresponding datum vectors. The Al system then seeks a sample composition that, according to the model and the database, optimally matches the measured datum vector from the ELISA reader.
That sample composition is output by the Al system as the most probable composition of the sample. The Al system may also output alternative sample compositions along with their relative probabilities of being the correct composition. In addition, if the Al system is unable to find one or more sample compositions whose output datum vectors would correspond sufficiently closely to the observed datum vector, the Al system may indicate that an unknown or novel antigen is present in the sample. The difference between the observed datum vector and the best-match datum vector that can result from a sample containing known antigens provides a “difference fingerprint” of the unknown or novel antigen; and the difference can also provide information about the unknown or novel antigen that can be helpful in isolating, analyzing, or identifying that antigen.
Finally, the datum vector and any other available information provided by the user about the sample is placed in the database, and a description of the most-probable antigen composition of the sample may be sent to the user.
In accordance with at least one embodiment of the disclosure, the assemblies and/or methods can include a set of 20 to 2000 lateral flow elements of standard design with a different antibody type in each lateral flow element as follows:
1. each antibody corresponding to a flow element can bind with low selectivity to some subset comprising anywhere from a small fraction to roughly half of the universal set of antigens;
2. each antibody corresponding to a flow element can bind to a different subset of the universal set of antigens, with respect to the antibodies in the other flow elements; and
3. the binding affinities of the antibodies in the set of flow elements may form an effectively complete, redundant, vector space which can be mapped to the universal set of antigens.
In accordance with another embodiment of the disclosure, the assemblies and/or methods of the disclosure can include a reader that provides a quantitative readout of strip color and intensity resulting from a sample submitted to the lateral flow elements. This reader may provide a real-valued vector of values that depends on the composition of the sample. This vector may be referred to herein as a “datum vector”, and is identified with the sample being tested. LFA panel readers that can be used in accordance with the present disclosure can include a color photo scanner such as the Epson® Perfection® V39 Flatbed Desktop Photo Scanner, Model #: B11 B232201 (cost in March, 2020 only $100). This scanner can provide colorimetry and spatial resolution more than sufficient for use as an LFA panel reader.
Software to derive a datum vector from a scanned LFA panel can include one or more of:
• image re-registration and scaling
• feature identification
• colorimetry of a reference patch on the LFA panel
• colorimetry of identified features in the image
• a way to read or key in information that should be associated with the datum vector such as date, time, sample number, and information regarding the source and nature of the sample.
In accordance with another embodiment of the disclosure, the antibody array is placed in an SPR instrument such as the Carterra Biosystems LSA(c), and a sample is flowed through the instrument. The binding affinities between the antibodies and the analytes that may be in the sample may be measured by the instrument if the analytes are flowed sequentially through the instrument. Similarly, an antigen array may be placed in an SPR instrument and the antibodies may be flowed sequentially through the instrument to obtain essentially the same information. This information is used to construct and refine a model of the detector array. In the present disclosure, a complex unknown sample may be flowed through the instrument containing an antibody array, thereby revealing the amount of antigen bound to each antibody type, and thereby producing a datum vector for the unknown sample.
A way to increase the amount of information obtained from an assay performed in accordance with the present disclosure is to obtain a first datum vector using the unknown sample, and then to obtain an additional datum vector by using the unknown sample with a known added quantity of a known mixture of selected analytes. This will provide further information directly relevant to the competitive binding between different analytes at each antibody element of the array; and the information can further be utilized by a machine learning system or Al system or by the preferred algorithm, to provide improved reliability of analysis of the sample and interpretation of the corresponding datum vector. In fact, the further information can be concatenated to the datum vector to form an extended datum vector which can be handled by an Al or machine learning system in essentially the same way as the datum vector itself.
In accordance with another embodiment of the disclosure algorithms are provided that can compare an unknown datum vector to other datum vectors stored in a database, to determine the degree of similarity or difference between the datum vectors and therefrom to obtain a maximum-likelihood estimate of the composition of the sample identified with the unknown datum vector. One or more of the algorithms can include the following:
1. find nearest-neighbors to the unknown datum vector within the datum space represented by entries in the database. If there is a close enough match to a specific stored datum vector (that is, if the unknown datum vector is similar enough to a specific stored datum vector), the unknown is flagged as “probably the same as” the specific stored datum vector. Note: “nearest neighbors” within datum space may not be “nearest neighbors” in a Cartesian sense (i.e., the Cartesian distance measure of the square root of the sum of the squares of the coordinate differences, or the RMS differences of coordinate values, where the coordinates are the detector readout values). A useful distance measure may involve taking the nonlinear relationship between sample space and datum vector space into account.
2. If there is not a close match, a search is made for possible linear combinations of known antigens (that, is, for “sample vectors”) that would produce a closely matched datum vector to the unknown datum vector. This search may be intentionally limited to include only reasonably likely such combinations. If close matches are found, the corresponding linear combinations are flagged as “possibly the same” antigen combination as in the sample.
3. If no such close match can easily be found, the unknown datum vector may be flagged as “potentially containing one or more new antigens”. In accordance with another embodiment of the disclosure, the assemblies and/or methods can include a service that allows local testing of samples at a first location and, via a suitable communication medium, remote centralized analysis (at a second location) of datum vectors resulting from the readings, Analysis of datum vectors may include comparing datum vectors corresponding to unknown samples to datum vectors corresponding to known samples and drawing conclusions from the comparison And, of course, if standardized arrays of low-specificity binding elements are available, presentation of unknown samples to the arrays can be done at one location while presentation of known samples can be done at a different location. Typically, the results of the analysis will be communicated to a user at a location remote from the location where analysis is done, but in some cases the first and second locations will be the same. This “global embodiment” may include regular updating of the database using the new datum vectors and associated data that are submitted with them, and providing geographical, temporal, demographic, and other data to health agencies, to assist in tracking the emergence and trajectories of contagious diseases.
Note that “similarity” and “difference”, with respect to two vectors, are reciprocal relations. A smaller difference between two vectors generally corresponds to a greater similarity between the two vectors.
In accordance with another embodiment of the disclosure, the assemblies and/or methods can employ an “ELISA sandwich assay” as an array of wells. At the bottom of each well a distinct corresponding type of very broad-specificity antibody (or selected set of such antibodies) is attached. A microarray reader can be used to detect the amount of antigen that binds to the antibody type in each well.
In accordance with another embodiment of the disclosure, the assemblies and/or methods can employ an “ELISA sandwich assay”, but in each well of the assay a corresponding selected mixture of antibody types is printed and/or attached. Particular antibody types can be components of multiple selected mixtures in wells, or each well can contain a corresponding non-overlapping subset of a large set of antibodies, or some wells can contain mixtures, some unique antibodies, and so on.
A way to select antibodies for an antibody array to be used in the present disclosure in order to distinguish between mutant viruses within a family is to follow these steps:
1. Prepare an array of antigens, consisting of a large number of variants of the viral coat proteins of a family of viruses (e.g., coronaviruses).
2. Present a large library of antibodies to the antigen array, and separate out those antibodies that bind with intermediate strength to the antigens in the array. 3. Perform a binding kinetics study on the intermediate strength antibodies against the antigens, to identify those antibodies that fall into “bins”. I.e., antibodies that compete with one another for binding to a specific epitope on the antigen. Each such competing set of antibodies is a “bin”. Note that an antibody can sometimes be a member of multiple bins.
4. Eliminate from each bin those antibodies whose dissociation constants are extremely small or extremely large.
5. From the remaining antibodies in each bin, select a subset of antibodies that comprise approximately 2% to 40% of the antibodies in the bin, such that the antibodies in the subset are as different from each other as possible.
6. Prepare an array from the antibodies selected in Step 5, and test it against the full set of viral coat proteins in the family of interest.
Steps 2 and 3 can be performed using, for example, the Carterra LSA platform.
In accordance with example implementations, typically there will be enough information derivable from readings from a relatively small set of detectors (e.g.,20 to 2000) to detect and identify a much larger number (e.g., 100 to 20,000) of different antigens than the number of detectors. This fact may be illustrated by reference to a mathematical trick used by some people claiming to be psychics. The “psychic” shows a victim a set of six sheets of paper, each containing 4,000 names in different arrangements. Each sheet is divided into quadrants. The victim is asked to indicate which quadrant his surname is in, on each sheet. With just six such indications, the “psychic” is able to know the victim's surname. A typical victim does not know enough mathematics to be aware that the six indications amount to a unique code for each surname, and that there are 46 = 4096 different such codes. In other words, only six measurements, each with four possible outcomes, are sufficient to distinguish 4096 different items. More directly germane to the present disclosure, an array of N detectors, each with a binary readout, can ideally distinguish as many as 2N different analytes. If N is merely 20, that amounts to 1 ,048,576 different analytes. For N = 40, it amounts to over a trillion different analytes.
The “psychic” trick is relatively easy to understand. Less easy to understand is the fact that it is, under the right circumstances, possible not only to detect the presence of a single analyte, but also possible to determine the analyte concentrations in a sample containing multiple analytes. This is an important consequence of the present disclosure.
If the relationship between sample space and datum vector space were linear, the approach to determining concentrations of multiple analytes in a sample would be fairly straightforward linear algebra. In the case of the present disclosure, however, the relationship is almost always nonlinear.
In accordance with another embodiment of the disclosure, the assemblies and/or methods can separate a set of samples into two subsets, such that one subset almost certainly does not contain a given analyte, and such that the other subset may or may not contain the given analyte. This can be termed a “negative screening assay”.
For example, if a pre-made existing detector array is exposed to a novel virus in several known samples (which may contain mixtures of a few analytes), and if the detector values for those samples are distinctly different from the detector values for other samples that do not contain the antigen, it can be determined with high confidence that those samples that do not produce the distinctly different detector values do not contain the antigen.
For example, in response to an epidemic such as COVID-19, health care workers and others who must be in contact with other people can be screened. If the negative screening assay deems a worker to be in the “does not contain COVID-19” subset, that worker should be safe to continue working. If the worker is deemed to be in the other subset, it is definitely not a positive result: it cannot be said that the worker has the virus. However, the assay can give a sound reason for further testing of that worker, or diverting the worker to a non-public-contact role. Of course, it also makes sense to screen individuals with whom that worker must make contact, whenever it is possible to do so, in order to prevent transmission to the worker from those individuals. In a community under disease lockdown, it can be very important for some workers to be available to deliver needed commodities to the community members. If those workers are selected from a “does not have the disease” subpopulation, it will minimize the likelihood that the workers will carry and transmit the disease.
For a negative screening assay, detectors in an array similar to that of the positive assay would be used in parallel as in the positive assay, and the results used to pare down the possibilities, both positive and negative with respect to the presence of specific analytes, for each sample, as far as is reasonable given the apparent (sample-dependent) noise levels at each detector. Some of the detectors may be designed to be more useful for assessing sample-dependent noise level in particular portions of analyte mixture space, than for classifying a particular sample.
If the noise is very low and the set of detectors is optimally chosen, and the database contains the right breadth of data, it should be possible to say with confidence how much of each analyte is in a given sample.
If the noise is higher, and the database is less complete, and the detector antibody array hasn't been optimized, it should usually be possible to determine with confidence that a sample should be classified A) as not containing a particular subset of particular analytes, or B) might or might not contain analytes in the subset of particular analytes. In some cases it may be beneficial to adopt a modified method for presenting a sample to the detector array. Whereas normally a single sample may be presented to the array in different dilutions, an alternative method can provide more information particularly if there are unknown antigens in the sample. A modified or alternative method is to first present the sample to the array in several dilutions, then to add a known mix of antigens - a reference solution which may be selected according to the source of the sample, such as blood, semen, sputum, or urine - to the sample and re-present the resulting sample to the array. This procedure can of course be done using varying concentrations of the reference solution. The result is that any similarities or differences between the reference solution and the actual sample will be more easily discernable in the data obtained from the detector array. Precisely the same Al methods can be used to separate the effects of the added reference solution and the actual components of the sample; and in the process a more exact understanding of the similarities or differences between the two will be extracted from the resulting data.
The present disclosure is primarily aimed at assaying the protein contents of a liquid sample such as blood, urine, sputum, nasal secretions, blood plasma, tears, perspiration, drinking water, milk, runoff from agricultural or industrial sites, sewage, rain, river water, and so on. A liquid sample may also be prepared from solid precursors by dissolving the solids in a suitable solvent like water, hydrocarbon solvents, oils, etc. Samples that originate in a mixed state, like sputum, usually should be converted to liquid samples by dilution in a suitable pure or mixed solvent such as water or water plus alcohol. Proteins in a liquid sample may be in their original form, or may be broken apart into protein fragments by a suitable pre-processing procedure, desirably in an easily repeatable way.
It should also be noted that the present disclosure is not limited to assaying the protein content of a sample. Many non-protein molecules such as specific types of metal ions can have a binding affinity to antibodies or other such “recognition molecules”, and may therefore be suitable target antigens for this type of assay. Moreover, molecules recognized by the assemblies and/or methods of the present disclosure need not be free in solution. Virus coat proteins, cell surface proteins and the like, attached to particles much larger than the target molecules, can be recognized by CABA. Thus, a detector array of the present disclosure may be used to separate and distinguish particles in a liquid suspension according to molecules on the surface of the particles. An example of the valuable utility of this application of CABA is in discovering optimal combinations of surface molecules to target using “Logic-gated CAR-T cells” as described in Srivastava S, Salter Al, Liggitt D, Yechan-Gunja S, Sarvothama M, Cooper K, Smythe KS, Dudakov JA, Pierce RH, Rader C, Riddell SR. 2019. “Logic-Gated ROR1 Chimeric Antigen Receptor Expression Rescues T Cell-Mediated Toxicity to Normal Tissues and Enables Selective Tumor Targeting”, Cancer Cell. 2019 Mar 18;35(3):489-503.e8. doi: 10.1016/j.ccell.2019.02.003. Tumor cells can be presented to the detector array, then normal cells presented to an identical detector array, and the differences between the two resulting binding patterns can then be used to select antibodies whose presence and/or absence together can provide high specificity of targeting for Logic-gated CAR- T cells.
The assemblies and/or methods of the present disclosure can be used as a clinical instrument to aid in establishing a current-health profile of each patient. Because every reader and every detector array can be identical regardless of the application, the system can be used to detect and identify influenza, pregnancy, metabolic diseases, infections, blood types, serotypes, histocompatibilities, and a vast array of other factors important to public and personal health. It can be applied in similar ways in veterinary medicine and other areas of biology that, today, use specific antibody tests. On the other hand, in some cases it will be helpful to provide different detector arrays that are specific to testing particular categories of samples (e.g., blood, plasma, urine, sera, water, milk, saliva, or samples that have been pre- processed in various ways to reduce background interference or increase concentration of analytes).
The assemblies and/or methods of the present disclosure can be applied, in accordance with example implementations, as one test (or a few tests), extremely inexpensively, with a shelf life of a couple of years, and can be used to detect the presence of practically any pathogen, protein, microbe, or virus in a sample.
In the following claims, “binding elements” or “detecting elements” can mean any of the following, inclusively:
• antibodies or aptamers spotted onto a substrate or into wells on a plate
• antibodies spotted onto, e.g., microfabricated mass detectors
• any kind of molecular target recognition devices
• detectors, as defined in the Description.
“Low specificity binding elements” means binding elements that bind preferentially to a limited class of target analytes, but to more than one target analyte.
A “binding element” may, for example, be a region, coated with antibodies of a specific type, on a surface. A “region” on a detector may be simply connected or multiply connected in the topological sense. That is, it may consist of one or more separate (that is, spatially distinct) non-connected sub-regions, or of a region from which are excluded one or more contained sub-regions.

Claims

1. A binding affinity-based assay method, the method comprising: providing a first set of known low specificity binding elements; presenting a known sample to the first set to define a first datum vector; providing a second set of known low-specificity binding elements, the first and second sets including the same known low- specificity binding elements; presenting an unknown sample to the second set to acquire a second datum vector; and comparing the first and second datum vectors to determine similarity or difference between the known and unknown samples.
2. The method of claim 1 wherein the low-specificity binding elements comprise selected molecular types.
3. The method of claim 1 wherein the low-specificity binding elements are defined by at least one region spatially distinct from other individual regions.
4. The method of claim 3 wherein each of the individual regions corresponds to a distinct low-specificity binding element.
5. The method of claim 1 wherein the comparing the first and second datum vectors comprises using an algorithm to determine a degree of difference or similarity between the first and second datum vectors.
6. The method of claim 5 further comprising using an algorithm to determine analyte concentrations in the unknown sample.
7. The method of claim 6 further comprising using a library of the datum vectors corresponding to known samples to determine relative analyte concentrations of the unknown sample.
8. The method of claim 5 further comprising using a mathematical model of a detector response to a mix of analyte concentrations created from detector responses to varying concentrations of pure samples of known analytes.
9. The method of claim 5 further comprising using the binding equilibria of analytes to determine analytes and/or concentration of analytes.
10. The method of claim 5 further comprising using the kinetic association/dissociation of analytes to determine analytes and/or concentration of analytes.
11. The method of claim 1 further comprising communicating the similarity or difference between the known and unknown samples to a remote device.
12. The method of claim 1 further comprising comparing the two datum vectors at a remote system via a communication medium.
13. The method of claim 1 wherein the presenting the known samples to the low-specificity binding elements is performed at a first location, and the presenting the unknown samples to the low- specificity binding elements is performed at a second location.
14. The method of claim 13 wherein the first and second locations are the same.
15. A binding affinity-based assay method, the method comprising: presenting a known sample to a set of known low-specificity binding elements to define an individual datum vector to develop a library of datum vectors associated with known samples; presenting an unknown sample to the set of known low- specificity binding elements to acquire a test sample datum vector; and comparing the test sample datum vector to those of the library to determine the most likely mixture of antigens to have produced the observed datum vector.
16. The method of claim 15 wherein the low-specificity binding elements comprise selected molecular types.
17. The method of claim 15 wherein the low-specificity binding elements are defined by at least one region spatially distinct from other individual regions.
18. The method of claim 17 wherein each of the individual regions correspond to distinct low-specificity binding elements.
19. The method of claim 15 wherein the comparing the first and second datum vectors comprises using an algorithm to determine a degree of difference or similarity between the first and second datum vectors.
20. The method of claim 18 further comprising using an algorithm to determine relative analyte concentrations in the unknown sample.
21. The method of claim 15 further comprising using a mathematical model of a detector response to a mix of analyte concentrations created from detector responses to pure concentrations of known analytes.
22. The method of claim 18 further comprising using the binding equilibria of analytes to determine analytes and/or concentrations of analytes.
23. The method of claim 18 further comprising using the association/dissociation kinetics of analytes to determine analytes and/or concentration of analytes.
24. The method of claim 18 further comprising communicating the similarity or difference between the known and unknown samples to a remote device.
25. The method of claim 18 further comprising comparing the two datum vectors at a remote system via a communication medium.
26. The method of claim 18 wherein the presenting the known samples to the low-specificity binding elements is performed at a first location, and the presenting the unknown samples to the low- specificity binding elements is performed at a second location.
27. The method of claim 26 wherein the first and second locations are the same.
PCT/US2021/016075 2020-03-24 2021-02-01 Combinatorial affinity-based analysis assemblies and methods WO2021194635A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US202062993973P 2020-03-24 2020-03-24
US62/993,973 2020-03-24

Publications (1)

Publication Number Publication Date
WO2021194635A1 true WO2021194635A1 (en) 2021-09-30

Family

ID=77892560

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2021/016075 WO2021194635A1 (en) 2020-03-24 2021-02-01 Combinatorial affinity-based analysis assemblies and methods

Country Status (1)

Country Link
WO (1) WO2021194635A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100029496A1 (en) * 2008-07-31 2010-02-04 Cary R Bruce Multiplexed lateral flow microarray assay for detection of citrus pathogens Xylella fastidiosa and Xanthomonas axonopodis PV citri
WO2012099897A1 (en) * 2011-01-18 2012-07-26 Symbolics, Llc Lateral flow assays using two dimensional features
US20180319657A1 (en) * 2015-11-04 2018-11-08 Biocifer Pty Ltd Multiplex Lateral Flow Devices and Assays
US20180372755A1 (en) * 2017-06-22 2018-12-27 Massachusetts Institute Of Technology Multiplexed Immunoassay for Detecting Biomarkers of Disease
WO2019068802A1 (en) * 2017-10-04 2019-04-11 Unisensor Optical reading device with controlled light intensity of a removable solid substrate for detecting and/or quantifying analytes present in a sample
WO2019081361A1 (en) * 2017-10-21 2019-05-02 Expedeon Ltd Universal lateral flow immunoassay

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100029496A1 (en) * 2008-07-31 2010-02-04 Cary R Bruce Multiplexed lateral flow microarray assay for detection of citrus pathogens Xylella fastidiosa and Xanthomonas axonopodis PV citri
WO2012099897A1 (en) * 2011-01-18 2012-07-26 Symbolics, Llc Lateral flow assays using two dimensional features
US20180319657A1 (en) * 2015-11-04 2018-11-08 Biocifer Pty Ltd Multiplex Lateral Flow Devices and Assays
US20180372755A1 (en) * 2017-06-22 2018-12-27 Massachusetts Institute Of Technology Multiplexed Immunoassay for Detecting Biomarkers of Disease
WO2019068802A1 (en) * 2017-10-04 2019-04-11 Unisensor Optical reading device with controlled light intensity of a removable solid substrate for detecting and/or quantifying analytes present in a sample
WO2019081361A1 (en) * 2017-10-21 2019-05-02 Expedeon Ltd Universal lateral flow immunoassay

Similar Documents

Publication Publication Date Title
Su et al. Single cell proteomics in biomedicine: High‐dimensional data acquisition, visualization, and analysis
Bhunia One day to one hour: how quickly can foodborne pathogens be detected?
Nemati et al. An overview on novel microbial determination methods in pharmaceutical and food quality control
Nahtman et al. Validation of peptide epitope microarray experiments and extraction of quality data
CN110730826A (en) Analyte detection
JP2021501332A (en) Methods and systems for protein identification
DiGiuseppe et al. Detection of minimal residual disease in B lymphoblastic leukemia using vi SNE
US20140120122A1 (en) Method for monitoring vaccine response using single cell network profiling
CN102439168A (en) Biomarkers for the identification, monitoring, and treatment of head and neck cancer
CN109415752A (en) Magnetic electrochemical sensing
Esperança et al. Detection of Plasmodium berghei infected Anopheles stephensi using near-infrared spectroscopy
US11105801B2 (en) Bioanalyte signal amplification and detection with artificial intelligence diagnosis
CN101194166A (en) Materials and methods relating to breast cancer classification
Schuurs et al. Harmonization of PCR-based detection of intestinal pathogens: experiences from the Dutch external quality assessment scheme on molecular diagnosis of protozoa in stool samples
US20110065601A1 (en) Identification of discriminant proteins through antibody profiling, methods and apparatus for identifying an individual
Cheong et al. L abel‐free identification of antibiotic resistant isolates of living E scherichia coli: P ilot study
Byrum et al. multiSero: open multiplex-ELISA platform for analyzing antibody responses to SARS-CoV-2 infection
Malinick et al. Surface plasmon resonance imaging (SPRi) in combination with machine learning for microarray analysis of multiple sclerosis biomarkers in whole serum
US20150023568A1 (en) Computing systems, computer-readable media and methods of antibody profiling
WO2021194635A1 (en) Combinatorial affinity-based analysis assemblies and methods
EP1963519B1 (en) Method and system for real-time analysis of biosensor data
Bokhari Exploitation of microbial forensics and nanotechnology for the monitoring of emerging pathogens
US20190234962A1 (en) Entropy of immune health
CA2536698A1 (en) System and method of detecting, identifying and characterizing pathogensand characterizing hosts
JP2007513399A (en) Generation and use of biochemical images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21776371

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21776371

Country of ref document: EP

Kind code of ref document: A1