WO2024065069A1 - Method for determining cellular identity from a complex biological sample using pcr-hrm and mathematical analysis - Google Patents

Method for determining cellular identity from a complex biological sample using pcr-hrm and mathematical analysis Download PDF

Info

Publication number
WO2024065069A1
WO2024065069A1 PCT/CL2023/050090 CL2023050090W WO2024065069A1 WO 2024065069 A1 WO2024065069 A1 WO 2024065069A1 CL 2023050090 W CL2023050090 W CL 2023050090W WO 2024065069 A1 WO2024065069 A1 WO 2024065069A1
Authority
WO
WIPO (PCT)
Prior art keywords
pcr
data
sample
microorganisms
biological sample
Prior art date
Application number
PCT/CL2023/050090
Other languages
Spanish (es)
French (fr)
Inventor
Rodrigo Fernando MALIG FUENTES
Mauricio Alejandro NIKLISTCHEK OYARZUN
Denis Gustavo BERNDT BRICEÑO
Leandro Anthony Emmanuel FARIAS AGUILERA
Original Assignee
TAAG Genetics S.A
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TAAG Genetics S.A filed Critical TAAG Genetics S.A
Publication of WO2024065069A1 publication Critical patent/WO2024065069A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • C12N15/113Non-coding nucleic acids modulating the expression of genes, e.g. antisense oligonucleotides; Antisense DNA or RNA; Triplex- forming oligonucleotides; Catalytic nucleic acids, e.g. ribozymes; Nucleic acids used in co-suppression or gene silencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6809Methods for determination or identification of nucleic acids involving differential detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/6853Nucleic acid amplification reactions using modified primers or templates
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6844Nucleic acid amplification reactions
    • C12Q1/686Polymerase chain reaction [PCR]
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/689Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for bacteria
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6893Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for protozoa
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6888Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms
    • C12Q1/6895Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for detection or identification of organisms for plants, fungi or algae
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Definitions

  • a challenge of molecular biology has been the identification of cells in different biological and non-biological matrices.
  • Microbial identification is understood as the set of techniques and procedures that are applied to establish the identity of a microorganism. These techniques are used in different areas, such as:
  • Molecular biology has allowed typing, meaning the identification and characterization of different organisms, from eucanotates to viruses, including microorganisms that are found in different matrices, whether inert or not.
  • the sequencing of complete genomes of bacteria, viruses or pathogens is used to detect specific cells in specific contexts.
  • PCR polymerase chain reaction
  • SUBSTITUTE SHEET (RULE 26) a fluorochrome and the measurement of the fluorescence emitted by the probe.
  • the advantage of this technique is that it is fast and results are obtained in a time of 30 minutes to two hours.
  • HRM high-resolution dissociation curve
  • the HRM team has a device capable of measuring the fluorescence of the sample and software that allows the detected data to be represented in a graph known as a melting curve, which shows the level of fluorescence versus temperature.
  • the resulting melting profile reflects the mixture of amplicons present. Aspects such as GC content, length, sequence and heterozygosity will add to the melting curve characteristics of each amplicon.
  • the resulting profiles can provide valuable information for mutation screening, genotyping, mediation and other research applications including the identification of different cell lines as shown in the prior art.
  • the high-resolution dissociation curve (HRM or HRMT) is a powerful method for scanning sequences in DNA samples.
  • HRM technology characterizes nucleic acid samples based on their dissociation behavior and detects small differences in PCR-amplified sequences simply by direct dissociation.
  • the dissociative behavior of the samples depends on the GC content, the sequence and its length that are characteristics of the amplicons generated in the PCR step.
  • Double-stranded DNA (dsDNA) is very stable at room temperature. However, with increasing temperature, the two individual chains begin to dissociate until they separate completely.
  • T m The temperature at which 50% of the DNA is single-stranded is called the melting temperature (T m ). This depends on both the length and the guanine-cytosine (GC) content of the DNA fragment.
  • guanine-cytocine (GC) base pairs are linked by three hydrogen bonds, they are more stable than adenine-thymine (AT) base pairs, which are linked by only two hydrogen bonds. Therefore, DNA sequences with a high GC content have a higher T m than DNA sequences containing a low number of GC base pairs.
  • Melting curve analysis is based on gradually increasing the temperature after the last PCR cycle. Initially, a high fluorescence signal is obtained due to the large number of (double-stranded) amplicons present in the PCR tube linked or intercalated to the DNA. However, at higher temperatures, the dsDNA dissociates, a fluorescent probe is released, and a decrease in the fluorescence signal is observed.
  • the T m of the amplicon can be determined from the inflection point of the melting curve or the melting peak obtained by plotting the negative first derivative of fluorescence (F) over temperature (T) (-dF/dT) against the temperature (T).
  • T m melting temperature
  • amplicons For cell identification under this technique, specific amplicons must be generated that allow the generation of typical dissociation curves that characterize the cell line to be identified. To do this, amplicons of genomic regions must be generated that allow discrimination between the different species and classes present in a given sample.
  • starters give a plurality of amplicons that can be identified by dissociation curves that can be characterized by their shapes, displacements, T m among other characteristics.
  • dissociation curves with similar characteristics exist, the data generated can be treated mathematically to increase the degree of discrimination.
  • data processing methods include the calculation of derivatives, adjustment to specific mathematical models or grouping generation using statistical or artificial intelligence methods.
  • the negative first derivative curve can discriminate genotypes by comparing the relative position and shape of the melting curves.
  • the main use of this methodology is oriented on the one hand to the certification of identity by comparison against a known standard and on the other hand, it depends on discrimination from a limited group, such as, for example, the identification and serotyping of species. He The objective of the methods described above has in common that the identity of the query curve is contrasted against a limited group of possibilities.
  • curves obtained from the first negative derivative of HMR have an intrinsic degree of variability. These curves primarily reflect the sequence contained in the PCR amplicons, but also include other non-intrinsic variables, such as the source of the DNA, the method of its preparation and procedural variations, and even the equipment used, among others.
  • Marcin Slomka et al. (2017) shows the profound impact that slight changes in test matrices or procedures cause on HMR curves, these increase in impact and frequency when the sample size is large scale, such as high-throughput tests (high-performance). throughput).
  • the resolution needs are in direct proportion to the size of the sample base used to determine the identity of an amplicon through HRM.
  • HRM the size of the sample base used to determine the identity of an amplicon through HRM.
  • PCA Principal Component Analysis
  • the use of the first derivative, whether positive or negative, as well as the second derivative of the HRM data made it possible to increase the discriminatory capacity of the curves obtained, and advantageously this results in an improved identification process when a problem sample is faced with a database with a broad sample base.
  • the present invention seeks to protect a method for determining cell identity from a biological sample that comprises providing a biological sample suspected of containing cells that can be identified from a known database, which are selected from eukaryotes, bacteria, viruses, fungi and protozoa. and that can be obtained from biological or non-biological matrices, perform a cell separation step from the sample, extract the DNA from the sample and perform a polymerase chain reaction (PCR) to amplify at least one genomic region using at least one and/or in combination, of any of the starters chosen from SEQ ID N°1 to SEQ ID N°85.
  • PCR polymerase chain reaction
  • It also includes performing a high resolution dissociation curve (HRM) of the amplicons generated by the PCR from the previous step and subsequently converting the data generated in the previous point through mathematical transformation, which has a first step that consists of standardizing the equipment data, then generate a first positive or negative derivative and finally a third step that consists of generating a positive derivative to the set of data obtained in the previous point.
  • HRM high resolution dissociation curve
  • Claimed in this invention is a system for determining cellular identity comprising a storage component for generating a data matrix of a high resolution dissociation curve (HRM) of known organisms and suspected samples, a computational processor for processing data that allows perform a mathematical transformation and/or a computational processor to process data that allows executing artificial intelligence algorithms for the grouping and/or discrimination and/or cell identification step. And a monitor that displays information regarding cell identification.
  • HRM high resolution dissociation curve
  • Microorganisms can also be genotyped, as for example in patent WO2022068785, which indicates a method to quickly identify Bacillus cereus and Bacillus thuringiensis that comprises selecting different SNP sites in a sequence of the ispD gene and carrying out an HRM curve taking said species as templates. to acquire the information of the characteristic curve of the corresponding strains, to subsequently perform a precise identification of the Bacillus in a sample.
  • US2017321257 refers to a method for identifying bacteria in a biological sample that includes the following steps: a) providing a biological sample; b) isolate bacterial DNA; c) amplify at least a portion of the ITS region, using at least one set of primers capable of hybridizing this area; d) perform HRM analysis of the amplicons; and e) identify the bacterial species by comparing the curves with a known species base.
  • This patent indicates that for the analysis of the curves, the Hilbert transform is used as a mathematical method to fit the data of the curves and the database.
  • the method described involves the extraction of DNA from a forensic sample, the quantification and evaluation of the purity of the DNA, the performance of a PCR using primers for the mitochondrial 16S rRNA gene of each species and performing an HRM curve from 60 to 95°C. . Finally, a derivation of the results is carried out (T versus - dF/dT) and a classification using random forest.
  • the main technical difference of this invention with respect to the closest documents is in the development of specific primers (SEQ ID No. 1 to SEQ ID No. 85 for conserved regions for different cell lines such as rpoB, ITS2, rpB2, EF1a, TEF -1a, NL1 23S, LS 23S, dnaA, gyrB, Fteg U1, Fteg U2, CandUn, FilamUn and FungUn: the use of said primers that allows cell identification taking advantage of the differences in high resolution dissociation curves (HRM) by the generation of different amplicons using different PCR techniques.
  • HRM high resolution dissociation curves
  • first data matrix the data generated by the HRM curve
  • second data matrix the generation of a first positive or negative derivative
  • third data matrix the transformation of this previous data again by a positive derivative
  • the present invention seeks to protect a method for determining cell identity from a biological sample that comprises providing a biological sample suspected of containing cells that can be identified from a known database, which are selected from eukaryotes, bacteria, viruses, fungi and protozoa. and that can be obtained from biological or non-biological matrices.
  • a cell separation step from the sample where you can discriminate between eukaryotic cells and/or microorganisms and/or viruses, and it can be carried out by physical and/or biological methods to then extract the DNA from the sample and carry out a reaction in polymerase chain (PCR) to amplify at least one genomic region using at least one and/or in combination, of any of the primers chosen from SEQ ID N°1 to SEQ ID N°85.
  • PCR polymerase chain
  • the method also includes performing a high resolution dissociation curve (HRM) of the amplicons generated by the PCR from the previous step where the melting temperature (Tm) of the amplicons is in the range of 60°C to 98°C; subsequently convert the data generated in the previous point through mathematical transformation where said transformation has a first step that consists of standardizing the equipment data, then generating a first positive or negative derivative and finally a third step that consists of generating a positive derivative at set of data obtained in the previous point; then use the mathematically transformed data, in both step two and/or step three, as inputs to the machine learning algorithms to determine cell identity; where artificial intelligence algorithms make it possible to generate a first classifier that allows discrimination based on the data generated by the profiles and finally compare the data of the suspicious biological sample with a database of known organisms for cellular identification.
  • HRM high resolution dissociation curve
  • a system for determining cellular identity comprising a storage component for generating a data matrix of a high resolution dissociation curve (HRM) of known organisms and suspected samples, a computational processor for processing data that allow for a transformation mathematics and/or a computational processor to process data that allows executing artificial intelligence algorithms for the step of grouping and/or discrimination and/or identification of microorganisms. And a monitor that displays information regarding the identification of microorganisms.
  • HRM high resolution dissociation curve
  • a method for determining cell identity from a biological sample that comprises providing a biological sample suspected of containing cells that are selected from eukaryotes, bacteria, viruses, fungi and protozoa, and that can be obtained from biological or non-biological matrices.
  • the biological sample can come from clinical procedures.
  • blood samples can be incorporated, including, without limitation, whole blood, serum and plasma. They can also be detected in other fluid or tissue samples, including saliva, cerebrospinal fluid, mucus, lymph fluid or lavage fluid, and tissue samples obtained from the skin and soft tissues.
  • fluid or tissue samples can be obtained from organs of the respiratory system, reproductive system, nervous system, muscular system, integumentary system, lymphatic system, excretory system, endocrine system, digestive system, cardiovascular system and skeletal system.
  • cells can be detected and specifically identified from samples taken from the site of a localized infection, for example, at the site of a wound caused by traumatic injury or surgery.
  • the samples can also come from the food industry within which, and without the intention of limiting the invention, they can include fresh foods, foods processed by different industry methods such as cooking, drying, freeze-drying, among others.
  • Dairy foods such as natural, cultured, fermented milk and their derivatives.
  • Obtaining the sample can be carried out in addition to contact surfaces made of inorganic material and that are regularly used in the industries described above such as work benches, medical instruments, prostheses, kitchen instruments and permanent and temporary installations for carrying out production procedures in the case of the food industry, as well as the performance of both invasive and non-invasive clinical procedures.
  • the surface materials may be common for the food industry as well as for clinical procedures. Without limiting the present invention, they would be wood and its derivatives, steel and its derivatives, iron and its derivatives, plastics in all its formats, glass in all its formats, alloys, etc.
  • the sample can be subjected to a cell enrichment process, incorporating specific culture media for the growth of the cells to be identified. This step seeks to have a sufficient number of cells for the generation of PCR amplification.
  • the enrichment techniques used are those known in microbiology.
  • a cell separation step can be carried out from the sample where it can discriminate between eukaryotic cells and/or microorganisms and/or viruses, and it can be carried out by physical and/or biological methods.
  • the separation methods are those known in biotechnology and without limiting the present invention, they can be selected according to the properties used, such as size or density of the cells, affinity for antibodies, light scattering, fluorescence emission, properties physical among others.
  • the next step is to extract the DNA from the sample by means known in the art and perform a polymerase chain reaction (PCR) to amplify at least one genomic region using any and/or in combination of any of the primers chosen from the SEQ ID N°1 to SEQ ID N°85.
  • PCRs can be chosen from nested PCR, multiplex PCR, reverse transcriptase PCR and real-time PCR (qPCR). As a preferred, but not limiting, modality, qPCR is indicated as the method of this patent.
  • genomic regions to which the sequences described above are specific primers are selected from rpoB, ITS2, rpB2, EF1a, TEF-1a, NL1 23S, LS 23S, dnaA, gyrB, Reg U1, Reg U2, CandUn, FilamUn and FungUn, without limiting the invention to other genomic regions that are used in the art for the selection and identification of cells or cell lines.
  • the starter oligonucleotides comprise a variant thereof that comprises a sequence that has at least about 80 to 100% identity of the sequences identified above, including any percentage identity within this range, such as 81, 82 , 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% sequence identity. Changes can be introduced in the nucleotide sequences corresponding to particular genetic variations of interest.
  • nucleotide changes can be made, in a sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 85, wherein the oligonucleotide primer is capable of hybridizing and amplifying a target DNA sequence. .
  • the method also includes performing a high resolution dissociation curve (HRM) of the amplicons generated by the PCR from the previous step where the melting temperature (Tm) of the amplicons is in the range of 60°C to 98°C.
  • HRM high resolution dissociation curve
  • Tm melting temperature
  • the data from said curve are stored in a database and can be incorporated into a database of known cells to later be used for identification.
  • the data generated by the dissociation curve can be transformed mathematically, where said transformation has a first step that consists of standardizing the data generated by the PCR equipment using normalization methods known in the data processing technique. As a preferred, but not limiting, modality, said standardization is carried out by z-score. Once this standardization has been carried out, step two is to generate a first positive or negative derivative of the normalized data and generate a second base of the data already derived.
  • a third treatment step can be carried out, which consists of generating a positive derivative to the set of data obtained in the previous point, forming a third database to work with.
  • This database is called double derivative.
  • Data sets generated from both HRM, normalized data and those derived from the second and/or third step can be used in this method to generate additional discrimination parameters.
  • These discrimination parameters may be selected, but not limited to, such as relative position of peaks, valleys, height of peaks, and relationships between the generated data such as width and/or distance between peaks and/or valleys.
  • the previously individualized data sets can be used, either individually or in combination, as inputs to the machine learning algorithms, where said algorithms allow the generation of classifiers that allow discrimination in the generated data sets and finally compare the data. of the suspected biological sample with a database of known organisms for cellular identification.
  • the machine learning algorithm may comprise a supervised learning algorithm such as ordinal classification, regression analysis and information fuzzy networks (IFN), statistical classification (such as AODE), linear classifiers (e.g., Fisher linear discriminant, regression logistics, Naive Bayes classifier, Perceptron and support vector machine), quadratic classifiers, k-nearest neighbor, Boosting, decision trees (for example, random forests), Bayesian networks and hidden Markov models among others.
  • IFN ordinal classification
  • AODE statistical classification
  • linear classifiers e.g., Fisher linear discriminant, regression logistics, Naive Bayes classifier, Perceptron and support vector machine
  • quadratic classifiers e.g., k-nearest neighbor,
  • Machine learning algorithms may also comprise an unsupervised learning algorithm.
  • unsupervised learning algorithms may include artificial neural networks, data clustering, expectation maximization algorithm, self-organizing map, radial basis function network, vector quantization, generative topographic map, information bottleneck method, and IBSEAD .
  • Unsupervised learning can also comprise association rule learning algorithms such as the Apriori algorithm, the Eclat algorithm, and the fp-growth algorithm.
  • Hierarchical clustering can also be used, such as single link clustering and conceptual clustering.
  • unsupervised learning may comprise partitional clustering, such as the K-means algorithm and fuzzy clustering.
  • machine learning algorithms comprise a reinforcement learning algorithm. Examples of reinforcement learning algorithms include, but are not limited to, temporal difference learning, Q learning, and learning automata.
  • the artificial intelligence algorithms for the clustering and/or discrimination and/or identification step can be selected from random forest, vector machine learning, k nearest neighbors (kNN), partitioning around medoids ( PAM), Naive Bayes, principal component analysis (PCA) or linear discriminant analysis (LDA). These artificial intelligence algorithms allow the generation of a new classifier that allows discrimination against the databases generated in the previous step.
  • a system for determining cellular identity comprising a storage component for generating a data matrix of a high resolution dissociation curve (HRM) of known organisms and suspected samples.
  • HRM high resolution dissociation curve
  • a computational processor to process data that allows carrying out a mathematical transformation where said transformation has a first step that consists of standardizing the data matrix generated for the high resolution dissociation curve, using normalization methods known in the art. of data processing.
  • step two is to generate a first positive or negative derivative of the normalized data and generate a second database already derived.
  • a third treatment step can be carried out, which consists of generating a positive derivative to the set of data obtained in the previous point, forming a third database to work with.
  • a computational processor that allows the execution of artificial intelligence algorithms for the grouping and/or discrimination step of the data matrices already individualized previously.
  • the artificial intelligence algorithms are the same as mentioned above in the method both in its general selection and in its preferred embodiment.
  • Example 1 Location of the starters within the reference gene.
  • SEQ No. 1 to SEQ No. 12 within the sequence of the highly conserved rpoB gene is presented as a non-limiting example of the embodiments indicated above in the specification.
  • Figure 1 A shows the gene completely and where they are located. the sequences indicated above are located.
  • Figures 1 B and 1 C show the sequences SEQ N°1 to 5 within the gene, while Figures 1 D and 1 E show the sequences SEQ N°6 to 12.
  • sequences are variations of highly conserved sectors of the genes indicated in the table presented in the detailed description of the invention and that this property allows their use in different next generation sequencing (NGS) technologies and is also a crucial part of the identification method presented in this presentation.
  • NGS next generation sequencing
  • Example 2 Protocol for generating PCR-HRM.
  • thermoblock is heated to 99°C. At the same time, the samples are centrifuged and homogenized after incubation.
  • b) Transfer 200 pL of each of the enriched samples and mix in a sterile 1.5 mL tube.
  • d) Add 350 pL of wash solution and pipette until it becomes homogeneous.
  • e) Transfer the 350 pL of sample to tube A and shake at 2500 rpm for 10 minutes.
  • PCR and HRM program is configured as indicated in the following table:
  • the wells must be centrifuged (spin-down) prior to loading. Then load with 2 pL of previously extracted DNA and load 2 pL in the positive and negative controls in their respective wells.
  • Example 3 Mathematical treatment of dissociation curves (HMR) of nearby amplicons.
  • the inventors selected 3 strains of microorganisms that are usually difficult to discriminate by HMR, Paenibacillus maceraos (2), Pseudomonas putida (5) and Acetobacter aceti (7).
  • a PCR-RT was performed, taking 50 pL of sample culture and mixing it with 450 pL of lysis buffer solution. The mixture was incubated for 20 minutes at 95 S C and the extracted DNA went to the amplification stage.
  • the PCR step was developed against a genomic region using a set of amplification primers with a wide taxonomic range, such as rpoB, using a mix containing the primers corresponding to SEQ ID N°1 to SEQ ID N°19.
  • results of the qPCR equipment are then normalized by performing a z-score normalization that allows comparable results to be obtained regardless of the device and time in which the analyzes are performed. Subsequently, the temperature results and the derivative of the relative fluorescence (d ⁇ Fu) are graphed as indicated in Figure 2.
  • Example 4 Microorganism database generation by HMR and increase in discrimination capacity in complex samples.
  • microorganisms that were used to build the comparison database are the following, but not limiting the incorporation of new strains or species:
  • Acetobacter aceti Acidovorax sp, Acinetobacter sp, Aeromonas hydrophila, Alicyclobacillus acidoterrestr ⁇ s, Asaia bogorensis, Asaia lannensis, Asaia sp, Bacillus albus, Bacillus altitudinis, Bacillus amyloliquefaciens, Bacillus cereus, Bacillus coagulaos, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus subtilis , Blastomonas natatorial, Brochothr ⁇ x thermosphacta, Caulobacter vibr ⁇ oides, Clostridium perfr ⁇ ngens, Delftia acidovorans, Delftia sp, Enterobacter cloacae, Enterococcus f aecium, Escherichia coli O157:H7, Hydrogenophaga pseudof
  • microorganisms are presented in a reference panel, where the microorganisms were obtained from collection banks of ATCC, DSMZ and from environmental or product isolates.
  • This panel contains the strains of Acinetobacter Iwoffii, Aeromonas hydrophila, Bacillus cereus, Bacillus circulans, Bacillus clausii, Bacillus licheniformis, Enterobacter aerogenes, Enterobacter cloacae, Enterococcus faecalis, Klebsiella, Klebsiella aerogenes, Klebsiella oxytoca, Kocur ⁇ a kristinae, Lactococcus lactis, Leuconostoc lacti yes, Macrococcus carouselicus, Propionibacter ⁇ um freudenreichii, Proteus vulgaris, Pseudomonas fluorescens, Pseudomonas putida, Serratia grimesii, Serratia lique
  • matrices of different types of dairy were selected: natural milk (M8), skim milk (M9), fortified with calcium (M7), with flavoring (M1 1) and milk cream (M10).
  • the matrices were enriched without inoculum, with 9 mL of fungal growth medium for 48 hours at 25°C, confirming their sterility.
  • each sample was plated on Potato Dextrose Agar for quantification and incubated for 5 days at 25°C. Colonies were then counted and recorded for data analysis.
  • DNA extraction and PCR were carried out according to the protocol of the previous examples. In this case, a mixture of primers corresponding to SEQ ID N°20 and SEQ ID N°69 to 83 was used. 2 pL of each DNA sample was added to each tube in the PCR plate. The PCR plate was loaded into an AriaMx Realtime PCR machine (Agilent Technologies), and the PCR protocol and dissociation curve were set up. Results
  • the beginning of the procedure is to inoculate 1 mL of the matrices with the particular microorganism in the stationary phase at a low concentration, leaving the matrices with equal to or less than 100 CFU/mL.
  • inoculated matrix was taken to 9 mL of fungal growth medium for the enrichment phase, incubated at 28°C for 48 hours, allowing the good development of the microorganisms tested in the different matrices.
  • the results of the PCR can be seen in Figure 7, which shows the presence of the microorganisms in the different matrices evaluated.
  • the assay controls indicate that there are no problems with the PCR, and also that the matrices evaluated are sterile.
  • the microorganism controls show the profiles of each of the molds and yeasts.
  • Each of the matrices evaluated present the profiles corresponding to each of the inoculated microorganisms, indicating the correct detection of the respective molds and yeasts.
  • Example 6 Identification of bacteria in drinkable products.
  • the reference panel contains the strains of Acetobacter aceti, Aeromonas hydrophila, Alicyclobacillus acidoterrestr ⁇ s, Asaia bogorensis, Asaia lannensis, Bacillus albus, Bacillus altitudinis, Bacillus amyloliquefaciens, Bacillus cereus, Bacillus coagulans, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus subtilis, Blastomonas natatoria, Brochothrix thermosphacta, Caulobacter vibr ⁇ oides, Clostridium perfringens, Delftia acidovorans, Enterobacter cloacae, Enterococcus faecium, Enterococcus hirae, Escherichia coi!,
  • strains were activated in non-selective growth media and then isolated on agar, in order to work with pure cultures, and these were confirmed by sequencing. Stabilization and inoculation of strains.
  • matrices of different types of final beverage products were selected: black carbonated drink, colorless carbonated drink, carbonated water, non-carbonated water and orange juice.
  • the matrices were enriched without inoculum, with bacterial growth medium for 24 hours at 35°C, confirming their sterility.
  • Example 7 Discrimination of nearby microorganisms using the method of the invention.
  • a discrimination test is presented between microorganisms that are normally found as contaminants in the food industry such as Lactobacillus plantarum, Pseudomonas aeruginosa and Pseudomonas alcaligenes.
  • the first is used as an internal control of the experiment, while the resolution capacity of the method with respect to Pseudomonas will be evaluated.
  • the general protocol is the same as indicated in example 2 of this patent, and where the matrix evaluated is fantasy drinks.
  • Example 8 Evaluation of the identification method for Zyqosaccharomyces.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Wood Science & Technology (AREA)
  • Zoology (AREA)
  • Analytical Chemistry (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Genetics & Genomics (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Biomedical Technology (AREA)
  • Medical Informatics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioethics (AREA)
  • Data Mining & Analysis (AREA)
  • Mycology (AREA)
  • Plant Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Botany (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Tropical Medicine & Parasitology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)

Abstract

The invention relates to a method for determining cellular identity from a biological sample in a matrix, which comprises providing a biological sample suspected of containing cells or microorganisms; carrying out an optional step of cellular separation of the sample; extracting DNA from the sample and carrying out a polymerase chain reaction (PCR) to amplify at least one genomic region, using a set of broad-taxonomic-range amplification primers; producing a high-resolution melt (HRM) curve of the amplicons generated by the PCR of the previous step; converting the generated data by means of mathematical transformation; using the mathematically transformed data as inputs for machine learning algorithms to determine cellular identity; and comparing the data of the suspected biological sample with a database of know microorganisms to identify cells.

Description

MÉTODO PARA DETERMINAR IDENTIDAD CELULAR DESDE UNA MUESTRA BIOLÓGICA COMPLEJA POR PCR-HRM Y ANÁLISIS MATEMÁTICO METHOD TO DETERMINE CELLULAR IDENTITY FROM A COMPLEX BIOLOGICAL SAMPLE BY PCR-HRM AND MATHEMATICAL ANALYSIS
ANTECEDENTES Y ARTE PREVIO BACKGROUND AND PREVIOUS ART
Un desafío de la biología molecular ha sido la identificación de células en distintas matrices tanto biológicas como no biológicas. A challenge of molecular biology has been the identification of cells in different biological and non-biological matrices.
Se entiende por identificación microbiana al conjunto de técnicas y procedimientos que se aplican para establecer la identidad de un microorganismo. Estas técnicas se utilizan en diferentes áreas, como, por ejemplo: Microbial identification is understood as the set of techniques and procedures that are applied to establish the identity of a microorganism. These techniques are used in different areas, such as:
En el área clínica donde es de importancia conocer cuál es el agente causal de la infección que presenta un paciente, de manera de poder tratarlo con agentes terapéuticos. In the clinical area where it is important to know the causal agent of the infection that a patient presents, in order to be able to treat it with therapeutic agents.
En la industria farmacéutica, cosmética y de alimentos donde las normas de control de calidad exigen la ausencia de ciertos microorganismos. In the pharmaceutical, cosmetic and food industries where quality control standards require the absence of certain microorganisms.
En investigación básica donde se aísla un determinado microorganismo que debe identificarse para comprobar si se trata de un microorganismo conocido o de uno nuevo para poder clasificarlo. In basic research where a certain microorganism is isolated that must be identified to check if it is a known microorganism or a new one in order to classify it.
El reconocimiento de líneas celulares, especialmente de microorganismos patógenos, ha sido un tema recurrente en la microbiología y la biología. Los métodos para la identificación microbiana, los podemos clasificar dependiendo de las propiedades del organismo y estas pueden ser por criterios morfológicos, tinción diferencial, reacciones (pruebas) bioquímicas, métodos basados en tipificación con fagos, pruebas serológicas y detección molecular. The recognition of cell lines, especially pathogenic microorganisms, has been a recurring topic in microbiology and biology. We can classify the methods for microbial identification depending on the properties of the organism and these can be by morphological criteria, differential staining, biochemical reactions (tests), methods based on phage typing, serological tests and molecular detection.
La dificultad de detectar organismos mediante la microbiología clásica ha hecho que se desarrollen nuevos métodos. Los métodos clásicos están asociados a inconvenientes como largos periodos de crecimiento, condiciones muy especiales de cultivo, muestras mal recogidas que complican la interpretación de los resultados o incluso imposibilitan la obtención de un resultado. Todo esto, ha dado paso a nuevos métodos, entre ellos los basados en biología molecular, que han avanzado mucho en las últimas décadas, dando lugar a técnicas muy sensibles y específicas, capaces de detectar y cuantificar pequeñas cantidades de ácido desoxirñbonucleico (ADN) o ácido ribonucleico (ARN) y proteínas de microorganismos que pueden aislarse de un cultivo o no ser cultivables in vitro (Farfán J, 2015). The difficulty of detecting organisms using classical microbiology has led to the development of new methods. Classic methods are associated with drawbacks such as long growth periods, very special cultivation conditions, poorly collected samples that complicate the interpretation of the results or even make it impossible to obtain a result. All of this has given rise to new methods, including those based on molecular biology, which have advanced a lot in recent decades, giving rise to very sensitive and specific techniques, capable of detecting and quantifying small amounts of deoxyrbonucleic acid (DNA) or ribonucleic acid (RNA) and proteins from microorganisms that can be isolated from a culture or cannot be cultivated in vitro (Farfán J, 2015).
La biología molecular ha permitido tipificar, entendiendo por esto el identificar y caracterizar distintos organismos, desde eucañotas hasta virus, pasando por microorganismos que se encuentran en distintas matrices ya sean inertes o no. La secuenciación de genomas completos de bacterias, virus o patógenos se emplean para la detección de células concretas en contextos específicos. Molecular biology has allowed typing, meaning the identification and characterization of different organisms, from eucanotates to viruses, including microorganisms that are found in different matrices, whether inert or not. The sequencing of complete genomes of bacteria, viruses or pathogens is used to detect specific cells in specific contexts.
En cuanto a estas técnicas, existen muchas variantes para amplificar, detectar y secuenciar los ácidos nucleicos. La técnica más utilizada es la reacción en cadena de la polimerasa (PCR). Esta técnica posteriormente se acompaña de la detección de la amplificación mediante gel de agarosa con un intercalante inespecífico fluorescente. La PCR en tiempo real se acompaña de la identificación inmediata de la amplificación a través de hibridación con sondas marcadas con Regarding these techniques, there are many variants to amplify, detect and sequence nucleic acids. The most used technique is the polymerase chain reaction (PCR). This technique is subsequently accompanied by the detection of amplification using agarose gel with a nonspecific fluorescent intercalant. Real-time PCR is accompanied by the immediate identification of the amplification through hybridization with probes labeled with
HOJA DE SUSTITUCIÓN (REGLA 26) un fluorocromo y la medida de la fluorescencia emitida por la sonda. La ventaja de esta técnica es que es rápida y se obtienen resultados en un tiempo de 30 minutos a dos horas. SUBSTITUTE SHEET (RULE 26) a fluorochrome and the measurement of the fluorescence emitted by the probe. The advantage of this technique is that it is fast and results are obtained in a time of 30 minutes to two hours.
La introducción en la práctica diaria de nuevas tecnologías tiene ventajas entre las que destacan mejores cifras de sensibilidad, especificidad y valores predictivos positivos y negativos, mayor rapidez en la obtención de resultados, mayor automatización y por tanto, capacidad para asumir una mayor carga de trabajo y mejor gestión de la actividad en el laboratorio. En cuanto a las desventajas asociadas, los analizadores e instrumentos necesarios para esta nueva forma de trabajo son equipos muy costosos, que supondrán una gran inversión inicial para el laboratorio. A pesar de todo, se han realizado diversos análisis de costo- efectividad que hacen que esta desventaja sea mínima comparada con las ventajas asociadas a esta tecnología (Cantón R et al., 2015) The introduction of new technologies into daily practice has advantages among which are better sensitivity, specificity and positive and negative predictive values, faster results, greater automation and, therefore, the ability to assume a greater workload. and better management of activity in the laboratory. As for the associated disadvantages, the analyzers and instruments necessary for this new way of working are very expensive equipment, which will involve a large initial investment for the laboratory. Despite everything, various cost-effectiveness analyzes have been carried out that make this disadvantage minimal compared to the advantages associated with this technology (Cantón R et al., 2015).
Una de las técnicas asociadas a la PCR más utilizadas para la identificación celular es la curva de disociación de alta resolución (HRM). Dicho análisis en sí, consiste en un calentamiento preciso del ADN de la muestra desde unos 50°C hasta unos 95°C. En algún momento de este proceso, se alcanza la temperatura de fusión de cada uno de los amplicones presenten en la muestra y las dos cadenas de ADN se separan. La clave del análisis por HRM es controlar la separación de las dos cadenas de ADN en tiempo real. Para ello actualmente se dispone de diferentes substancias fluorescentes que se intercalan en el ADN de doble cadena y cuando están unidas a él emiten una elevada fluorescencia. Cuando están en solución (no unidos al ADN) emiten una fluorescencia baja. Esta disminución suele ser mayor cerca de la temperatura de fusión (Tm) de los amplicones. La Tm se define como el punto de la curva de fusión en el que el 50% del ADN es de doble cadena y el 50% es de cadena simple. One of the most used PCR-associated techniques for cell identification is the high-resolution dissociation curve (HRM). Said analysis itself consists of precise heating of the DNA in the sample from about 50°C to about 95°C. At some point in this process, the melting temperature of each of the amplicons present in the sample is reached and the two DNA strands separate. The key to HRM analysis is to monitor the separation of the two DNA strands in real time. For this purpose, different fluorescent substances are currently available that are intercalated in the double-stranded DNA and when bound to it they emit high fluorescence. When in solution (not bound to DNA) they emit low fluorescence. This decrease is usually greatest near the melting temperature (T m ) of the amplicons. The T m is defined as the point on the melting curve at which 50% of the DNA is double-stranded and 50% is single-stranded.
El equipo de HRM posee un dispositivo capaz de medir la fluorescencia de la muestra y un software que permite representar los datos detectados en un gráfico conocido como curva de fusión, que muestra el nivel de fluorescencia frente a la temperatura. The HRM team has a device capable of measuring the fluorescence of the sample and software that allows the detected data to be represented in a graph known as a melting curve, which shows the level of fluorescence versus temperature.
El perfil de fusión resultante refleja la mezcla de amplicones presentes. Aspectos como el contenido de GC, la longitud, la secuencia y la heterocigosidad se sumarán a las características de la curva de fusión de cada amplicón. Los perfiles resultantes pueden proporcionar información valiosa para el cribado de mutaciones, el genotipado, la mediación y otras aplicaciones de investigación entre las cuales se encuentran la identificación de distintas líneas celulares como se muestra en el arte previo. The resulting melting profile reflects the mixture of amplicons present. Aspects such as GC content, length, sequence and heterozygosity will add to the melting curve characteristics of each amplicon. The resulting profiles can provide valuable information for mutation screening, genotyping, mediation and other research applications including the identification of different cell lines as shown in the prior art.
Uno de los procesos que ha generado una mayor fiabilidad de los resultados es el uso de la PCR en sus distintos formatos para la identificación de líneas celulares, mutantes y cepas específicas de microorganismos. One of the processes that has generated greater reliability of the results is the use of PCR in its different formats for the identification of cell lines, mutants and specific strains of microorganisms.
La curva de disociación de alta resolución (HRM o HRMT por sus siglas en inglés) es un método potente para escanear secuencias en muestras de ADN. La tecnología de HRM caracteriza las muestras de ácido nucleico en función de su comportamiento de disociación y detecta pequeñas diferencias en secuencias amplificadas por PCR, simplemente mediante disociación directa. El comportamiento disociativo de las muestras depende del contenido de GC, de la secuencia y de la longitud de esta que son características de los amplicones generados en el paso de PCR. El ADN de doble cadena (dsDNA) es muy estable a temperatura ambiente. Sin embargo, con el aumento de la temperatura, las dos cadenas individuales comienzan a disociarse hasta que se separan por completo. La temperatura a la que el 50% del ADN es monocatenario se denomina temperatura de fusión (Tm). Esta depende tanto de la longitud como del contenido de guanina- citosina (GC) del fragmento de ADN. Dado que los pares de bases guanina-citocina (GC) están unidos por tres enlaces de hidrógeno, son más estables que los pares de bases de adenina- timina (AT), que están unidos por solo dos enlaces de hidrógeno. Por lo tanto, las secuencias de ADN con un alto contenido de GC tienen una Tm más alto que las secuencias de ADN que contienen un bajo número de pares de bases de GC. El análisis de la curva de fusión se basa en aumentar gradualmente la temperatura después del último ciclo de PCR. Al principio, se obtiene una señal de alta fluorescencia debido a la gran cantidad de amplicones (de doble cadena) presentes en el tubo de PCR ligados o intercalados al ADN. Sin embargo, a mayor temperatura, el dsDNA se disocia, se libera una sonda fluorescente y se observa una disminución de la señal de fluorescencia. La Tm del amplicón se puede determinar a partir del punto de inflexión de la curva de fusión o del pico de fusión obtenido al graficar la primera derivada negativa de la fluorescencia (F) sobre la temperatura (T) (-dF/dT) contra la temperatura (T). The high-resolution dissociation curve (HRM or HRMT) is a powerful method for scanning sequences in DNA samples. HRM technology characterizes nucleic acid samples based on their dissociation behavior and detects small differences in PCR-amplified sequences simply by direct dissociation. The dissociative behavior of the samples depends on the GC content, the sequence and its length that are characteristics of the amplicons generated in the PCR step. Double-stranded DNA (dsDNA) is very stable at room temperature. However, with increasing temperature, the two individual chains begin to dissociate until they separate completely. The temperature at which 50% of the DNA is single-stranded is called the melting temperature (T m ). This depends on both the length and the guanine-cytosine (GC) content of the DNA fragment. Since guanine-cytocine (GC) base pairs are linked by three hydrogen bonds, they are more stable than adenine-thymine (AT) base pairs, which are linked by only two hydrogen bonds. Therefore, DNA sequences with a high GC content have a higher T m than DNA sequences containing a low number of GC base pairs. Melting curve analysis is based on gradually increasing the temperature after the last PCR cycle. Initially, a high fluorescence signal is obtained due to the large number of (double-stranded) amplicons present in the PCR tube linked or intercalated to the DNA. However, at higher temperatures, the dsDNA dissociates, a fluorescent probe is released, and a decrease in the fluorescence signal is observed. The T m of the amplicon can be determined from the inflection point of the melting curve or the melting peak obtained by plotting the negative first derivative of fluorescence (F) over temperature (T) (-dF/dT) against the temperature (T).
Originalmente, la temperatura de fusión (Tm) es un parámetro usado para establecer sospecha de identidad entre secuencias y por consiguiente sospecha de identidad entre los organismos contenedores de tales. Pero es sabido por los técnicos del área que a igualdad de Tm puede existir una pluralidad de curvas que definen organismos diferentes y, por tanto, ese parámetro por sí solo es insuficiente para establecer identidad en diferentes ámbitos. Originally, the melting temperature (T m ) is a parameter used to establish suspicion of identity between sequences and consequently suspicion of identity between organisms containing such sequences. But it is known by technicians in the area that at the same T m there can be a plurality of curves that define different organisms and, therefore, that parameter alone is insufficient to establish identity in different areas.
Para la identificación celular bajo esta técnica de deben generar amplicones específicos que permitan generar curvas de disociación típicas que caractericen a la línea celular que se quiere identificar. Para ello se deben generar usar amplicones de regiones genómicas que permitan discriminar las distintas especies y clases presentes en una muestra determinada. For cell identification under this technique, specific amplicons must be generated that allow the generation of typical dissociation curves that characterize the cell line to be identified. To do this, amplicons of genomic regions must be generated that allow discrimination between the different species and classes present in a given sample.
Lo reivindicado en esta patente es el diseño de nuevos e inventivos partidores que generan amplicones identificatorios de líneas celulares o víricas. Estos partidores están relacionadas a zonas génicas que permiten identificar células hasta el nivel de subclase. What is claimed in this patent is the design of new and inventive starters that generate amplicons identifying cell or viral lines. These starters are related to genic areas that allow cells to be identified down to the subclass level.
Dichos partidores dan una pluralidad de amplicones que pueden ser identificados mediante curvas de disociación que pueden caracterizar por sus formas, desplazamientos, Tm entre otras características. Sin embargo, el arte previo indica también que cuando existen curvas de disociación con características similares, los datos generados pueden ser tratados matemáticamente para aumentar el grado de discriminación. Para ello como métodos de tratamiento de datos se encuentran el cálculo de derivadas, el ajuste a modelos matemáticos específicos o generación de agrupamiento usando métodos estadísticos o de inteligencia artificial. These starters give a plurality of amplicons that can be identified by dissociation curves that can be characterized by their shapes, displacements, T m among other characteristics. However, the prior art also indicates that when dissociation curves with similar characteristics exist, the data generated can be treated mathematically to increase the degree of discrimination. For this, data processing methods include the calculation of derivatives, adjustment to specific mathematical models or grouping generation using statistical or artificial intelligence methods.
Respecto a los métodos que usan derivadas, la curva de la primera derivada negativa puede discriminar genotipos comparando la posición relativa y forma de las curvas de fusión. El uso principal de esta metodología está orientado por una parte a la certificación de identidad mediante comparación contra estándar conocido y por otra parte, depende de la discriminación desde un grupo acotado, como por ejemplo, la identificación y serotipificación de especies. El objetivo de los métodos antes descritos tiene en común que la identidad de la curva consulta (query) es contrastada contra un grupo limitado de posibilidades. Regarding methods that use derivatives, the negative first derivative curve can discriminate genotypes by comparing the relative position and shape of the melting curves. The main use of this methodology is oriented on the one hand to the certification of identity by comparison against a known standard and on the other hand, it depends on discrimination from a limited group, such as, for example, the identification and serotyping of species. He The objective of the methods described above has in common that the identity of the query curve is contrasted against a limited group of possibilities.
En la práctica, las curvas obtenidas desde la primera derivada negativa de HMR tienen un grado intrínseco de variabilidad. Estas curvas reflejan primariamente la secuencia contenida en los amplicones del PCR, pero también comprenden otras variables no intrínsecas, tales como son la fuente del DNA, el método de su preparación y variaciones procedimentales, e incluso del equipamiento usado, entre otros. Marcin Slomka et al. (2017) enseña el profundo impacto que ligeros cambios en las matrices de ensayo o de procedimientos provocan en las curvas de HMR, estos aumentan en impacto y frecuencia cuando el tamaño muestral es de gran escala, tal como los ensayos de alto rendimiento (high-throughput). In practice, curves obtained from the first negative derivative of HMR have an intrinsic degree of variability. These curves primarily reflect the sequence contained in the PCR amplicons, but also include other non-intrinsic variables, such as the source of the DNA, the method of its preparation and procedural variations, and even the equipment used, among others. Marcin Slomka et al. (2017) shows the profound impact that slight changes in test matrices or procedures cause on HMR curves, these increase in impact and frequency when the sample size is large scale, such as high-throughput tests (high-performance). throughput).
Por otra parte, resulta intuitivo imaginar que base de datos pequeñas tienen una menor probabilidad de comprender curvas de HRM semejantes y por tanto necesitan un menor poder de discriminación de las muestras problemas. Este es el caso típico del uso de HRM para determinar variantes de una cepa en un conjunto discreto de posibilidades o un también discreto números de sitios con polimorfismos posibles de un amplicon determinado, toda vez que en general los ensayos son optimizados para mejorar la discriminación y potenciar las diferencias. El objetivo de los métodos antes descritos tiene en común que la identidad de la curva consulta (query) es contrastada contra un grupo limitado de posibilidades. Sin embargo, cuando las bases de datos crecen y las muestras se contrastan con una base muestral amplia, la probabilidad que una curva sea erróneamente comprendida en una familia de curvas semejantes aumenta. Es decir, las necesidades resolutivas están en directa proporción al tamaño de la base muestral usada para determinar la identidad de un amplicon mediante HRM. Así entonces, el uso de HMR para establecer la identidad de una secuencia, es decir la pertenencia a un grupo, contra una base de datos de amplio tamaño muestral, exige saltar desde los rudimentarios procedimientos comparativos a métodos avanzados de análisis de agrupamiento (clustering) estadísticos. Adicional a procedimientos que mejoren las certezas estadísticas, se necesitan métodos que permitan la evaluación de dichas estrategias. On the other hand, it is intuitive to imagine that small databases have a lower probability of including similar HRM curves and therefore need less power to discriminate problem samples. This is the typical case of the use of HRM to determine variants of a strain in a discrete set of possibilities or a also discrete number of sites with possible polymorphisms of a given amplicon, since in general the assays are optimized to improve discrimination and enhance differences. The objective of the methods described above has in common that the identity of the query curve is contrasted against a limited group of possibilities. However, when the databases grow and the samples are compared with a broad sample base, the probability that a curve is mistakenly understood in a family of similar curves increases. That is, the resolution needs are in direct proportion to the size of the sample base used to determine the identity of an amplicon through HRM. Thus, the use of HMR to establish the identity of a sequence, that is, membership in a group, against a database with a large sample size, requires jumping from rudimentary comparative procedures to advanced clustering analysis methods. statistics. In addition to procedures that improve statistical certainty, methods are needed that allow the evaluation of these strategies.
Dentro de los algoritmos de inteligencia artificial para el paso de agrupamiento y/o discriminación y/o identificación que se utilizan normalmente en la técnica encontramos tales como bosque aleatorio (random forest), aprendizaje automático vectorial, k vecinos más cercanos (kNN), partición alrededor de medoides (PAM), Naive Bayes, análisis de componentes principales (PCA) o análisis discriminante lineal (LDA) entre muchos otros. Among the artificial intelligence algorithms for the clustering and/or discrimination and/or identification step that are normally used in the technique we find such as random forest, vector machine learning, k nearest neighbors (kNN), partition around medoids (PAM), Naive Bayes, principal component analysis (PCA) or linear discriminant analysis (LDA) among many others.
En ese sentido, Principal Component Analysis (PCA) es un método estadístico que permite simplificar la complejidad de espacios muéstrales con muchas dimensiones a la vez que conserva su información. Esta técnica pertenece a la familia conocida como aprendizaje no supervisado (unsupervised learning). En dichos métodos, la variable respuesta “Y” no se tiene en cuenta ya que el objetivo no es predecir “Y” sino extraer información empleando los predictores, por ejemplo, para identificar subgrupos. El método de PCA permite por lo tanto “condensar” la información aportada por múltiples variables en solo unas pocas componentes (Amat, J 2017). El uso de métodos de aprendizaje no dirigido para mejorar la resolución en la formación de grupos semejantes (clustering) en la genotipificación de amplicones polimórficos es conocido en la técnica y ya ha demostrado su utilidad (ver Kubista M et al, 2006; Harrison L, Hanson N, 2017). In that sense, Principal Component Analysis (PCA) is a statistical method that allows us to simplify the complexity of sample spaces with many dimensions while preserving their information. This technique belongs to the family known as unsupervised learning. In these methods, the response variable “Y” is not taken into account since the objective is not to predict “Y” but to extract information using the predictors, for example, to identify subgroups. The PCA method therefore allows “condensing” the information provided by multiple variables into just a few components (Amat, J 2017). The use of undirected learning methods to improve the resolution of the formation of similar groups (clustering) in the genotyping of polymorphic amplicons It is known in the art and has already demonstrated its usefulness (see Kubista M et al, 2006; Harrison L, Hanson N, 2017).
El análisis de PCA en curvas de HRM permite determinar en ellas al menos dos componentes, y recíprocamente cada muestra puede ser posicionada en un espacio de 2 dimensiones representado por cada componente. Esta forma de representación permite intuitiva y visualmente reconocer dominios, o segmentos territoriales, donde las curvas exhiben participación semejante de cada componente, formado agrupaciones. Nuestra patente demuestra que la incorporación del paso adicional de derivar la curva de HRM, es decir obtener la segunda derivada, permite generar una nueva matriz de datos que enriquece el proceso de discriminación en el agrupamiento estadístico evidenciado, porque la distancia euclidiana desde el punto promedio de cada cluster a otro aumenta cuando el paso adicional de segunda derivada es incorporado. Sorprendentemente el uso de la primera derivada, sea positiva o negativa, así como la segunda derivada de los datos de HRM permitió incrementar la capacidad discriminatoria de las curvas obtenidas, y ventajosamente esto resulta en un proceso identificatorio mejorado cuando se enfrenta una muestra problema a una base de datos de amplia base muestral. PCA analysis on HRM curves allows at least two components to be determined in them, and reciprocally each sample can be positioned in a 2-dimensional space represented by each component. This form of representation allows us to intuitively and visually recognize domains, or territorial segments, where the curves exhibit similar participation of each component, forming groupings. Our patent demonstrates that the incorporation of the additional step of deriving the HRM curve, that is, obtaining the second derivative, allows the generation of a new data matrix that enriches the discrimination process in the statistical grouping evidenced, because the Euclidean distance from the average point from each cluster to another increases when the additional second derivative step is incorporated. Surprisingly, the use of the first derivative, whether positive or negative, as well as the second derivative of the HRM data made it possible to increase the discriminatory capacity of the curves obtained, and advantageously this results in an improved identification process when a problem sample is faced with a database with a broad sample base.
DIFERENCIA TÉCNICA TECHNICAL DIFFERENCE
La presente invención busca proteger un método para determinar la identidad celular desde una muestra biológica que comprende proporcionar una muestra biológica sospechosa de contener células que pueden ser identificadas desde una base de datos conocida, que son seleccionados desde eucariotas, bacterias, virus, hongos y protozoos y que se pueden obtener desde matrices biológicas o no biológicas, realizar un paso de separación celular de la muestra, extraer el ADN de la muestra y efectuar una reacción en cadena de la polimerasa (PCR) para amplificar al menos una región genómica utilizando al menos uno y/o en combinación, de cualquiera de los partidores elegidos desde las SEQ ID N°1 a SEQ ID N°85. También incluye realizar una curva de disociación de alta resolución (HRM) de los amplicones generados por el PCR del paso anterior y posteriormente convertir los datos generados en el punto anterior mediante transformación matemática que tiene un primer paso que consiste en estandarizar los datos del equipo, luego generar una primera derivada positiva o negativa y finalmente un tercer paso que consiste en generar una derivada positiva al conjunto de datos obtenidos en el punto anterior. Finalmente usar los datos transformados matemáticamente, tanto en el paso dos y/o en el paso tres, como entradas de los algoritmos de aprendizaje automático para determinar la identidad celular comparando los datos de la muestra biológica sospechosa con una base de datos de organismos conocidos para identificación celular. Se reivindica en esta invención un sistema para determinar la identidad celular que comprende un componente de almacenamiento para generar una matriz de datos de una curva de disociación de alta resolución (HRM) de organismos conocidos y muestras sospechosas, un procesador computacional para procesar datos que permiten realizar una transformación matemática y/o un procesador computacional para procesar datos que permite ejecutar algoritmos de inteligencia artificial para el paso de agrupamiento y/o discriminación y/o identificación celular. Y un monitor que muestra información con respecto a la identificación celular. The present invention seeks to protect a method for determining cell identity from a biological sample that comprises providing a biological sample suspected of containing cells that can be identified from a known database, which are selected from eukaryotes, bacteria, viruses, fungi and protozoa. and that can be obtained from biological or non-biological matrices, perform a cell separation step from the sample, extract the DNA from the sample and perform a polymerase chain reaction (PCR) to amplify at least one genomic region using at least one and/or in combination, of any of the starters chosen from SEQ ID N°1 to SEQ ID N°85. It also includes performing a high resolution dissociation curve (HRM) of the amplicons generated by the PCR from the previous step and subsequently converting the data generated in the previous point through mathematical transformation, which has a first step that consists of standardizing the equipment data, then generate a first positive or negative derivative and finally a third step that consists of generating a positive derivative to the set of data obtained in the previous point. Finally, use the mathematically transformed data, both in step two and/or step three, as inputs to the machine learning algorithms to determine cell identity by comparing the data from the suspicious biological sample with a database of organisms known to cellular identification. Claimed in this invention is a system for determining cellular identity comprising a storage component for generating a data matrix of a high resolution dissociation curve (HRM) of known organisms and suspected samples, a computational processor for processing data that allows perform a mathematical transformation and/or a computational processor to process data that allows executing artificial intelligence algorithms for the grouping and/or discrimination and/or cell identification step. And a monitor that displays information regarding cell identification.
Dentro de la literatura se encuentra un amplio uso de la técnica PCR-HRM principalmente en la genotipificación de polimorfismos de nucleótido único (SNP por sus siglas en inglés) principalmente para la detección y anticipación de mutaciones en células posiblemente cancerígenas (Ghalamkari S et al. doi:10.1007/s12010-018-2859-3), detección de diabetes tipo 2 (CN104059991 ) entre otras enfermedades en el área clínica. También se pueden genotipificar microorganismos, como por ejemplo en la patente WO2022068785 que indica un método para identificar rápidamente Bacillus cereus y Bacillus thuríngiensis que comprende seleccionar diferentes sitios SNP en una secuencia del gen ispD y llevar a cabo una curva HRM tomando a dichas especies como plantillas para adquirir la información de la curva característica de las cepas correspondientes, para posteriormente realizar una identificación precisa de los Bacillus en una muestra. Within the literature there is a wide use of the PCR-HRM technique mainly in the genotyping of single nucleotide polymorphisms (SNP) mainly for the detection and anticipation of mutations in possibly cancerous cells (Ghalamkari S et al. doi:10.1007/s12010-018-2859-3), detection of type 2 diabetes (CN104059991) among other diseases in the clinical area. Microorganisms can also be genotyped, as for example in patent WO2022068785, which indicates a method to quickly identify Bacillus cereus and Bacillus thuringiensis that comprises selecting different SNP sites in a sequence of the ispD gene and carrying out an HRM curve taking said species as templates. to acquire the information of the characteristic curve of the corresponding strains, to subsequently perform a precise identification of the Bacillus in a sample.
Respecto a los documentos más cercanos a esta presentación está el US2017321257 que se refiere a un método para identificar bacterias en una muestra biológica que comprende los siguientes pasos: a) proveer una muestra biológica; b) aislar el ADN bacteriano; c) amplificar al menos una porción de la región ITS, usando al menos un set de partidores capaz de hibridizar esta zona; d) realizar un análisis HRM de los amplicones; y e) identificar las especies de bacterias comparando las curvas con una base especies conocidas. En esta patente se indica que para el análisis de las curvas se utiliza la transformada de Hilbert como método matemático para ajustar los datos de las curvas y de la base de datos. Regarding the documents closest to this presentation, there is US2017321257, which refers to a method for identifying bacteria in a biological sample that includes the following steps: a) providing a biological sample; b) isolate bacterial DNA; c) amplify at least a portion of the ITS region, using at least one set of primers capable of hybridizing this area; d) perform HRM analysis of the amplicons; and e) identify the bacterial species by comparing the curves with a known species base. This patent indicates that for the analysis of the curves, the Hilbert transform is used as a mathematical method to fit the data of the curves and the database.
También tiene cercanía el documento de Bowman et al. “Species identification using high resolution melting (HRM) analysis with random forest classification” (doi:10.1080/00450618.2017.1315835). Este documento busca discriminar células en un sitio forense entre 1 1 especies que pueden ser encontradas comúnmente: Gallus gallus domesticus (pollo), Felis catus (gato), Bos taurus (vacuno), Canis lupus familiarís (perro), Vulpes vulpes (zorro rojo), Homo sapiens (humano), Macropus giganteus (kanguro del este), Sus scrofa domesticus (cerdo doméstico), Oryctolagus cuniculus (conejo europeo), Ovis arfes arfes (oveja domestica) y Vombatus ursinus (uómbat común). El método descrito implica la extracción del ADN desde una muestra forense, la cuantificación y evaluación de la pureza del ADN, la realización de un PCR utilizando partidores del gen rARN 16S mitocondrial de cada especie y realizando una curva HRM desde los 60 a 95°C. Finalmente se realiza una derivación de los resultados (T versus - dF/dT) y una clasificación mediante random forest. The document by Bowman et al. is also close. “Species identification using high resolution melting (HRM) analysis with random forest classification” (doi:10.1080/00450618.2017.1315835). This document seeks to discriminate cells in a forensic site between 1 1 species that can be commonly found: Gallus gallus domesticus (chicken), Felis catus (cat), Bos taurus (cattle), Canis lupus familiarís (dog), Vulpes vulpes (red fox ), Homo sapiens (human), Macropus giganteus (eastern kangaroo), Sus scrofa domesticus (domestic pig), Oryctolagus cuniculus (European rabbit), Ovis arfes arfes (domestic sheep) and Vombatus ursinus (common wombat). The method described involves the extraction of DNA from a forensic sample, the quantification and evaluation of the purity of the DNA, the performance of a PCR using primers for the mitochondrial 16S rRNA gene of each species and performing an HRM curve from 60 to 95°C. . Finally, a derivation of the results is carried out (T versus - dF/dT) and a classification using random forest.
La principal diferencia técnica de esta invención con respecto a los documentos más cercanos está en el desarrollo de partidores específicos (SEQ ID N°1 a SEQ ID N°85 para regiones conservadas para distintas líneas celulares como rpoB, ITS2, rpB2, EF1a, TEF-1a, NL1 23S, LS 23S, dnaA, gyrB, Fteg U1, Fteg U2, CandUn, FilamUn y FungUn: el uso de dichos partidores que permite la identificación celular aprovechando las diferencias de las curvas de disociación de alta resolución (HRM) por la generación de diferentes amplicones mediante distintas técnicas de PCR. The main technical difference of this invention with respect to the closest documents is in the development of specific primers (SEQ ID No. 1 to SEQ ID No. 85 for conserved regions for different cell lines such as rpoB, ITS2, rpB2, EF1a, TEF -1a, NL1 23S, LS 23S, dnaA, gyrB, Fteg U1, Fteg U2, CandUn, FilamUn and FungUn: the use of said primers that allows cell identification taking advantage of the differences in high resolution dissociation curves (HRM) by the generation of different amplicons using different PCR techniques.
También es una diferencia técnica importante el tratamiento de los datos generados por la curva HRM (primera matriz de datos), la generación de una primera derivada positiva o negativa (segunda matriz de datos), luego la transformación de estos datos anteriores nuevamente por una derivada positiva (tercera matriz de datos) y finalmente el uso de estas matrices de datos mediante un algoritmo de inteligencia artificial que permita la identificación de una célula hasta al nivel de línea celular o especie. Also an important technical difference is the treatment of the data generated by the HRM curve (first data matrix), the generation of a first positive or negative derivative (second data matrix), then the transformation of this previous data again by a positive derivative (third data matrix) and finally the use of these data matrices through an artificial intelligence algorithm that allows the identification of a cell up to the level of cell line or species.
Es importante destacar que ambas diferencias técnicas, incorporadas dentro del mismo método permiten generar no solamente una novedad respecto al arte previo, sino que proveen un nivel inventivo ya que el uso de los partidores para las distintas regiones en forma individual y/o en conjunto mejora la identificación celular. Además, el uso combinado de distintas transformaciones matemáticas, así como de algoritmos de clasificación y/o discriminación permite mejorar la resolución de los grupos a identificar tal como se mostrará en las siguientes descripciones como los ejemplos de esta patente. It is important to highlight that both technical differences, incorporated within the same method, allow generating not only a novelty with respect to the previous art, but also provide an inventive level since the use of the starters for the different regions individually and/or together improves cellular identification. Furthermore, the combined use of different mathematical transformations, as well as classification and/or discrimination algorithms, allows improving the resolution of the groups to be identified, as will be shown in the following descriptions such as the examples of this patent.
DESCRIPCIÓN GENERAL DE LA INVENCIÓN GENERAL DESCRIPTION OF THE INVENTION
La presente invención busca proteger un método para determinar la identidad celular desde una muestra biológica que comprende proporcionar una muestra biológica sospechosa de contener células que pueden ser identificadas desde una base de datos conocida, que son seleccionados desde eucariotas, bacterias, virus, hongos y protozoos y que se pueden obtener desde matrices biológicas o no biológicas. Posteriormente realizar un paso de separación celular de la muestra donde se puede discriminar entre células eucariotas y/o microorganismos y/o virus, y se puede realizar por métodos físicos y/o biológicos para luego extraer el ADN de la muestra y efectuar una reacción en cadena de la polimerasa (PCR) para amplificar al menos una región genómica utilizando al menos uno y/o en combinación, de cualquiera de los partidores elegidos desde las SEQ ID N°1 a SEQ ID N°85. The present invention seeks to protect a method for determining cell identity from a biological sample that comprises providing a biological sample suspected of containing cells that can be identified from a known database, which are selected from eukaryotes, bacteria, viruses, fungi and protozoa. and that can be obtained from biological or non-biological matrices. Subsequently, carry out a cell separation step from the sample where you can discriminate between eukaryotic cells and/or microorganisms and/or viruses, and it can be carried out by physical and/or biological methods to then extract the DNA from the sample and carry out a reaction in polymerase chain (PCR) to amplify at least one genomic region using at least one and/or in combination, of any of the primers chosen from SEQ ID N°1 to SEQ ID N°85.
El método también incluye realizar una curva de disociación de alta resolución (HRM) de los amplicones generados por el PCR del paso anterior donde la temperatura de fusión (Tm) de los amplicones se encuentra en el rango de 60°C a 98°C; posteriormente convertir los datos generados en el punto anterior mediante transformación matemática en donde dicha transformación tiene un primer paso que consiste en estandarizar los datos del equipo, luego generar una primera derivada positiva o negativa y finalmente un tercer paso que consiste en generar una derivada positiva al conjunto de datos obtenidos en el punto anterior; luego usar los datos transformados matemáticamente, tanto en el paso dos y/o en el paso tres, como entradas de los algoritmos de aprendizaje automático para determinar la identidad celular; donde los algoritmos de inteligencia artificial permiten generar un primer clasificador que permite discriminar en base a los datos generados por los perfiles y finalmente comparar los datos de la muestra biológica sospechosa con una base de datos de organismos conocidos para identificación celular. The method also includes performing a high resolution dissociation curve (HRM) of the amplicons generated by the PCR from the previous step where the melting temperature (Tm) of the amplicons is in the range of 60°C to 98°C; subsequently convert the data generated in the previous point through mathematical transformation where said transformation has a first step that consists of standardizing the equipment data, then generating a first positive or negative derivative and finally a third step that consists of generating a positive derivative at set of data obtained in the previous point; then use the mathematically transformed data, in both step two and/or step three, as inputs to the machine learning algorithms to determine cell identity; where artificial intelligence algorithms make it possible to generate a first classifier that allows discrimination based on the data generated by the profiles and finally compare the data of the suspicious biological sample with a database of known organisms for cellular identification.
Además se reivindica en esta invención un sistema para determinar la identidad celular que comprende un componente de almacenamiento para generar una matriz de datos de una curva de disociación de alta resolución (HRM) de organismos conocidos y muestras sospechosas, un procesador computacional para procesar datos que permiten realizar una transformación matemática y/o un procesador computacional para procesar datos que permite ejecutar algoritmos de inteligencia artificial para el paso de agrupamiento y/o discriminación y/o identificación de los microrganismos. Y un monitor que muestra información con respecto a la identificación de los microorganismos. Additionally claimed in this invention is a system for determining cellular identity comprising a storage component for generating a data matrix of a high resolution dissociation curve (HRM) of known organisms and suspected samples, a computational processor for processing data that allow for a transformation mathematics and/or a computational processor to process data that allows executing artificial intelligence algorithms for the step of grouping and/or discrimination and/or identification of microorganisms. And a monitor that displays information regarding the identification of microorganisms.
DESCRIPCIÓN DETALLADA DE LA INVENCIÓN DETAILED DESCRIPTION OF THE INVENTION
Tal como se indicó en la descripción general de la invención se busca proteger un método para determinar la identidad celular desde una muestra biológica que comprende proporcionar una muestra biológica sospechosa de contener células que son seleccionados desde eucariotas, bacterias, virus, hongos y protozoos, y que se pueden obtener desde matrices biológicas o no biológicas. As indicated in the general description of the invention, it seeks to protect a method for determining cell identity from a biological sample that comprises providing a biological sample suspected of containing cells that are selected from eukaryotes, bacteria, viruses, fungi and protozoa, and that can be obtained from biological or non-biological matrices.
Preferentemente la muestra biológica puede provenir de procedimientos clínicos. En este caso se pueden incorporar muestras de sangre incluyendo sin limitaciones, sangre completa, suero y plasma. También se pueden detectar en otras muestras de líquido o tejido, incluyendo, saliva, líquido cefalorraquídeo, moco, líquido linfático o líquido de lavado, y muestras de tejido obtenidas de la piel y los tejidos blandos. En particular, se pueden obtener muestras de líquido o tejido de órganos del sistema respiratorio, sistema reproductivo, sistema nervioso, sistema muscular, sistema tegumentario, sistema linfático, sistema excretor, sistema endocrino, sistema digestivo, sistema cardiovascular y sistema esquelético. Además, las células se pueden detectar e identificar específicamente a partir de muestras tomadas del sitio de una infección localizada, por ejemplo, en el sitio de una herida causada por una lesión traumática o cirugía.Preferably, the biological sample can come from clinical procedures. In this case, blood samples can be incorporated, including, without limitation, whole blood, serum and plasma. They can also be detected in other fluid or tissue samples, including saliva, cerebrospinal fluid, mucus, lymph fluid or lavage fluid, and tissue samples obtained from the skin and soft tissues. In particular, fluid or tissue samples can be obtained from organs of the respiratory system, reproductive system, nervous system, muscular system, integumentary system, lymphatic system, excretory system, endocrine system, digestive system, cardiovascular system and skeletal system. Additionally, cells can be detected and specifically identified from samples taken from the site of a localized infection, for example, at the site of a wound caused by traumatic injury or surgery.
También las muestras pueden provenir de la industria alimenticia dentro de las cuales, y sin el ánimo de limitar la invención, se pueden incluir alimentos frescos, alimentos procesados por distintos métodos de la industria como cocimiento, desecamiento, liofilización entre otros. Alimentos lácteos como leches naturales, cultivadas, fermentadas y sus derivados. Bebidas carbonatadas con y sin azúcar, jugos naturales con y sin preservantes, condimentos para preparaciones, alimentos secos y húmedos para animales de casa y granja, entre otros. The samples can also come from the food industry within which, and without the intention of limiting the invention, they can include fresh foods, foods processed by different industry methods such as cooking, drying, freeze-drying, among others. Dairy foods such as natural, cultured, fermented milk and their derivatives. Carbonated drinks with and without sugar, natural juices with and without preservatives, condiments for preparations, dry and wet foods for home and farm animals, among others.
La obtención de la muestra se puede realizar además de superficies de contacto hechas de material inorgánico y que se usan regularmente en las industrias anteriormente descritas como mesones de trabajo, instrumental médico, prótesis, instrumental de cocina y de instalaciones permanentes como temporales para la realización de los procedimientos de producción en el caso de la industria alimenticia, como de realización de procedimientos clínicos tanto invasivos como no invasivos. Obtaining the sample can be carried out in addition to contact surfaces made of inorganic material and that are regularly used in the industries described above such as work benches, medical instruments, prostheses, kitchen instruments and permanent and temporary installations for carrying out production procedures in the case of the food industry, as well as the performance of both invasive and non-invasive clinical procedures.
Respecto a los materiales de las superficies estas pueden ser los comunes para la industria alimenticia como para los procedimientos clínicos. Sin limitar la presente invención serían madera y sus derivados, acero y sus derivados, hierro y sus derivados, plásticos en todos sus formatos, vidrio en todos sus formatos, aleaciones, etc. También, y en forma opcional la muestra puede ser sometida a un proceso de enriquecimiento celular, incorporando medios de cultivos específicos para el crecimiento de las células a identificar. Se busca con este paso tener una cantidad suficiente de células para la generación de la amplificación por PCR. Las técnicas de enriquecimiento empleadas son las conocidas en la microbiología. Regarding the surface materials, these may be common for the food industry as well as for clinical procedures. Without limiting the present invention, they would be wood and its derivatives, steel and its derivatives, iron and its derivatives, plastics in all its formats, glass in all its formats, alloys, etc. Also, and optionally, the sample can be subjected to a cell enrichment process, incorporating specific culture media for the growth of the cells to be identified. This step seeks to have a sufficient number of cells for the generation of PCR amplification. The enrichment techniques used are those known in microbiology.
Posteriormente se puede realizar un paso de separación celular de la muestra donde se puede discriminar entre células eucariotas y/o microorganismos y/o virus, y se puede realizar por métodos físicos y/o biológicos. Los métodos de separación son los conocidos en la biotecnología y sin limitar la presente invención se pueden seleccionar de acuerdo a las propiedades aprovechadas como por tamaño o densidad de las células, la afinidad respecto a anticuerpos, dispersión de la luz, emisión de fluorescencia, propiedades físicas entre otras.Subsequently, a cell separation step can be carried out from the sample where it can discriminate between eukaryotic cells and/or microorganisms and/or viruses, and it can be carried out by physical and/or biological methods. The separation methods are those known in biotechnology and without limiting the present invention, they can be selected according to the properties used, such as size or density of the cells, affinity for antibodies, light scattering, fluorescence emission, properties physical among others.
El paso siguiente es extraer el ADN de la muestra por los medios conocidos por la técnica y efectuar una reacción en cadena de la polimerasa (PCR) para amplificar al menos una región genómica utilizando cualquiera y/o en combinación cualquiera de los partidores elegido desde las SEQ ID N°1 a SEQ ID N°85. Las PCR pueden ser elegidas de entre PCR anidada, PCR multiplex, PCR con transcñptasa inversa y PCR en tiempo real (qPCR). Como modalidad preferida, pero no limitante, se indica la qPCR como método de esta patente. The next step is to extract the DNA from the sample by means known in the art and perform a polymerase chain reaction (PCR) to amplify at least one genomic region using any and/or in combination of any of the primers chosen from the SEQ ID N°1 to SEQ ID N°85. PCRs can be chosen from nested PCR, multiplex PCR, reverse transcriptase PCR and real-time PCR (qPCR). As a preferred, but not limiting, modality, qPCR is indicated as the method of this patent.
Las regiones genómicas a las cuales las secuencias anteriormente descritas son partidores específicos se seleccionan de rpoB, ITS2, rpB2, EF1a, TEF-1a, NL1 23S, LS 23S, dnaA, gyrB, Reg U1, Reg U2, CandUn, FilamUn y FungUn, sin limitar la invención a otras regiones genómicas que se usan en la técnica para la selección e identificación de células o líneas celulares. The genomic regions to which the sequences described above are specific primers are selected from rpoB, ITS2, rpB2, EF1a, TEF-1a, NL1 23S, LS 23S, dnaA, gyrB, Reg U1, Reg U2, CandUn, FilamUn and FungUn, without limiting the invention to other genomic regions that are used in the art for the selection and identification of cells or cell lines.
Las secuencias de partidores individualizadas anteriormente se presentan en la siguiente tabla y codifican las regiones genómicas que se muestran a continuación.
Figure imgf000010_0001
Figure imgf000011_0001
Figure imgf000012_0001
The primer sequences identified above are presented in the following table and encode the genomic regions shown below.
Figure imgf000010_0001
Figure imgf000011_0001
Figure imgf000012_0001
Para efectos del entendimiento de las secuencias presentadas, se indican en la siguiente tabla mediante código IUPAC lo que corresponde cada letra indicada en las SEQ.
Figure imgf000012_0002
Figure imgf000013_0001
For the purposes of understanding the sequences presented, the following table indicates by IUPAC code what each letter indicated in the SEQ corresponds to.
Figure imgf000012_0002
Figure imgf000013_0001
En ciertas realizaciones, los oligonucleótidos partidores comprende una variante de la misma que comprenda una secuencia que tenga al menos alrededor de entre un 80 a 100% de identidad de las secuencias antes individualizadas, incluida cualquier identidad porcentual dentro de este rango, como 81 , 82, 83, 84, 85, 86, 87, 88, 89, 90, 91 , 92, 93, 94, 95, 96, 97, 98 o 99% de identidad de secuencia. Se pueden introducir cambios en las secuencias de nucleótidos correspondientes a variaciones genéticas particulares de interés. En ciertas realizaciones, se pueden realizar cambios de nucleótidos, en una secuencia seleccionada del grupo que consiste en SEQ ID N°1 a SEQ ID N°85, en la que el cebador de oligonucleótidos es capaz de hibridar y amplificar una secuencia de ADN objetivo. In certain embodiments, the starter oligonucleotides comprise a variant thereof that comprises a sequence that has at least about 80 to 100% identity of the sequences identified above, including any percentage identity within this range, such as 81, 82 , 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% sequence identity. Changes can be introduced in the nucleotide sequences corresponding to particular genetic variations of interest. In certain embodiments, nucleotide changes can be made, in a sequence selected from the group consisting of SEQ ID NO: 1 to SEQ ID NO: 85, wherein the oligonucleotide primer is capable of hybridizing and amplifying a target DNA sequence. .
El método también incluye realizar una curva de disociación de alta resolución (HRM) de los amplicones generados por el PCR del paso anterior donde la temperatura de fusión (Tm) de los amplicones se encuentra en el rango de 60°C a 98°C. Los datos de dicha curva son almacenados en una base de datos y pueden ser incorporados a una base de datos de células conocidas para posteriormente ser usados para identificación. The method also includes performing a high resolution dissociation curve (HRM) of the amplicons generated by the PCR from the previous step where the melting temperature (Tm) of the amplicons is in the range of 60°C to 98°C. The data from said curve are stored in a database and can be incorporated into a database of known cells to later be used for identification.
Los datos generados por la curva de disociación pueden ser transformados matemáticamente, en donde dicha transformación tiene un primer paso que consiste en estandarizar los datos generados por los equipos PCR usando métodos de normalización conocidos en la técnica de tratamiento de datos. Como modalidad preferida, pero no limitante, dicha estandarización se realiza por z-score. Una vez realizada dicha estandarización, el paso dos es generar una primera derivada positiva o negativa de los datos normalizados y generar una segunda base de los datos ya derivados. The data generated by the dissociation curve can be transformed mathematically, where said transformation has a first step that consists of standardizing the data generated by the PCR equipment using normalization methods known in the data processing technique. As a preferred, but not limiting, modality, said standardization is carried out by z-score. Once this standardization has been carried out, step two is to generate a first positive or negative derivative of the normalized data and generate a second base of the data already derived.
Finalmente se puede realizar un tercer paso de tratamiento que consiste en generar una derivada positiva al conjunto de datos obtenidos en el punto anterior, formando una tercera base de datos para trabajar. A esta base de datos se le llama doble derivada. Finally, a third treatment step can be carried out, which consists of generating a positive derivative to the set of data obtained in the previous point, forming a third database to work with. This database is called double derivative.
Los conjuntos de datos generados tanto por la HRM, como de datos normalizados y las derivadas del segundo y/o tercer paso se pueden utilizar en este método para generar parámetros de discriminación adicionales. Estos parámetros discriminación pueden ser seleccionados, pero no limitados, a tales como posición relativa de picos, valles, altura de picos, y relaciones entre los datos generados tales como ancho y/o distancia entre picos y/o valles. Data sets generated from both HRM, normalized data and those derived from the second and/or third step can be used in this method to generate additional discrimination parameters. These discrimination parameters may be selected, but not limited to, such as relative position of peaks, valleys, height of peaks, and relationships between the generated data such as width and/or distance between peaks and/or valleys.
Como paso siguiente se pueden usar los conjuntos de datos individualizados anteriormente, ya sea individualmente o en combinación, como entradas de los algoritmos de aprendizaje automático, en donde dichos algoritmos permiten generar clasificadores que permite discriminar en los conjuntos de datos generados y finalmente comparar los datos de la muestra biológica sospechosa con una base de datos de organismos conocidos para identificación celular. El algoritmo de aprendizaje automático puede comprender un algoritmo de aprendizaje supervisado como por ejemplo clasificación ordinal, análisis de regresión y las redes difusas de información (IFN), clasificación estadística (como AODE), clasificadores lineales (por ejemplo, discriminante lineal de Fisher, regresión logística, clasificador Naive Bayes, Perceptron y máquina de vectores de soporte), clasificadores cuadráticos, k-vecino más cercano, Boosting, árboles de decisión (por ejemplo, bosques aleatorios), redes bayesianas y modelos ocultos de Markov entre otros. As a next step, the previously individualized data sets can be used, either individually or in combination, as inputs to the machine learning algorithms, where said algorithms allow the generation of classifiers that allow discrimination in the generated data sets and finally compare the data. of the suspected biological sample with a database of known organisms for cellular identification. The machine learning algorithm may comprise a supervised learning algorithm such as ordinal classification, regression analysis and information fuzzy networks (IFN), statistical classification (such as AODE), linear classifiers (e.g., Fisher linear discriminant, regression logistics, Naive Bayes classifier, Perceptron and support vector machine), quadratic classifiers, k-nearest neighbor, Boosting, decision trees (for example, random forests), Bayesian networks and hidden Markov models among others.
Los algoritmos de aprendizaje automático también pueden comprender un algoritmo de aprendizaje no supervisado. Los ejemplos de algoritmos de aprendizaje no supervisados pueden incluir redes neuronales artificiales, agrupación de datos, algoritmo de maximización de expectativas, mapa autoorganizado, red de funciones de base radial, cuantización vectorial, mapa topográfico generativo, método de cuello de botella de información e IBSEAD. El aprendizaje no supervisado también puede comprender algoritmos de aprendizaje de reglas de asociación como el algoritmo Apriori, el algoritmo Eclat y el algoritmo fp-crecimiento. También se puede utilizar la agrupación jerárquica, como la agrupación en clústeres de enlace único y la agrupación en clústeres conceptuales. Alternativamente, el aprendizaje no supervisado puede comprender la agrupación particional, como el algoritmo K-means y la agrupación difusa. En algunos casos, los algoritmos de aprendizaje automático comprenden un algoritmo de aprendizaje por refuerzo. Los ejemplos de algoritmos de aprendizaje por refuerzo incluyen, entre otros, el aprendizaje por diferencia temporal, el aprendizaje Q y los autómatas de aprendizaje. Machine learning algorithms may also comprise an unsupervised learning algorithm. Examples of unsupervised learning algorithms may include artificial neural networks, data clustering, expectation maximization algorithm, self-organizing map, radial basis function network, vector quantization, generative topographic map, information bottleneck method, and IBSEAD . Unsupervised learning can also comprise association rule learning algorithms such as the Apriori algorithm, the Eclat algorithm, and the fp-growth algorithm. Hierarchical clustering can also be used, such as single link clustering and conceptual clustering. Alternatively, unsupervised learning may comprise partitional clustering, such as the K-means algorithm and fuzzy clustering. In some cases, machine learning algorithms comprise a reinforcement learning algorithm. Examples of reinforcement learning algorithms include, but are not limited to, temporal difference learning, Q learning, and learning automata.
En una realización preferida los algoritmos de inteligencia artificial para el paso de agrupamiento y/o discriminación y/o identificación pueden ser seleccionados desde bosque aleatorio (random forest), aprendizaje automático vectorial, k vecinos más cercanos (kNN), partición alrededor de medoides (PAM), Naive Bayes, análisis de componentes principales (PCA) o análisis discriminante lineal (LDA). Estos algoritmos de inteligencia artificial permiten generar un nuevo clasificador que permite discriminar respecto a las bases de datos generadas en el paso anterior. In a preferred embodiment, the artificial intelligence algorithms for the clustering and/or discrimination and/or identification step can be selected from random forest, vector machine learning, k nearest neighbors (kNN), partitioning around medoids ( PAM), Naive Bayes, principal component analysis (PCA) or linear discriminant analysis (LDA). These artificial intelligence algorithms allow the generation of a new classifier that allows discrimination against the databases generated in the previous step.
Además, en esta invención se reivindica un sistema para determinar la identidad celular que comprende un componente de almacenamiento para generar una matriz de datos de una curva de disociación de alta resolución (HRM) de organismos conocidos y muestras sospechosas.Furthermore, a system for determining cellular identity is claimed in this invention comprising a storage component for generating a data matrix of a high resolution dissociation curve (HRM) of known organisms and suspected samples.
También se cuenta con un procesador computacional para procesar datos que permiten realizar una transformación matemática en donde dicha transformación tiene un primer paso que consiste en estandarizar la matriz de datos generados para la curva de disociación de alta resolución, usando métodos de normalización conocidos en la técnica de tratamiento de datos. Una vez realizada dicha estandarización, el paso dos es generar una primera derivada positiva o negativa de los datos normalizados y generar una segunda base de datos ya derivados. Finalmente se puede realizar un tercer paso de tratamiento que consiste en generar una derivada positiva al conjunto de datos obtenidos en el punto anterior, formando una tercera base de datos para trabajar. Se adiciona a este sistema, un procesador computacional que permite ejecutar algoritmos de inteligencia artificial para el paso de agrupamiento y/o discriminación de las matrices de datos ya individualizadas anteriormente. Los algoritmos de inteligencia artificial son los mismos mencionados anteriormente en el método tanto en su selección general como en su realización preferida. There is also a computational processor to process data that allows carrying out a mathematical transformation where said transformation has a first step that consists of standardizing the data matrix generated for the high resolution dissociation curve, using normalization methods known in the art. of data processing. Once this standardization has been carried out, step two is to generate a first positive or negative derivative of the normalized data and generate a second database already derived. Finally, a third treatment step can be carried out, which consists of generating a positive derivative to the set of data obtained in the previous point, forming a third database to work with. Added to this system is a computational processor that allows the execution of artificial intelligence algorithms for the grouping and/or discrimination step of the data matrices already individualized previously. The artificial intelligence algorithms are the same as mentioned above in the method both in its general selection and in its preferred embodiment.
Finalmente, como componente de este sistema tenemos un monitor que muestra información con respecto a la identificación celular. Esto se logra por la comparación del tratamiento de los datos tanto por la transformación matemática como por la ejecución del algoritmo de inteligencia artificial, con una base de datos de células conocidas que contienen los parámetros de transformación matemática antes mencionados. Finally, as a component of this system we have a monitor that displays information regarding cell identification. This is achieved by comparing the data treatment by both the mathematical transformation and the execution of the artificial intelligence algorithm, with a database of known cells that contain the aforementioned mathematical transformation parameters.
EJEMPLOS EXAMPLES
A continuación, se presentan ejemplos de realizaciones específicas para llevar a cabo la presente invención. Los ejemplos se ofrecen únicamente con fines ilustrativos y no pretenden limitar el alcance de la presente invención de ninguna manera. Below are examples of specific embodiments to carry out the present invention. The examples are offered for illustrative purposes only and are not intended to limit the scope of the present invention in any way.
Ejemplo 1 : Ubicación de los partidores dentro de gen de referencia. Example 1: Location of the starters within the reference gene.
Se presenta como ejemplo no limitativo a las realizaciones indicadas anteriormente en la memoria descriptiva, la ubicación de las SEQ N°1 a SEQ N°12 dentro de la secuencia del gen altamente conservado rpoB La figura 1 A muestra el gen completamente y donde se encuentran ubicadas las secuencias antes indicadas. Las figuras 1 B y 1 C presenta a las secuencias SEQ N°1 a 5 dentro del gen, mientras que las figuras 1 D y 1 E muestra a las secuencias SEQ N°6 a 12. The location of SEQ No. 1 to SEQ No. 12 within the sequence of the highly conserved rpoB gene is presented as a non-limiting example of the embodiments indicated above in the specification. Figure 1 A shows the gene completely and where they are located. the sequences indicated above are located. Figures 1 B and 1 C show the sequences SEQ N°1 to 5 within the gene, while Figures 1 D and 1 E show the sequences SEQ N°6 to 12.
Se puede ver que las secuencias son variaciones de sectores altamente conservados de los genes indicados en la tabla presentada en la descripción detallada de la invención y que esta propiedad permite su uso en distintas tecnologías de secuenciación de nueva generación (NGS) y además es parte crucial del método de identificación presentado en esta presentación. It can be seen that the sequences are variations of highly conserved sectors of the genes indicated in the table presented in the detailed description of the invention and that this property allows their use in different next generation sequencing (NGS) technologies and is also a crucial part of the identification method presented in this presentation.
Ejemplo 2: Protocolo para generación de PCR-HRM. Example 2: Protocol for generating PCR-HRM.
Para la identificación de bacterias que provocan deterioro en bebidas de fantasía se deben realizar 3 procesos: Enriquecimiento, Extracción y RT-PCR. To identify bacteria that cause spoilage in fancy beverages, 3 processes must be carried out: Enrichment, Extraction and RT-PCR.
1 ) Enriquecimiento 1) Enrichment
Tranferir 1 mL de la muestra a un tubo con medio enriquecido para bacterias lácticas y se deja incubar a 35°C durante 24 hrs. Se traspasa otro mL de la muestra a un tubo de enriquecimiento para bacterias termoacidófilas y déjalo incubar a 44°C durante 48hrs. Transfer 1 mL of the sample to a tube with enriched medium for lactic acid bacteria and allow it to incubate at 35°C for 24 hrs. Transfer another mL of the sample to an enrichment tube for thermoacidophilic bacteria and let it incubate at 44°C for 48 hours.
Trasladar otro mL de la muestra a un tubo con medio de enriquecimiento para bacterias y dejarlo incubar a 35°C durante 24 hrs. Transfer another mL of the sample to a tube with bacteria enrichment medium and let it incubate at 35°C for 24 hrs.
Transferir 1 mL de tu muestra a un tubo con medio de enriquecimiento para bacterias anaeróbicas y déjalo incubar a 35°C durante 24 hrs en anaerobiosis. Transfer 1 mL of your sample to a tube with enrichment medium for anaerobic bacteria and let it incubate at 35°C for 24 hrs in anaerobiosis.
Finalmente, traspasar 1 mL de tu muestra a un tubo con medio enriquecido para bacterias termófilas y déjalos incubar a 44°C durante 24 hrs. Finally, transfer 1 mL of your sample to a tube with enriched medium for thermophilic bacteria and let it incubate at 44°C for 24 hrs.
Transcurrido el tiempo de las incubaciones, transfiere 1 ,2 mL de la muestra enriquecida a un tubo estéril. De igual manera, se prepara una contramuestra y guardar a 4°C. After the incubation time has elapsed, transfer 1.2 mL of the enriched sample to a sterile tube. Similarly, prepare a countersample and store at 4°C.
2) Extracción del ADN 2) DNA extraction
Para la extracción de ADN desde las muestras de alimento se ejecuta el siguiente protocolo: a) Se calienta un termoblock a 99°C. Paralelamente realiza un centrifugado y homogenizado de las muestras tras su incubación. b) Trasferir 200 pL de cada una de las muestras enriquecidas y se mezclan en un tubo de 1 ,5 mL estéril. c) Centrifugar a 9000g por 4 minutos, y eliminar el sobrenadante mediante micropipeta. d) Se agregan 350 pL de solución de lavado y pipetea hasta que se vuelva homogéneo. e) Transferir los 350 pL de muestra al tubo A y agita a 2500 rpm por 10 minutos. f) Transcurrido el tiempo, traspasa 250 pL de muestra a un tubo con una resina de separación y solución de fosfato, y se incuba a 99°C por 20 minutos. g) Dejar reposar a temperatura ambiente por 5 minutos, una vez que la resina decanta se toman 100 pL de sobrenadante y se transfieren a un tubo de 1 .5 mL estéril. To extract DNA from food samples, the following protocol is executed: a) A thermoblock is heated to 99°C. At the same time, the samples are centrifuged and homogenized after incubation. b) Transfer 200 pL of each of the enriched samples and mix in a sterile 1.5 mL tube. c) Centrifuge at 9000g for 4 minutes, and eliminate the supernatant using a micropipette. d) Add 350 pL of wash solution and pipette until it becomes homogeneous. e) Transfer the 350 pL of sample to tube A and shake at 2500 rpm for 10 minutes. f) After the time has elapsed, transfer 250 pL of sample to a tube with a separation resin and phosphate solution, and incubate at 99°C for 20 minutes. g) Let it rest at room temperature for 5 minutes, once the resin decants, take 100 pL of supernatant and transfer it to a sterile 1.5 mL tube.
3) RT-PCR 3) RT-PCR
Para esta etapa, se utiliza un termociclador ChaiBio o AriaMX. Se configura el programa de PCR y de HRM según se indica en la siguiente tabla:
Figure imgf000016_0001
Figure imgf000017_0001
For this stage, a ChaiBio or AriaMX thermal cycler is used. The PCR and HRM program is configured as indicated in the following table:
Figure imgf000016_0001
Figure imgf000017_0001
Se debe realizar un centrifugado (spin-down) a los pocilios previo a la carga. Luego cargar con 2 pL de ADN previamente extraído y cargar 2 pL en los controles positivo y negativo en sus respectivos pocilios. The wells must be centrifuged (spin-down) prior to loading. Then load with 2 pL of previously extracted DNA and load 2 pL in the positive and negative controls in their respective wells.
Una vez que los pocilios fueron cargados, se realiza un nuevo centrifugado para asegurar que la muestra y controles entren en contacto con el mix. Cierra los pocilios con sus respectivas tapas. Once the wells were loaded, a new centrifugation is performed to ensure that the sample and controls come into contact with the mix. Close the wells with their respective lids.
Lleva la placa o tiras al equipo de PCR. Take the plate or strips to the PCR equipment.
Ejemplo 3: Tratamiento matemático de curvas de disociación (HMR) de amplicones cercanos. Example 3: Mathematical treatment of dissociation curves (HMR) of nearby amplicons.
Sin limitarse a las realizaciones preferentes, los inventores seleccionaron 3 cepas de microorganismos que suelen ser difíciles de discriminar mediante HMR, Paenibacillus maceraos (2), Pseudomonas putida (5) y Acetobacter aceti (7). Without limiting ourselves to the preferred embodiments, the inventors selected 3 strains of microorganisms that are usually difficult to discriminate by HMR, Paenibacillus maceraos (2), Pseudomonas putida (5) and Acetobacter aceti (7).
Para ello se realizó un PCR-RT tomando 50 pL de cultivo de muestra fueron mezclados con 450 pL de solución tampón de lisis. La mezcla fue incubada por 20 minutos a 95SC y el ADN extraído pasó a la etapa de amplificación. La etapa de PCR se desarrolló contra una región genómica utilizando un conjunto de partidores de amplificación de amplio rango taxonómico, tal como rpoB, utilizando un mix que contiene los partidores correspondientes a las SEQ ID N°1 a SEQ ID N°19. To do this, a PCR-RT was performed, taking 50 pL of sample culture and mixing it with 450 pL of lysis buffer solution. The mixture was incubated for 20 minutes at 95 S C and the extracted DNA went to the amplification stage. The PCR step was developed against a genomic region using a set of amplification primers with a wide taxonomic range, such as rpoB, using a mix containing the primers corresponding to SEQ ID N°1 to SEQ ID N°19.
Luego se normalizan los resultados de los equipos de qPCR realizando una normalización z- score que permite tener resultados comparables independiente del dispositivo y tiempo en el cual se realicen los análisis. Posteriormente se grafican los resultados de temperatura y la derivada de la fluorescencia relativa (dñFu) tal como se indican en la Figura 2. The results of the qPCR equipment are then normalized by performing a z-score normalization that allows comparable results to be obtained regardless of the device and time in which the analyzes are performed. Subsequently, the temperature results and the derivative of the relative fluorescence (dñFu) are graphed as indicated in Figure 2.
Luego con estos datos se realizó un agrupamiento mediante software R de las curvas logradas sin (Figura 3A) y con segunda derivada (Figura 3B) aplicando Análisis de Componentes Principales (PCA por sus siglas en inglés). Then, with these data, grouping was performed using R software of the curves achieved without (Figure 3A) and with second derivative (Figure 3B) applying Principal Component Analysis (PCA).
Las distancias euclidianas, que es un criterio de disimilitud, entre los grupos (cluster) logrados mediante el proceso sólo incorpora la primera derivada de la fluorescencia (método estándar) y el proceso que incorpora la segunda derivada es determinante. Como se observa en la figura 4, se contempla, sorprendente y ventajosamente, que las distancias entre los grupos de organismos analizados aumentaron cuando se utilizó el método reivindicado caracterizado en un aspecto esencial por la aplicación de la segunda derivada a las curvas de HMR, lo que significa que mejoró el poder de discriminación entre curvas de comportamiento cercano. Tanto en el ordenamiento de cluster, presentado en la figura 3, como lo indicado del aumento de la distancia euclidiana indicado en la figura 3 se puede visualizar una identificación clara de Paenibacillus maceraos (2) y una discriminación importante entre Pseudomonas putida (5) y Acetobacter aceti (7). The Euclidean distances, which is a dissimilarity criterion, between the groups (cluster) achieved through the process only incorporates the first derivative of fluorescence (standard method) and the process that incorporates the second derivative is decisive. As seen in Figure 4, it is contemplated, surprisingly and advantageously, that the distances between the groups of organisms analyzed increased when the claimed method was used, characterized in an essential aspect by the application of the second derivative to the HMR curves, which which means that the power of discrimination between curves of close behavior improved. Both in the cluster arrangement, presented in figure 3, and in the increase in Euclidean distance indicated in figure 3, a clear identification of Paenibacillus maceraos (2) and an important discrimination between Pseudomonas putida (5) and Acetobacter aceti (7).
Ejemplo 4: Generación base de datos de microorganismo por HMR y aumento de capacidad de discriminación en muestras complejas. Example 4: Microorganism database generation by HMR and increase in discrimination capacity in complex samples.
Para probar que este aumento en la capacidad de discriminar entre curvas cercanas es también reflejo de un mejoramiento en la capacidad de identificar una muestra problema contra una base de datos de amplia base muestral. To prove that this increase in the ability to discriminate between nearby curves is also a reflection of an improvement in the ability to identify a problem sample against a database with a broad sample base.
Para los microorganismos, todas las cepas se prepararon frescas en medios de cultivo no selectivos y luego se aislaron en medio de agar. De cada placa, se secuenció el ADN de 10 colonias para confirmar la cepa y la pureza de los cultivos. For microorganisms, all strains were prepared fresh in non-selective culture media and then isolated on agar medium. From each plate, DNA from 10 colonies was sequenced to confirm the strain and purity of the cultures.
El resultado entre el proceso habitual (curva de disociación) se contrastó contra el método invento (aplicación de doble derivada) mediante gráficos de calor. Esto se visualiza por un evidente incremento de calor en la línea de identidad, la línea transversal que va desde el origen a la esquina superior derecha tal como se presenta en la figura 5. The result between the usual process (dissociation curve) was contrasted against the inventive method (application of double derivative) using heat graphs. This is visualized by an evident increase in heat on the line of identity, the transverse line that goes from the origin to the upper right corner as presented in Figure 5.
Los siguientes microorganismos que se utilizaron para construir la base de datos de comparación son los siguientes, pero no limitantes a la incorporación de nuevas cepas o especies: The following microorganisms that were used to build the comparison database are the following, but not limiting the incorporation of new strains or species:
Bacteria: Bacterium:
Acetobacter aceti, Acidovorax sp, Acinetobacter sp, Aeromonas hydrophila, Alicyclobacillus acidoterrestrís, Asaia bogorensis, Asaia lannensis, Asaia sp, Bacillus albus, Bacillus altitudinis, Bacillus amyloliquefaciens, Bacillus cereus, Bacillus coagulaos, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus subtilis, Blastomonas natatorial, Brochothríx thermosphacta, Caulobacter vibríoides, Clostridium perfríngens, Delftia acidovorans, Delftia sp, Enterobacter cloacae, Enterococcus f aecium, Escherichia coli O157:H7, Hydrogenophaga pseudoflava, Klebsiella oxy toca, Lactobacillus alimentan us, Lactobacillus brevis, Lactobacillus fermentum, Lactobacillus helveticus, Lactobacillus parabuchneri, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc citreum, Paenibacillus humicus, Paenibacillus macerans, Paenibacillus motobuensis, Paenibacillus xylanilyticus, Pseudomonas aeruginosa, Pseudomonas alcaligenes, Pseudomonas mendocina, Pseudomonas plecoglossicida, Pseudomonas putida, Salmonella typhimurium, Serratia liquefaciens, Serratia marcescens, Shewanella báltica, Sphingobium yanoikuyae, Sphingomonas sp, Sphingopyxis sp, Sphingopyxis terrae, Staphylococcus epidermidis, Staphylococcus hominis, Streptococcus pyogenes, Weissella cibaria, Aerococcus viridans, Aeromona salmonicida, Alcaligenes viscolactis, Bacillus albus, Bacillus cereus, Bacillus circulans, Bacillus clausii, Bacillus invictae, Bacillus paranthracis, Bacillus simplex, Bacillus sonorensis, Bacillus thuringiensis, Bacillus velezensis, Brachybacterium nesterenkovii, Brusella neotomae, Campylobacter jejuni, Clostridium butyricum, Clostridium pasteurianum, Cupriavidus pauculus, Enterobacter aerogenes, Enterobacter asburiae, Enterobacter sacchari, Enterobacter soils, Enterococcus casseliflavus, Enterococcus durans, Enterococcus faecalis, Enterococcus gallinarum, Enterococcus hirae, Enterococcus raffinosus, Enterococcus saccharolyticus, Erysipelothrix sp, Escherichia coli, Escherichia coli 0157-H7, Escherichia coli Serotype 0111:H8, Escherichia coli Serotype O145:NM, Escherichia coli Serotype 0157: H7, Escherichia coli Serotype O26:H11, Escherichia coli Serotype 045 :H2, Exignobacterium acetyl icum, Flavobacterium, Gluconacetobacter sacchari, Hafnia paralvei, Herbaspirillum seropedicae, Kocuria flava, Kocuria kristinae, Kocuria rhizophila, Kocuria varians, Kokuria palustris, Kokuria rosea, L. bulgaricus, Lactobacillus acidophilus, Lactobacillus buchneri, Lactobacillus fructivorans, Lactococcus helveticus, Lactococcus lactis, Leclercia adecarboxylata, Leuconostoc lactis, Listeria aquatica, Listeria booriae, Listeria cornellensis, Listeria floridensis, Listeria grandensis, Listeria grayi, Listeria innocua, Listeria ivanovii, Listeria marthii, Listeria monocytogenes, Listeria riparia, Listeria rocourtiae, Listeria seeligeri, Listeria welshimeri, Macrococcus carouselicus, Microbacterium aurantiacum, Microbacterium liquefaciens, Microbacterium marinilacus, Micrococcus aloeverae, Paenibacillus cookie, Paenibacillus fonticola, Paenibacillus puldeungensis, Paenibacillus wynnii, Paenibacillus xylaexedens, Pediococcus damnosus, Phytobacter ursingii, Plesiomonas shigelloides, Propianics bacillus, Propionibacterium acidifaciens, Propionibacterium freudenreichii, Proteus vulgaris, Pseudomona shigelloides, Pseudomonas fluorescens, Pseudomonas fragi, Pseudomonas lundensis, Pseudomonas plecoglossicida, Pseudomonas putrefaciens, Ralstonia mannitolilytica, Salmonella bongori, Salmonella enterica Diarizonae, Serratia grimesii, Sphingomonas melonis, Spirosoma panacterrae, Staphylococcus aureus, Staphylococcus capitis, Staphylococcus xylosus, Stenotrophomonas maltophilia, Stenotrophomonas rhizophila, Streptococcus agalactiae, Streptococcus cremoris, Streptococcus equi subsp. Equi, Streptococcus gallolyticus, Streptococcus grupo B, Streptococcus pyogenes, Streptococcus thermophilus y Vibrio sp. Acetobacter aceti, Acidovorax sp, Acinetobacter sp, Aeromonas hydrophila, Alicyclobacillus acidoterrestrís, Asaia bogorensis, Asaia lannensis, Asaia sp, Bacillus albus, Bacillus altitudinis, Bacillus amyloliquefaciens, Bacillus cereus, Bacillus coagulaos, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus subtilis , Blastomonas natatorial, Brochothríx thermosphacta, Caulobacter vibríoides, Clostridium perfríngens, Delftia acidovorans, Delftia sp, Enterobacter cloacae, Enterococcus f aecium, Escherichia coli O157:H7, Hydrogenophaga pseudoflava, Klebsiella oxy tumba, Lactobacillus feeding us, Lactobacillus brevis, Lactobacillus fermentum, Lactobacillus helveticus, Lactobacillus parabuchneri, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc citreum, Paenibacillus humicus, Paenibacillus macerans, Paenibacillus motobuensis, Paenibacillus xylanilyticus, Pseudomonas aeruginosa, Pseudomonas alcaligenes , Pseudomonas mendocina, Pseudomonas plecoglossicida, Pseudomonas putida, Salmonella typhimurium, Serratia liquefaciens, Serratia marcescens, Shewanella baltica, Sphingobium yanoikuyae, Sphingomonas sp, Sphingopyxis sp, Sphingopyxis terrae, Staphylococcus epidermidis, Staphylococcus hominis, Streptococcus pyogenes, Weissella cibaria, Aerococcus viridans, Aeromona salmonicida, Alcaligenes viscolactis, Bac illus albus, Bacillus cereus, Bacillus circulans , Bacillus clausii, Bacillus invictae, Bacillus paranthracis, Bacillus simplex, Bacillus sonorensis, Bacillus thuringiensis, Bacillus velezensis, Brachybacterium nesterenkovii, Brusella neotomae, Campylobacter jejuni, Clostridium butyricum, Clostridium pasteurianum, Cupriavidus pauculus, Enterobacter aerogenes, Enterobacter asburiae, Enterobacter sacchari, Enterobacter soils, Enterococcus casseliflavus, Enterococcus durans, Enterococcus faecalis, Enterococcus gallinarum, Enterococcus hirae, Enterococcus raffinosus, Enterococcus saccharolyticus, Erysipelothrix sp, Escherichia coli, Escherichia coli 0157-H7, Escherichia coli Serotype 0111:H8, Escherichia coli Serotype O145:NM, Escherichia coli Serotype 0157: H7, Escherichia coli Serotype O26:H11, Escherichia coli Serotype 045:H2, Exignobacterium acetyl icum, Flavobacterium, Gluconacetobacter sacchari, Hafnia paralvei, Herbaspirillum seropedicae, Kocuria flava, Kocuria kristinae, Kocuria rhizophila, Kocuria varians, Kokuria palustris, Kokuria rosea, L. bulgaricus, Lactobacillus acidophilus, Lactobacillus buchneri, Lactobacillus fructivorans, Lactococcus helveticus, Lactococcus lactis, Leclercia adecarboxylata, Leuconostoc lactis, Listeria aquatica, Listeria booriae, Listeria cornellensis, Listeria floridensis, Listeria grandensis, Listeria grayi, Listeria innocua, Listeria ivanovii, Listeria marthii, Listeria monocytogenes, Listeria riparia, Listeria rocourtiae, Listeria seeligeri, Listeria welshimeri, Macrococcus carouselicus, Microbacterium aurantiacum, Microbacterium liquefaciens, Microbacterium marinilacus, Micrococcus aloeverae , Paenibacillus cookie, Paenibacillus fonticola, Paenibacillus puldeungensis, Paenibacillus wynnii, Paenibacillus xylaexedens, Pediococcus damnosus, Phytobacter ursingii, Plesiomonas shigelloides, Propianics bacillus, Propionibacterium acidifaciens, Propionibacterium freudenreichii, Proteus vulgaris, Pseudomona shigelloides, Pseudomonas fluorescens , Pseudomonas fragi, Pseudomonas lundensis, Pseudomonas plecoglossicida, Pseudomonas putrefaciens , Ralstonia mannitolilytica, Salmonella bongori, Salmonella enterica Diarizonae, Serratia grimesii, Sphingomonas melonis, Spirosoma panacterrae, Staphylococcus aureus, Staphylococcus capitis, Staphylococcus xylosus, Stenotrophomonas maltophilia, Stenotrophomonas rhizophila, Streptococcus agalactiae, Streptococcus cremor is, Streptococcus equi subsp. Equi, Streptococcus gallolyticus, Streptococcus group B, Streptococcus pyogenes, Streptococcus thermophilus and Vibrio sp.
Levaduras y Mohos: Yeasts and Molds:
Alternaría alternata, Aspergillus niger, Aspergillus versicolor, Barnettozyma californica, Brettanomyces anomalus, Brettanomyces bruxellensis, Candida boidinii, Candida devenportii, Candida lactis-condensi, Candida magnoliae, Candida parapsilosis, Candida sojae, Candida sp, Candida temnochilae, Cladosporium cladosporioides, Cladosporium sp, Cutaneotrichosporon dermatis, Dekkera naardenensis, Didymella sp, Exophiala oligosperma, Exophiala sp, Fusarium equiseti, Fusarium oxysporum, Fusarium solani, Geotrichum candidum, Kluyveromyces marxianus, Lachancea dasiensis, Lodderomyces elongisporus, Penicillium citrinum, Penicillium g labrum, Penicillium sp, Pichia cactophila, Pichia kudriavzevii, Rhodosporidiobolus nylandii, Rhodotorula glutinis, Rhodotorula mucilaginosa, Saccharomyces bayanus, Saccharomyces cerevisiae, Saccharomyces ludwigii, Saccharomyces pastorianus, Talaromyces funiculosus, Talaromyces minioluteus, Talaromyces sp, Trichoderma atroviride, Trichoderma reesei, Vishniacozyma sp, Wickerhamomyces anomalus, Zygoascus hellenicus, Zygosaccharomyces bailii, Zygosaccharomyces bisporus, Zygosaccharomyces parabailH y Zygosaccharomyces rouxii. Ejemplo 5: Identificación de Hongos y Levaduras en productos lácteos. Alternaria alternata, Aspergillus niger, Aspergillus versicolor, Barnettozyma californica, Brettanomyces anomalus, Brettanomyces bruxellensis, Candida boidinii, Candida devenportii, Candida lactis-condensi, Candida magnoliae, Candida parapsilosis, Candida sojae, Candida sp, Candida temnochilae, Cladosporium cladosporioides, Cladosporium sp, Cutaneotrichosporon dermatis, Dekkera naardenensis, Didymella sp, Exophiala oligosperma, Exophiala sp, Fusarium equiseti, Fusarium oxysporum, Fusarium solani, Geotrichum candidum, Kluyveromyces marxianus, Lachancea dasiensis, Lodderomyces elongisporus, Penicillium citrinum, Penicillium g labrum, Penicillium sp , Pichia cactophila, Pichia kudriavzevii, Rhodosporidiobolus nylandii, Rhodotorula glutinis, Rhodotorula mucilaginosa, Saccharomyces bayanus, Saccharomyces cerevisiae, Saccharomyces ludwigii, Saccharomyces pastorianus, Talaromyces funiculosus, Talaromyces minioluteus, Talaromyces sp, Trichoderma atroviride, Trichoderma reesei, Vishniacozyma sp, Wickerhamomyces anomalus, Zygoascus hellenicus, Zygosaccharomyces bailii, Zygosaccharomyces bisporus, Zygosaccharomyces parabailH and Zygosaccharomyces rouxii. Example 5: Identification of Fungi and Yeasts in dairy products.
Panel de referencia. Reference panel.
Se presentan 14 microorganismos en un panel de referencia, donde los microrganismos se obtuvieron de bancos de colección de ATCC, DSMZ y de aislados ambientales o de productos. Dicho panel contiene las cepas de Acinetobacter Iwoffii, Aeromonas hydrophila, Bacillus cereus, Bacillus circulans, Bacillus clausii, Bacillus licheniformis, Enterobacter aerogenes, Enterobacter cloacae, Enterococcus faecalis, Klebsiella, Klebsiella aerogenes, Klebsiella oxytoca, Kocuría kristinae, Lactococcus lactis, Leuconostoc lactis, Macrococcus carouselicus, Propionibacteríum freudenreichii, Proteus vulgaris, Pseudomonas fluorescens, Pseudomonas putida, Serratia grimesii, Serratia liquefaciens, Serratia marcesens, Staphylococcus aureus, Staphylococcus epidermidis, Streptococcus agalactiae, Streptococcus cremoris, Streptococcus thermophilus.14 microorganisms are presented in a reference panel, where the microorganisms were obtained from collection banks of ATCC, DSMZ and from environmental or product isolates. This panel contains the strains of Acinetobacter Iwoffii, Aeromonas hydrophila, Bacillus cereus, Bacillus circulans, Bacillus clausii, Bacillus licheniformis, Enterobacter aerogenes, Enterobacter cloacae, Enterococcus faecalis, Klebsiella, Klebsiella aerogenes, Klebsiella oxytoca, Kocuría kristinae, Lactococcus lactis, Leuconostoc lacti yes, Macrococcus carouselicus, Propionibacteríum freudenreichii, Proteus vulgaris, Pseudomonas fluorescens, Pseudomonas putida, Serratia grimesii, Serratia liquefaciens, Serratia marcesens, Staphylococcus aureus, Staphylococcus epidermidis, Streptococcus agalactiae, Streptococcus cremoris, Streptococcus thermophilus.
Todas las cepas se activaron en medios de crecimiento no selectivos y luego se aislaron en agar, para así trabajar con cultivos puros, además estas fueron confirmadas por secuenciación.All strains were activated in non-selective growth media and then isolated on agar, in order to work with pure cultures, and these were confirmed by sequencing.
Matrices. Arrays.
Se seleccionaron 5 matrices de diferentes tipos de lácteos: leche natural (M8), descremada (M9), fortificada con calcio (M7), con saborizante (M1 1 ) y crema de leche (M10). Las matrices fueron enriquecidas sin inoculo, con 9 mL de medio de crecimiento de hongos durante 48 horas a 25°C confirmando su esterilidad. Five matrices of different types of dairy were selected: natural milk (M8), skim milk (M9), fortified with calcium (M7), with flavoring (M1 1) and milk cream (M10). The matrices were enriched without inoculum, with 9 mL of fungal growth medium for 48 hours at 25°C, confirming their sterility.
Estabilización e inoculación de cepas. Stabilization and inoculation of strains.
Antes de la inoculación en los productos finales, se realizó estandarización del inoculo de trabajo con una baja cantidad microbiana de igual o menor de 100 UFC/100 pL, confirmándose por métodos tradicionales de siembra en placa. Se inoculo 1 mL de las matrices evaluadas con las cantidades que se detalla en la tabla 1 . Los productos inoculados son enriquecidos con 9 mL de medio de crecimiento para hongos durante 48 horas a 25°C. Before inoculation in the final products, standardization of the working inoculum was carried out with a low microbial quantity of equal to or less than 100 CFU/100 pL, confirmed by traditional plating methods. 1 mL of the evaluated matrices was inoculated with the quantities detailed in Table 1. The inoculated products are enriched with 9 mL of fungal growth medium for 48 hours at 25°C.
Cuantificación del crecimiento microbiano. Quantification of microbial growth.
Después del enriquecimiento, se sembró cada muestra en Agar Papa Dextrosa para su cuantificación y se incubaron durante 5 días a 25°C. Luego, las colonias fueron contadas y registrados para el análisis de datos. After enrichment, each sample was plated on Potato Dextrose Agar for quantification and incubated for 5 days at 25°C. Colonies were then counted and recorded for data analysis.
Extracción de ADN y amplificación de ADN. DNA extraction and DNA amplification.
La extracción de ADN y el PCR se realizaron de acuerdo al protocolo de los ejemplos anteriores. En este caso se usó una mezcla de partidores correspondiente a las SEQ ID N°20 y SEQ ID N°69 a 83. Se agregaron 2 pL de cada muestra de ADN a cada tubo en la placa de PCR. La placa de PCR se carga en un equipo AriaMx Realtime PCR (Agilent Technologies), y se configuró el protocolo de PCR y curva de disociación. Resultados DNA extraction and PCR were carried out according to the protocol of the previous examples. In this case, a mixture of primers corresponding to SEQ ID N°20 and SEQ ID N°69 to 83 was used. 2 pL of each DNA sample was added to each tube in the PCR plate. The PCR plate was loaded into an AriaMx Realtime PCR machine (Agilent Technologies), and the PCR protocol and dissociation curve were set up. Results
El inicio del procedimiento es realizar una inoculación a 1 mL de las matrices con el microorganismo en particular en fase estacionaria a una baja concentración, dejando las matrices con igual o menor a 100 UFC/mL. The beginning of the procedure is to inoculate 1 mL of the matrices with the particular microorganism in the stationary phase at a low concentration, leaving the matrices with equal to or less than 100 CFU/mL.
Posteriormente, la alícuota de matriz inoculada fue llevada a 9 mL de medio de crecimiento para hongos para la fase de enriquecimiento, se incuban a 28°C por 48 horas, permitiendo el buen desarrollo de los microrganismos testeados en las diferentes matrices. Subsequently, the aliquot of inoculated matrix was taken to 9 mL of fungal growth medium for the enrichment phase, incubated at 28°C for 48 hours, allowing the good development of the microorganisms tested in the different matrices.
Terminada la fase de enriquecimiento, se realizó cuantificación de cada una de las muestras mediante métodos tradicionales (recuento en placa). Se observó crecimiento de los microorganismos en todas las matrices evaluadas como se muestra en la Figura 6. Once the enrichment phase was completed, quantification of each of the samples was carried out using traditional methods (plate counting). Growth of microorganisms was observed in all the matrices evaluated as shown in Figure 6.
Estos resultados demuestran que el medio permite un buen crecimiento de todas las levaduras y mohos evaluados. Aspergillus fumígalas y Aspergillus niger presentaron los recuentos más bajos, con recuentos de 33 UFC/mL y 66 UFC/mL respectivamente. Sin embargo, esta concentración de microorganismo, es detectable por PCR. These results demonstrate that the medium allows good growth of all the yeasts and molds evaluated. Aspergillus fumigalas and Aspergillus niger had the lowest counts, with counts of 33 CFU/mL and 66 CFU/mL respectively. However, this concentration of microorganism is detectable by PCR.
Los resultados del PCR se pueden observar en la figura 7, en la cual se muestra la presencia de los microorganismos en las diferentes matrices evaluadas. Los controles de ensayo nos indican que no hay problemas con el PCR, y también que las matrices evaluadas son estériles. Los controles de microorganismos muestran los perfiles de cada uno de los mohos y levaduras. Cada una de las matrices evaluadas presentan los perfiles correspondientes a cada uno de los microorganismos inoculados, indicando la correcta detección de los respectivos mohos y levaduras. The results of the PCR can be seen in Figure 7, which shows the presence of the microorganisms in the different matrices evaluated. The assay controls indicate that there are no problems with the PCR, and also that the matrices evaluated are sterile. The microorganism controls show the profiles of each of the molds and yeasts. Each of the matrices evaluated present the profiles corresponding to each of the inoculated microorganisms, indicating the correct detection of the respective molds and yeasts.
Finalmente, el análisis de datos se realizó con lo indicado en la descripción detallada. Los resultados mostraron que los 14 microorganismos fueron detectados e identificados correctamente. La concordancia positiva entre esta prueba y los resultados esperados fue del 100% (tabla 3). No observamos diferencias en el rendimiento analítico entre los microorganismos analizados. A pesar que algunos hongos crecieron significativamente más lento que otros, todos fueron identificados. Además, los diferentes productos lácteos no tuvieron efectos sobre la sensibilidad y precisión del método.
Figure imgf000021_0001
Figure imgf000022_0001
Finally, the data analysis was carried out as indicated in the detailed description. The results showed that the 14 microorganisms were detected and identified correctly. The positive agreement between this test and the expected results was 100% (table 3). We did not observe differences in analytical performance between the microorganisms analyzed. Although some fungi grew significantly slower than others, all were identified. Furthermore, the different dairy products had no effect on the sensitivity and precision of the method.
Figure imgf000021_0001
Figure imgf000022_0001
Ejemplo 6: Identificación de bacterias en productos bebestibles. Example 6: Identification of bacteria in drinkable products.
Panel de referencia. Reference panel.
Se presentan 54 microorganismos en el panel de referencia que se obtuvieron de bancos de colección de ATCC, DSMZ y de aislados ambientales o de productos. El panel de referencia contiene las cepas de Acetobacter aceti, Aeromonas hydrophila, Alicyclobacillus acidoterrestrís, Asaia bogorensis, Asaia lannensis, Bacillus albus, Bacillus altitudinis, Bacillus amyloliquefaciens, Bacillus cereus, Bacillus coagulans, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus subtilis, Blastomonas natatoria, Brochothrix thermosphacta, Caulobacter vibríoides, Clostridium perfringens, Delftia acidovorans, Enterobacter cloacae, Enterococcus faecium, Enterococcus hirae, Escherichia coi!, Hydrogenophaga pseudoflava, Klebsiella oxy toca, Lactobacillus al i mentar i us, Lactobacillus brevis, Lactobacillus fermentum, Lactobacillus helveticus, Lactobacillus parabuchneri, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc citreum, Paenibacillus humicus, Paenibacillus maceraos, Paenibacillus motobuensis, Paenibacillus xylanilyticus, Pseudomonas aeruginosa, Pseudomonas alcaligenes, Pseudomonas mendocina, Pseudomonas plecoglossicida, Pseudomonas putida, Salmonella typhimurium, Serratia liquefaciens, Serratia marcescens, Shewanella báltica, Sphingobium yanoikuyae, Sphingopyxis terrae, Staphylococcus epidermidis, Staphylococcus hominis, Streptococcus pyogenes y Weissella cibari. 54 microorganisms are presented in the reference panel that were obtained from collection banks of ATCC, DSMZ and from environmental or product isolates. The reference panel contains the strains of Acetobacter aceti, Aeromonas hydrophila, Alicyclobacillus acidoterrestrís, Asaia bogorensis, Asaia lannensis, Bacillus albus, Bacillus altitudinis, Bacillus amyloliquefaciens, Bacillus cereus, Bacillus coagulans, Bacillus licheniformis, Bacillus megaterium, Bacillus pumilus, Bacillus subtilis, Blastomonas natatoria, Brochothrix thermosphacta, Caulobacter vibríoides, Clostridium perfringens, Delftia acidovorans, Enterobacter cloacae, Enterococcus faecium, Enterococcus hirae, Escherichia coi!, Hydrogenophaga pseudoflava, Klebsiella oxy tumba, Lactobacillus al i mentar i us, Lactobacillus brevis, Lactobacillus fermentum , Lactobacillus helveticus , Lactobacillus parabuchneri, Lactobacillus paracasei, Lactobacillus plantarum, Lactobacillus rhamnosus, Lactococcus lactis, Leuconostoc citreum, Paenibacillus humicus, Paenibacillus maceraos, Paenibacillus motobuensis, Paenibacillus xylanilyticus, Pseudomonas aeruginosa, Pseudomonas alcaligenes, Pseudo mendocina monkeys, Pseudomonas plecoglossicida, Pseudomonas putida, Salmonella typhimurium, Serratia liquefaciens, Serratia marcescens, Shewanella baltica, Sphingobium yanoikuyae, Sphingopyxis terrae, Staphylococcus epidermidis, Staphylococcus hominis, Streptococcus pyogenes and Weissella cibari.
Todas las cepas se activaron en medios de crecimiento no selectivos y luego se aislaron en agar, para así trabajar con cultivos puros, además estas fueron confirmadas por secuenciación. Estabilización e inoculación de cepas. All strains were activated in non-selective growth media and then isolated on agar, in order to work with pure cultures, and these were confirmed by sequencing. Stabilization and inoculation of strains.
Antes de la inoculación en los productos finales, se realizó estandarización del inoculo de trabajo con una baja cantidad microbiana menor de 1 x 105 UFC/100 pL, a 10 mL de producto final. Before inoculation in the final products, standardization of the working inoculum was carried out with a low microbial quantity of less than 1 x 10 5 CFU/100 pL, to 10 mL of final product.
Matrices. Arrays.
Se seleccionaron 5 matrices de diferentes tipos de productos finales de bebidas: bebida gasificada negra, bebida gasificada incolora, agua carbonatada, agua no carbonatada y jugo de naranja. Las matrices fueron enriquecidas sin inoculo, con medio de crecimiento de bacterias durante 24 horas a 35°C confirmando su esterilidad. 5 matrices of different types of final beverage products were selected: black carbonated drink, colorless carbonated drink, carbonated water, non-carbonated water and orange juice. The matrices were enriched without inoculum, with bacterial growth medium for 24 hours at 35°C, confirming their sterility.
Estabilización e inoculación de cepas. Stabilization and inoculation of strains.
Antes de la inoculación en los productos finales, todas las cepas fueron estabilizadas durante 72 horas a 4°C dentro de las matrices. Después de la estabilización, todas las muestras fueron inoculadas en matrices estériles con no más de 8 diluciones en serie que van de 0 a 10.000 UFC/ mL. Cinco repeticiones de cada dilución fueron cuantificadas por métodos tradicionales de cultivo en placa. En paralelo a la cuantificación tradicional, 100 pL de cada muestra se inocularon en 10 ml del producto final. Estas muestras fueron enriquecidas durante 24 horas a 35°C utilizando los medios de crecimiento de bacterias. Before inoculation into the final products, all strains were stabilized for 72 hours at 4°C within the matrices. After stabilization, all samples were inoculated into sterile matrices with no more than 8 serial dilutions ranging from 0 to 10,000 CFU/mL. Five replicates of each dilution were quantified by traditional plating methods. In parallel to the traditional quantification, 100 pL of each sample was inoculated into 10 ml of the final product. These samples were enriched for 24 hours at 35°C using the bacterial growth media.
Cuantificación del crecimiento microbiano. Quantification of microbial growth.
Después del enriquecimiento, 1 mL de cada muestra se agregan a una placa de conteo Agar y se incuba durante 2 días a 35°C. Las colonias fueron contadas y registradas para el análisis de datos. After enrichment, 1 mL of each sample is added to an Agar counting plate and incubated for 2 days at 35°C. Colonies were counted and recorded for data analysis.
Extracción y amplificación de ADN. DNA extraction and amplification.
50 pl de la muestra de enriquecido es mezclado con 450 pL de tampón de lisis. Esta mezcla es incubada durante 20 minutos a 95°C. Luego 2 pL de cada ADN fue adicionado a cada pote en la placa PCR, la que es cargada en un Equipo AriaMx Real-time PCR (Agilent Technologies), y el PCR y el protocolo de disociación se establecieron de acuerdo a lo indicado en el ejemplo 1 . En este caso se usó una mezcla de partidores correspondiente a las SEQ ID N°1 a 19 y SEQ ID N°21 a 68. 50 μL of the spiked sample is mixed with 450 μL of lysis buffer. This mixture is incubated for 20 minutes at 95°C. Then 2 pL of each DNA was added to each pot on the PCR plate, which is loaded into an AriaMx Real-time PCR Kit (Agilent Technologies), and the PCR and dissociation protocol were established according to what was indicated in the Example 1 . In this case, a mixture of primers corresponding to SEQ ID N°1 to 19 and SEQ ID N°21 to 68 was used.
Análisis de datos y resultados. Data analysis and results.
Finalmente, el análisis de datos se realizó con lo indicado en la descripción detallada. Los resultados mostraron que los 54 microorganismos fueron detectados e identificados correctamente. La concordancia positiva entre esta prueba y los resultados esperados fue del 100%. No observamos diferencias en el rendimiento analítico entre los microorganismos analizados. Además, los diferentes productos bebestibles no tuvieron efectos sobre la sensibilidad y precisión del método. Ejemplo 7: Discriminación de microorganismos cercanos mediante método de la invención. Finally, the data analysis was carried out as indicated in the detailed description. The results showed that the 54 microorganisms were detected and identified correctly. The positive agreement between this test and the expected results was 100%. We did not observe differences in analytical performance between the microorganisms analyzed. Furthermore, the different drinkable products had no effects on the sensitivity and precision of the method. Example 7: Discrimination of nearby microorganisms using the method of the invention.
Se presenta un ensayo de discriminación entre microorganismos que se encuentran normalmente como contaminantes de la industria alimenticia como son Lactobacillus plantarum, Pseudomonas aeruginosa y Pseudomonas alcaligenes. El primero se usa como control interno del experimento, mientras que se evaluará la capacidad de resolución del método respecto a las Pseudomonas. El protocolo general es el mismo indicado en el ejemplo 2 de esta patente, y donde la matriz evaluada son bebidas de fantasía. A discrimination test is presented between microorganisms that are normally found as contaminants in the food industry such as Lactobacillus plantarum, Pseudomonas aeruginosa and Pseudomonas alcaligenes. The first is used as an internal control of the experiment, while the resolution capacity of the method with respect to Pseudomonas will be evaluated. The general protocol is the same as indicated in example 2 of this patent, and where the matrix evaluated is fantasy drinks.
Se puede ver en la figura 8 que la primera derivada las curvas de fusión de los PCR no generan una importante separación entre las Pseudomonas debido a su igualdad filogenética. It can be seen in Figure 8 that the first derivative melting curves of the PCRs do not generate a significant separation between the Pseudomonas due to their phylogenetic equality.
Posteriormente se realizó una estandarización por z-score de los resultados de la primera derivada y una clasificación usando bosques aleatorios (random forest). Para evaluar dicha clasificación se genera una matriz de confusión que muestra la precisión de la separación entre los grupos evaluados.
Figure imgf000024_0001
Subsequently, a z-score standardization of the first derivative results and a classification using random forests were performed. To evaluate this classification, a confusion matrix is generated that shows the precision of the separation between the evaluated groups.
Figure imgf000024_0001
Como se puede ver en la tabla anterior, se muestra un aumento significativo de la resolución entre las Pseudomonas, y esto se ve reforzado sorprendentemente por lo que se muestra en la figura 9 en relación al aumento de la distancia euclidiana entre los datos originales (A) y una vez realizada la aplicación del método de esta invención (B). As can be seen in the table above, a significant increase in resolution is shown among Pseudomonas, and this is surprisingly reinforced by what is shown in Figure 9 in relation to the increase in Euclidean distance between the original data (A ) and once the application of the method of this invention has been carried out (B).
Ejemplo 8: Evaluación del método de identificación para Zyqosaccharomyces. Example 8: Evaluation of the identification method for Zyqosaccharomyces.
Se realiza una identificación de distintas las subclases de Zygosaccharomyces en matices lácteas. Se escogen Z. parabailü, Z. bailii, Z. bisporus y Z. rouxii. Para esta evaluación se usa el protocolo y análisis primario indicado en el ejemplo anterior. An identification of different Zygosaccharomyces subclasses in dairy nuances is carried out. Z. parabailü, Z. bailii, Z. bisporus and Z. rouxii are chosen. For this evaluation, the protocol and primary analysis indicated in the previous example are used.
Para ello se efectuó una clasificación por PCA evaluando 50 características que fueron agrupadas en 2 componentes tal como se muestra en la figura 10A y 3 componentes (x, y, z) en la figura 10B. To do this, a classification was carried out by PCA evaluating 50 characteristics that were grouped into 2 components as shown in Figure 10A and 3 components (x, y, z) in Figure 10B.
Como se muestra en ambas figuras existe un alto grado de resolución entre los distintos grupos de microorganismos. La mayor resolución se logra con Z. bisporus (amarillo) y Z. parabailü (azul) sin embargo al hacer la matriz de confusión se tiene un 2,38% de error entre la resolución de Z. bailii y Z. rouxii. As shown in both figures, there is a high degree of resolution between the different groups of microorganisms. The highest resolution is achieved with Z. bisporus (yellow) and Z. parabailü (blue), however when creating the confusion matrix there is a 2.38% error between the resolution of Z. bailii and Z. rouxii.

Claims

FIGURAS FIGURES
La Figura 1 corresponde a la ubicación de los partidores que corresponden a las SEQ N°1 a SEQ N°12 dentro las secuencias conservadas del gen rpoB. Figure 1 corresponds to the location of the primers that correspond to SEQ No. 1 to SEQ No. 12 within the conserved sequences of the rpoB gene.
La Figura 2 corresponde a una curva de fusión de rpoB amplificadas por PCR de una pluralidad de muestras normalizados mediante z-score. Figure 2 corresponds to a melting curve of rpoB amplified by PCR from a plurality of samples normalized by z-score.
La Figura 3 presenta el agrupamiento de curvas generadas por HMR de amplicones sin (A) y con (B) aplicación de segunda derivada. Figure 3 presents the clustering of curves generated by HMR of amplicons without (A) and with (B) application of second derivative.
La Figura 4 muestra el análisis estadístico entre el promedio de distancia del microorganismo 5 y del microorganismo 7 más cercano. Figure 4 shows the statistical analysis between the average distance of microorganism 5 and the closest microorganism 7.
La Figura 5 corresponde a un gráfico de calor donde se contrasta método habitual (HRM) respecto del mejorado por segunda derivada (método TAAG) en línea de identidad. Figure 5 corresponds to a heat graph where the usual method (HRM) is contrasted with the improved one by second derivative (TAAG method) on the identity line.
La Figura 6 presenta el crecimiento de microorganismos (hongos) en las matrices lácteas. Figure 6 presents the growth of microorganisms (fungi) in the dairy matrices.
La Figura 7 muestra los resultados de PCR de los microorganismos evaluados en las diferentes matrices lácteas. Figure 7 shows the PCR results of the microorganisms evaluated in the different dairy matrices.
La Figura 8 corresponde a la primera derivada de las curvas de fusión obtenidas por PCR para los organismos indicados. Figure 8 corresponds to the first derivative of the melting curves obtained by PCR for the indicated organisms.
La Figura 9 presenta la distancia euclidiana de los grupos de microorganismos evaluados en la bebida de fantasía. Figure 9 presents the Euclidean distance of the groups of microorganisms evaluated in the fantasy drink.
La Figura 10 presenta la resolución del método evaluado en distintas subespecies de Zygosaccharomyces. Figure 10 presents the resolution of the method evaluated in different subspecies of Zygosaccharomyces.
APLICACIÓN INDUSTRIAL INDUSTRIAL APPLICATION
La presente invención tiene aplicación en la industria biotecnológica, medicina y de alimentos toda vez que el método puede ser implementado para la temprana detección e identificación celular, tanto eucariota como procariota. También las partes materiales del método pueden ser producidas como kit para su comercialización. The present invention has application in the biotechnology, medicine and food industries since the method can be implemented for the early detection and identification of cells, both eukaryotic and prokaryotic. The material parts of the method can also be produced as a kit for commercialization.
REIVINDICACIONES Un método para determinar la identidad celular desde una muestra biológica CARACTERIZADO porque comprende los pasos de: a) proporcionar dicha muestra biológica sospechosa de contener células que son seleccionados desde eucariotas, bacterias, virus, hongos y protozoos y que se obtiene de matrices biológicas o no biológicas; b) realizar un paso de separación celular de la muestra donde la separación permite discriminar entre células eucariotas y/o microorganismos y/o virus, y se puede realizar por métodos físicos y/o biológicos; c) extraer el ADN de la muestra y efectuar una reacción en cadena de la polimerasa (PCR) para amplificar al menos una región genómica utilizando un conjunto de partidores de amplificación de amplio rango taxonómico seleccionado desde las SEQ ID 1 a SEQ ID 83; d) realizar una curva de disociación de alta resolución (HRM) de los amplicones generados por PCR en el paso anterior donde la temperatura de fusión (Tm) de los amplicones se encuentra en el rango de 60°C a 98°C; e) convertir los datos generados en el punto anterior mediante transformación matemática en donde dicha transformación tiene un primer paso que consiste en estandarizar los datos del equipo, luego generar una primera derivada positiva o negativa y finalmente un tercer paso que consiste en generar una derivada positiva al conjunto de datos obtenidos en el punto anterior; f) usar los datos transformados matemáticamente, tanto en el paso dos y/o en el paso tres, como entradas de los algoritmos de aprendizaje automático para determinar la identidad celular; donde los algoritmos de inteligencia artificial permiten generar un primer clasificador que permite discriminar en base a los datos generados por los perfiles. g) comparar los datos de la muestra biológica sospechosa con una base de datos de organismos conocidos para identificación celular. El método según la reivindicación 1 CARACTERIZADO porque dicha muestra biológica del punto a) se obtiene de matrices biológicas o no biológicas. El método según la reivindicación 2 CARACTERIZADO porque dicha muestra biológica se obtiene de matrices biológicas provenientes procedimientos clínicos e industria alimenticia. El método según la reivindicación 2 CARACTERIZADO porque dicha muestra biológica se obtiene de matrices no biológicas provenientes de superficies de contacto hechas de material inorgánico. CLAIMS A method to determine cellular identity from a biological sample CHARACTERIZED because it comprises the steps of: a) providing said biological sample suspected of containing cells that are selected from eukaryotes, bacteria, viruses, fungi and protozoa and that is obtained from biological matrices or non-biological; b) perform a cell separation step from the sample where the separation allows discrimination between eukaryotic cells and/or microorganisms and/or viruses, and can be carried out by physical and/or biological methods; c) extract the DNA from the sample and perform a polymerase chain reaction (PCR) to amplify at least one genomic region using a set of broad taxonomic range amplification primers selected from SEQ ID 1 to SEQ ID 83; d) perform a high resolution dissociation curve (HRM) of the amplicons generated by PCR in the previous step where the melting temperature (Tm) of the amplicons is in the range of 60°C to 98°C; e) convert the data generated in the previous point through mathematical transformation where said transformation has a first step that consists of standardizing the equipment data, then generating a first positive or negative derivative and finally a third step that consists of generating a positive derivative to the set of data obtained in the previous point; f) use the mathematically transformed data, both in step two and/or step three, as inputs to the machine learning algorithms to determine cell identity; where artificial intelligence algorithms allow the generation of a first classifier that allows discrimination based on the data generated by the profiles. g) compare the data of the suspicious biological sample with a database of known organisms for cellular identification. The method according to claim 1 CHARACTERIZED because said biological sample from point a) is obtained from biological or non-biological matrices. The method according to claim 2 CHARACTERIZED because said biological sample is obtained from biological matrices from clinical procedures and the food industry. The method according to claim 2 CHARACTERIZED because said biological sample is obtained from non-biological matrices coming from contact surfaces made of inorganic material.
. El método según las reivindicaciones 1 a 4 CARACTERIZADO porque opcionalmente dicha muestra puede ser sometida a un proceso de cultivo celular o microbiológico mediante un medio de crecimiento para enriquecer la muestra. . El método según la reivindicación 7 CARACTERIZADO porque el proceso de cultivo para microorganismos permite el crecimiento en el medio en al menos 1 x102 UFC/mL . El método según la reivindicación 1 CARACTERIZADO porque el paso de separación celular puede realizarse el tamaño y la densidad de las células, la afinidad de anticuerpos, la dispersión de luz, la emisión de fluorescencia. . El método según la reivindicación 1 CARACTERIZADO porque la reacción de amplificación por PCR del punto c) se puede realizar por PCR anidada, PCR multiplex, PCR con trascñptasa inversa y PCR en tiempo real. . El método según la reivindicación 9 CARACTERIZADO porque la reacción de amplificación por PCR se puede realizar por PCR en tiempo real. 0. El método de acuerdo con la reivindicación 1 , CARACTERIZADO porque en el punto d) comprende el contacto del amplicón con un fluoróforo intercalado antes de realizar el análisis HRM. 1. El método de acuerdo con la reivindicación 13, CARACTERIZADO porque el fluoróforo intercalado es seleccionado del grupo formado por EvaGreen y MasterMix 5X. 2. El método de acuerdo con la reivindicación 1 , CARACTERIZADO porque el conjunto de datos generados en el e) por el primer y/o segundo paso genera parámetros de discriminación. 3. El método según la reivindicación 1 CARACTERIZADO porque el conjunto de datos generados por la segunda derivada negativa del punto e) genera parámetros adicionales de discriminación, tales como posición relativa de picos, valles, altura de picos, y relaciones entre estas tales como ancho de picos/valles y distancia entre picos y/o valles. 4. El método según la reivindicación 1 , CARACTERIZADO porque el punto f) comprende ejecutar algoritmos de inteligencia artificial para el paso de agrupamiento y/o discriminación y/o identificación seleccionados tales como bosque aleatorio (random forest), aprendizaje automático vectorial, k vecinos más cercanos (kNN), partición alrededor de medoids (PAM), Naive Bayes, análisis de componentes principales (PCA) o análisis discriminante lineal (LDA). 5. El método según la reivindicación 1 CARACTERIZADO porque los algoritmos de inteligencia artificial permiten generar un segundo clasificador que permite discriminar en base a los datos generados por los perfiles seleccionados por el primer clasificador. El método según cualquiera de las reivindicaciones precedentes CARACTERIZADO porque previamente se genera una base de datos de células y microorganismos conocidos. El método según la reivindicación 16 CARACTERIZADO porque comprende el paso adicional de determinar inclusión o exclusión del organismo presente en la muestra sospechosa como un grupo específico de organismos presentes en la base de datos. El método según la reivindicación 17 CARACTERIZADO porque la base de datos contiene organismos que son seleccionados desde eucariotas, bacterias, virus, hongos y protozoos. Sistema para determinar la identidad celular CARACTERIZADO porque comprende: a) un componente de almacenamiento para generar una matriz de datos de una curva de disociación de alta resolución (HRM) de organismos conocidos y muestras sospechosas. b) un procesador computacional para procesar datos que permite realizar una transformación matemática y/o c) un procesador computacional para procesar datos que permite ejecutar algoritmos de inteligencia artificial para el paso de agrupamiento y/o discriminación y/o identificación de los microrganismos. y d) un monitor que muestra información con respecto a la identificación de los microorganismos El sistema según la reivindicación 19 CARACTERIZADO porque en el punto b) la transformación matemática de los datos obtenidos tiene un primer paso que consiste en estandarizar los datos del equipo, luego generar una primera derivada positiva o negativa y finalmente un tercer paso que consiste en generar una derivada positiva al conjunto de datos obtenidos en el punto anterior; . El sistema según la reivindicación 19 o 20 CARACTERIZADO porque en el punto c) usar los datos transformados matemáticamente, tanto en el paso dos y/o en el paso tres, como entradas de los algoritmos de aprendizaje automático para determinar la identidad celular; donde los algoritmos de inteligencia artificial permiten generar un primer clasificador que permite discriminar en base a los datos generados por los perfiles. El sistema según cualquiera de las reivindicaciones 19 o 20 CARACTERIZADO porque el conjunto de datos generados por la segunda derivada negativa genera parámetros adicionales de discriminación, tales como posición relativa de picos, valles, altura de picos, y relaciones entre estas tales como ancho de picos/valles y distancia entre picos y/o valles. El sistema según la reivindicación 19 o 21 , CARACTERIZADO porque comprende ejecutar algoritmos de inteligencia artificial para el paso de agrupamiento y/o discriminación y/o identificación seleccionados tales como bosque aleatorio (random forest), aprendizaje automático vectorial, k vecinos más cercanos (kNN), partición alrededor de medoids (PAM), Naive Bayes, análisis de componentes principales (PCA) o análisis discriminante lineal (LDA). El sistema según las reivindicaciones 19 a 23 CARACTERIZADO porque previamente se genera una base de datos de células y microorganismos conocidos. El método según la reivindicación 28 CARACTERIZADO porque comprende el paso adicional de determinar inclusión o exclusión del organismo presente en la muestra sospechosa como un grupo específico de organismos presentes en la base de datos. El método según la reivindicación 22 CARACTERIZADO porque en el punto d) la información se presenta identificando la célula o microorganismo, hasta el nivel de especie y su proporcionalidad en porcentaje en una muestra biológica sospechosa. . The method according to claims 1 to 4 CHARACTERIZED because said sample can optionally be subjected to a cell culture or microbiological process using a growth medium to enrich the sample. . The method according to claim 7 CHARACTERIZED because the culture process for microorganisms allows growth in the medium in at least 1 x10 2 CFU/mL. The method according to claim 1 CHARACTERIZED because the cell separation step can be carried out, the size and density of the cells, the affinity of antibodies, the scattering of light, the emission of fluorescence. . The method according to claim 1 CHARACTERIZED because the PCR amplification reaction of point c) can be carried out by nested PCR, multiplex PCR, reverse transcriptase PCR and real-time PCR. . The method according to claim 9 CHARACTERIZED because the PCR amplification reaction can be carried out by real-time PCR. 0. The method according to claim 1, CHARACTERIZED because in point d) it comprises the contact of the amplicon with an intercalated fluorophore before performing the HRM analysis. 1. The method according to claim 13, CHARACTERIZED because the intercalated fluorophore is selected from the group formed by EvaGreen and MasterMix 5X. 2. The method according to claim 1, CHARACTERIZED in that the set of data generated in e) by the first and/or second step generates discrimination parameters. 3. The method according to claim 1 CHARACTERIZED because the set of data generated by the second negative derivative of point e) generates additional discrimination parameters, such as relative position of peaks, valleys, height of peaks, and relationships between these such as width of peaks/valleys and distance between peaks and/or valleys. 4. The method according to claim 1, CHARACTERIZED in that point f) comprises executing artificial intelligence algorithms for the grouping and/or discrimination and/or identification step selected such as random forest, vector machine learning, k neighbors nearest (kNN), partition around medoids (PAM), Naive Bayes, principal component analysis (PCA) or linear discriminant analysis (LDA). 5. The method according to claim 1 CHARACTERIZED because the artificial intelligence algorithms allow the generation of a second classifier that allows discrimination based on the data generated by the profiles selected by the first classifier. The method according to any of the preceding claims CHARACTERIZED because a database of known cells and microorganisms is previously generated. The method according to claim 16 CHARACTERIZED because it comprises the additional step of determining inclusion or exclusion of the organism present in the suspicious sample as a specific group of organisms present in the database. The method according to claim 17 CHARACTERIZED because the database contains organisms that are selected from eukaryotes, bacteria, viruses, fungi and protozoa. System to determine cellular identity CHARACTERIZED because it comprises: a) a storage component to generate a data matrix of a high resolution dissociation curve (HRM) of known organisms and suspicious samples. b) a computational processor to process data that allows performing a mathematical transformation and/oc) a computational processor to process data that allows executing artificial intelligence algorithms for the step of grouping and/or discrimination and/or identification of the microorganisms. and d) a monitor that shows information regarding the identification of microorganisms. The system according to claim 19 CHARACTERIZED because in point b) the mathematical transformation of the data obtained has a first step that consists of standardizing the equipment data, then generating a first positive or negative derivative and finally a third step that consists of generating a positive derivative to the set of data obtained in the previous point; . The system according to claim 19 or 20 CHARACTERIZED because in point c) use the mathematically transformed data, both in step two and/or in step three, as inputs to the machine learning algorithms to determine cell identity; where artificial intelligence algorithms allow the generation of a first classifier that allows discrimination based on the data generated by the profiles. The system according to any of claims 19 or 20 CHARACTERIZED because the set of data generated by the second negative derivative generates additional discrimination parameters, such as relative position of peaks, valleys, peak height, and relationships between these such as peak width /valleys and distance between peaks and/or valleys. The system according to claim 19 or 21, CHARACTERIZED in that it comprises executing artificial intelligence algorithms for the grouping and/or discrimination and/or identification step selected such as random forest, learning vector automatic, k nearest neighbors (kNN), partition around medoids (PAM), Naive Bayes, principal component analysis (PCA) or linear discriminant analysis (LDA). The system according to claims 19 to 23 CHARACTERIZED because a database of known cells and microorganisms is previously generated. The method according to claim 28 CHARACTERIZED because it comprises the additional step of determining inclusion or exclusion of the organism present in the suspicious sample as a specific group of organisms present in the database. The method according to claim 22 CHARACTERIZED because in point d) the information is presented identifying the cell or microorganism, up to the species level and its proportionality in percentage in a suspicious biological sample.
PCT/CL2023/050090 2022-09-29 2023-09-29 Method for determining cellular identity from a complex biological sample using pcr-hrm and mathematical analysis WO2024065069A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CL2022002672A CL2022002672A1 (en) 2022-09-29 2022-09-29 Method to determine cell identity from a complex biological sample by pcr-hrm and mathematical analysis
CL202202672 2022-09-29

Publications (1)

Publication Number Publication Date
WO2024065069A1 true WO2024065069A1 (en) 2024-04-04

Family

ID=85792759

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CL2023/050090 WO2024065069A1 (en) 2022-09-29 2023-09-29 Method for determining cellular identity from a complex biological sample using pcr-hrm and mathematical analysis

Country Status (2)

Country Link
CL (1) CL2022002672A1 (en)
WO (1) WO2024065069A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170321257A1 (en) * 2016-05-09 2017-11-09 The Board Of Trustees Of The Leland Stanford Junior University Bacterial pathogen identification by high resolution melting analysis
WO2021011943A2 (en) * 2019-07-16 2021-01-21 Meliolabs Inc. Methods and devices for single-cell based digital high resolution melt
WO2021112673A1 (en) * 2019-12-02 2021-06-10 Inbiome B.V. Methods for identifying microbes in a clinical and non-clinical setting.
WO2021262581A1 (en) * 2020-06-22 2021-12-30 Combinati Incorporated Systems and methods for analyzing a biological sample

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170321257A1 (en) * 2016-05-09 2017-11-09 The Board Of Trustees Of The Leland Stanford Junior University Bacterial pathogen identification by high resolution melting analysis
WO2021011943A2 (en) * 2019-07-16 2021-01-21 Meliolabs Inc. Methods and devices for single-cell based digital high resolution melt
WO2021112673A1 (en) * 2019-12-02 2021-06-10 Inbiome B.V. Methods for identifying microbes in a clinical and non-clinical setting.
WO2021262581A1 (en) * 2020-06-22 2021-12-30 Combinati Incorporated Systems and methods for analyzing a biological sample

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
ATHAMANOLAP PORNPAT, HSIEH KUANGWEN, O'KEEFE CHRISTINE M., ZHANG YE, YANG SAMUEL, WANG JEFF (TZA-HUEI): "Machine Learning-Assisted Digital PCR and Melt Enables Broad Bacteria Identification and Pheno-Molecular Antimicrobial Susceptibility Test", BIORXIV, 24 March 2019 (2019-03-24), XP055867334, Retrieved from the Internet <URL:https://www.biorxiv.org/content/10.1101/587543v1.full.pdf> [retrieved on 20211130], DOI: 10.1101/587543 *
DAS SURAJIT; DASH HIRAK R.; MANGWANI NEELAM; CHAKRABORTY JAYA; KUMARI SUPRIYA: "Understanding molecular identification and polyphasic taxonomic approaches for genetic relatedness and phylogenetic relationships of microorganisms", JOURNAL OF MICROBIOLOGICAL METHODS, ELSEVIER, AMSTERDAM,, NL, vol. 103, 1 January 1900 (1900-01-01), NL , pages 80 - 100, XP028860115, ISSN: 0167-7012, DOI: 10.1016/j.mimet.2014.05.013 *
ERDEM MINE; KESMEN ZüLAL; ÖZBEKAR ESRA; ÇETIN BüLENT; YETIM HASAN: "Application of high-resolution melting analysis for differentiation of spoilage yeasts", THE JOURNAL OF MICROBIOLOGY, THE MICROBIOLOGICAL SOCIETY OF KOREA // HAN-GUG MISAENGMUL HAG-HOE, KR, vol. 54, no. 9, 31 August 2016 (2016-08-31), KR , pages 618 - 625, XP036375057, ISSN: 1225-8873, DOI: 10.1007/s12275-016-6017-8 *
LI MEI; LU SHEN; SUN XU: "Rapid detection and genotyping of ALK fusion variants by adapter multiplex PCR and high-resolution melting analysis", LABORATORY INVESTIGATION, NATURE PUBLISHING GROUP, THE UNITED STATES AND CANADIAN ACADEMY OF PATHOLOGY, INC., vol. 100, no. 1, 22 October 2019 (2019-10-22), The United States and Canadian Academy of Pathology, Inc. , pages 110 - 119, XP036989701, ISSN: 0023-6837, DOI: 10.1038/s41374-019-0330-x *
OLIVER IBARRONDO: "A Statistical Method to Enhance the Analysis of the Differences Among High‐Resolution Melting (HRM) Curves of PCR‐Amplified DNA Fragments", JOURNAL OF FOOD SCIENCE, WILEY-BLACKWELL PUBLISHING, INC, US, vol. 84, no. 10, 1 October 2019 (2019-10-01), US , pages 2719 - 2728, XP093157243, ISSN: 0022-1147, DOI: 10.1111/1750-3841.14814 *
PABLO GOLDSCHMIDT: "New Strategy for Rapid Diagnosis and Characterization of Fungal Infections: The Example of Corneal Scrapings", PLOS ONE, PUBLIC LIBRARY OF SCIENCE, US, vol. 7, no. 7, 2 July 2012 (2012-07-02), US , pages e37660, XP093157246, ISSN: 1932-6203, DOI: 10.1371/journal.pone.0037660 *
RYUSUKE HATAE: "Precise Detection of IDH1/2 and BRAF Hotspot Mutations in Clinical Glioma Tissues by a Differential Calculus Analysis of High-Resolution Melting Data", PLOS ONE, PUBLIC LIBRARY OF SCIENCE, US, vol. 11, no. 8, 16 August 2016 (2016-08-16), US , pages e0160489, XP093157237, ISSN: 1932-6203, DOI: 10.1371/journal.pone.0160489 *
SIMON SCHIWEK: "High-Resolution Melting (HRM) Curve Assay for the Identification of Eight Fusarium Species Causing Ear Rot in Maize", PATHOGENS, MDPI AG, vol. 9, no. 4, pages 270, XP093157233, ISSN: 2076-0817, DOI: 10.3390/pathogens9040270 *
V. DEPERROIS-LAFARGE: "Use of the rpoB gene as an alternative to the V3 gene for the identification of spoilage and pathogenic bacteria species in milk and milk products ", LETTERS IN APPLIED MICROBIOLOGY, WILEY-BLACKWELL PUBLISHING LTD., GB, vol. 55, no. 2, 1 August 2012 (2012-08-01), GB , pages 99 - 108, XP093157249, ISSN: 0266-8254, DOI: 10.1111/j.1472-765X.2012.03261.x *

Also Published As

Publication number Publication date
CL2022002672A1 (en) 2023-03-24

Similar Documents

Publication Publication Date Title
Huang et al. Identification and classification for the Lactobacillus casei group
Reguant et al. Typification of Oenococcus oeni strains by multiplex RAPD‐PCR and study of population dynamics during malolactic fermentation
Rossetti et al. Rapid identification of dairy lactic acid bacteria by M13-generated, RAPD-PCR fingerprint databases
Nalbantoglu et al. Metagenomic analysis of the microbial community in kefir grains
Giannino et al. Study of microbial diversity in raw milk and fresh curd used for Fontina cheese production by culture-independent methods
EP2426220B1 (en) Tagged microorganisms and methods of tagging
Kullen et al. Use of the DNA sequence of variable regions of the 16S rRNA gene for rapid and accurate identification of bacteria in the Lactobacillus acidophilus complex
Sakamoto et al. 16S rRNA pyrosequencing-based investigation of the bacterial community in nukadoko, a pickling bed of fermented rice bran
Porcellato et al. Bacterial dynamics and functional analysis of microbial metagenomes during ripening of Dutch-type cheese
Quintela‐Baluja et al. Characterization of different food‐isolated E nterococcus strains by MALDI‐TOF mass fingerprinting
Vandamme et al. Phylogenetics and systematics
Ammor et al. Identification by fluorescence spectroscopy of lactic acid bacteria isolated from a small-scale facility producing traditional dry sausages
Xu et al. Use of multilocus sequence typing to infer genetic diversity and population structure of Lactobacillus plantarum isolates from different sources
de Boer et al. Amplicon sequencing for the quantification of spoilage microbiota in complex foods including bacterial spores
Mangia et al. Microbiological characterization using combined culture dependent and independent approaches of Casizolu pasta filata cheese
WO2015097006A1 (en) Metagenomic analysis of samples
Chen et al. Genetic relationships among Enterococcus faecalis isolates from different sources as revealed by multilocus sequence typing
Debruyne et al. Comparative performance of different PCR assays for the identification of Campylobacter jejuni and Campylobacter coli
Miteva et al. Differentiation of Lactobacillus delbrueckii subspecies by ribotyping and amplified ribosomal DNA restriction analysis (ARDRA)
Lee et al. Identification of Weissella species by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry
WO2024065069A1 (en) Method for determining cellular identity from a complex biological sample using pcr-hrm and mathematical analysis
Kong et al. Safety and technological characterization of Staphylococcus xylosus and Staphylococcus pseudoxylosus isolates from fermented soybean foods of Korea
Nyanzi et al. Comparison of rpoA and pheS gene sequencing to 16S rRNA gene sequencing in identification and phylogenetic analysis of LAB from probiotic food products and supplements
WO2024065068A1 (en) Set of primers from seq no. 1 to seq no. 70 and use of said primers for cell identity
Kaur et al. DNA profiling of Leuconostoc mesenteroides strains isolated from fermented foods and farm produce in Korea by repetitive-element PCR

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23869390

Country of ref document: EP

Kind code of ref document: A1