WO2022146103A1 - Construction of and searching method for raman scattering spectrum database through machine learning - Google Patents

Construction of and searching method for raman scattering spectrum database through machine learning Download PDF

Info

Publication number
WO2022146103A1
WO2022146103A1 PCT/KR2021/020362 KR2021020362W WO2022146103A1 WO 2022146103 A1 WO2022146103 A1 WO 2022146103A1 KR 2021020362 W KR2021020362 W KR 2021020362W WO 2022146103 A1 WO2022146103 A1 WO 2022146103A1
Authority
WO
WIPO (PCT)
Prior art keywords
raman
nucleic acid
shift
machine learning
value
Prior art date
Application number
PCT/KR2021/020362
Other languages
French (fr)
Korean (ko)
Inventor
이동우
Original Assignee
모던밸류 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 모던밸류 주식회사 filed Critical 모던밸류 주식회사
Publication of WO2022146103A1 publication Critical patent/WO2022146103A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/70Machine learning, data mining or chemometrics
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/65Raman scattering
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/65Raman scattering
    • G01N21/658Raman scattering enhancement Raman, e.g. surface plasmons
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/20Identification of molecular entities, parts thereof or of chemical compositions
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/30Prediction of properties of chemical compounds, compositions or mixtures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/40Searching chemical structures or physicochemical data
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/90Programming languages; Computing architectures; Database systems; Data warehousing

Definitions

  • the present invention provides a method for building and searching a Raman scattering spectrum database through machine learning; and an apparatus for calculating desired biometric prediction information from a Raman spectrum database by performing the method.
  • Raman spectroscopy is a phenomenon in which resonance between the frequency of the changed polarization and the frequency within the molecule occurs when a part of the incident light changes the polarizability of a molecule when a short wavelength incident light such as laser light is exposed. It is a spectroscopy method that measures the intrinsic scattering frequency of molecules in the Raman effect.
  • Raman spectroscopy is a method of irradiating incident light directly on a sample to be measured. It is easy to measure, and it is possible to measure even a very small amount of sample, there is no interference between moisture and carbon dioxide, and it can be used in a visible area. Therefore, Raman spectroscopy mainly uses a visible laser to detect light scattered by Raman molecules.
  • Raman scattering occurs only in the mode in which the polarization degree is changed among the vibrational modes of the molecule.
  • Symmetric Vibration the Raman spectrum is strongly generated.
  • Laser light is light in phase with a single wavelength.
  • the laser beam is thin and does not spread.
  • Lasers are mainly used in spectroscopy because of their precisely defined monochromatic wavelengths.
  • a pulsed laser it is used to observe a phenomenon occurring in a short time by using a short pulse width.
  • Raman spectroscopy is known as a technique suitable for single-cell level bacterial detection because it can quickly measure intracellular lipids, nucleic acids, and proteins by using the property of laser scattering by molecular resonance. Because of the high specificity and sensitivity to cellular components, it is possible to analyze bacterial phylogeny at the level of some species using only Raman spectra. In addition, when isotopes such as carbon-13 and hydrogen-2 are used at the same time, it can be used for quantitative evaluation of changes in the physiological activity of single cells.
  • various functions are required for biosensor development.
  • fields such as signal transmission, communication, and signal conversion
  • research is mainly conducted in electrical engineering, computer engineering, and mechanical engineering
  • the operating part for sensor driving is mainly in the fields of chemistry, biotechnology, materials engineering, and chemical biological engineering. research is in progress.
  • the sensitivity and function of the sensor are very important for precise examination for early diagnosis.
  • the ability to recognize an analyte is the most important key factor in determining the sensitivity of a sensor.
  • the factors that determine the sensitivity of the sensor can be divided into two main categories.
  • the first is a receptor site that recognizes the target material
  • the second is a transducer that generates a signal after recognition and converts it into a desired signal form.
  • Antibodies, aptamers, peptides, nucleic acid sequences, etc. capable of recognizing a target are attached to the receptor site, so that the recognition ability is determined according to affinity with the target.
  • a receptor is introduced into the sensor surface, research has been conducted to optimize it, as cognitive intelligence is determined by the introduction method (chemical, physical, biological) and structural stability of the receptor.
  • Raman spectroscopy has greatly improved measurement sensitivity and specificity due to recent technological developments, and thus has a high utility value in the field of microbial ecology research. Therefore, studies on bacterial identification, functional analysis, and biogeography can be supplemented to analyze the functional role of microorganisms in the environment.
  • the use of Raman spectroscopy enables real-time analysis of Raman signature analysis (detection) unique to bacteria and analysis of substrate specificity combined with isotopes (functional analysis), thus supplementing the limitations of existing analysis methods.
  • the present invention provides a method for constructing and searching a Raman scattering spectrum database through machine learning in order to derive desired information from data produced in a biological system; and to provide an apparatus for collecting, indexing, and storing desired biometric information from a Raman spectrum database by performing the above method.
  • Another object of the present invention is to provide computer software for deriving desired information from data produced in a biological system using a Raman spectrum database constructed through machine learning.
  • a first aspect of the present invention is a Raman scattering spectrum database construction and search method through machine learning
  • step 3 Based on (a') Raman shift values generated in step 2, (b') Raman peaks and (c') Raman troughs at each shift value, (d) signal to noise ratio within a given repeatability is Sensitivity, defined as the ratio of the spectrum that is 50% or more Defined, stability and (f) based on the number of measurements within a given inspection time, the above-defined spectrum of stability of 50% or more and sensitivity of 50% or more has repeatability. Step 3;
  • It provides a method for building and searching a Raman scattering spectrum database through machine learning, characterized by including.
  • a second aspect of the present invention transmits a program for executing at least one of the first to ninth steps to a computer so that the Raman scattering spectrum database construction and search method through machine learning according to the first aspect is performed on the computer Provides a medium or computer-readable recording medium.
  • a third aspect of the present invention provides an apparatus for calculating desired bio-prediction information from a Raman spectrum database
  • Spectrum with repeatability, stability of 80% or more, and sensitivity of 90% or more, is selected as the selection value within the shift, and the spectrum selectivity is calculated
  • Raman spectrum list of the sample (a) Raman shift values, at each shift value, (b) Raman peaks and (c) Raman troughs; (a') Raman shift values generated by a machine learning algorithm, (b') Raman peaks and (c') Raman troughs at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred from this by machine learning; and a Raman spectrum database (B) constructed by inputting the calculated spectral selectivity by selecting it as a selection value within the corresponding shift from this;
  • It provides an apparatus for calculating biometric prediction information, characterized in that it comprises a.
  • the apparatus for calculating desired biometric prediction information from the Raman spectrum database according to the third aspect may perform the Raman scattering spectrum database construction and search method through machine learning according to the first aspect.
  • the apparatus for calculating desired biometric prediction information from the Raman spectrum database corresponds to an apparatus for constructing and searching a Raman scattering spectrum database through a kind of machine learning.
  • the present invention relates to the presence and/or concentration of a specific biological material (eg, protein, amino acid, lipid, nucleic acid), chemical binding derived from cells (eg, bacteria, cancer cells, normal cells), identification of constituents and/or cell types It is characterized by building a Raman scattering spectrum database and pattern matching algorithm based on unsupervised machine learning techniques to accurately predict biometric information such as identification and/or concentration.
  • a specific biological material eg, protein, amino acid, lipid, nucleic acid
  • chemical binding derived from cells eg, bacteria, cancer cells, normal cells
  • identification of constituents and/or cell types It is characterized by building a Raman scattering spectrum database and pattern matching algorithm based on unsupervised machine learning techniques to accurately predict biometric information such as identification and/or concentration.
  • the biometric prediction information calculating device includes (a) one or more Raman shift values and each shift value included in a Raman spectrum list of a biological sample. (b) Raman highest point and (c) Raman lowest point as input information to the machine learning algorithm, and by learning the feature set of the biological sample using the machine learning algorithm, information about the state of the animal or cell from which the biological sample is extracted, disease It is characterized in that diagnosis and/or evaluation of the effectiveness of a therapeutic agent, and/or calculation or processing of specific information for predicting bacterial infection information of the sample is realized.
  • the local surface of the nanoparticles Surface analysis Raman spectroscopy using localized surface plasmon resonance (LSPR) may be performed.
  • LSPR localized surface plasmon resonance
  • surface analysis Raman spectroscopy may be performed after concentrating or filtering nanoparticles to which a detection indicator is connected.
  • step 3 Based on (a') Raman shift values generated in step 2, (b') Raman peaks and (c') Raman troughs at each shift value, (d) signal to noise ratio within a given repeatability is Sensitivity, defined as the ratio of the spectrum that is 50% or more Defined, stability and (f) based on the number of measurements within a given inspection time, the above-defined spectrum of stability of 50% or more and sensitivity of 50% or more has repeatability. Step 3;
  • the sample to be constructed of the Raman spectrum database may be a liquid sample.
  • the present invention is more useful when constructing and searching a Raman spectrum database from a liquid sample in order to predict the above-described biometric information.
  • an eighth step of separating and storing through spectral intensity indexing is optionally, in the case of a liquid sample, an eighth step of separating and storing through spectral intensity indexing.
  • the ninth step of filtering noise spectrum intensity using spectral pattern matching may be included.
  • a baseline as the average value of the standard deviation for each Raman shift of a signal experimentally output through a negative control, and to derive all Raman spectra after performing baseline correction.
  • the first step is from each sample, (a) one or more Raman shift values, and at each shift value, (b) Raman peak, which is the highest value among Raman intensities on the vertical axis, and (c) Raman, which is the lowest value among Raman intensities on the vertical axis.
  • This is a step of generating a Raman spectrum of each sample by deriving the lowest point.
  • Raman scattering when light (laser) is incident on a target molecular material and is scattered, scattering in which the amount of energy is not the same is called Raman scattering, and a change in the energy level is called Raman shift.
  • the Raman value does not use the wavelength (wave length) of Raman itself, but uses the wave number (cm -1 ) that is the Raman shift.
  • the Raman shift value may be derived by Equation 1 or Equation 2 below.
  • one or more Raman shift values may be derived from one sample.
  • (b) Raman highest point and (c) Raman lowest point are obtained from each Raman shift value.
  • the Raman shift range of 400 cm -1 to 3200 cm -1 for example, in the Raman shift range of the inspection equipment (eg, 500 cm -1 to 2,000 cm -1 ) by 1 to 5 shifts, preferably moving by 1 shift
  • the lowest and highest points can be obtained from each shift value.
  • the Raman intensity is generally expressed in arbitrary units (a.u.), it is not an absolute number, and the Raman peak can be defined as the highest intensity within the relatively Raman shift range.
  • the Raman peak may be derived from the relatively highest value among the Raman intensities of the vertical axis in each Raman shift value.
  • the Raman intensity (a.u.) is not an absolute number because it is generally expressed in arbitrary units, and the Raman lowest point can be defined as the lowest intensity within the relatively Raman shift range.
  • the Raman lowest point may be derived from the relatively lowest value among the Raman intensities of the vertical axis in each Raman shift value.
  • the second step is a step 2-1 of section learning the Raman shift value (a) of the first step, a step 2-2 of cluster learning the Raman peak (b) of the first step, and the Raman of the first step Generate (a') Raman shift values, (b') Raman peaks, and (c') Raman troughs from each shift value, learned according to a machine learning algorithm that performs step 2-3 of cluster learning the lowest point (c) is a step to
  • the second step is performed to select a clustering candidate group through unsupervised machine learning, and is a preprocessing process to reduce over-fitting as much as possible.
  • Machine learning can collect training data through an algorithm and then create a more accurate model based on that data.
  • (a') Raman shift values, (b') Raman peaks, and (c') Raman troughs at each shift value are outputs generated when training a machine learning algorithm using data, and provide a machine learning model. After training, you provide input to the machine learning model and you will receive the output.
  • Step 3 is based on (a') Raman shift values generated in Step 2, (b') Raman peaks and (c') Raman troughs at each shift value, (d) sensitivity, (e) It is a step of inferring stability and (f) repeatability by machine learning.
  • the third step is to select a main peak capable of substantially identifying a molecule from among the Raman peak clusters clustered in the previous step, and select peaks that satisfy the following conditions as final candidates for the main peak. is to perform
  • sensitivity may be defined as a ratio of a spectrum in which a signal to noise ratio within a given repeatability is 50% or more.
  • stability may be defined as a spectrum composition ratio in which the distribution range of a given spectrum value is above the standard deviation ( ⁇ ) range of the normal distribution mean ( ⁇ ) ( ⁇ - ⁇ , ⁇ + ⁇ ).
  • repeatability can be defined as having repeatability of a spectrum with a stability of 50% or more and a sensitivity of 50% or more, as defined above, based on the number of measurements within a given inspection time.
  • the fourth step is to calculate the fractional bandwidth.
  • Fractional Bandwidth can be said to be a relative bandwidth, which means the bandwidth with respect to the center frequency.
  • bandwidth is not simply a matter to be considered absolutely, but rather a concept to be considered relative to the center frequency.
  • the fourth step is to select only a single peak within the same fractional bandwidth as a representative among the final selected main peak candidates. If there are several main peak candidates within the same fractional bandwidth, the peak that best meets the condition is selected among them, and if the same score is obtained, the left candidate peak closer to the center frequency is selected.
  • Step 5 is a step of calculating spectral selectivity by selecting a spectrum having a repeatability and stability of 80% or more and a sensitivity of 90% or more as a selection value within the shift as defined in the third step.
  • the fifth step is to select one representative main peak that best fits the condition among the main peak candidates within each fractional bandwidth. Through this, representative peaks to be used for identification of the final molecule are selected.
  • selectivity has repeatability as defined above, and a spectrum having a stability of 80% or more and a sensitivity of 90% or more is selected as a selection value within the shift.
  • Step 6 includes (a) Raman shift values of the first step, (b) Raman peaks and (c) Raman minimums at each shift value; (a') Raman shift values generated by machine learning in the second step, (b') Raman peaks and (c') Raman troughs at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and inputting the spectral selectivity calculated in step 5 to build a Raman spectrum database.
  • the second data set constructed through machine learning may be (a') Raman shift values generated by learning in the second step, (b') Raman peaks and (c') Raman lowest points at each shift value, This may also be expressed as a Raman spectra derived through machine learning for the corresponding sample.
  • the third data set inferred by machine learning includes: (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and the spectral selectivity calculated in the fifth step.
  • the first data set, the second data set, and the third data set may be constructed as the Raman scattering spectrum database of the present invention.
  • the sixth step of constructing the Raman spectrum database may store the following items (i) to (iv):
  • step 7 if (a) Raman shift value of the sample, (b) Raman highest point, and (c) Raman lowest point at each shift value are input in step 1, desired prediction information from the Raman spectrum database constructed in step 6 is the step to calculate
  • the prediction information calculated in the seventh step includes (a') Raman shift values generated by learning in the second step, (b') Raman highest points and (c') Raman lowest points at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and one or more output values obtained through a specific function by inputting one or more values selected from the group consisting of the spectral selectivity calculated in the fifth step.
  • the function may have a relationship as shown in FIG. 5 .
  • the prediction information (output value) output through the function (hidden layer) is the state information of the animal or cell from which the sample is extracted, which is the target for deriving the Raman shift value (a) in the first step, disease diagnosis and/or evaluation of the effect of a therapeutic agent, and / or bacterial infection information of the sample.
  • the prediction information (output value) output through the function is the presence and/or concentration of specific biomaterials (eg, proteins, amino acids, lipids, nucleic acids), cells (eg, bacteria, cancer cells, normal cells) It may be the chemical bond of origin, the identification and/or the concentration of the constituent and/or the cell type.
  • biomaterials eg, proteins, amino acids, lipids, nucleic acids
  • cells eg, bacteria, cancer cells, normal cells
  • noise spectrum intensity reduction filtering using spectral pattern matching is performed in order to distinguish and build a Raman spectrum database dedicated to a liquid sample in the sixth step of constructing the Raman spectrum database.
  • the ninth step may be to determine the spectral pattern matching based on the coincidence rate with the Raman shift values selected based on the material.
  • Determining the spectral pattern matching by the coincidence rate with the Raman shift values selected based on the material may be by using at least one of the following methods (i) to (iv):
  • Noise is defined as 50% or less of the signal-to-noise ratio of each shift
  • the Raman scattering spectrum database construction and search method through machine learning according to the present invention can solve this problem.
  • FIG. 1 is a schematic diagram of an algorithm driving a Raman scattering spectrum database construction and search method through machine learning according to the present invention.
  • 2 is a schematic diagram for explaining the principle of Raman scattering.
  • FIG. 3 is an exemplary diagram for explaining a method of obtaining (a) one or more Raman shift values, (b) Raman highest point, and (c) Raman lowest point from each shift value, which is Raman spectrum information of a sample.
  • AI artificial intelligence
  • FIG. 5 is a schematic diagram illustrating the architecture of a neural network as a kind of function.
  • FIG. 6 is a schematic diagram of radio wave plasmon and localized surface plasmon resonance (LSPR).
  • FIG. 7 is a schematic diagram illustrating components of a biosensor and their relationship.
  • FIG. 8 is a schematic diagram illustrating the operating principle of a biosensor using metal nanoparticles exhibiting localized surface plasmon resonance (LSPR).
  • LSPR localized surface plasmon resonance
  • NAW structure nucleic acid-based self-assembly complex
  • FIG. 10 is a graph showing the result of repeated 100 measurements of the Raman signal coming out using the NEW structure prepared in Example 1.
  • FIG. 10 is a graph showing the result of repeated 100 measurements of the Raman signal coming out using the NEW structure prepared in Example 1.
  • FIG. 11 shows the Raman signal coming out using the NEW construct prepared in Example 1, the left graph is the NC (negative control) state in which the NEW construct is not dissociated and completely exists in the absence of the target nucleic acid, and the right graph is the target nucleic acid If present, the structure is in a completely dissociated PC (positive control) state.
  • FIG. 13 is a conceptual schematic diagram of an exemplary computer server used to process the systems and methods described herein.
  • Machine learning is a form of AI that can learn systems from data rather than through explicit programming ( Figure 4).
  • machine learning is not a simple process. After collecting training data through an algorithm, a more accurate model can be created based on that data.
  • a machine learning model is an output generated when training a machine learning algorithm using data. After training, you provide input to the model and you will receive the output. For example, in a predictive algorithm, a predictive model is generated. Then, you provide data to a predictive model, and you receive predictions based on the data that you trained on that model.
  • Machine learning allows a model to be trained on a data set before it is deployed.
  • Some machine learning models are online and persistent. This iterative process of the online model can improve the types of connections made between data elements. These patterns and associations are easy to overlook by users due to their complexity and size. After training the model, you can use the model in real time to learn from the data. The learning process and automation involved in machine learning can improve accuracy.
  • Machine learning techniques are needed to improve the accuracy of predictive models. Based on data type and capacity, there are various approaches such as:
  • Supervised learning usually begins with a solid understanding of the data set that has been built and how to classify that data.
  • Supervised learning is a way to find patterns in data that can be applied to the analytic process. These data have classified functions that define the meaning of the data.
  • Unsupervised learning is used when a problem requires a huge amount of unclassified data. Understanding the meaning behind these data requires algorithms that classify data based on discovered patterns or clusters. Self-learning performs an iterative process and analyzes data without user intervention.
  • Reinforcement learning is a behavioral learning model. Feedback from data analysis is applied to the algorithm to guide users to optimal results. Reinforcement learning is different from other types of supervised learning. This is because we are not training the system using a sample data set. Instead, it learns the system through trial and error. Thus, the process is strengthened by a series of successful decision-making, because it always solves the problem most effectively.
  • Deep learning is a specific machine learning methodology that integrates neural networks into successive layers so that they can iteratively learn from data. Deep learning is especially useful when learning patterns from unstructured data. Thus, computers can be trained to deal with poorly defined abstractions and problems.
  • Big data can increase the reliability of raw data and learning results by pre-processing data suitable for the purpose of use, and the presence of big data can help to increase the accuracy of machine learning models. Big data can be used to virtualize data so that it can be stored in the most efficient and cost-effective way, whether on premises or in the cloud. Additionally, improvements in network speed and reliability may remove other physical limitations associated with managing large amounts of data at acceptable rates.
  • Raman spectroscopy when strong light having a single wavelength is irradiated to a material, most of it undergoes elastic scattering, but a part of the light is used for molecular resonance and is scattered with a different frequency (Inelastic scattering: It is a method to analyze the chemical composition and structure of molecules using the Raman effect, which is a Raman scattering phenomenon. When Raman scattering is performed, the degree of shift compared to elastic scattering is called a Raman shift, and the characteristic of a medium can be expressed by expressing it as a spectra.
  • Raman spectroscopy can examine cellular components such as bacterial proteins, lipids, and nucleic acids, and according to the characteristics of the components, different signal intensities in the Raman shift 400-3200 cm -1 section are measured by Raman spectra ( The result can be expressed as spectra). Theoretically, each bacteria has its own Raman spectra, and some bacteria can be distinguished at the species level. In addition, the expression of genetic traits according to various environmental conditions affects the cell composition, which appears as a change in the Raman spectra, allowing information on the cell status within the same species to be confirmed. In general, when measuring single cells, a 532 nm laser with the least background effect by fluorescence is used. Table 1 shows the Raman shift information of chemical bonds and cellular components exhibiting resonance with a 532 nm laser.
  • the amino acid phenylalanine found in general Raman spectra of bacteria can be measured at a Raman shift of 1004 cm -1 and is used as a major factor in determining the accuracy of Raman spectra of bacteria.
  • Raman spectroscopy is used to detect infectious bacteria and diagnose diseases caused by bacterial infection.
  • urinary tract infection it can be diagnosed by detecting Escherichia coli and Enterococcus faecalis, which are the main causes of inflammation, from clinical samples, and it can be applied to the evaluation of the effect of prescribed antibiotics.
  • E. coli the treatment effect of the four antibiotics ampicillin, ciprofloxacin, gentamicin, and sulfamethoxazole can be directly confirmed.
  • tuberculosis diagnosis by constructing a DB for Mycobacterium tuberculosis, which is known as a major pathogen of tuberculosis.
  • Raman spectroscopy to real-time bacterial detection technology, it can be utilized for disease diagnosis.
  • Raman spectroscopy is being actively used in the food field.
  • Salmonella spp. Escherichia coli, Pseudomonas aeruginosa, Listeria monocytogenes, Legionella spp., and Staphylococcus aureus, which cause foodborne illness, can be quickly detected for food-borne illness.
  • Salmonella spp. In the case of detection technology, the use of Raman spectroscopy is common in the food field to the extent that ISO international standards have been established for various foods.
  • Pseudomonas aeruginosa and Legionella spp. can be detected from tap water and commercially available drinking samples and used for water quality health management.
  • a method of directly irradiating cells with Raman spectroscopy is included in the disease diagnosis process and utilized to prevent diseases such as cancer.
  • Breast cancer cells had a lower signal intensity than normal cells at a Raman shift of 1003 cm ⁇ 1
  • the signal intensity of platelets extracted from mice transplanted with Alzheimer’s gene at 740 cm ⁇ 1 and 1654 cm ⁇ 1 was compared with the normal control group. is known to be high. With such high measurement sensitivity and specificity, it can be applied to disease diagnosis by discriminating differences in minute Raman scattering signals.
  • physiological activity analysis using SIP-Raman technology analyzes bacteria using substrates containing stable isotopes such as carbon-13 and nitrogen-15 to study specificity for specific substrates way to do it
  • cell components such as nucleic acids and amino acids are labeled with stable isotopes, and the labeled components have different values from the existing Raman shift measurement values, and are distinguished from bacteria cultured on a general substrate.
  • the phenylalanine Raman shift of carbon-12 is observed at 1004 cm -1
  • the Raman shift of phenylalanine labeled with carbon-13 shows a Raman signal at 967 cm -1 .
  • it is possible to directly measure the physiological activity of bacteria by comparing the Raman shift difference that can be distinguished when cell components are substituted with isotopes of carbon-13, nitrogen-15, and hydrogen-2.
  • bacterial single cells can be measured.
  • phenylalanine of Aicdovorax a bacterium that decomposes naphthalene, showed a Raman signal at a Raman shift of 967 cm -1 unlike other bacterial single cells, and it can be verified that Aicdovorax is a bacterium that uses naphthalene as a carbon source.
  • the Raman signal of phenylalanine of the labeled cyanobacteria appears at 991 cm -1 . It exhibits a characteristic that the intensity of the Raman shift is proportionally changed as much as the ratio of the cell constituents labeled with carbon-13. This means that it is possible to quantitatively analyze bacterial activity on a substrate using the intensity information of the Raman shift.
  • the labeling degree of the carbon-13 substrate is less than 10%, it cannot be distinguished from carbon-12 due to the detection limit of Raman spectroscopy.
  • the change in Raman shift is relatively smaller than when a carbon isotope is used, and the change occurs mainly in the Raman shift value of a nucleic acid, unlike carbon, which showed a difference in amino acids. Even these changes are buried in different Raman shifts, making them very difficult to distinguish in complex samples.
  • E. coli cultured in a medium with different ratios of nitrogen-14 ammonium chloride and nitrogen-15 ammonium chloride is measured by Raman spectroscopy, the position change of the Raman shift cannot be confirmed, but the signal intensity depends on the injection concentration of the isotope. It can be seen that it increases proportionally.
  • the portion containing lipids in bacteria is extensive, it can be applied to monitoring lipid metabolism using imaging techniques in addition to simple detection.
  • a general substrate for hydrogen-2 isotopes For example, in Geobacter metalireducens cultured with acetic acid substituted with hydrogen-2 as a substrate, it is difficult to observe the change in Raman signal according to the use of isotopes.
  • It is possible to analyze the physiological activity by replacing the culture medium with heavy water (D 2 O). Bacteria with substrate activity can be analyzed through Raman shift without using isotope-substituted substrates by using the property of substituting hydrogen ions in the culture medium during lipid biosynthesis.
  • the analyte to be detected may include amino acids, peptides, polypeptides, proteins, glycoproteins, lipoproteins, nucleosides, nucleotides, oligonucleotides, nucleic acids, sugars, carbohydrates, oligosaccharides, polysaccharides, fatty acids, lipids, hormones, metabolites, cytokines, chemokines, receptors, neurotransmitters, antigens, allergens, antibodies, substrates, metabolites, cofactors, inhibitors, drugs, pharmaceuticals, nutrients, prions, toxins, poisons, explosives, pesticides, chemicals inorganic agents, biohazardous agents, radioisotopes, vitamins, heterocyclic aromatic compounds, carcinogens, mutagens, anesthetics, amphetamines, barbiturates, hallucinogens, wastes or contaminants, and the like.
  • the analyte is a nucleic acid
  • the nucleic acid is a gene, viral RNA and DNA, bacterial DNA, fungal DNA, mammalian DNA, cDNA, mRNA, RNA and DNA fragments, oligonucleotides, synthetic oligonucleotides, modified oligonucleotides, single stranded and double-stranded nucleic acids, natural and synthetic nucleic acids.
  • biomolecules capable of recognizing the analyte include antibodies, antibody fragments, genetically engineered antibodies, single chain antibodies, receptor proteins, binding proteins, enzymes, inhibitor proteins, lectins, cell adhesion proteins, oligonucleotides. , polynucleotides, nucleic acids or aptamers.
  • an analyte When an analyte is detected with a Raman signal for a known Raman shift value, (a) the analyte itself or (b) a biomolecule capable of recognizing the analyte is a molecule or a compound of which polarization occurs
  • the Raman signal for the Raman shift value of a biomolecule capable of recognizing the analyte may be measured, (a) the analyte itself or (b) the analyte is recognized
  • It is also possible to measure a Raman signal with respect to a Raman shift value of a Raman marker by linking a Raman marker to be described later to a biomolecule capable of doing so.
  • Au and Ag have high free electron density compared to other metals and are very stable because of their relatively low ionization tendency. Also, the high free electron density makes the real part of the dielectric constant of the metal negative and makes the metal have a large polarization, causing strong electric field enhancement. And, in the case of the imaginary part, since it indicates the degree of absorption of light, which is energy loss, the value must be small for effective augmentation.
  • Au in the case of Au, it has a real part value of a relatively low dielectric constant at about 630 nm in the visible ray region and has the lowest imaginary part value.
  • Ag when both the real part and the imaginary part of the permittivity are considered, it has a value capable of efficiently enhancing at about 530 nm.
  • SPR surface plasmon resonance
  • SERS Surface enhanced Raman spectroscopy
  • metal nanoparticles such as silver or gold' causes plasmon resonance on the material surface to amplify the Raman scattering signal. That is, when a target molecule is present in the vicinity of the metal nanostructure, a phenomenon in which the Raman scattering signal of the corresponding molecule is greatly increased is used.
  • One of the advantages of surface-enhanced Raman scattering analysis is that it can provide information that is difficult to obtain with general Raman analysis.
  • SERS Surface-enhanced Raman spectroscopy
  • metal nanoparticles such as silver (Ag) or gold (Au)
  • the Raman signal of the sample adsorbed on the surface of the nanoparticles can be amplified and detected by the interaction between the metal nanoparticles and incident light.
  • the amplification degree of the signal varies depending on the shape and size of the metal nanoparticles and the type of metal, and also depends on the angle, wavelength, and polarization of the incident light.
  • SERS overcomes the shortcomings of the small scattering cross-sectional area of Raman spectroscopy, high-resolution Raman images cannot be obtained due to the diffraction limit of the optical system.
  • TERS tipenhanced Raman spectroscopy
  • SERS surface-enhanced Raman spectroscopy
  • Metal nanoparticles are actively used in in vivo and in vitro diagnostic fields due to their excellent durability and unique physical, chemical, and electrochemical properties according to their size.
  • the signal generated based on the material, shape, and size of the metal nanoparticles has the advantage of being able to generate a stable signal for a long time because it is possible to transmit a unique signal without an additional labeling material.
  • Another advantage of metal nanoparticles is that they can amplify the signal generation of fluorescent substances and small molecule labeling substances based on the material properties of the metal.
  • the plasmon resonance phenomenon of metal nanoparticles can have the effect of amplifying optical properties such as Raman signals and fluorescent molecular signals.
  • the surface-modified metal nanoparticles can improve the function of the sensor, such as amplification of an electrochemical signal, and improving sensitivity and selectivity.
  • metal nanoparticles are being used variously for clinical, pharmaceutical, and cancer treatment delivery along with their use as sensors.
  • Metal nanoparticles exhibiting localized surface plasmon resonance can be synthesized as metal nanoparticles themselves, biofunctionalized metal nanoparticles, metal nanocomposites, or nanohybrids ( FIG. 8 ).
  • nanoparticles it is easy to control the size, and various materials can be used depending on the purpose, and since they are often synthesized in an aqueous solution, a large amount of material can be synthesized in a relatively easy way.
  • the high surface area to volume ratio of metal nanoparticles increases the efficiency of the catalyst or improves the sensitivity of the sensor. , which can have many advantages for clinical applications. Due to this, it is possible to detect sensitive (or small amount) cells and biomarkers in the human body in medical diagnosis and clinical analysis, and to perform detailed examination of local tissue sites. It is possible to develop and use a metal nanomaterial-based electrochemical sensor and biosensor platform to detect a very small amount of samples mainly of clinical and biological origin and to find biomedical important analytes.
  • a Raman scattering signal of a Raman indicator may be obtained through Raman spectroscopy.
  • Raman indicators are organic or inorganic molecules, atoms, complexes or synthetic molecules, dyes, naturally occurring dyes (phycoerythrin, etc.), organic nanostructures such as C 60 , bucky balls, carbon nanotubes, quantum dots, organic It may be a fluorescent molecule or the like.
  • a Raman indicator FAM, Dabcyl, TRITC (tetramethyl rhodamine-5-isothiocyanate), MGITC (malakit green isothiocyanate), XRITC (X- Rhodamine-5-isothiocyanate), DTDC (3,3-diethylthiadicarbocyanine iodide), TRIT (tetramethyl rhodamine isothiol), NBD (7-nitrobenz-2-1,3 -diazole), phthalic acid, terephthalic acid, isophthalic acid, para-aminobenzoic acid, erythrosine, biotin, digoxigenin, 5-carboxy-4',5'-dichloro-2',7' -dimethoxy, fluorescein, 5-carboxy-2',4',5',7'-tetrachlorofluorescein, 5-carboxyfluorescein,
  • the Raman indicator should show a clear Raman spectrum, and preferably, it is an organic fluorescent molecule including a cyanine-based fluorescence-maintaining molecule, Cy3, Cy3.5, Cy5, or a FAM, Dabcyl, or Rhodamine-based fluorescent molecule.
  • Organic fluorescent molecules have the advantage of being able to detect higher Raman scattering signals by resonating with the excitation laser wavelength used for Raman analysis.
  • a fluorescent substance absorbs light according to its unique structure, and when a molecule reaching an excited state loses energy and returns to a stable ground state, a radiation process in which energy is emitted again as light is a substance with
  • Such a fluorescence material can be used in liquid-based assays or imaging in a living environment. Since such a fluorescent material can control a signal (or color) according to its molecular structure, multiple detection is possible, and some fluorescent molecules have affinity only for a specific material, so a selective reaction is possible. However, when these fluorescent materials are exposed to light for a long time, the signal intensity decreases, making it difficult to monitor for a long time, and there are disadvantages in that the detectable fluorescence intensity is weak. In order to overcome this, a large amount of fluorescent material can be integrated or the fluorescence intensity can be amplified by using metal nanoparticles.
  • the fluorescent material connected to the nanoparticle exhibiting local surface plasmon resonance can amplify the signal. and this may increase the sensitivity.
  • Gold nanoparticles In the case of gold nanoparticles, it is possible to detect biochemicals using the Au-thiol reaction to easily modify the surface and attach a large amount of material to the surface of the nanoparticles.
  • Gold nanoparticles not only have a very high extinction coefficient, but also can act as a quencher because they have a broad absorption wavelength that overlaps most of the emission wavelengths of commonly used energy donors. By utilizing these characteristics, it is possible to develop a sensor with an on/off signal system.
  • DNA with a hairpin structure is designed, and a fluorescent material is attached to the end of gold nanoparticles that exhibit local surface plasmon resonance (LSPR).
  • LSPR local surface plasmon resonance
  • the Raman intensity of the fluorescent material turns off, and biochemicals can be detected by measuring it.
  • the Raman scattering spectrum database construction and search method through machine learning can be used in a method of detecting a target nucleic acid using a Raman signal derived from a nucleic acid-based self-assembly complex in a liquid, comprising the following steps. :
  • Step I of preparing a target nucleic acid detection reagent designed to measure the change value of the Raman signal when it is not or disassembled;
  • a hybridization reaction with a target nucleic acid detection reagent containing (a) the first nanoparticle-based structure to which the first nucleotide is linked and (b) the second nanoparticle-based construct to which the second nucleotide is linked in step I in the nucleic acid-containing liquid sample Step II to carry out;
  • Step III of measuring the Raman signal derived from the nucleic acid-based self-assembly complex in the liquid sample before, after and/or simultaneously with the occlusion reaction of step II;
  • Step IV providing detection and/or quantitative data of the target nucleic acid in the sample through an algorithm that analyzes the Raman signal measured in step III or its change value
  • the target nucleic acid detection reagent of step I is (a) a first nanoparticle-based structure in which a first nucleotide that intersects with a target nucleic acid is linked to a first metal nanoparticle, and (b) a second nucleotide complementary to the first nucleotide From a second nanoparticle-based structure linked to a bimetallic nanoparticle, a nucleic acid-based self-assembly complex is formed by spontaneous bonding at a molecular level between the first nucleotide and the second nucleotide, that is, complementary hydrogen bonding of at least 10 base pairs.
  • a nanogap is formed by two adjacent metal nanoparticles, and (ii) the nanogap generates and further strengthens a surface plasmon resonance phenomenon (electromagnetic effect) upon irradiation with light. space, and (iii) a Raman indicator linked to the second oligonucleotide may be positioned in the nanogap to enhance the Raman scattering signal detected during light irradiation (FIG. 9).
  • the first nucleotide that mates with the target nucleic acid is an oligonucleotide probe having a nucleic acid sequence that mates with a partial sequence of the target nucleic acid depending on conditions, spacers and nano an oligonucleotide adhesive that attaches to the particle;
  • a second nucleotide complementary to the first nucleotide by 10 bp or more is a 2-1 oligonucleotide adhesive having a nucleic acid sequence complementary to the oligonucleotide probe of the first nucleotide, and the degree of freedom of action is increased by 10 bp or more It may include a spacer and a second 2-2 oligonucleotide adhesive attached to the nanoparticles to assist in hydrogen bonding.
  • the target nucleic acid detection reagent of the present invention is capable of measuring the change value of the Raman signal when the nucleic acid-based self-assembly complex is not formed or disassembled by hybridization between the first nucleotide and the target nucleic acid in the presence of the target nucleic acid. characteristic (FIGS. 9 to 12).
  • Steps III and IV may be performed while utilizing the Raman scattering spectrum database construction and search method through machine learning of the present invention ( FIGS. 10 to 12 ).
  • hybridization with the target nucleic acid detection reagent is performed in a nucleic acid-containing liquid sample, and a Raman signal derived from a nucleic acid-based self-assembly complex in the liquid sample before, after and/or simultaneously with the hybridization reaction, that is, (a) Raman shift If (b) Raman highest point and (c) Raman lowest point are measured and inputted from each shift value, desired prediction information can be calculated from the constructed Raman spectrum database.
  • the target nucleic acid may be a genome or a fragment thereof, and through detection and/or quantification of the target nucleic acid, various predictive information such as identification of viruses and/or microorganisms or diagnosis of diseases and/or evaluation of the effectiveness of therapeutic agents can be calculated. .
  • the nucleic acid-based self-assembly complex functions as a sensor of a turn-off signal method in the presence of a target nucleic acid.
  • the nucleic acid-based self-assembly complex forms a precisely structurally defined nanogap between two metal nanoparticles and can exert localized surface plasmon resonance (LSPR) that amplifies the Raman scattering signal.
  • LSPR localized surface plasmon resonance
  • the on/off signal system in which the formation of a nanogap is determined by inversely interlocking with the presence or absence of the target nucleic acid to be measured in order to reproducibly secure the enhanced Raman scattering signal.
  • the target nucleic acid detection reagent can confirm whether or not the nucleic acid-based self-assembly complex is formed and/or the degree of formation (quantitation) with the Raman signal captured in the nucleic acid-based self-assembly complex in the liquid, and from this It is possible to detect or quantify a target nucleic acid that hybridizes with the first nucleotide so that the nucleic acid-based self-assembly complex is not formed or disassembled ( FIG. 12 ).
  • a target nucleic acid detection reagent that forms or contains a nucleic acid-based self-assembly complex at a known concentration has a Raman marker signal at its maximum in the absence of the target nucleic acid, and as the target nucleic acid increases, the Raman marker signal decreases, resulting in a known concentration of the target nucleic acid.
  • the Raman marker signal decreases, resulting in a known concentration of the target nucleic acid.
  • the minimum FOG. 11
  • the minimum and maximum reference points of the Raman signal for each concentration of the nucleic acid-based self-assembly complex that can be formed in the target nucleic acid detection reagent can be machine-learned through the Raman scattering spectrum database construction and search method through the machine learning of the present invention.
  • the present invention provides target information from data produced in a biological system.
  • a computer-readable recording medium is provided.
  • a computer is a device having information processing capability.
  • Information processing is the operation or processing of information according to the purpose of use.
  • software is a set of instructions and commands (including audio or image information) that enable commands, input, processing, storage, output, and interaction with equipment such as a computer and its peripheral devices.
  • a computer program is a program installed in a computer to perform a specific function, and is a set of instructions suitable for executing the first to ninth steps with a computer.
  • a data recording medium is a computer-readable medium in which data having a structure in which processing contents performed by a computer are specified due to the recorded data structure.
  • server 401 includes a central processing unit (CPU, also “processor”) 405 , which is a single core processor, a multi-core processor, or multiple processors for parallel processing.
  • processor 405 is a single core processor, a multi-core processor, or multiple processors for parallel processing.
  • the processor used as part of the control assembly is a microprocessor.
  • server 401 may also include memory 410 (eg, random access memory, read-only memory, flash memory); electronic storage unit 415 (eg hard disk); a communication interface 420 (eg, a network adapter) for communicating with one or more other systems; and peripheral devices 425 including cache, other memory, data storage, and/or electronic display adapters.
  • the memory 410, the storage unit 415, the interface 420, and the peripheral device 425 communicate with the processor 405 via a communication bus (solid line), such as a motherboard.
  • the storage unit 415 is a data storage unit for storing data.
  • the server 401 is operatively coupled to a computer network (“network”) 430 with the aid of a communication interface 420 .
  • network computer network
  • network 430 is an intranet and/or extranet that communicates with the Internet, an intranet and/or extranet, the Internet, a telecommunications or data network.
  • network 430 assisted by server 401 implements a peer-to-peer network, which enables a device coupled to server 401 to act as a client or server.
  • the server is configured to provide computer-readable instructions (eg, device/system operating protocols or parameters) or data (eg, sensor measurements, detection of metabolites) via electronic signals transmitted over the network 430 .
  • raw data obtained, analysis of raw data obtained from detection of metabolites, interpretation of raw data obtained from detection of metabolites, etc. can be transmitted and received.
  • a network is used, for example, to transmit or receive data across international boundaries.
  • the server 401 communicates with one or more output devices 435 such as a display or printer, and/or one or more input devices 440 such as, for example, a keyboard, mouse, or joystick.
  • the display is a touch screen display, in which case it functions as both a display device and an input device.
  • different and/or additional input devices are present, such as enunciators, speakers, or microphones.
  • the server uses any one of a variety of operating systems, such as, for example, Windows®, or MacOS®, or any one of several versions of Unix®, or Linux®.
  • the storage unit 415 stores files or data related to the operation of an apparatus, system, or method described herein.
  • the server communicates with one or more remote computer systems via a network 430 .
  • the one or more remote computer systems include, for example, personal computers, laptops, tablets, telephones, smartphones, or personal digital terminals.
  • control assembly includes a single server 401 .
  • a system includes multiple servers that communicate with each other via intranets, extranets, and/or the Internet.
  • server 401 is adapted to store device operating parameters, protocols, methods described herein, and other potentially relevant information. In some implementations, such information is stored on storage unit 415 or server 401 and such data is transmitted over a network.
  • a general communication network, communication line, etc. may transmit predetermined information such as a program or data.
  • Non-limiting examples of computer-readable recording media include hard disks, floppy disks, magnetic recording media, and optical recording media
  • non-limiting examples of transmission media include a transmission (communication) medium, a carrier wave (carrier) medium. wave), carrier wave, transmission (communication) mechanism, etc.).
  • Laser light is light in phase with a single wavelength. In general, the laser beam is thin and does not spread. Lasers are mainly used in spectroscopy because of their precisely defined monochromatic wavelengths.
  • the disadvantage of Raman spectroscopy is that the signal strength is weak, so it is preferable to use a laser capable of providing high-power incident light, that is, high-density photons, as a light source. Accordingly, it is preferable to include a photomultiplier tube (PMT), an avalanche photodiode (APD), a charge coupled device (CCD), or the like, which can effectively amplify the detection signal as the detector.
  • PMT photomultiplier tube
  • APD avalanche photodiode
  • CCD charge coupled device
  • Raman surface enhancement effect by metal nanoparticles using localized surface plasmon resonance (LSPR) (ii) Raman indicator further amplified due to nano-gap
  • the Raman scattering signal intensity amplification level and/or (iii) the Raman shift value of the Raman indicator may vary depending on the wavelength of the laser incident light used in the Raman analysis.
  • the method of acquiring a Raman scattering signal through Raman spectroscopy may be performed by any known Raman spectroscopy, preferably, Surface Enhanced Raman Scattering (SERS), Surface Enhanced Resonance Raman Spectroscopy (SERRS, Surface). enhanced resonance Raman spectroscopy), hyper-Raman and/or incoherent anti-Stokes Raman spectroscopy (CARS, coherent anti-Stokes Raman spectroscopy) may be used.
  • SERS Surface Enhanced Raman Scattering
  • SERRS Surface Enhanced Resonance Raman Spectroscopy
  • CARS coherent anti-Stokes Raman spectroscopy
  • Raman spectroscopy or related techniques may be used for analyte detection, including normal Raman scattering, resonance Raman scattering, surface enhanced Raman scattering, surface enhanced resonance Raman scattering. , incoherent anti-Stokes Raman spectroscopy (CARS), stimulated Raman scattering, inverse Raman spectroscopy, excitation gain Raman spectroscopy, hyper-Raman scattering, molecular optical laser examiner (MOLE) or Raman microprobe or Raman microscopy or confocal Raman microspectroscopy, three-dimensional or scanning Raman, Raman saturation spectroscopy, time-resolved resonance Raman, Raman dissociation spectroscopy or UV-Raman microscopy.
  • CARS incoherent anti-Stokes Raman spectroscopy
  • MOLE molecular optical laser examiner
  • Raman microprobe or Raman microscopy or confocal Raman microspectroscopy three-dimensional or scanning Raman, Raman saturation spectroscopy, time
  • the Raman detection apparatus may include a computer.
  • An example computer may include a bus for exchanging information and a processor for processing information.
  • a computer may further include RAM (RAM) or other dynamic storage devices, ROM (ROM) or other static storage devices and data storage devices, such as magnetic or optical disks and corresponding drives.
  • Computers also include peripheral devices known in the art, such as display devices (eg cathode ray tubes or liquid crystal displays), alphabet input devices (eg keyboards), cursor control devices (eg mouse, trackball, or cursor arrow keys), and communication devices. (eg, a modem, network interface card or interface device used to couple with an Ethernet, token ring, or other type of network).
  • the Raman detection apparatus may be operatively coupled with a computer.
  • Data from the detection device may be processed by a processor and the data stored in main memory. Data on release profiles for standard analytes may also be stored in main memory or ROM.
  • the processor may compare the emission spectra from the analyte on the Raman active substrate to determine the analyte type of the sample.
  • the processor may analyze data from the detection device to determine the identity and/or concentration of various analytes. Differently equipped computers may be used for specific implementations. Accordingly, the structure of the system may differ in different embodiments of the present invention.
  • the data After a data collection job, typically the data will be sent to a data analysis job. To facilitate the analytical task, the data obtained by the detection device will typically be analyzed using a digital computer as described above.
  • the computer will be suitably programmed for receiving and storing data from the detection device, as well as for analysis and reporting of the collected data.
  • a non-limiting example of a Raman detection device is disclosed in US Pat. No. 6,002,471.
  • the excitation beam is generated by a frequency superposed Nd:YAG laser at a wavelength of 532 nm or a frequency superposed Ti: sapphire laser at a wavelength of 365 nm.
  • a pulsed laser beam or a continuous laser beam may be used.
  • US Pat. No. 5,306,403 which is a Spex Model equipped with a gallium-arsenide photomultiplier tube (RCA Model C31034 or Burle Industries Model C3103402) operating in a single photon counting mode. ) 1403 double grating spectrometer.
  • Excitation sources include a 514.5 nm line argon-ion laser from SpectraPhysics, model 166, and a 647.1 nm line from a krypton-ion laser (Innova 70, incoherent).
  • excitation beam is spectrally refined by bandpass filter (Corion) on Raman active substrate using 6X objective lens (Newport, Model L6X) can be focused.
  • Example 1 Preparation of a target nucleic acid detection reagent containing a nucleic acid-based self-assembly complex
  • the target nucleic acid detection reagent to be prepared in Example 1 contains a nucleic acid-based self-assembly complex (NEW construct), and the NEW construct has (a) a first nucleotide that mates with the target nucleic acid has a diameter of 20 From a first nanoparticle-based structure linked to spherical gold nanoparticles of ⁇ 30 nm and (b) a second nanoparticle-based structure in which a second nucleotide complementary to the first nucleotide is linked to spherical gold nanoparticles with a diameter of 20-30 nm, Self-assembly in a water-based solvent through complementary hydrogen bonding of the first nucleotide and the second nucleotide.
  • NSW construct nucleic acid-based self-assembly complex
  • the target nucleic acid is synthesized with a nucleotide sequence (12 mer to 30 mer) that can be identified as the genome of the currently prevalent coronavirus.
  • An oligonucleotide probe having a nucleotide sequence that intersects with a target nucleic acid, a C3 spacer of the following formula (1) that increases the freedom of activity to facilitate complementary hydrogen bonding of 10 bp or more, and an oligonucleotide adhesive (poly-adenine) attached to the nanoparticles 10mer) was sequentially ligated to prepare a first nucleotide that mates with the target nucleic acid.
  • the oligonucleotide probe of the first nucleotide and the 2-1 oligonucleotide attacher having a nucleotide sequence complementary to 20 to 50 bp, increasing the freedom of action to help hydrogen bond complementary to 10 bp or more
  • a second nucleotide complementary to the first nucleotide was prepared by sequentially linking the C3 spacer of 1 and the 2-2 oligonucleotide adhesive (poly-adenine 10mer) attached to the nanoparticles.
  • Cy3 as a Raman marker is located between the C3 spacer in the second nucleotide and the oligonucleotide adhesive (poly-adenine 10mer) attached to the nanoparticles.
  • sequence length of the 2-1 oligonucleotide adhesive is shorter than the sequence length of the synthesized target nucleic acid so that the target nucleic acid has the upper hand in competition during mating with the first nucleotide (FIG. 9).
  • 100 mM phosphate buffer and 2M NaCl were sequentially added to a mixed solution of the first nucleotide modified with a -SH group at one end and gold nanoparticles, and reacted at room temperature to synthesize a first nanoparticle-based structure.
  • 100 mM phosphate buffer and 2M NaCl were sequentially added to a mixed solution of the second nucleotide modified with a -SH group at one end and gold nanoparticles, and reacted at room temperature to synthesize a second nanoparticle-based structure.
  • aqueous solution containing the first nanoparticle-based structure and the aqueous solution containing the second nanoparticle-based structure are mixed, and a nucleic acid-based self-assembly complex (NEW structure) containing target nucleic acid formed through complementary hydrogen bonding between the first nucleotide and the second nucleotide A detection reagent was prepared.
  • NW structure nucleic acid-based self-assembly complex
  • the Raman signal of the NEW construct that is, the Raman signal of the Raman marker Cy3 linked to the second nucleotide
  • the Raman signal of the NEW construct was measured in the target nucleic acid detection reagent containing the NEW construct in the phosphate buffer prepared in Example 1. The results are shown in FIG. 10 .
  • the above-described NEW construct in the target nucleic acid detection reagent provides a stably enhanced Raman scattering signal within a certain range and reproducibly even when continuously measured 100 times per second.
  • a stable signal could be obtained, but the exact same signal could not be obtained.
  • the nucleic acid-based self-assembly complex (NEW construct) prepared in Example 1 is a Raman indicator that exhibits a known Raman shift value to a second nucleotide that competes with a target nucleic acid in a occlusion reaction with the first nucleotide. is connected, and the nucleic acid-based self-assembly complex (NEW structure) acts as a sensor of the turn-off signal method in the presence of a target nucleic acid, so whether or not the nucleic acid-based self-assembly complex is formed/number/concentration of the Raman indicator It can be confirmed (quantified) by a signal (FIGS. 9 and 12).
  • the synthesized target nucleic acid (12 mer to 30 mer) was added to the target nucleic acid detection reagent containing the nucleic acid-based self-assembly complex (NEW construct) prepared in Example 1 at a known concentration (FIG. 12).
  • An inverted Raman detection device manufactured by itself in the same manner as in Example 1 was was used to measure the Raman signal of the Raman marker linked to the second nucleotide.
  • the graph on the left in FIG. 11 is a Raman spectrum measured after raising the target nucleic acid detection reagent without a target nucleic acid to 72.0°C and lowering the temperature to about 5°C, and the graph on the right in FIG. 11 shows the number of first nucleotides in the target nucleic acid detection reagent It is a Raman spectrum measured when an excessive amount of the synthesized target nucleic acid is taken into consideration.
  • Example 1 the number of nucleic acid-based self-assembling complexes (NEW constructs) formed by spontaneous hydrogen bonding at the molecular level between the first nucleotide, which is a probe that mates with the target nucleic acid, and the second nucleotide complementary thereto, the number of It is in a functional relationship that is linked to the number of nucleic acids oppositely (FIG. 9).
  • the target nucleic acid in the sample is absent or less than the minimum value of the detection sensitivity of the target nucleic acid detection reagent for the target nucleic acid (the minimum value of the detection range of the target nucleic acid detection reagent containing or forming a nucleic acid-based self-assembly complex)
  • the number of nucleic acid-based self-assembly complexes (NEW constructs) formed by self-assembly of complementary first and second nucleotides is the maximum
  • the target nucleic acid in the sample is the detection sensitivity of the target nucleic acid detection reagent to the target nucleic acid
  • the number of nucleic acid-based self-assembly complexes (NEW constructs) formed by self-assembly of complementary first nucleotides and second nucleotides is the minimum ( FIG. 11 ).
  • the nucleic acid-based self-assembly complex (NEW construct) in the target nucleic acid detection reagent prepared in Example 1 serves as a sensor of the turn-off signal method in the presence of the target nucleic acid, so that the nucleic acid-based self-assembly complex containing the nucleic acid-based self-assembly complex at a known concentration
  • the target nucleic acid detection reagent has a maximum Raman signal in the absence of the target nucleic acid, and the Raman signal decreases as the amount of target nucleic acid increases. and it is possible to secure or predict the reference points (Min, Max) of the optical signal for each concentration of the nucleic acid-based self-assembly complex in the target nucleic acid detection reagent (FIG. 11).
  • Figure 12 shows the target nucleic acid (12 mer ⁇ 30 mer) synthesized in the target nucleic acid detection reagent containing the nucleic acid-based self-assembly complex (NEW construct) prepared in Example 1 at a known concentration (0 M, 10 -16 M, 10 -12 M) is the measured Raman spectrum.
  • the Raman scattering signal is reversely linked to the target nucleic acid concentration upon light irradiation. was found to decrease in a constant pattern (FIG. 12).
  • the number of formation of the aforementioned nucleic acid-based self-assembly complex that is, the number of nanogap formations thereof and the intensity of the enhanced Raman scattering signal formed therefrom, is in a functional relationship with the number of target nucleic acids, so the above-described nucleic acid-based self-assembly
  • the target nucleic acid detection reagent that forms the complex can act as a sensor of the on/off signal system, and it is also possible to quantitatively analyze the target nucleic acid through a computer algorithm from the intensity of the Raman scattering signal measured during light irradiation. can be inferred
  • the target nucleic acid detection reagent of Example 1 is the change value of the Raman signal of the Raman marker when the nucleic acid-based self-assembly complex is not formed or disassembled by hybridization between the first nucleotide and the target nucleic acid in the presence of the target nucleic acid ( Intensity reduction) can be measured, and a target nucleic acid can be quantitatively analyzed ( FIGS. 9 to 12 ).

Abstract

In order to accurately predict bio-information such as presence or absence and/or concentration of specific biological materials (e.g., proteins, amino acids, lipids, and nucleic acids), chemical bonds and constituents derived from cells (e.g., bacteria, cancer cells, and normal cells), and/or identification and/or concentration of cells, a bio-information predictive calculation device according to the present invention utilizes, as input information for machine learning algorithm, (a) one or more Raman shift values included in a Raman spectrum list for biological entity-derived samples, (b) a Raman maximum intensity which is the highest value among Raman intensities on the vertical axis in each shift value, and (c) a Raman minimum intensity which is the lowest value among Raman intensities on the vertical axis, and takes advantage of the mechanical learning algorithm to lean character sets of biological entity-derived samples, whereby characteristic information calculation or processing can be achieved to predict information about states of animals or cells from which biological entity-derived samples are extracted, disease diagnosis and/or an assay for efficacy of therapeutic agents, and/or information about bacterial infection of the samples.

Description

기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색법 Raman scattering spectrum database construction and search method through machine learning
본 발명은 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법; 및 상기 방법을 수행하여 라만 스펙트럼 데이터베이스로부터 원하는 생체 예측 정보 산출 장치에 관한 것이다.The present invention provides a method for building and searching a Raman scattering spectrum database through machine learning; and an apparatus for calculating desired biometric prediction information from a Raman spectrum database by performing the method.
라만 분광법(Raman Spectroscopy)이란, 레이저광과 같은 단파장의 입사광을 쬐었을 때 입사된 빛의 일부가 분자의 편극성(polarizability)을 변화시키면, 변화된 편극성의 진동수와 분자내 진동수 간의 공명이 일어나는 현상인 라만 효과(Raman effect)에서 분자 고유의 산란 진동수를 측정하는 분광법이다. 라만 분광법은, 측정 대상 시료에 직접 입사광을 조사하는 방법으로 측정이 용이하고, 극미량의 시료도 측정이 가능하며, 수분과 이산화탄소의 간섭이 없고 Visible 영역에서도 사용이 가능하다. 따라서, 라만 분광법은 주로 Visible 레이저를 이용하여 분자에 의해 라만 산란되는 빛을 검출한다.Raman spectroscopy is a phenomenon in which resonance between the frequency of the changed polarization and the frequency within the molecule occurs when a part of the incident light changes the polarizability of a molecule when a short wavelength incident light such as laser light is exposed. It is a spectroscopy method that measures the intrinsic scattering frequency of molecules in the Raman effect. Raman spectroscopy is a method of irradiating incident light directly on a sample to be measured. It is easy to measure, and it is possible to measure even a very small amount of sample, there is no interference between moisture and carbon dioxide, and it can be used in a visible area. Therefore, Raman spectroscopy mainly uses a visible laser to detect light scattered by Raman molecules.
도 2에 도시된 바와 같이, 산란의 종류(Scattering)로, 입사광과 산란광의 진동수의 변화가 없는 Rayleigh산란; 입사광이 원자와의 충돌로 에너지를 잃고 진동수가 감소하는 Stokes산란; 및 입사광이 원자와의 충돌로 에너지를 얻어 진동수가 증가하는 Anti-Stokes산란이 있다. 산란되는 빛 중에서 원래의 입사광 에너지보다 적거나 많은 에너지를 가지는 빛의 산란을 라만 산란(Raman Scattering)이라 한다. 진동에너지를 직접적으로 측정할 수 없으나, 레일리 산란과 비교하여 에너지를 잃었는지 얻었는지를 관찰할 수 있다.As shown in FIG. 2 , as a type of scattering, Rayleigh scattering without change in the frequencies of incident light and scattered light; Stokes scattering, in which incident light loses energy due to collision with atoms and decreases in frequency; and Anti-Stokes scattering, in which incident light gains energy by collision with atoms and increases in frequency. Among the scattered light, the scattering of light having less or more energy than the original incident light energy is called Raman scattering. Although the vibrational energy cannot be measured directly, it can be observed whether energy is lost or gained compared to Rayleigh scattering.
한편, 분자가 모든 에너지 상태로 들뜰 수 있는 것이 아니고, Selection Rule에 허용되는 준위로만 들뜰 수 있다.On the other hand, molecules cannot be excited to all energy states, but can be excited only to the level allowed by the Selection Rule.
또한, 분자의 진동모드 중 편극도의 변화가 있는 모드만 라만 산란이 일어난다. Symmetric Vibration에서 라만 스펙트럼이 강하게 일어난다.In addition, Raman scattering occurs only in the mode in which the polarization degree is changed among the vibrational modes of the molecule. In Symmetric Vibration, the Raman spectrum is strongly generated.
레이저 광은 단일 파장 동위상의 빛이다. 일반적으로 레이저 빔은 가늘고 퍼지지 않는다. 레이저는 정확하게 정해지는 단색의 파장 때문에 분광학 분야에 주로 사용된다. 펄스 레이저의 경우 짧은 펄스 폭을 이용하여 짧은 시간 동안에 일어나는 현상을 관찰하는 데 사용된다.Laser light is light in phase with a single wavelength. In general, the laser beam is thin and does not spread. Lasers are mainly used in spectroscopy because of their precisely defined monochromatic wavelengths. In the case of a pulsed laser, it is used to observe a phenomenon occurring in a short time by using a short pulse width.
라만 분광법은 레이저가 분자의 공명에 의해 산란되는 특성을 이용하여 세포 내 지질, 핵산, 단백질 등의 구성물질을 신속하게 측정할 수 있어 단세포 수준의 세균 검측에도 적합한 기술로 알려져 있다. 세포 구성물질에 대한 높은 특이성과 민감성 때문에 라만 스펙트라(spectra)만으로 일부 종 수준의 세균 계통분석이 가능하다. 또한, 탄소-13, 수소-2 등의 동위원소를 동시에 사용하였을 경우 단세포의 생리적 활성 변화에 대한 정량평가에 활용할 수 있다. Raman spectroscopy is known as a technique suitable for single-cell level bacterial detection because it can quickly measure intracellular lipids, nucleic acids, and proteins by using the property of laser scattering by molecular resonance. Because of the high specificity and sensitivity to cellular components, it is possible to analyze bacterial phylogeny at the level of some species using only Raman spectra. In addition, when isotopes such as carbon-13 and hydrogen-2 are used at the same time, it can be used for quantitative evaluation of changes in the physiological activity of single cells.
하지만, 라만 산란의 신호 세기가 약하기 때문에 미량의 물질 혹은 생물 조직을 측정하는 생물학 분야에서는 제한적으로 활용되어 왔다. 생물학 분야에서는 미약한 라만 산란 신호를 증폭하기 위한 다양한 방법이 개발되었다. 대표적으로 SERS (surface enhanced raman spectroscopy) 기법이 알려져 있으며, 은이나 금 등의 금속 나노 입자의 표면에 조사된 빛이 물질 표면의 플라즈몬 공명 현상을 증강시켜 라만 신호를 증폭시키는 원리를 이용한다. 이 외에도 UV 레이저(180~260 nm)로 라만 신호를 측정하는 UVRR (UV resonance raman) 방법도 사용되는데 이 방법은 라만 신호를 103 ~105 배로 증폭시킬 수 있다.However, since the signal strength of Raman scattering is weak, it has been limitedly used in the field of biology for measuring trace amounts of substances or biological tissues. In the field of biology, various methods have been developed to amplify the weak Raman scattering signal. A typical SERS (surface enhanced raman spectroscopy) technique is known, and uses the principle that light irradiated to the surface of metal nanoparticles such as silver or gold enhances the plasmon resonance phenomenon of the material surface to amplify the Raman signal. In addition to this, the UV Resonance Raman (UVRR) method, which measures the Raman signal with a UV laser (180~260 nm), is also used, and this method can amplify the Raman signal by 10 3 to 10 5 times.
고령화 사회로 진입함으로써 건강한 미래, 건강 사회 구현과 함께 질병 진단, 치료, 예방에 대한 관심이 급증하고 있다. 이러한 건강한 미래 사회 실현을 위해서 생체 기능을 유지하고 무엇보다 질병을 조기에 검진하거나 예측하여 질병으로부터 자신을 보호하고자 하는 노력들이 증가하게 되었다. 이러한 사회적 요구에 따라 질병 예방을 위한 상시 건강 모니터링, 질병의 조기진단, 개인 맞춤형 진단 및 치료와 같이 보다 진보된 의료기술이 요구되고 있다. As we enter an aging society, interest in disease diagnosis, treatment, and prevention is rapidly increasing along with the realization of a healthy future and a healthy society. In order to realize such a healthy future society, efforts to protect oneself from diseases by maintaining biological functions and, above all, early diagnosis or prediction of diseases have increased. According to these social demands, more advanced medical technologies such as regular health monitoring for disease prevention, early diagnosis of diseases, and personalized diagnosis and treatment are required.
이에 따라 적시적기에 질병을 치료하고 예방하기 위한 건강 검진용 바이오센서 수요가 증가하였고 이에 따라 바이오센서 시장은 전 세계적으로 급격히 확장되고 있다. 혈액 시료를 기반으로 한 검진 방법이 보편적이지만 최근에는 고감도 센서 개발에 따라 소변, 침, 눈물 등과 같은 체액으로도 검진이 가능한 환자 친화적 비침습 센싱 방법도 활발하게 연구되고 있다. 이러한 센서 시장 패러다임의 변화 및 급속한 발전은 마이크로, 나노 재료 제작과 분석 기술 발전으로 체액 내 존재하는 나노 크기의 바이오 마커(단백질, 유전자, 펩타이드, 사이토카인)를 검출하는 소형화된 고감도의 센서를 개발할 수 있게 되었고 그 결과 화학, 물리, 재료, 의약 등 다양한 학문 분야에서 다양한 형태의 센서가 활발히 보고되었다. Accordingly, the demand for biosensors for medical examinations for timely treatment and prevention of diseases has increased, and accordingly, the biosensor market is rapidly expanding worldwide. Although a blood sample-based screening method is common, recently, with the development of a high-sensitivity sensor, a patient-friendly non-invasive sensing method capable of screening with body fluids such as urine, saliva, and tears is also being actively studied. This change in the sensor market paradigm and the rapid development of micro- and nano-materials and the development of analysis technology enable the development of miniaturized, high-sensitivity sensors that detect nano-sized biomarkers (proteins, genes, peptides, and cytokines) present in body fluids. As a result, various types of sensors have been actively reported in various academic fields such as chemistry, physics, materials, and medicine.
이러한 움직임은 4차 산업 혁명과 함께 디지털 헬스케어 시대를 맞이함에 따라 더 가속화되었고 첨단화된 바이오센서 개발 시작과 함께 통신망을 활용한 신호 송출, 실시간 건강검진 등 다양한 능력들이 요구되고 있다.This movement has accelerated as we enter the digital health care era with the 4th industrial revolution, and with the start of the development of advanced biosensors, various capabilities such as signal transmission using communication networks and real-time health checkups are required.
도 7에 예시된 바와 같이 바이오센서 개발에는 다양한 기능이 요구된다. 예를 들어 신호 송출, 통신, 신호 전환과 같은 분야는 전기공학, 컴퓨터공학, 기계공학에서 주로 연구를 진행하며 센서구동을 위한 작동부는 화학, 생명공학, 재료공학, 화학생물공학 등의 분야에서 주로 연구를 진행하고 있다.As illustrated in FIG. 7 , various functions are required for biosensor development. For example, in fields such as signal transmission, communication, and signal conversion, research is mainly conducted in electrical engineering, computer engineering, and mechanical engineering, and the operating part for sensor driving is mainly in the fields of chemistry, biotechnology, materials engineering, and chemical biological engineering. research is in progress.
조기 진단 등을 위한 정밀한 검진을 위해서는 센서의 감도와 기능이 매우 중요하다고 할 수 있다. 센서의 특성 중 분석 물질을 인식하는 능력은 센서의 감도를 결정하는 가장 중요한 핵심 요소라 할 수 있다. It can be said that the sensitivity and function of the sensor are very important for precise examination for early diagnosis. Among the characteristics of a sensor, the ability to recognize an analyte is the most important key factor in determining the sensitivity of a sensor.
센서의 감도를 결정하는 요소는 크게 2가지로 나뉠 수 있는데 첫 번째 타겟물질을 인식하는 리셉터(receptor) 부위, 두 번째는 인식 후 신호를 발생시키고 이를 원하는 신호 형태로 변환시키는 트랜듀서(transducer) 부위라 할 수 있다. 리셉터 부위에는 타겟을 인식할 수 있는 항체, 압타머, 펩타이드, 핵산서열 등이 부착되게 되어 타겟과의 친화도에 따라 인식 능력이 결정된다. 센서 표면 부위에 리셉터를 도입 시 도입방법(화학적, 물리적, 생물학적)과 리셉터의 구조적 안정성 등에 따라 인지능이 결정됨으로 이를 최적화하기 위한 연구들이 이루어져 왔다. 하지만 타겟 물질과 상호작용할 수 있는 구조가 한정적이고 전술한 방법만을 가지고는 센서의 감도를 개선하는데 한계가 존재하게 되어 새로운 접근 방식의 필요성이 강조되기 시작했다. 이에 따라 센서 인식부 안정화와 동시에 타겟 물질 인식 후 신호를 생성하는 방출 시스템을 개선하고 이를 통해 미량의 물질이 존재하여도 이를 인식하여 센서 신호로 받아들일 수 있는 신호 송출 부위에 더욱 연구를 집중하게 되었다.The factors that determine the sensitivity of the sensor can be divided into two main categories. The first is a receptor site that recognizes the target material, and the second is a transducer that generates a signal after recognition and converts it into a desired signal form. can be said Antibodies, aptamers, peptides, nucleic acid sequences, etc. capable of recognizing a target are attached to the receptor site, so that the recognition ability is determined according to affinity with the target. When a receptor is introduced into the sensor surface, research has been conducted to optimize it, as cognitive intelligence is determined by the introduction method (chemical, physical, biological) and structural stability of the receptor. However, the structure that can interact with the target material is limited, and there is a limit to improving the sensitivity of the sensor using only the above-described method, so the need for a new approach is starting to be emphasized. Accordingly, we improved the emission system that generates a signal after recognizing a target material at the same time as stabilizing the sensor recognition unit, and through this, we focused more research on the signal transmission site that can recognize even a trace amount of material and receive it as a sensor signal. .
새로운 신소재의 개발, 측정기기의 발전도 센서감도를 개선하는데 기여하였으나 무엇보다 마이크로, 나노 크기 재료의 제작 및 패턴이 가능해지면서 센서의 소형화와 동시에 감도 개선 신호 측정 및 어세이 시간 단축과 같은 연구가 가능해졌다. 그 중에서도 나노 소재의 발달은 바이오센서의 신호 증폭, 감도 개선뿐만 아니라 재료 자체가 가지는 광학적, 물리적, 화학적 특성을 활용하여 새로운 형태의 센서 신호 생성 체계를 확보하고 이를 통해 진단 기술의 혁신적인 진보를 이끌어 오고 있다. 예컨대, 첨단형 바이오센서의 핵심 요소로서 나노 재료 중 나노 입자를 이용한 체외 진단 센서가 있다. 이때, 나노입자 특성에 따른 센서 신호 검출법이 다르다.The development of new materials and the development of measuring devices also contributed to improving the sensor sensitivity, but above all, as micro- and nano-sized materials can be manufactured and patterned, it is possible to miniaturize the sensor and improve the sensitivity at the same time, such as measuring signals and shortening the assay time. it was done Among them, the development of nanomaterials not only improves the signal amplification and sensitivity of the biosensor, but also utilizes the optical, physical, and chemical properties of the material itself to secure a new type of sensor signal generation system, leading to innovative progress in diagnostic technology. have. For example, as a core element of a high-tech biosensor, there is an in vitro diagnostic sensor using nanoparticles among nanomaterials. In this case, the sensor signal detection method according to the characteristics of the nanoparticles is different.
특히, 라만 분광법은 최근 기술 발달로 측정 민감도와 특이성이 대폭 향상되어 미생물 생태 연구 분야에서도 높은 활용가치를 갖게 되었다. 따라서, 환경에서 미생물의 기능적 역할을 분석하기 위하여 세균 검측(bacterial identification), 기능 분석 (functional analysis), 생물 지리 정보(biogeography)에 대한 연구를 보완할 수 있다. 라만 분광법을 활용하면 세균 고유의 라만 시그니처 분석(검측), 동위원소와 결합한 기질 특이성 분석(기능 분석) 등을 실시간 분석이 가능하므로 기존의 분석 방법들의 한계를 보완할 수 있다. In particular, Raman spectroscopy has greatly improved measurement sensitivity and specificity due to recent technological developments, and thus has a high utility value in the field of microbial ecology research. Therefore, studies on bacterial identification, functional analysis, and biogeography can be supplemented to analyze the functional role of microorganisms in the environment. The use of Raman spectroscopy enables real-time analysis of Raman signature analysis (detection) unique to bacteria and analysis of substrate specificity combined with isotopes (functional analysis), thus supplementing the limitations of existing analysis methods.
본 발명은 생체 시스템에서 생산된 데이터로부터 목적하는 정보를 도출해내기 위해, 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법; 및 상기 방법을 수행하여 라만 스펙트럼 데이터베이스로부터 원하는 생체 정보를 수집, 색인, 저장하는 장치를 제공하고자 한다.The present invention provides a method for constructing and searching a Raman scattering spectrum database through machine learning in order to derive desired information from data produced in a biological system; and to provide an apparatus for collecting, indexing, and storing desired biometric information from a Raman spectrum database by performing the above method.
또한, 본 발명은 생체 시스템에서 생산된 데이터로부터 기계학습을 통해 구축된 라만 스펙트럼 데이터베이스를 이용하여 목적하는 정보를 도출해내는 컴퓨터 소프트웨어를 제공하고자 한다.Another object of the present invention is to provide computer software for deriving desired information from data produced in a biological system using a Raman spectrum database constructed through machine learning.
본 발명의 제1양태는 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법에 있어서,A first aspect of the present invention is a Raman scattering spectrum database construction and search method through machine learning,
각 시료로부터 (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 세로축의 라만 인텐시티 중 상대적으로 가장 높은 값인 라만 최고점 및 (c) 세로축의 라만 인텐시티 중 상대적으로 가장 낮은 값인 라만 최저점을 도출하여, 각 시료의 라만 스펙트럼을 생성하는 제1단계; From each sample, (a) one or more Raman shift values, from each shift value, (b) the Raman peak, which is the relatively highest value among the Raman intensities on the vertical axis, and (c) the Raman lowest point, which is the relatively lowest value among the Raman intensities on the vertical axis, is derived. , a first step of generating a Raman spectrum of each sample;
제1단계의 라만 쉬프트 값(a)을 구간학습하는 제2-1단계,Step 2-1 of section learning the Raman shift value (a) of the first step;
제1단계의 라만 최고점(b)을 클러스터 학습하는 제2-2단계, 및Step 2-2 of cluster learning the Raman peak (b) of the first step, and
제1단계의 라만 최저점(c)을 클러스터 학습하는 제2-3단계Step 2-3 of cluster learning the Raman lowest point (c) of the first step
를 수행하는 기계 학습 알고리즘에 따라 학습된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 생성하는 제2단계;a second step of generating (a') Raman shift values, (b') Raman peaks, and (c') Raman troughs from each shift value according to a machine learning algorithm that performs
제2단계에서 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 기반으로, (d) 주어진 반복성내 신호잡음비(signal to noise ratio)가 50%이상인 스펙트럼의 비로 정의된, 민감도(sensitivity), (e) 주어진 스펙트럼 값의 분포 범위가 정규분포 평균(μ)의 표준편차(δ) 범위(μ-δ, μ+δ)의 스펙트럼구성비로 정의된, 안정도(stability) 및 (f) 주어진 검사 시간내 측정회수를 바탕으로 상기 정의된 안정도 50%이상, 민감도 50%이상의 스펙트럼은 반복성이 있다고 정의된, 반복성 (repeatability)을 기계학습으로 추론하는 제3단계;Based on (a') Raman shift values generated in step 2, (b') Raman peaks and (c') Raman troughs at each shift value, (d) signal to noise ratio within a given repeatability is Sensitivity, defined as the ratio of the spectrum that is 50% or more Defined, stability and (f) based on the number of measurements within a given inspection time, the above-defined spectrum of stability of 50% or more and sensitivity of 50% or more has repeatability. Step 3;
Fractional Bandwidth를 계산하는 제4단계; a fourth step of calculating fractional bandwidth;
제3단계에서 정의된 대로 반복성이 있으며, 안정도가 80% 이상, 민감도가 90% 이상인 스펙트럼을 해당 쉬프트내 선택값으로 선정하여, 스펙트럼 선택도(selectivity)를 계산하는 제5단계; a fifth step of calculating spectral selectivity by selecting a spectrum having repeatability and having a stability of 80% or more and a sensitivity of 90% or more as a selection value within the shift as defined in the third step;
제1단계의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점; 제2단계에서 기계 학습시켜 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점; 제3단계에서 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 제5단계에서 계산된 스펙트럼 선택도(selectivity)를 입력하여, 라만 스펙트럼 데이터베이스를 구축하는 제6단계; 및(a) Raman shift values of the first step, (b) Raman peaks and (c) Raman troughs at each shift value; (a') Raman shift values generated by machine learning in the second step, (b') Raman highest points and (c') Raman lowest points at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and a sixth step of constructing a Raman spectrum database by inputting the spectral selectivity calculated in the fifth step; and
선택적으로(optionally), 제1단계에서 시료의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 입력하면, 제6단계에서 구축된 라만 스펙트럼 데이터베이스로부터 원하는 예측 정보를 산출하는 제7단계;Optionally, by inputting (a) Raman shift values, (b) Raman peaks and (c) Raman troughs at each shift value of the sample in the first step, the desired prediction from the Raman spectrum database constructed in the sixth step a seventh step of calculating information;
를 포함하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법을 제공한다.It provides a method for building and searching a Raman scattering spectrum database through machine learning, characterized by including.
본 발명의 제2양태는 제1양태에 따른 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법이 컴퓨터에서 수행되도록, 제1단계 내지 제9단계 중 적어도 한 단계를 실행시키기 위한 프로그램을 컴퓨터에 전송하는 매체 또는 컴퓨터로 읽을 수 있는 기록매체를 제공한다.A second aspect of the present invention transmits a program for executing at least one of the first to ninth steps to a computer so that the Raman scattering spectrum database construction and search method through machine learning according to the first aspect is performed on the computer Provides a medium or computer-readable recording medium.
본 발명의 제3양태는 라만 스펙트럼 데이터베이스로부터 원하는 생체 예측 정보 산출 장치에 있어서, A third aspect of the present invention provides an apparatus for calculating desired bio-prediction information from a Raman spectrum database,
생체 유래 액체 시료로부터 (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 세로축의 라만 인텐시티 중 상대적으로 가장 높은 값인 라만 최고점 및 (c) 세로축의 라만 인텐시티 중 상대적으로 가장 낮은 값인 라만 최저점을 해당 시료의 라만 스펙트럼 목록(list)으로 수집하는 정보 수신부(A);From a biological-derived liquid sample, (a) one or more Raman shift values, and at each shift value, (b) a Raman peak, which is a relatively highest value among Raman intensities on the vertical axis, and (c) a Raman minimum, which is a relatively low value among Raman intensities on the vertical axis. Information receiving unit (A) to collect the Raman spectrum list (list) of the sample;
해당 시료의 라만 스펙트럼 목록(list)에 포함된 정보를, (a-1) 라만 쉬프트 값을 구간학습하는 알고리즘; (b-1) 라만 최고점을 클러스터 학습하는 알고리즘; 및 (c-1) 라만 최저점을 클러스터 학습하는 알고리즘의 입력으로 하고, Information included in the Raman spectrum list of the sample, (a-1) an algorithm for section learning the Raman shift value; (b-1) an algorithm for cluster learning of Raman peaks; and (c-1) using the Raman lowest point as an input of an algorithm for cluster learning,
상기 기계 학습 알고리즘에 의해 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 기반으로, (d) 주어진 반복성내 신호잡음비(signal to noise ratio)가 50%이상인 스펙트럼의 비로 정의된, 민감도(sensitivity), (e) 주어진 스펙트럼 값의 분포 범위가 정규분포 평균(μ)의 표준편차(δ) 범위(μ-δ, μ+δ)의 스펙트럼구성비로 정의된, 안정도(stability) 및 (f) 주어진 검사 시간내 측정회수를 바탕으로 상기 정의된 안정도 50%이상, 민감도 50%이상의 스펙트럼은 반복성이 있다고 정의된, 반복성 (repeatability)을 기계학습으로 추론하며,Based on (a') Raman shift values generated by the machine learning algorithm, (b') Raman peaks and (c') Raman troughs at each shift value, (d) signal to noise ratio within a given repeatability ), defined as the ratio of the spectrum where 50% or more, sensitivity (sensitivity), (e) the spectrum of the standard deviation (δ) range (μ-δ, μ+δ) of the normal distribution mean (μ) in which the distribution range of a given spectrum value is Stability, defined as the composition ratio, and (f) repeatability, defined as repeatability of a spectrum with a stability of 50% or more and a sensitivity of 50% or more, as defined above based on the number of measurements within a given inspection time, is machine learning. infer,
반복성이 있으며, 안정도가 80% 이상, 민감도가 90% 이상인 스펙트럼을 해당 쉬프트내 선택값으로 선정하여, 스펙트럼 선택도(selectivity)를 계산하여,Spectrum with repeatability, stability of 80% or more, and sensitivity of 90% or more, is selected as the selection value within the shift, and the spectrum selectivity is calculated,
해당 시료의 라만 스펙트럼 목록(list)에 포함된 정보인, (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점; 기계 학습 알고리즘에 의해 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점; 이로부터 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 이로부터 해당 쉬프트내 선택값으로 선정하여 계산된 스펙트럼 선택도(selectivity)를 입력하여 구축된, 라만 스펙트럼 데이터베이스(B); Information included in the Raman spectrum list of the sample, (a) Raman shift values, at each shift value, (b) Raman peaks and (c) Raman troughs; (a') Raman shift values generated by a machine learning algorithm, (b') Raman peaks and (c') Raman troughs at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred from this by machine learning; and a Raman spectrum database (B) constructed by inputting the calculated spectral selectivity by selecting it as a selection value within the corresponding shift from this;
선택적으로(optionally), 해당 시료의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 입력하면, 구축된 라만 스펙트럼 데이터베이스(B)로부터 원하는 생체 예측 정보를 산출하는 생체 정보 예측부(C)Optionally, by inputting (a) Raman shift value, (b) Raman peak and (c) Raman trough at each shift value of the sample, desired biometric prediction information is calculated from the constructed Raman spectrum database (B) Biometric information prediction unit (C)
를 포함하는 것이 특징인 생체 예측 정보 산출 장치를 제공한다.It provides an apparatus for calculating biometric prediction information, characterized in that it comprises a.
제3양태에 따른 라만 스펙트럼 데이터베이스로부터 원하는 생체 예측 정보 산출 장치는 제1양태에 따른 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법을 수행할 수 있다.The apparatus for calculating desired biometric prediction information from the Raman spectrum database according to the third aspect may perform the Raman scattering spectrum database construction and search method through machine learning according to the first aspect.
따라서, 본 발명에 따른 라만 스펙트럼 데이터베이스로부터 원하는 생체 예측 정보 산출 장치는 일종의 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 장치에 해당한다.Therefore, the apparatus for calculating desired biometric prediction information from the Raman spectrum database according to the present invention corresponds to an apparatus for constructing and searching a Raman scattering spectrum database through a kind of machine learning.
이하, 본 발명을 설명한다.Hereinafter, the present invention will be described.
본 발명은 특정 생체 물질(예, 단백질, 아미노산, 지질, 핵산)의 존재 여부 및/또는 농도, 세포(예, 세균, 암세포, 정상세포) 유래의 화학 결합, 구성물질 및/또는 세포 종류의 확인(identification) 및/또는 농도와 같은 생체 정보를 정확히 예측하기 위해, 무감독 기계학습 기법을 기반으로 한 라만 산란 스펙트럼 데이터베이스 및 패턴 매칭 알고리즘을 구축하는 것이 특징이다.The present invention relates to the presence and/or concentration of a specific biological material (eg, protein, amino acid, lipid, nucleic acid), chemical binding derived from cells (eg, bacteria, cancer cells, normal cells), identification of constituents and/or cell types It is characterized by building a Raman scattering spectrum database and pattern matching algorithm based on unsupervised machine learning techniques to accurately predict biometric information such as identification and/or concentration.
구체적으로, 생체 정보를 정확히 예측 산출하기 위해, 본 발명에 따른 생체 예측 정보 산출 장치는, 생체 유래 시료의 라만 스펙트럼 목록(list)에 포함된, (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 기계 학습 알고리즘의 입력정보로 하고, 기계 학습 알고리즘을 이용하여 생체 유래 시료의 특징 집합을 학습함으로써 생체 유래 시료가 추출된 동물 또는 세포의 상태 정보, 질환 진단 및/또는 치료제의 효과 평가, 및/또는 상기 시료의 세균 감염 정보를 예측하는 특유의 정보의 연산 또는 가공이 실현되는 것이 특징이다. Specifically, in order to accurately predict and calculate biometric information, the biometric prediction information calculating device according to the present invention includes (a) one or more Raman shift values and each shift value included in a Raman spectrum list of a biological sample. (b) Raman highest point and (c) Raman lowest point as input information to the machine learning algorithm, and by learning the feature set of the biological sample using the machine learning algorithm, information about the state of the animal or cell from which the biological sample is extracted, disease It is characterized in that diagnosis and/or evaluation of the effectiveness of a therapeutic agent, and/or calculation or processing of specific information for predicting bacterial infection information of the sample is realized.
라만 쉬프트 값(a)을 도출하고자 하는 검출 표지자(indicator)가 액체 시료에 분산되어 있는 나노 입자 상에 연결되어 있는 경우, 상기 라만 쉬프트 값(a)을 측정하기 위해, 상기 나노 입자에 의한 국부적 표면 플라즈몬 공명(localized surface plasmon resonance, LSPR)을 이용한 표면 분석 라만 분광법을 수행할 수 있다. 이때, 검출 표지자(indicator)가 연결된 나노 입자를 농축하거나 여과한 후 표면 분석 라만 분광법을 수행할 수 있다.When a detection indicator from which the Raman shift value (a) is to be derived is connected to nanoparticles dispersed in a liquid sample, in order to measure the Raman shift value (a), the local surface of the nanoparticles Surface analysis Raman spectroscopy using localized surface plasmon resonance (LSPR) may be performed. In this case, surface analysis Raman spectroscopy may be performed after concentrating or filtering nanoparticles to which a detection indicator is connected.
따라서, 본 발명에 따른 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법은,Therefore, the Raman scattering spectrum database construction and search method through machine learning according to the present invention,
각 시료로부터 (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 세로축의 라만 인텐시티 중 상대적으로 가장 높은 값인 라만 최고점 및 (c) 세로축의 라만 인텐시티 중 상대적으로 가장 낮은 값인 라만 최저점을 도출하여, 각 시료의 라만 스펙트럼을 생성하는 제1단계; From each sample, (a) one or more Raman shift values, from each shift value, (b) the Raman peak, which is the relatively highest value among the Raman intensities on the vertical axis, and (c) the Raman lowest point, which is the relatively lowest value among the Raman intensities on the vertical axis, is derived. , a first step of generating a Raman spectrum of each sample;
제1단계의 라만 쉬프트 값(a)을 구간학습하는 제2-1단계,Step 2-1 of section learning the Raman shift value (a) of the first step;
제1단계의 라만 최고점(b)을 클러스터 학습하는 제2-2단계, 및Step 2-2 of cluster learning the Raman peak (b) of the first step, and
제1단계의 라만 최저점(c)을 클러스터 학습하는 제2-3단계Step 2-3 of cluster learning the Raman lowest point (c) of the first step
를 수행하는 기계 학습 알고리즘에 따라 학습된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 생성하는 제2단계;a second step of generating (a') Raman shift values, (b') Raman peaks, and (c') Raman troughs from each shift value according to a machine learning algorithm that performs
제2단계에서 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 기반으로, (d) 주어진 반복성내 신호잡음비(signal to noise ratio)가 50%이상인 스펙트럼의 비로 정의된, 민감도(sensitivity), (e) 주어진 스펙트럼 값의 분포 범위가 정규분포 평균(μ)의 표준편차(δ) 범위(μ-δ, μ+δ)의 스펙트럼구성비로 정의된, 안정도(stability) 및 (f) 주어진 검사 시간내 측정회수를 바탕으로 상기 정의된 안정도 50%이상, 민감도 50%이상의 스펙트럼은 반복성이 있다고 정의된, 반복성 (repeatability)을 기계학습으로 추론하는 제3단계;Based on (a') Raman shift values generated in step 2, (b') Raman peaks and (c') Raman troughs at each shift value, (d) signal to noise ratio within a given repeatability is Sensitivity, defined as the ratio of the spectrum that is 50% or more Defined, stability and (f) based on the number of measurements within a given inspection time, the above-defined spectrum of stability of 50% or more and sensitivity of 50% or more has repeatability. Step 3;
Fractional Bandwidth를 계산하는 제4단계; a fourth step of calculating fractional bandwidth;
제3단계에서 정의된 대로 반복성이 있으며, 안정도가 80% 이상, 민감도가 90% 이상인 스펙트럼을 해당 쉬프트내 선택값으로 선정하여, 스펙트럼 선택도(selectivity)를 계산하는 제5단계; a fifth step of calculating spectral selectivity by selecting a spectrum having repeatability and having a stability of 80% or more and a sensitivity of 90% or more as a selection value within the shift as defined in the third step;
제1단계의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점; 제2단계에서 기계 학습시켜 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점; 제3단계에서 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 제5단계에서 계산된 스펙트럼 선택도(selectivity)를 입력하여, 라만 스펙트럼 데이터베이스를 구축하는 제6단계; 및(a) Raman shift values of the first step, (b) Raman peaks and (c) Raman troughs at each shift value; (a') Raman shift values generated by machine learning in the second step, (b') Raman highest points and (c') Raman lowest points at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and a sixth step of constructing a Raman spectrum database by inputting the spectral selectivity calculated in the fifth step; and
선택적으로(optionally), 제1단계에서 시료의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 입력하면, 제6단계에서 구축된 라만 스펙트럼 데이터베이스로부터 원하는 예측 정보를 산출하는 제7단계를 포함한다. Optionally, by inputting (a) Raman shift values, (b) Raman peaks and (c) Raman troughs at each shift value of the sample in the first step, the desired prediction from the Raman spectrum database constructed in the sixth step and a seventh step of calculating information.
라만 스펙트럼 데이터베이스 구축 대상인 시료는 액체 시료일 수 있다. The sample to be constructed of the Raman spectrum database may be a liquid sample.
생체 유래 시료의 라만 스펙트럼 목록(list)에 포함되는, (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점은 액체 시료 내에서 측정할 때 마다, 고정 상에서 측정할 때와 달리 직관적으로 확인 가능한 고정값으로 나타나지 않기 때문에, 본 발명은 전술한 생체 정보를 예측하기 위해 액체 시료로부터 라만 스펙트럼 데이터베이스를 구축 및 검색할 때 더욱 유용하다. (a) one or more Raman shift values, at each shift value, (b) Raman peaks and (c) Raman troughs included in the list of Raman spectra of the biological sample, whenever measured in a liquid sample, in a stationary phase Unlike measurement, since it does not appear as a fixed value that can be intuitively confirmed, the present invention is more useful when constructing and searching a Raman spectrum database from a liquid sample in order to predict the above-described biometric information.
따라서, 본 발명에 따른 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법은, 라만 스펙트럼 데이터베이스를 구축하는 제6단계에서 액체 시료 전용 라만 스펙트럼 데이터베이스를 구별하여 구축하는 경우, Therefore, in the method of constructing and searching a Raman scattering spectrum database through machine learning according to the present invention, in the sixth step of constructing the Raman spectrum database, when the Raman spectrum database exclusively for liquid samples is differentiated and constructed,
선택적으로(optionally), 액체 시료일 경우 스펙트럼 인텐시티(intensity) 인덱싱을 통한 분리 보관하는 제8단계; 및Optionally, in the case of a liquid sample, an eighth step of separating and storing through spectral intensity indexing; and
선택적으로(optionally), 액체 시료일 경우 스펙트럼 패턴 매칭을 이용한 노이즈 스펙트럼 인텐시티(intensity) 감소 필터링하는 제9단계를 포함할 수 있다.Optionally, in the case of a liquid sample, the ninth step of filtering noise spectrum intensity using spectral pattern matching may be included.
본 발명은, 실험적으로 레퍼런스 물질(negative control)을 통해 나온 신호의 라만 쉬프트별 표준편차 평균값으로 베이스라인을 설정하고, 베이스라인 교정을 한 후 모든 라만 스펙트럼을 도출하는 것이 바람직하다.In the present invention, it is preferable to set a baseline as the average value of the standard deviation for each Raman shift of a signal experimentally output through a negative control, and to derive all Raman spectra after performing baseline correction.
[제1단계][Step 1]
제1단계는 각 시료로부터 (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 세로축의 라만 인텐시티 중 상대적으로 가장 높은 값인 라만 최고점 및 (c) 세로축의 라만 인텐시티 중 상대적으로 가장 낮은 값인 라만 최저점을 도출하여, 각 시료의 라만 스펙트럼을 생성하는 단계이다.The first step is from each sample, (a) one or more Raman shift values, and at each shift value, (b) Raman peak, which is the highest value among Raman intensities on the vertical axis, and (c) Raman, which is the lowest value among Raman intensities on the vertical axis. This is a step of generating a Raman spectrum of each sample by deriving the lowest point.
(a) 라만 쉬프트(a) Raman shift
도 2에 도시된 바와 같이, 빛(레이저)가 대상 분자 물질에 입사되고 산란되어 나올 때 그 에너지 양이 같지 않은 산란을 라만 산란이라 하고, 그 에너지 준위의 변화를 라만 쉬프트라고 한다. As shown in FIG. 2 , when light (laser) is incident on a target molecular material and is scattered, scattering in which the amount of energy is not the same is called Raman scattering, and a change in the energy level is called Raman shift.
일반적으로 라만 값은 라만 자체의 파장(wave length)을 사용하지 않고 라만 쉬프트인 파수(wave number, cm-1)를 사용한다.In general, the Raman value does not use the wavelength (wave length) of Raman itself, but uses the wave number (cm -1 ) that is the Raman shift.
라만 쉬프트 값은 하기 수학식 1 또는 수학식 2에 의해 도출될 수 있다.The Raman shift value may be derived by Equation 1 or Equation 2 below.
[수학식 1][Equation 1]
Figure PCTKR2021020362-appb-I000001
Figure PCTKR2021020362-appb-I000001
[수학식 2][Equation 2]
Figure PCTKR2021020362-appb-I000002
Figure PCTKR2021020362-appb-I000002
도 3에 예시된 바와 같이, 하나의 시료로부터 하나 이상의 라만 쉬프트 값들이 도출될 수 있다.As illustrated in FIG. 3 , one or more Raman shift values may be derived from one sample.
도 3에 예시된 바와 같이, 각 라만 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 구한다. 400 cm-1 ~ 3200 cm-1 의 라만 쉬프트 범위에서, 예컨대 검사 장비의 라만 쉬프트 범위(예, 500 cm-1 ~ 2,000cm-1)에서 1~5 쉬프트씩, 바람직하게는 1 쉬프트씩 이동하면서 각 쉬프트 값에서 최저점 및 최고점을 구할 수 있다.As illustrated in FIG. 3 , (b) Raman highest point and (c) Raman lowest point are obtained from each Raman shift value. In the Raman shift range of 400 cm -1 to 3200 cm -1 , for example, in the Raman shift range of the inspection equipment (eg, 500 cm -1 to 2,000 cm -1 ) by 1 to 5 shifts, preferably moving by 1 shift The lowest and highest points can be obtained from each shift value.
(b) 라만 최고점(b) Raman peak
라만 인텐시티(intensity)는 일반적으로 arbitrary unit(a.u.)으로 표시하기 때문에 절대적인 수치가 아니며, 라만 최고점은 상대적으로 라만 쉬프트 범위 내에 있는 가장 높은 인텐시티로 정의될 수 있다. Since the Raman intensity is generally expressed in arbitrary units (a.u.), it is not an absolute number, and the Raman peak can be defined as the highest intensity within the relatively Raman shift range.
도 3에 예시된 바와 같이, 라만 최고점은 각 라만 쉬프트 값에서 세로축의 라만 인텐시티 중 상대적으로 가장 높은 값으로부터 도출할 수 있다. As illustrated in FIG. 3 , the Raman peak may be derived from the relatively highest value among the Raman intensities of the vertical axis in each Raman shift value.
(c) 라만 최저점(c) Raman trough
라만 인텐시티(a.u.)는 일반적으로 arbitrary unit으로 표시하기 때문에 절대적인 수치가 아니며, 라만 최저점은 상대적으로 라만 쉬프트 범위 내에 있는 가장 낮은 인텐시티로 정의될 수 있다.The Raman intensity (a.u.) is not an absolute number because it is generally expressed in arbitrary units, and the Raman lowest point can be defined as the lowest intensity within the relatively Raman shift range.
도 3에 예시된 바와 같이, 라만 최저점은 각 라만 쉬프트 값에서 세로축의 라만 인텐시티 중 상대적으로 가장 낮은 값으로부터 도출할 수 있다.As illustrated in FIG. 3 , the Raman lowest point may be derived from the relatively lowest value among the Raman intensities of the vertical axis in each Raman shift value.
[제2단계][Step 2]
제2단계는, 제1단계의 라만 쉬프트 값(a)을 구간학습하는 제2-1단계, 제1단계의 라만 최고점(b)을 클러스터 학습하는 제2-2단계, 및 제1단계의 라만 최저점(c)을 클러스터 학습하는 제2-3단계를 수행하는 기계 학습 알고리즘에 따라 학습된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 생성하는 단계이다.The second step is a step 2-1 of section learning the Raman shift value (a) of the first step, a step 2-2 of cluster learning the Raman peak (b) of the first step, and the Raman of the first step Generate (a') Raman shift values, (b') Raman peaks, and (c') Raman troughs from each shift value, learned according to a machine learning algorithm that performs step 2-3 of cluster learning the lowest point (c) is a step to
제2단계는 무감독 기계학습 기법(unsupervised machine learning)을 통한 클러스터링의 후보군을 선별하기 위해 수행되는 것으로, over-fitting을 최대한 감소하기 위한 전처리 과정이다.The second step is performed to select a clustering candidate group through unsupervised machine learning, and is a preprocessing process to reduce over-fitting as much as possible.
기계 학습은 알고리즘을 통해 학습 데이터를 수집한 후 해당 데이터를 기반으로 더 정확한 모델을 생성할 수 있다. (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점은 데이터를 이용하여 기계 학습 알고리즘을 트레이닝시킬 때 생성되는 출력이며, 기계 학습 모델을 제공한다. 학습을 마친 후 기계 학습 모델에 입력 내용을 제공하면 결과물을 받게 된다.Machine learning can collect training data through an algorithm and then create a more accurate model based on that data. (a') Raman shift values, (b') Raman peaks, and (c') Raman troughs at each shift value are outputs generated when training a machine learning algorithm using data, and provide a machine learning model. After training, you provide input to the machine learning model and you will receive the output.
[제3단계][Step 3]
제3단계는 제2단계에서 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 기반으로, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability)을 기계학습으로 추론하는 단계이다.Step 3 is based on (a') Raman shift values generated in Step 2, (b') Raman peaks and (c') Raman troughs at each shift value, (d) sensitivity, (e) It is a step of inferring stability and (f) repeatability by machine learning.
제3단계는 전 단계에서 클러스터링된 라만 피크 클러스터 중에서 실질적으로 분자를 동정(identification)할 수 있는 주 피크(main peak)을 선별하고, 아래 조건들을 만족시키는 피크들을 주 피크의 최종 후보로 선정하기 위해 수행하는 것이다.The third step is to select a main peak capable of substantially identifying a molecule from among the Raman peak clusters clustered in the previous step, and select peaks that satisfy the following conditions as final candidates for the main peak. is to perform
(d) 민감도(sensitivity)(d) sensitivity
본 발명에서, 민감도(sensitivity)는 주어진 반복성내 신호잡음비(signal to noise ratio)가 50%이상인 스펙트럼의 비로 정의할 수 있다.In the present invention, sensitivity may be defined as a ratio of a spectrum in which a signal to noise ratio within a given repeatability is 50% or more.
Figure PCTKR2021020362-appb-I000003
Figure PCTKR2021020362-appb-I000003
(e) 안정도(stability)(e) stability
본 발명에서, 안정도(stability)는 주어진 스펙트럼 값의 분포 범위가 정규분포 평균(μ)의 표준편차(δ) 범위위(μ-δ, μ+δ)의 스펙트럼구성비로 정의할 수 있다.In the present invention, stability may be defined as a spectrum composition ratio in which the distribution range of a given spectrum value is above the standard deviation (δ) range of the normal distribution mean (μ) (μ-δ, μ+δ).
(f) 반복성(repeatability)(f) repeatability
본 발명에서, 반복성(repeatability)는 주어진 검사 시간내 측정회수를 바탕으로 위 정의된 안정도 50%이상, 민감도 50%이상의 스펙트럼은 반복성이 있다고 정의할 수 있다.In the present invention, repeatability can be defined as having repeatability of a spectrum with a stability of 50% or more and a sensitivity of 50% or more, as defined above, based on the number of measurements within a given inspection time.
[제4단계][ Step 4 ]
제4단계는 Fractional Bandwidth을 계산하는 단계이다. The fourth step is to calculate the fractional bandwidth.
Fractional Bandwidth는 상대적 대역폭이라고 말할 수 있는데, 중심주파수에 대비한 대역폭을 의미한다. Fractional Bandwidth can be said to be a relative bandwidth, which means the bandwidth with respect to the center frequency.
Fractional Bandwidth = 대역폭 / 중심주파수Fractional Bandwidth = bandwidth / center frequency
이러한 개념이 필요한 이유는, 대역폭이 단순히 절대적으로 봐야할 문제가 아니라 중심주파수에 대비해 상대적으로 봐야할 개념이기 때문이다.The reason why such a concept is necessary is that bandwidth is not simply a matter to be considered absolutely, but rather a concept to be considered relative to the center frequency.
제4단계는 최종 선정된 주 피크(main peak) 후보들 중에서 같은 fractional bandwidth안에 단일 피크만 대표로 선정하기 위해 수행하는 것이다. 만일 같은 fractional bandwidth 안에 여러 개의 주 피크 후보들이 존재할 경우는 그중 조건에 가장 잘 맞는 피크를 선정하며 만일 동일 점수를 획득하였다면 중심주파수에 더 가까운 왼쪽 후보 피크를 선택한다.The fourth step is to select only a single peak within the same fractional bandwidth as a representative among the final selected main peak candidates. If there are several main peak candidates within the same fractional bandwidth, the peak that best meets the condition is selected among them, and if the same score is obtained, the left candidate peak closer to the center frequency is selected.
[제5단계][ Step 5 ]
제5단계는 제3단계에서 정의된 대로 반복성이 있으며, 안정도가 80% 이상, 민감도가 90% 이상인 스펙트럼을 해당 쉬프트내 선택값으로 선정하여, 스펙트럼 선택도(selectivity)를 계산하는 단계이다.Step 5 is a step of calculating spectral selectivity by selecting a spectrum having a repeatability and stability of 80% or more and a sensitivity of 90% or more as a selection value within the shift as defined in the third step.
제5단계는 각 fractional bandwidth 내 주 피크 후보들 중에서 가장 조건에 잘 맞는 대표 주 피크 1개를 선별하기 위해 수행하는 것이다. 이를 통해 최종 분자의 동정(identification)에 사용될 대표 피크들을 선정하게 된다.The fifth step is to select one representative main peak that best fits the condition among the main peak candidates within each fractional bandwidth. Through this, representative peaks to be used for identification of the final molecule are selected.
선택도(selectivity)selectivity
본 명세서에서, 선택도(selectivity)는 앞서 정의된 대로 반복성이 있으며, 안정도가 80% 이상, 민감도가 90% 이상인 스펙트럼을 해당 쉬프트내 선택값으로 선정한다.In the present specification, selectivity has repeatability as defined above, and a spectrum having a stability of 80% or more and a sensitivity of 90% or more is selected as a selection value within the shift.
[제6단계][ Step 6 ]
제6단계는 제1단계의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점; 제2단계에서 기계 학습시켜 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점; 제3단계에서 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 제5단계에서 계산된 스펙트럼 선택도(selectivity)를 입력하여, 라만 스펙트럼 데이터베이스를 구축하는 단계이다.Step 6 includes (a) Raman shift values of the first step, (b) Raman peaks and (c) Raman minimums at each shift value; (a') Raman shift values generated by machine learning in the second step, (b') Raman peaks and (c') Raman troughs at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and inputting the spectral selectivity calculated in step 5 to build a Raman spectrum database.
각 시료로부터 도출한 (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점은 제2단계에서 학습대상인, 해당 시료의 제1 데이터 세트이며, 해당 시료에 대한 라만 스펙트라(spectra)로 표현될 수 있다.(a) one or more Raman shift values derived from each sample, at each shift value, (b) Raman peaks and (c) Raman lowest points are the first data set of the sample to be trained in the second step, and It can be expressed as a Raman spectra.
또한, 기계 학습을 통해 구축된 제2 데이터 세트는 제2단계에서 학습시켜 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점일 수 있으며, 이 역시 해당 시료에 대한 기계 학습을 통해 도출된 라만 스펙트라(spectra)로 표현될 수 있다.In addition, the second data set constructed through machine learning may be (a') Raman shift values generated by learning in the second step, (b') Raman peaks and (c') Raman lowest points at each shift value, This may also be expressed as a Raman spectra derived through machine learning for the corresponding sample.
나아가, 기계 학습으로 추론된 제3 데이터 세트는 제3단계에서 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 제5단계에서 계산된 스펙트럼 선택도(selectivity)일 수 있다.Further, the third data set inferred by machine learning includes: (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and the spectral selectivity calculated in the fifth step.
상기 제1 데이터 세트, 제2 데이터 세트 및 제3 데이터 세트는 본 발명의 라만 산란 스펙트럼 데이터베이스로 구축될 수 있다.The first data set, the second data set, and the third data set may be constructed as the Raman scattering spectrum database of the present invention.
라만 스펙트럼 데이터베이스를 구축하는 제6단계는 하기 (i) ~ (iv)의 항목을 저장할 수 있다:The sixth step of constructing the Raman spectrum database may store the following items (i) to (iv):
(i) 물질에 해당하는 부여 코드(i) the grant code corresponding to the substance;
(ii) 물질에 해당하는 모든 선택 라만 쉬프트 값(ii) any selected Raman shift values corresponding to the material;
(iii) 모든 선택 쉬프트 값에 해당하는 상대 인텐시티 값(iii) Relative Intensity values for all selection shift values
(iv) 해당 쉬프트에 사용된 negative control의 baseline 차이값(iv) the baseline difference value of the negative control used for the shift
[제7단계][ Step 7 ]
제7단계는, 제1단계에서 시료의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 입력하면, 제6단계에서 구축된 라만 스펙트럼 데이터베이스로부터 원하는 예측 정보를 산출하는 단계이다.In step 7, if (a) Raman shift value of the sample, (b) Raman highest point, and (c) Raman lowest point at each shift value are input in step 1, desired prediction information from the Raman spectrum database constructed in step 6 is the step to calculate
제7단계에서 산출되는 예측 정보는, 제2단계에서 학습시켜 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점; 제3단계에서 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 제5단계에서 계산된 스펙트럼 선택도(selectivity)로 구성된 군에서 선택된 하나 이상의 값을 입력하여 특정 함수를 통해 나온 하나이상의 출력값일 수 있다.The prediction information calculated in the seventh step includes (a') Raman shift values generated by learning in the second step, (b') Raman highest points and (c') Raman lowest points at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and one or more output values obtained through a specific function by inputting one or more values selected from the group consisting of the spectral selectivity calculated in the fifth step.
이때, 함수는 도 5와 같은 관계를 가질 수 있다.In this case, the function may have a relationship as shown in FIG. 5 .
함수(hidden layer)를 통해 출력되는 예측 정보(출력값)는 제1단계에서 라만 쉬프트 값(a)을 도출하는 대상인 시료를 추출한 동물 또는 세포의 상태 정보, 질환 진단 및/또는 치료제의 효과 평가, 및/또는 상기 시료의 세균 감염 정보일 수 있다.The prediction information (output value) output through the function (hidden layer) is the state information of the animal or cell from which the sample is extracted, which is the target for deriving the Raman shift value (a) in the first step, disease diagnosis and/or evaluation of the effect of a therapeutic agent, and / or bacterial infection information of the sample.
또한, 함수(hidden layer)를 통해 출력되는 예측 정보(출력값)는 특정 생체 물질(예, 단백질, 아미노산, 지질, 핵산)의 존재 여부 및/또는 농도, 세포(예, 세균, 암세포, 정상세포) 유래의 화학 결합, 구성물질 및/또는 세포 종류의 확인(identification) 및/또는 농도일 수 있다.In addition, the prediction information (output value) output through the function (hidden layer) is the presence and/or concentration of specific biomaterials (eg, proteins, amino acids, lipids, nucleic acids), cells (eg, bacteria, cancer cells, normal cells) It may be the chemical bond of origin, the identification and/or the concentration of the constituent and/or the cell type.
[제8단계][ Step 8 ]
제8단계는 라만 스펙트럼 데이터베이스를 구축하는 제6단계에서 액체 시료 전용 라만 스펙트럼 데이터베이스를 구별하여 구축하는 경우, 액체 시료일 경우 액체 시료 전용 라만 스펙트럼을 구축하기 위해 스펙트럼 인텐시티 (intensity) 인덱싱을 통한 분리 보관하는 단계이다.In the eighth step, when a Raman spectrum database dedicated to a liquid sample is differentiated and constructed in the sixth step of constructing a Raman spectrum database, in the case of a liquid sample, separate storage through spectral intensity indexing to construct a Raman spectrum dedicated to a liquid sample is a step to
[제9단계][ Step 9 ]
제9단계는 라만 스펙트럼 데이터베이스를 구축하는 제6단계에서 액체 시료 전용 라만 스펙트럼 데이터베이스를 구별하여 구축하기 위해, 액체 시료일 경우 스펙트럼 패턴 매칭을 이용한 노이즈 스펙트럼 인텐시티(intensity) 감소 필터링하는 단계이다.In the ninth step, in the case of a liquid sample, noise spectrum intensity reduction filtering using spectral pattern matching is performed in order to distinguish and build a Raman spectrum database dedicated to a liquid sample in the sixth step of constructing the Raman spectrum database.
제9단계는, 해당 물질을 바탕으로 선정된 라만 쉬프트 값들과의 일치율로 스펙트럼 패턴 매칭을 판단하는 것일 수 있다.The ninth step may be to determine the spectral pattern matching based on the coincidence rate with the Raman shift values selected based on the material.
해당 물질을 바탕으로 선정된 라만 쉬프트 값들과의 일치율로 스펙트럼 패턴 매칭을 판단하는 것은 하기 방법 (i) ~ (iv) 중 적어도 하나를 사용하는 것일 수 있다: Determining the spectral pattern matching by the coincidence rate with the Raman shift values selected based on the material may be by using at least one of the following methods (i) to (iv):
(i) 노이즈는 각 쉬프트의 신호잡음비가 50%이하로 정의함;(i) Noise is defined as 50% or less of the signal-to-noise ratio of each shift;
(ii) 획득된 라만 쉬프트 값들의 최소값과 최대값을 해당 물질에 대한 기 획득 레퍼런스 최소, 최대값에 일치시키고 비율을 조정함;(ii) matching the minimum and maximum values of the obtained Raman shift values to the previously obtained reference minimum and maximum values for the material and adjusting the ratio;
(iii) 각 선택 라만 쉬프트 값의 양측 1% 이내 값은 일치함으로 판단함; (iii) Values within 1% of both sides of each selected Raman shift value are judged to be identical;
(iv) 모든 선택 스펙트럼 일치율이 95%이상일 경우 일치했다 판정함(iv) If the matching rate of all selected spectra is more than 95%, it is judged to be consistent
바이오센서는 사람의 건강과 삶의 질, 생명과도 연결되어 있기 때문에 단순히 센서의 감도를 높이는 노력만이 아니라 정확도를 높이고자 하는 노력이 함께 되어야 한다. 센서의 감도가 높아지게 되면 동시에 비특이적 신호를 받아들이는 감도도 높아지게 된다. 따라서 위양성(false-positive) 결과를 송출하는 경우가 생길 수 있다. 이로 인해 약물을 남용, 불필요한 치료와 같은 부작용을 낳을 수 있기 때문에 센서를 개발할 때 가장 우선적으로 고려해야 할 것이다.Since biosensors are also connected to human health, quality of life, and life, efforts to improve accuracy, not just to increase the sensitivity of the sensor, must be made together. As the sensitivity of the sensor increases, the sensitivity to accept non-specific signals also increases. Therefore, there may be cases where a false-positive result is transmitted. Because this can lead to side effects such as drug abuse and unnecessary treatment, it should be considered first when developing a sensor.
본 발명에 따른 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법은 이러한 문제점을 해결할 수 있다.The Raman scattering spectrum database construction and search method through machine learning according to the present invention can solve this problem.
도 1은 본 발명에 따른 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법을 구동하는 알고리즘을 도식화한 것이다.1 is a schematic diagram of an algorithm driving a Raman scattering spectrum database construction and search method through machine learning according to the present invention.
도 2는 라만 산란의 원리를 설명하는 모식도이다.2 is a schematic diagram for explaining the principle of Raman scattering.
도 3은 시료의 라만 스펙트럼 정보인 (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 구하는 방법을 설명하기 위한 예시도이다. 3 is an exemplary diagram for explaining a method of obtaining (a) one or more Raman shift values, (b) Raman highest point, and (c) Raman lowest point from each shift value, which is Raman spectrum information of a sample.
도 4는 기계 학습과 자연어 처리를 포함하는 전체 범주인 인공지능(AI)에 대한 설명을 나타낸 개략도이다.4 is a schematic diagram illustrating an explanation of artificial intelligence (AI), which is an entire category including machine learning and natural language processing.
도 5는 일종의 함수인 신경망의 아키텍쳐(The architecture of a neural network)를 예시한 개략도이다.5 is a schematic diagram illustrating the architecture of a neural network as a kind of function.
도 6은 전파형 플라즈몬과 국부적 표면 플라즈몬 공명(LSPR)의 모식도이다.6 is a schematic diagram of radio wave plasmon and localized surface plasmon resonance (LSPR).
도 7은 바이오센서의 구성요소들 및 이들의 관계를 도시한 모식도이다.7 is a schematic diagram illustrating components of a biosensor and their relationship.
도 8은 국부적 표면 플라즈몬 공명(LSPR)을 발휘하는 금속 나노입자를 사용한 바이오센서의 작동원리를 도시한 모식도이다.8 is a schematic diagram illustrating the operating principle of a biosensor using metal nanoparticles exhibiting localized surface plasmon resonance (LSPR).
도 9는 라만용 핵산 기반 자가조립 복합체(NEW 구조체)의 타겟 핵산 존재시 turn-off 신호 방식의 센서 역할을 수행하는 작동원리도이다.9 is a diagram showing the operating principle of a nucleic acid-based self-assembly complex (NEW structure) for Raman, which acts as a sensor of a turn-off signal method in the presence of a target nucleic acid.
도 10은 실시예 1에서 준비한 NEW 구조체를 이용해 나오는 라만 신호를 100회 반복 측정한 결과 그래프이다. 10 is a graph showing the result of repeated 100 measurements of the Raman signal coming out using the NEW structure prepared in Example 1. FIG.
도 11은 실시예 1에서 준비한 NEW 구조체를 이용해 나오는 라만 신호를 나타낸 것이며, 왼쪽 그래프는 타겟 핵산이 없을 경우 NEW 구조체가 해리되지 않고 온전히 존재하는 NC(negative control) 상태이며, 오른쪽 그래프는 타겟 핵산이 있을 경우 구조체가 완전 해리한 PC(positive control) 상태이다.11 shows the Raman signal coming out using the NEW construct prepared in Example 1, the left graph is the NC (negative control) state in which the NEW construct is not dissociated and completely exists in the absence of the target nucleic acid, and the right graph is the target nucleic acid If present, the structure is in a completely dissociated PC (positive control) state.
도 12는 실시예 1에서 준비한 NEW 구조체를 이용해 나오는 라만 신호를 나타낸 것이며, 타겟 핵산의 존재량에 따라 turn-off 방식으로 그 양이 많아질 수록 신호가 점점 감소하는 것을 보여준다.12 shows the Raman signal coming out using the NEW construct prepared in Example 1, and it shows that the signal gradually decreases as the amount increases in a turn-off method according to the amount of target nucleic acid present.
도 13은 명세서에 기재된 시스템 및 방법을 처리하기 위해 사용되는 예시적인 컴퓨터 서버의 개념적 도식도이다.13 is a conceptual schematic diagram of an exemplary computer server used to process the systems and methods described herein.
본 명세서에서 사용되는 용어는 가능한 현재 널리 사용되는 일반적인 용어를 선택하였으나, 특정한 경우는 출원인이 임의로 선정한 용어도 있는데 이 경우에는 단순한 용어의 명칭이 아닌 발명의 상세한 설명 부분에 기재되거나 사용된 의미를 고려하여 그 의미가 파악되어야 할 것이다.As for the terms used in this specification, general terms that are currently widely used as possible have been selected, but in certain cases, there are also terms arbitrarily selected by the applicant. So the meaning should be understood.
이하, 첨부한 도면 및 바람직한 실시예를 참조하여 본 발명의 기술적 구성을 상세하게 설명한다.Hereinafter, the technical configuration of the present invention will be described in detail with reference to the accompanying drawings and preferred embodiments.
1. 기계 학습 또는 머신 러닝1. Machine Learning or Machine Learning
머신 러닝은 명시적인 프로그래밍을 통해서가 아닌 데이터로부터 시스템을 학습할 수 있는 AI의 한 형태이다(도 4). 그러나, 머신 러닝은 단순한 프로세스가 아니다. 알고리즘을 통해 학습 데이터를 수집한 후 해당 데이터를 기반으로 더 정확한 모델을 생성할 수 있다. 머신 러닝 모델은 데이터를 이용하여 머신 러닝 알고리즘을 트레이닝시킬 때 생성되는 출력이다. 학습을 마친 후 모델에 입력 내용을 제공하면 결과물을 받게 된다. 예컨대, 예측 알고리즘에서는 예측 모델이 생성된다. 그런 다음 예측 모델에 데이터를 제공하면 해당 모델을 학습한 데이터를 기반으로 예측 정보를 받게 된다.Machine learning is a form of AI that can learn systems from data rather than through explicit programming (Figure 4). However, machine learning is not a simple process. After collecting training data through an algorithm, a more accurate model can be created based on that data. A machine learning model is an output generated when training a machine learning algorithm using data. After training, you provide input to the model and you will receive the output. For example, in a predictive algorithm, a predictive model is generated. Then, you provide data to a predictive model, and you receive predictions based on the data that you trained on that model.
2. 머신 러닝의 반복 학습2. Iterative learning in machine learning
머신 러닝을 사용하면 모델을 배치하기 전에 데이터 세트를 기반으로 학습할 수 있다. 일부 머신 러닝 모델은 온라인에서 이루어지며 지속적이다. 온라인 모델의 이러한 반복적인 프로세스를 통해 데이터 요소 사이에서 이루어지는 연결 유형이 개선될 수 있다. 이러한 패턴과 연관성은 그 복잡성과 크기로 인해 사용자가 간과하기 쉽다. 모델을 학습한 후에는 모델을 실시간으로 사용하여 데이터로부터 학습할 수 있다. 머신 러닝에 포함되는 학습 과정 및 자동화로 인해 정확도가 향상될 수 있다.Machine learning allows a model to be trained on a data set before it is deployed. Some machine learning models are online and persistent. This iterative process of the online model can improve the types of connections made between data elements. These patterns and associations are easy to overlook by users due to their complexity and size. After training the model, you can use the model in real time to learn from the data. The learning process and automation involved in machine learning can improve accuracy.
3. 머신 러닝에 대한 접근법3. Approaches to Machine Learning
예측 모델의 정확도를 향상시키려면 머신 러닝 기술이 필요하다. 데이터 유형 및 용량을 기반으로 하기와 같은 다양한 접근법이 있다. Machine learning techniques are needed to improve the accuracy of predictive models. Based on data type and capacity, there are various approaches such as:
3-1. 감독 학습3-1. supervised learning
감독 학습은 일반적으로 구축된 데이터 세트와 해당 데이터를 분류하는 방식에 대한 확실한 이해를 바탕으로 시작된다. 감독 학습은 분석 프로세스에 적용할 수 있도록 데이터에서 패턴을 찾기 위한 방식이다. 이러한 데이터에는 데이터의 의미를 정의하는 분류된 기능이 있다. Supervised learning usually begins with a solid understanding of the data set that has been built and how to classify that data. Supervised learning is a way to find patterns in data that can be applied to the analytic process. These data have classified functions that define the meaning of the data.
3-2. 무감독 학습3-2. unsupervised learning
무감독 학습은 문제에 엄청난 양의 분류되지 않은 데이터가 필요한 경우 사용된다. 이러한 데이터에 숨겨진 의미를 이해하려면 발견된 패턴 또는 클러스터를 기반으로 데이터를 분류하는 알고리즘이 필요하다. 자율 학습은 반복적인 프로세스를 수행하고, 사용자 개입 없이 데이터를 분석한다. Unsupervised learning is used when a problem requires a huge amount of unclassified data. Understanding the meaning behind these data requires algorithms that classify data based on discovered patterns or clusters. Self-learning performs an iterative process and analyzes data without user intervention.
3-3. 강화 학습3-3. reinforcement learning
강화 학습은 행동 학습 모델이다. 알고리즘에 데이터 분석에서 얻은 피드백이 접목되어 사용자에게 최적의 결과를 안내한다. 강화 학습은 다른 유형의 감독 학습과 다르다. 샘플 데이터 세트를 사용하여 시스템을 학습하지 않기 때문이다. 대신 시행착오를 통해 시스템을 학습한다. 따라서 일련의 성공적인 의사결정을 통해 프로세스가 강화되는데, 문제를 언제든 가장 효과적으로 해결하기 때문이다.Reinforcement learning is a behavioral learning model. Feedback from data analysis is applied to the algorithm to guide users to optimal results. Reinforcement learning is different from other types of supervised learning. This is because we are not training the system using a sample data set. Instead, it learns the system through trial and error. Thus, the process is strengthened by a series of successful decision-making, because it always solves the problem most effectively.
3-4. 딥 러닝3-4. deep learning
딥 러닝은 데이터로부터 반복적으로 학습할 수 있도록 연속된 계층에 신경망을 통합하는 특정 머신 러닝 방법론이다. 딥 러닝은 비정형 데이터로부터 패턴을 학습할 때 특히 유용하다. 따라서 컴퓨터는 잘못 정의된 추상과 문제점을 처리하도록 트레이닝을 받을 수 있다. Deep learning is a specific machine learning methodology that integrates neural networks into successive layers so that they can iteratively learn from data. Deep learning is especially useful when learning patterns from unstructured data. Thus, computers can be trained to deal with poorly defined abstractions and problems.
4. 머신 러닝 환경의 빅데이터4. Big Data in Machine Learning Environments
머신 러닝을 수행하려면 학습 프로세스에 올바른 데이터 세트를 적용해야 한다. 빅데이터는 사용 목적에 맞는 데이터 전처리를 하여 원데이터와 학습 결과의 신뢰성을 높일 수 있으며, 빅데이터가 있으면 머신 러닝 모델의 정확도를 높이는 데 도움이 될 수 있다. 빅데이터를 사용하여 온프레미스에서든 클라우드에서든 가장 효율적이고 비용 효율적인 방식으로 저장될 수 있도록 데이터를 가상화할 수 있다. 또한 네트워크 속도 및 안정성이 개선되어 대용량 데이터를 허용 가능한 속도로 관리하는 작업과 관련된 기타 물리적 제한사항이 제거될 수 있다. To do machine learning, you need to apply the right data set to the learning process. Big data can increase the reliability of raw data and learning results by pre-processing data suitable for the purpose of use, and the presence of big data can help to increase the accuracy of machine learning models. Big data can be used to virtualize data so that it can be stored in the most efficient and cost-effective way, whether on premises or in the cloud. Additionally, improvements in network speed and reliability may remove other physical limitations associated with managing large amounts of data at acceptable rates.
5. 머신 러닝의 포괄적인 운영 방법5. Comprehensive Operational Methods of Machine Learning
머신 러닝의 장점은 결과를 예측하기 위해 알고리즘과 모델을 활용할 수 있다는 점이다. 그 비결은 올바른 알고리즘을 사용하고 가장 적합한 데이터(즉, 정확하고 정리된 데이터)를 수집하며 최상의 수행 모델을 지속적으로 사용할 수 있도록 보장하는 것이다. 이러한 모든 요소를 모으면 데이터로부터 학습하여 모델을 지속적으로 학습하고 그 결과를 통해 다시 학습할 수 있다. 모델링, 모델 학습, 테스트로 이루어진 이러한 과정을 자동화하면 정확한 예측을 도출하여, 라만 스펙트럼 데이터베이스로부터 다양한 생체 예측 정보들을 지원할 수 있다.The advantage of machine learning is that it can utilize algorithms and models to predict outcomes. The trick is to use the right algorithms, collect the most appropriate data (that is, accurate and clean data), and ensure that the best performing models are consistently available. Putting all these elements together, you can continuously train the model by learning from the data and re-learn from the results. By automating this process of modeling, model learning, and testing, accurate predictions can be derived, and various bio-prediction information can be supported from the Raman spectrum database.
6. 라만 쉬프트 6. Raman shift
라만 분광법은, 단일 파장을 갖는 강력한 빛을 물질에 조사하였을 때 대부분은 탄성 산란(Elastic scattering)을 하게 되지만 빛의 일부가 분자의 공명에 이용되어 다른 진동수를 가지고 산란하게 되는 비탄성산란(Inelastic scattering: 라만 산란) 현상인 라만 효과를 이용하여, 분자의 화학적인 구성과 구조를 분석하는 방법이다. 라만 산란 되었을 때 탄성 산란에 비해 쉬프트(shift)된 정도를 라만 쉬프트(Raman shift)라고 하며, 이를 스펙트라(spectra)로 표현하여 매질의 특성을 표현할 수 있다. In Raman spectroscopy, when strong light having a single wavelength is irradiated to a material, most of it undergoes elastic scattering, but a part of the light is used for molecular resonance and is scattered with a different frequency (Inelastic scattering: It is a method to analyze the chemical composition and structure of molecules using the Raman effect, which is a Raman scattering phenomenon. When Raman scattering is performed, the degree of shift compared to elastic scattering is called a Raman shift, and the characteristic of a medium can be expressed by expressing it as a spectra.
따라서, 라만 분광법은 예컨대, 세균의 단백질, 지질, 핵산 등의 세포 구성물질을 조사할 수 있으며, 구성물질의 특성에 따라 라만 쉬프트 400~3200 cm-1 구간에서 각각 다른 신호의 세기를 라만 스펙트라(spectra)로 결과값이 표현될 수 있다. 이론적으로는 세균 마다 고유의 라만 스펙트라로 나타나며, 일부 세균들은 종 수준으로 구분이 가능하다. 뿐만 아니라 다양한 환경 조건에 따른 유전 형질 발현은 세포 구성에 영향을 미치며, 이는 라만 스펙트라의 변화로 나타나 같은 종 안에서 세포의 상태에 대한 정보를 확인할 수 있다. 일반적으로 단세포를 측정할 때 형광에 의한 배경 영향이 가장 적은 532 nm 레이저를 사용한다. 532 nm 레이저에 공명(resonance)을 보이는 화학 결합 및 세포 구성물질의 라만 쉬프트 정보는 표 1에 예시되어 있다. Therefore, Raman spectroscopy, for example, can examine cellular components such as bacterial proteins, lipids, and nucleic acids, and according to the characteristics of the components, different signal intensities in the Raman shift 400-3200 cm -1 section are measured by Raman spectra ( The result can be expressed as spectra). Theoretically, each bacteria has its own Raman spectra, and some bacteria can be distinguished at the species level. In addition, the expression of genetic traits according to various environmental conditions affects the cell composition, which appears as a change in the Raman spectra, allowing information on the cell status within the same species to be confirmed. In general, when measuring single cells, a 532 nm laser with the least background effect by fluorescence is used. Table 1 shows the Raman shift information of chemical bonds and cellular components exhibiting resonance with a 532 nm laser.
Figure PCTKR2021020362-appb-T000001
Figure PCTKR2021020362-appb-T000001
또한, C-C, C-N와 같은 화학 결합과 DNA, 아미노산 등 다양한 구성물질의 특성을 확인할 수 있다. 특히 세균의 일반적인 라만 스펙트라에서 발견되는 아미노산 페닐알라닌은 라만 쉬프트 1004 cm-1에서 측정 가능하며, 세균의 라만 스펙트라 정확도를 판단하는 주요 인자로 활용된다. In addition, chemical bonds such as CC and CN and characteristics of various constituents such as DNA and amino acids can be confirmed. In particular, the amino acid phenylalanine found in general Raman spectra of bacteria can be measured at a Raman shift of 1004 cm -1 and is used as a major factor in determining the accuracy of Raman spectra of bacteria.
의학 분야에서 라만 분광법은 감염 세균을 검측하여 세균 감염에 의한 질병 진단에 활용되고 있다. 예컨대, 요로감염증을 진단하기 위하여 임상시료로부터 염증의 주 원인 장내 세균인 Escherichia coli와 Enterococcus faecalis를 검측함으로써 진단할 수 있으며, 처방한 항생제의 효과 평가에도 적용 가능하다. 특히 E. coli 경우 ampicillin, ciprofloxacin, gentamicin, sulfamethoxazole 4개의 항생제에 대한 처리 효과를 직접 확인할 수 있다. 결핵의 주요 병원체로 알려져 있는 Mycobacterium tuberculosis에 대한 DB를 구축하여 결핵진단에 활용할 수 있다. 이처럼 라만 분광법을 실시간 세균 검측 기술에 응용하여 질병 진단에 활용할 수 있다. In the medical field, Raman spectroscopy is used to detect infectious bacteria and diagnose diseases caused by bacterial infection. For example, in order to diagnose urinary tract infection, it can be diagnosed by detecting Escherichia coli and Enterococcus faecalis, which are the main causes of inflammation, from clinical samples, and it can be applied to the evaluation of the effect of prescribed antibiotics. In particular, in the case of E. coli, the treatment effect of the four antibiotics ampicillin, ciprofloxacin, gentamicin, and sulfamethoxazole can be directly confirmed. It can be used for tuberculosis diagnosis by constructing a DB for Mycobacterium tuberculosis, which is known as a major pathogen of tuberculosis. As such, by applying Raman spectroscopy to real-time bacterial detection technology, it can be utilized for disease diagnosis.
의학분야 외에도 식품분야에서 활발히 라만 분광법을 활용하고 있다. 식품매개 질병(foodborne illness)을 일으키는 Salmonella spp., Escherichia coli, Pseudomonas aeruginosa, Listeria monocytogenes, Legionella spp., Staphylococcus aureus 들이 음식에 감염되었는지 여부를 신속하게 검측할 수 있다. 현재 우유와 고기(닭, 다진 고기 등)에 포함되어 있는 감염균을 단세포 수준으로 검측이 가능하다. Salmonella spp. 검측 기술의 경우에는 다양한 식품에 따른 ISO 국제 표준안이 수립되었을 정도로 식품분야에서 라만 분광법의 활용은 보편화되어 있다. 또한, 수돗물과 시판되고 있는 먹는 샘플을 대상으로 Pseudomonas aeruginosa와 Legionella spp.를 검측하여 수질 보건 관리에 활용할 수 있다.In addition to the medical field, Raman spectroscopy is being actively used in the food field. Salmonella spp., Escherichia coli, Pseudomonas aeruginosa, Listeria monocytogenes, Legionella spp., and Staphylococcus aureus, which cause foodborne illness, can be quickly detected for food-borne illness. Currently, it is possible to detect infectious bacteria contained in milk and meat (chicken, minced meat, etc.) at the single-cell level. Salmonella spp. In the case of detection technology, the use of Raman spectroscopy is common in the food field to the extent that ISO international standards have been established for various foods. In addition, Pseudomonas aeruginosa and Legionella spp. can be detected from tap water and commercially available drinking samples and used for water quality health management.
나아가, SERS 또는 TERS와 같은 증폭 기술을 활용하여, 의학 분야에서도 라만 분광법으로 세포를 직접 조사하는 방법을 질병 진단 과정에 포함시켜 암 등의 질병 예방에 활용하고 있다. 유방암 세포는 라만 쉬프트 1003 cm-1에서 정상세포보다 낮은 신호세기를 가졌으며, 알츠하이머 유전자를 이식한 쥐에서 추출한 혈소판의 라만 쉬프트 740 cm-1와 1654 cm-1에서의 신호 세기는 정상 대조군과 비교하여 높다고 알려져 있다. 이토록 높은 측정 민감도와 특이성으로 미세한 라만 산란 신호의 차이를 구분함으로써 질병 진단에 응용할 수 있다. Furthermore, by utilizing amplification technologies such as SERS or TERS, in the medical field, a method of directly irradiating cells with Raman spectroscopy is included in the disease diagnosis process and utilized to prevent diseases such as cancer. Breast cancer cells had a lower signal intensity than normal cells at a Raman shift of 1003 cm −1 , and the signal intensity of platelets extracted from mice transplanted with Alzheimer’s gene at 740 cm −1 and 1654 cm −1 was compared with the normal control group. is known to be high. With such high measurement sensitivity and specificity, it can be applied to disease diagnosis by discriminating differences in minute Raman scattering signals.
별도의 증폭 기술 없이도 생체 분자(biomolecule) 측정 민감도가 매우 향상되어 단세포 수준에서의 분석까지 응용범위가 확대되고 있다. 경우에 따라서는 세균 고정, 배양, 교합(hybridization) 등 별도의 시료 전처리 과정을 생략할 수 있으며, 측정 시간(10~60초)이 짧아 실시간 수준의 분석이 가능하다. Even without a separate amplification technology, the sensitivity of biomolecule measurement is greatly improved, and the range of applications is expanding to analysis at the single-cell level. In some cases, a separate sample pretreatment process such as bacterial fixation, culture, and hybridization can be omitted, and the measurement time (10 to 60 seconds) is short, enabling real-time analysis.
또한, 측정 시 물에 의한 간섭이 적으므로 배양액 혹은 환경시료 추출액 등의 액체 시료 그대로 측정이 가능하며, 측정 이후에도 세포 파괴될 위험이 적어 검측된 단세포를 분리하여 배양 및 단세포(single cell) 유전체 분석에 응용할 수 있다. 이러한 라만 분광법 만의 장점들은 환경시료에 존재하는 세균의 특성을 단 세포 수준에서 분석 가능하게 한다. In addition, since there is little interference from water during measurement, it is possible to measure liquid samples such as culture or environmental sample extracts as they are, and since there is little risk of cell destruction even after measurement, the detected single cells are separated for culture and single cell genome analysis. can be applied These unique advantages of Raman spectroscopy make it possible to analyze the characteristics of bacteria present in environmental samples at the single-cell level.
나아가, SIP-라만 기술을 이용한 생리학적 활성 분석 SIP (stable isotope probing) 기술은 탄소-13와 질소-15 등의 안정동위원소가 포함된 기질을 사용하는 세균을 분석하여 특정 기질에 대한 특이성을 연구하는 방법이다. 이때 핵산, 아미노산 등의 세포 구성물질이 안정동위원소로 표지되며, 표지된 구성물질은 기존의 라만 쉬프트 측정값과 다른 값을 갖게 되며 일반 기질에 배양된 세균과 구분된다. 예를 들면, 탄소-12의 페닐알라닌 라만 쉬프트는 1004 cm-1에서 관측되지만, 탄소-13으로 표지된 페닐알라닌의 라만 쉬프트는 967 cm-1에서 라만 신호가 나타난다. 페닐알라닌 외에도 동위원소 탄소-13, 질소-15, 수소-2로 세포 구성물질이 치환되었을 때 구분 가능한 라만 쉬프트 차이를 비교하여 세균의 생리 활성을 직접적으로 측정하는 것이 가능하다. Furthermore, physiological activity analysis using SIP-Raman technology SIP (stable isotope probing) technology analyzes bacteria using substrates containing stable isotopes such as carbon-13 and nitrogen-15 to study specificity for specific substrates way to do it At this time, cell components such as nucleic acids and amino acids are labeled with stable isotopes, and the labeled components have different values from the existing Raman shift measurement values, and are distinguished from bacteria cultured on a general substrate. For example, the phenylalanine Raman shift of carbon-12 is observed at 1004 cm -1 , but the Raman shift of phenylalanine labeled with carbon-13 shows a Raman signal at 967 cm -1 . In addition to phenylalanine, it is possible to directly measure the physiological activity of bacteria by comparing the Raman shift difference that can be distinguished when cell components are substituted with isotopes of carbon-13, nitrogen-15, and hydrogen-2.
지하수에서 나프탈렌을 분해하는 세균을 구분하기 위하여 탄소-13으로 표지된 나프탈렌을 탄소원으로 배양한 후, 세균 단세포들을 측정할 수 있다. 측정 결과 나프탈렌을 분해하는 세균인 Aicdovorax의 페닐알라닌은 다른 세균 단세포와 달리 라만 쉬프트 967 cm-1에서 라만 신호가 나타났고, Aicdovorax가 나프탈렌을 탄소원으로 사용하는 세균이라는 것을 검증할 수 있다. 또 다른 예로 환경시료에서 카로티노이드를 포함하는 cyanobacteria를 검측하기 위해 탄소-13의 중탄산나트륨을 주입하는 경우, 표지된 cyanobacteria의 페닐알라닌의 라만 신호는 991 cm-1에서 나타난다. 세포의 구성물질이 탄소-13으로 표지된 비율만큼 라만 쉬프트의 강도(Intensity)도 비례해서 변화하는 특성을 발휘한다. 이는 라만 쉬프트의 강도 정보를 이용하여 기질에 대한 세균 활성을 정량 분석하는 것이 가능하다는 것을 의미한다. 다만 탄소-13 기질의 표지 정도가 10% 미만일 경우에는 라만 분광법의 검출 한계로 인하여 탄소-12와 구분할 수 없기 때문에 높은 농도의 동위원소를 기질로 활용해야 한다. 질소 동위원소를 사용할 경우 탄소 동위원소를 사용할 경우보다 라만 쉬프트의 변화가 상대적으로 적으며, 아미노산에서 차이를 보였던 탄소와 달리 주로 핵산의 라만 쉬프트 값에서 변화가 발생한다. 이러한 변화조차도 다른 라만 쉬프트에 묻혀서 복잡한 시료에서는 구분하기가 매우 어렵다. 질소-14의 염화암모늄과 질소-15의 염화암모늄의 비율이 다른 배지에 배양된 E. coli를 라만 분광법으로 측정하는 경우, 라만 쉬프트의 위치 변화는 확인할 수 없지만 신호 강도는 동위원소의 주입 농도와 비례해서 증가함을 확인할 수 있다. 라만 쉬프트의 위치 변화를 동반하지 않은 강도 변화는 실험 방법의 차이(초점 조절, 세균 측정 위치 등)에 따라 변화가 가능하므로 세균의 생리학적 활성을 구분하는 것은 한계가 있다. 그러므로 질소 동위원소에 따른 변화를 보다 명확히 보기 위해서는 SERS와 같은 증폭 기술을 연계하여 진행할 수 있다. 수소 동위원소인 수소-2(D; Deuterium)는 지질의 대사과정을 연구할 때 주로 사용하며, 수소-2로 표지된 세포 구성물질 (C-D 결합)은 라만 쉬프트 2000~2300 cm-1에서 측정 가능하다. 동위원소를 사용하지 않았을 경우에는 2000~2300 cm-1 라만 쉬프트 지역에 특별한 신호가 존재하지 않지만, C-D 결합이 존재할 경우 새로운 신호가 측정된다. 그리고 수소-2 동위원소가 지질이 아닌 다른 세포 구성물질로 충분히 표지될 경우, 탄소 동위원소 결과와 유사하게 페닐알라닌의 라만 쉬프트가 959 cm-1로 이동하는 현상도 보고되고 있다. In order to distinguish bacteria that decompose naphthalene in groundwater, after culturing naphthalene labeled with carbon-13 as a carbon source, bacterial single cells can be measured. As a result of the measurement, phenylalanine of Aicdovorax, a bacterium that decomposes naphthalene, showed a Raman signal at a Raman shift of 967 cm -1 unlike other bacterial single cells, and it can be verified that Aicdovorax is a bacterium that uses naphthalene as a carbon source. As another example, when sodium bicarbonate of carbon-13 is injected to detect cyanobacteria containing carotenoids in an environmental sample, the Raman signal of phenylalanine of the labeled cyanobacteria appears at 991 cm -1 . It exhibits a characteristic that the intensity of the Raman shift is proportionally changed as much as the ratio of the cell constituents labeled with carbon-13. This means that it is possible to quantitatively analyze bacterial activity on a substrate using the intensity information of the Raman shift. However, if the labeling degree of the carbon-13 substrate is less than 10%, it cannot be distinguished from carbon-12 due to the detection limit of Raman spectroscopy. When a nitrogen isotope is used, the change in Raman shift is relatively smaller than when a carbon isotope is used, and the change occurs mainly in the Raman shift value of a nucleic acid, unlike carbon, which showed a difference in amino acids. Even these changes are buried in different Raman shifts, making them very difficult to distinguish in complex samples. When E. coli cultured in a medium with different ratios of nitrogen-14 ammonium chloride and nitrogen-15 ammonium chloride is measured by Raman spectroscopy, the position change of the Raman shift cannot be confirmed, but the signal intensity depends on the injection concentration of the isotope. It can be seen that it increases proportionally. There is a limit to distinguishing the physiological activity of bacteria because the change in intensity not accompanied by a change in the position of the Raman shift can be changed according to differences in experimental methods (focus control, bacterial measurement position, etc.). Therefore, in order to more clearly see the changes according to nitrogen isotopes, amplification techniques such as SERS can be connected. Hydrogen-2 (D; Deuterium), a hydrogen isotope, is mainly used to study lipid metabolism, and hydrogen-2 labeled cell components (CD binding) can be measured at a Raman shift of 2000-2300 cm -1 do. When no isotope is used, a special signal does not exist in the 2000-2300 cm -1 Raman shift region, but a new signal is measured when CD binding is present. In addition, when the hydrogen-2 isotope is sufficiently labeled with a cell component other than a lipid, a phenomenon in which the Raman shift of phenylalanine moves to 959 cm -1 has also been reported, similar to the carbon isotope result.
세균 내에서도 지질을 포함하는 부분은 광범위하기 때문에 단순 검측 외에도 이미징 기법을 이용하여 지질 대사 모니터링에 응용 가능하다. 하지만 일반적인 기질의 수소를 수소-2 동위 원소로 치환해서 실험을 하더라도 세포 구성 물질로 사용되는 수소의 양이 상대적으로 적기 때문에 변화하는 라만 쉬프트 값을 확인하기 어려운 한계가 있다. 예를 들어, 수소-2로 치환된 아세트산을 기질로 하여 배양된 Geobacter metalireducens에서 동위원소 사용에 따른 라만 신호의 변화는 관측하기 어려우나, 검측력을 높이기 위해서 기질의 수소를 동위원소로 치환하지 않고, 배양액을 중수(D2O)로 대체하여 생리적 활성 분석이 가능하다. 기질 활성을 보이는 세균들은 지질 생합성 시 배양액의 수소이온과 치환을 하는 특성을 이용해서 동위원소로 치환된 기질을 사용하지 않고서도 생리학적 활성이 있는 세균을 라만 쉬프트를 통해 분석 가능하다. Since the portion containing lipids in bacteria is extensive, it can be applied to monitoring lipid metabolism using imaging techniques in addition to simple detection. However, there is a limitation in that it is difficult to check the changing Raman shift value because the amount of hydrogen used as a cell constituent material is relatively small even if the experiment is performed by substituting hydrogen in a general substrate for hydrogen-2 isotopes. For example, in Geobacter metalireducens cultured with acetic acid substituted with hydrogen-2 as a substrate, it is difficult to observe the change in Raman signal according to the use of isotopes. It is possible to analyze the physiological activity by replacing the culture medium with heavy water (D 2 O). Bacteria with substrate activity can be analyzed through Raman shift without using isotope-substituted substrates by using the property of substituting hydrogen ions in the culture medium during lipid biosynthesis.
환경시료에는 다양한 기질(사체에서 발생되는 유기물 포함)들이 포함되어 있어 중수를 활용한 실험이 어려울 수도 있지만, 탄소 혹은 질소의 동위원소를 포함한 중복 표지 기법을 이용한다면 미생물 군집 수준의 분석도 가능하다.Experiments using heavy water may be difficult because environmental samples contain various substrates (including organic matter generated from cadavers).
7. 기지의 라만 쉬프트 값에 대한 라만 신호로 검출하고자 하는 분석물7. An analyte to be detected as a Raman signal for a known Raman shift value
예컨대, 검출하고자 하는 분석물은 아미노산, 펩타이드, 폴리펩타이드, 단백질, 글리코프로테인, 리포프로테인, 뉴클레오시드, 뉴클레오티드, 올리고뉴클레오티드, 핵산, 당, 탄수화물, 올리고사카라이드, 폴리사카라이드, 지방산, 지질, 호르몬, 대사산물, 사이토카인, 케모카인, 수용체, 신경전달물질, 항원, 알레르겐, 항체, 기질, 대사산물, 보조인자, 억제제, 약물, 약학물, 영양물, 프리온, 독소, 독물, 폭발물, 살충제, 화학무기제, 생체유해성 제제, 방사선동위원소, 비타민, 헤테로사이클릭 방향족 화합물, 발암물질, 돌연변이유발요인, 마취제, 암페타민, 바르비투레이트, 환각제, 폐기물 또는 오염물 등일 수 있다. For example, the analyte to be detected may include amino acids, peptides, polypeptides, proteins, glycoproteins, lipoproteins, nucleosides, nucleotides, oligonucleotides, nucleic acids, sugars, carbohydrates, oligosaccharides, polysaccharides, fatty acids, lipids, hormones, metabolites, cytokines, chemokines, receptors, neurotransmitters, antigens, allergens, antibodies, substrates, metabolites, cofactors, inhibitors, drugs, pharmaceuticals, nutrients, prions, toxins, poisons, explosives, pesticides, chemicals inorganic agents, biohazardous agents, radioisotopes, vitamins, heterocyclic aromatic compounds, carcinogens, mutagens, anesthetics, amphetamines, barbiturates, hallucinogens, wastes or contaminants, and the like.
또한, 분석물이 핵산일 경우 상기 핵산은 유전자, 바이러스 RNA 및 DNA, 박테리아 DNA, 곰팡이 DNA, 포유동물 DNA, cDNA, mRNA, RNA 및 DNA 단편, 올리고뉴클레오티드, 합성 올리고뉴클레오티드, 개질된 올리고뉴클레오티드, 단일 가닥 및 이중 가닥 핵산, 자연적 및 합성 핵산을 포함한다.In addition, when the analyte is a nucleic acid, the nucleic acid is a gene, viral RNA and DNA, bacterial DNA, fungal DNA, mammalian DNA, cDNA, mRNA, RNA and DNA fragments, oligonucleotides, synthetic oligonucleotides, modified oligonucleotides, single stranded and double-stranded nucleic acids, natural and synthetic nucleic acids.
또한, 상기 분석물을 인식할 수 있는 바이오 분자의 비제한적인 예는 항체, 항체 단편, 유전조작 항체, 단일 쇄 항체, 수용체 단백질, 결합 단백질, 효소, 억제제 단백질, 렉틴, 세포 유착 단백질, 올리고뉴클레오티드, 폴리뉴클레오티드, 핵산 또는 압타머를 들 수 있다.In addition, non-limiting examples of biomolecules capable of recognizing the analyte include antibodies, antibody fragments, genetically engineered antibodies, single chain antibodies, receptor proteins, binding proteins, enzymes, inhibitor proteins, lectins, cell adhesion proteins, oligonucleotides. , polynucleotides, nucleic acids or aptamers.
기지의 라만 쉬프트 값에 대한 라만 신호로 분석물을 검출하고자 할 때, (a) 분석물 자체 또는 (b) 상기 분석물을 인식할 수 있는 바이오 분자가 분극화(polarization)이 일어나는 분자 또는 그 화합물인 경우 (a) 분석물 자체 또는 (b) 상기 분석물을 인식할 수 있는 바이오 분자의 라만 쉬프트 값에 대한 라만 신호를 측정할 수도 있고, (a) 분석물 자체 또는 (b) 상기 분석물을 인식할 수 있는 바이오 분자에 후술할 라만 표지자를 연결하여 라만 표지자의 라만 쉬프트 값에 대한 라만 신호를 측정할 수도 있다.When an analyte is detected with a Raman signal for a known Raman shift value, (a) the analyte itself or (b) a biomolecule capable of recognizing the analyte is a molecule or a compound of which polarization occurs In case (a) the analyte itself or (b) the Raman signal for the Raman shift value of a biomolecule capable of recognizing the analyte may be measured, (a) the analyte itself or (b) the analyte is recognized It is also possible to measure a Raman signal with respect to a Raman shift value of a Raman marker by linking a Raman marker to be described later to a biomolecule capable of doing so.
8. 국부적 표면 플라즈몬 공명(localized surface plasmon resonance, LSPR)을 이용한 표면 분석 라만 분광법8. Raman spectroscopy for surface analysis using localized surface plasmon resonance (LSPR)
Au와 Ag는 다른 금속들에 비해 높은 자유전자 밀도를 가지고 있고, 상대적으로 이온화 경향성이 낮은 편이어서 매우 안정적이다. 또한 높은 자유전자 밀도는 금속 유전율의 실수부를 음수로 만들고 금속이 큰 편극률을 갖게 하여, 강한 전기장 증강을 유발한다. 그리고 허수부의 경우, 에너지 손실인 빛의 흡수 정도를 가리키므로 효율적인 증강을 위해선 값이 작아야만 한다. Au and Ag have high free electron density compared to other metals and are very stable because of their relatively low ionization tendency. Also, the high free electron density makes the real part of the dielectric constant of the metal negative and makes the metal have a large polarization, causing strong electric field enhancement. And, in the case of the imaginary part, since it indicates the degree of absorption of light, which is energy loss, the value must be small for effective augmentation.
따라서, Au의 경우에는 가시광선 영역 중 약 630 nm에서 비교적 낮은 유전율의 실수부 값을 갖으면서 가장 낮은 허수부 값을 갖는다. Ag의 경우에는 유전율의 실수부와 허수부를 동시에 고려하였을 때, 약 530 nm에서 효율적인 증강을 할 수 있는 값을 갖게 된다. Accordingly, in the case of Au, it has a real part value of a relatively low dielectric constant at about 630 nm in the visible ray region and has the lowest imaginary part value. In the case of Ag, when both the real part and the imaginary part of the permittivity are considered, it has a value capable of efficiently enhancing at about 530 nm.
표면 플라즈몬(도 6)이란 Ag와 Au 같은 0보다 작은 유전율을 갖는 금속과 그 금속이 속해 있는 0보다 큰 유전율을 갖는 유전체의 계면을 따라 전파하는 자유 전자들의 집단적인 진동 현상을 말한다. 이때, 금속에 입사하는 전자기장(일반적으로 가시광선)의 진동수와 표면 플라즈몬의 진동수가 일치하여 공명이 일어나 입사파보다 더욱 증강된 크기를 가지는 현상을 표면 플라즈몬 공명(surface plasmon resonance, SPR) 이라 일컫는다. 그리고 SPR은 계면으로부터 멀어질수록 기하급수적으로(exponentially) 감소하는 소멸파(evanescent wave) 형태를 갖는다. 이러한 SPR은 얇은 금속 평면에서 일어나는 전파형 플라즈몬과 금속 나노 입자에서 발생하는 국부적 표면 플라즈몬 공명(localized surface plasmon resonance, LSPR) 이 있다 (도 6).Surface plasmon (FIG. 6) refers to a collective vibration phenomenon of free electrons propagating along the interface between a metal having a dielectric constant less than zero, such as Ag and Au, and a dielectric having a dielectric constant greater than zero to which the metal belongs. At this time, a phenomenon in which the frequency of the electromagnetic field (generally visible light) incident on the metal coincides with the frequency of the surface plasmon, resulting in resonance, and having a size more enhanced than the incident wave is called surface plasmon resonance (SPR). And the SPR has an evanescent wave form that exponentially decreases as it goes away from the interface. Such SPR includes radio wave plasmon that occurs on a thin metal plane and localized surface plasmon resonance (LSPR) that occurs on metal nanoparticles (FIG. 6).
표면 증강 라만 분광법(surface enhanced raman spectroscopy, SERS)은 '은이나 금 등의 금속 나노입자의 표면'에 조사된 빛이 물질 표면의 플라즈몬 공명을 일으켜 라만 산란 신호를 증폭시키는 원리를 이용한다. 즉, 금속 나노구조의 주변에 표적 분자가 존재할 경우, 해당 분자의 라만 산란 신호가 크게 증가하는 현상을 이용한 것이다. 표면 증강 라만 산란 분석의 장점 중 하나는 일반적인 라만 분석으로는 얻기 힘든 정보를 제공할 수 있다는 것이다.Surface enhanced Raman spectroscopy (SERS) uses the principle that light irradiated on the 'surface of metal nanoparticles such as silver or gold' causes plasmon resonance on the material surface to amplify the Raman scattering signal. That is, when a target molecule is present in the vicinity of the metal nanostructure, a phenomenon in which the Raman scattering signal of the corresponding molecule is greatly increased is used. One of the advantages of surface-enhanced Raman scattering analysis is that it can provide information that is difficult to obtain with general Raman analysis.
표면 증강 라만 분광법(SERS)는 작은 라만 산란 단면적(Raman scattering cross-section)으로 인해 비교적 검출하기 힘든 라만 산란 신호를 증폭시키기 위해 도입될 수 있다. 은(Ag)이나 금(Au)과 같은 금속 나노 입자를 이용함으로써, 금속 나노 입자와 입사하는 빛의 상호 작용에 의해 나노 입자 표면에 흡착된 시료의 라만 신호를 증폭시켜 검출할 수 있다.Surface-enhanced Raman spectroscopy (SERS) may be introduced to amplify a Raman scattering signal that is relatively difficult to detect due to a small Raman scattering cross-section. By using metal nanoparticles such as silver (Ag) or gold (Au), the Raman signal of the sample adsorbed on the surface of the nanoparticles can be amplified and detected by the interaction between the metal nanoparticles and incident light.
이때, 신호의 증폭 정도는 금속 나노 입자의 모양 및 크기 그리고 금속 종류에 따라 변하며, 이와 더불어 입사하는 빛의 각도와 파장 그리고 편광에도 의존하게 된다. 이처럼 다양한 요인들을 통제함으로써, 단일 분자의 라만 산란 신호까지 SERS로 확인할 수 있다. 하지만 SERS는 라만 분광학의 작은 산란 단면적의 단점을 극복했음에도 불구하고, 광학계의 회절 한계로 인해 고 분해능의 라만 이미지는 얻을 수 없다.At this time, the amplification degree of the signal varies depending on the shape and size of the metal nanoparticles and the type of metal, and also depends on the angle, wavelength, and polarization of the incident light. By controlling these various factors, even the Raman scattering signal of a single molecule can be confirmed with SERS. However, although SERS overcomes the shortcomings of the small scattering cross-sectional area of Raman spectroscopy, high-resolution Raman images cannot be obtained due to the diffraction limit of the optical system.
이러한 시료의 화학적 분석의 한계를 극복할 수 있는 기술로, SPM과 라만 분광법을 결합한 탐침 증강 라만 분광학(tipenhanced Raman spectroscopy, TERS)이 있다. TERS는 표면 증강 라만 분광법(SERS)의 원리를 이용하여 개발된 라만 분광 기술이다. As a technique capable of overcoming the limitations of chemical analysis of such samples, there is tipenhanced Raman spectroscopy (TERS), which combines SPM and Raman spectroscopy. TERS is a Raman spectroscopy technique developed using the principle of surface-enhanced Raman spectroscopy (SERS).
9. 국부적 표면 플라즈몬 공명(LSPR)을 발휘하는 금속 나노입자의 특성 9. Characterization of Metal Nanoparticles to Exhibit Localized Surface Plasmon Resonance (LSPR)
금속나노입자는 뛰어난 내구성과 함께 크기에 따른 특유의 물리적, 화학, 전기화학성 특성으로 인해 생체 내 및 체외 진단 분야에서 활발하게 사용되고 있다. 금속나노입자의 재료, 모양, 크기에 기반하여 생성되는 신호는 추가적 표지 물질이 없어도 특유의 신호 송출이 가능하기 때문에 오랜 시간동안 안정적인 신호 생성이 가능하다는 장점을 지니고 있다. 금속나노입자의 또 하나의 장점은 금속이 지닌 물질 특성에 기반하여 형광물질, 저분자 표지 물질 등의 신호 생성을 증폭시킬 수 있다는 것이다. 예를 들어, 금속나노입자가 가진 플라즈몬 공명 현상은 라만 신호, 형광 분자 신호와 같이 광학적 특성을 증폭시키는 효과를 얻을 수 있다. Metal nanoparticles are actively used in in vivo and in vitro diagnostic fields due to their excellent durability and unique physical, chemical, and electrochemical properties according to their size. The signal generated based on the material, shape, and size of the metal nanoparticles has the advantage of being able to generate a stable signal for a long time because it is possible to transmit a unique signal without an additional labeling material. Another advantage of metal nanoparticles is that they can amplify the signal generation of fluorescent substances and small molecule labeling substances based on the material properties of the metal. For example, the plasmon resonance phenomenon of metal nanoparticles can have the effect of amplifying optical properties such as Raman signals and fluorescent molecular signals.
또한, 표면이 개질된 금속나노입자는 전기화학 신호의 증폭 및 감도, 선택성을 향상시키는 등 센서의 기능을 발전시킬 수 있다. 더불어 금속나노입자는 센서로써의 활용과 함께 임상, 제약, 암 치료제 전달에도 다양하게 사용되고 있다. In addition, the surface-modified metal nanoparticles can improve the function of the sensor, such as amplification of an electrochemical signal, and improving sensitivity and selectivity. In addition, metal nanoparticles are being used variously for clinical, pharmaceutical, and cancer treatment delivery along with their use as sensors.
국부적 표면 플라즈몬 공명(LSPR)을 발휘하는 금속나노입자는 금속나노입자 자체, 생체 기능화된 금속나노입자, 금속 나노 복합체 또는 나노 하이브리드로서 합성될 수 있다(도 8). Metal nanoparticles exhibiting localized surface plasmon resonance (LSPR) can be synthesized as metal nanoparticles themselves, biofunctionalized metal nanoparticles, metal nanocomposites, or nanohybrids ( FIG. 8 ).
한편, 일정한 물리 화학적 특징, 표면 전하, 모양을 갖는 고순도의 나노물질을 합성하는 것이 중요하다. 나노입자의 경우 크기 조절이 용이하고 목적에 따라 다양한 재료를 사용할 수 있으며 수용액상에서 합성하는 경우가 많아 상대적으로 쉬운 방법으로 대량의 재료를 합성할 수 있다.On the other hand, it is important to synthesize high-purity nanomaterials with certain physicochemical characteristics, surface charge, and shape. In the case of nanoparticles, it is easy to control the size, and various materials can be used depending on the purpose, and since they are often synthesized in an aqueous solution, a large amount of material can be synthesized in a relatively easy way.
금속나노입자가 가지는 높은 표면적 대 부피 비율(surface area to volume ratio)은 촉매의 효율을 증가시키거나 센서의 감도를 향상시키고, 고순도로 합성된 입자는 더 진보된 광학적, 전자기적 특성을 갖추게 되어 생물학적, 임상 응용 분야에 많은 장점을 지닐 수 있다. 이로 인해 의료 진단 및 임상 분석에서 인체 내 민감한(또는 소량 존재하는) 세포, 바이오 마커 검출과 국소적 조직 부위의 정밀검진이 가능하다. 주로 임상 및 생물학적 유래 시료가 가지는 극소량의 시료를 검출하고 생의학적으로 중요한 분석 물질을 찾아내기 위한 금속 나노물질 기반 전기화학센서 및 바이오센서 플랫폼을 개발 및 사용할 수 있다.The high surface area to volume ratio of metal nanoparticles increases the efficiency of the catalyst or improves the sensitivity of the sensor. , which can have many advantages for clinical applications. Due to this, it is possible to detect sensitive (or small amount) cells and biomarkers in the human body in medical diagnosis and clinical analysis, and to perform detailed examination of local tissue sites. It is possible to develop and use a metal nanomaterial-based electrochemical sensor and biosensor platform to detect a very small amount of samples mainly of clinical and biological origin and to find biomedical important analytes.
10. 나노 입자에 의해 국부적 표면 플라즈몬 공명(LSPR)을 이용한 표면 분석 라만 분광법에서 라만 쉬프트 값(a)을 도출하고자, 나노 입자 상에 연결되어 있는 라만 표지자(Raman indicator)10. Surface analysis using localized surface plasmon resonance (LSPR) by nanoparticles To derive the Raman shift value (a) in Raman spectroscopy, a Raman indicator connected to the nanoparticles
본 발명은 라만 분광법을 통해 라만 표지자(Raman indicator)의 라만 산란 신호를 획득할 수도 있다.In the present invention, a Raman scattering signal of a Raman indicator may be obtained through Raman spectroscopy.
라만 표지자(Raman indicator)는, 유기 또는 무기 분자, 원자, 복합체 또는 합성 분자, 염료, 천연발생 염료(피코에리스린 등), C60과 같은 유기 나노구조체, 벅키볼, 탄소 나노튜브, 양자점, 유기 형광 분자 등일 수 있다. 구체적으로, 라만 표지자(Raman indicator)의 비제한적인 예로서, FAM, Dabcyl, TRITC(테트라메틸 로다민-5-아이소티오시아네이트), MGITC(말라키트 그린 아이소티오시아네이트), XRITC(X-로다민-5-아이소티오시아네이트), DTDC(3,3-디에틸티아디카보시아닌 아이오다이드), TRIT(테트라메틸 로다민 아이소티올), NBD(7-니트로벤즈-2-1,3-다이아졸), 프탈산, 테레프 탈산, 아이소프탈산, 파라-아미노벤조산, 에리트로신, 비오틴, 다이곡시게닌(digoxigenin), 5-카복시-4',5'-다이클로로-2',7'-다이메톡시, 플루오레세인, 5-카복시-2',4',5',7'-테트라클로로플루오레세인, 5-카복시플루오레세인, 5-카복시 로다민, 6-카복시로다민, 6-카복시테트라메틸 아미노 프탈로시아닌, 아조메틴, 시아닌(Cy3, Cy3.5, Cy5), 크산틴, 석신일플루오레세인, 아미노아크리딘, 양자점, 탄소동소체, 시아나이드, 티올, 클로린, 브롬, 메틸, 인 또는 황 등이 있다. 라만 표지자(Raman indicator)는 뚜렷한 라만 스펙트럼을 나타내어야 하고, 바람직하게는 시아닌 계열 형광 유지분자인 Cy3, Cy3.5, Cy5 또는 FAM, Dabcyl, Rhodamine 계열의 형광분자를 포함하는 유기 형광분자들이다. 유기 형광분자는 라만 분석 시 사용하는 여기 레이저 파장과 공명하여 더욱 높은 라만 산란 신호의 검출이 가능한 장점이 있다. Raman indicators are organic or inorganic molecules, atoms, complexes or synthetic molecules, dyes, naturally occurring dyes (phycoerythrin, etc.), organic nanostructures such as C 60 , bucky balls, carbon nanotubes, quantum dots, organic It may be a fluorescent molecule or the like. Specifically, as a non-limiting example of a Raman indicator, FAM, Dabcyl, TRITC (tetramethyl rhodamine-5-isothiocyanate), MGITC (malakit green isothiocyanate), XRITC (X- Rhodamine-5-isothiocyanate), DTDC (3,3-diethylthiadicarbocyanine iodide), TRIT (tetramethyl rhodamine isothiol), NBD (7-nitrobenz-2-1,3 -diazole), phthalic acid, terephthalic acid, isophthalic acid, para-aminobenzoic acid, erythrosine, biotin, digoxigenin, 5-carboxy-4',5'-dichloro-2',7' -dimethoxy, fluorescein, 5-carboxy-2',4',5',7'-tetrachlorofluorescein, 5-carboxyfluorescein, 5-carboxyrhodamine, 6-carboxyrhodamine, 6-carboxytetramethyl amino phthalocyanine, azomethine, cyanine (Cy3, Cy3.5, Cy5), xanthine, succinylfluorescein, aminoacridine, quantum dots, carboisotropes, cyanide, thiol, chlorine, bromine, methyl, phosphorus or sulfur. The Raman indicator should show a clear Raman spectrum, and preferably, it is an organic fluorescent molecule including a cyanine-based fluorescence-maintaining molecule, Cy3, Cy3.5, Cy5, or a FAM, Dabcyl, or Rhodamine-based fluorescent molecule. Organic fluorescent molecules have the advantage of being able to detect higher Raman scattering signals by resonating with the excitation laser wavelength used for Raman analysis.
상기 라만 표지자(Raman indicator)의 일례로, 형광물질은 그 특유의 구조에 따라 빛을 흡수에 여기 상태에 도달한 분자가 에너지를 잃고 다시 안정된 기저 상태로 돌아갈 때 에너지를 다시 빛으로 방출하는 방사 과정을 갖는 물질이다. As an example of the Raman indicator, a fluorescent substance absorbs light according to its unique structure, and when a molecule reaching an excited state loses energy and returns to a stable ground state, a radiation process in which energy is emitted again as light is a substance with
이러한 형광물질은 액상 기반의 어세이, 또는 생체 환경 내에서의 이미징에서 사용될 수 있다. 이러한 형광 물질은 분자 구조에 따라 신호 (또는 색)을 조절할 수 있기 때문에 다중 검지가 가능하며 몇몇 형광 분자들은 특이적 물질에만 친화도를 가져 선택적 반응이 가능하다. 하지만 이런 형광 물질은 오랜 시간 동안 빛에 노출되면 신호의 세기가 감소하여 오랜 시간동안 모니터링이 어려워지고 검출 가능한 형광 세기의 강도가 약하다는 단점이 존재하고 있다. 이를 극복하기 위하여 금속나노입자를 활용하여 다량의 형광물질을 집적화하거나 형광 세기를 증폭시킬 수 있다. Such a fluorescence material can be used in liquid-based assays or imaging in a living environment. Since such a fluorescent material can control a signal (or color) according to its molecular structure, multiple detection is possible, and some fluorescent molecules have affinity only for a specific material, so a selective reaction is possible. However, when these fluorescent materials are exposed to light for a long time, the signal intensity decreases, making it difficult to monitor for a long time, and there are disadvantages in that the detectable fluorescence intensity is weak. In order to overcome this, a large amount of fluorescent material can be integrated or the fluorescence intensity can be amplified by using metal nanoparticles.
타겟 물질을 인식할 수 있는 리셉터가 도입된 금속 나노입자에 형광물질을 동시에 도입함으로써 한 개의 타겟을 인식하더라도 국부적 표면 플라즈몬 공명(LSPR)을 발휘하는 나노입자 상에 연결된 형광물질이 신호를 증폭할 수 있고, 이로인해 감도를 증가시킬 수 있다. By simultaneously introducing a fluorescent material to a metal nanoparticle having a receptor capable of recognizing a target material, even if a single target is recognized, the fluorescent material connected to the nanoparticle exhibiting local surface plasmon resonance (LSPR) can amplify the signal. and this may increase the sensitivity.
금 나노입자의 경우 Au-thiol 반응으로 표면 개질이 용이하고 나노입자 표면에 많은 양의 물질을 부착할 수 있기 때문에 이를 이용한 생화학 물질 검출이 가능하다. 금 나노입자는 매우 높은 흡광 계수를 가지고 있을 뿐만 아니라, 일반적으로 사용되는 에너지 주게의 방출 파장의 대부분과 중첩되는 넓은 흡수 파장을 가지고 있기에 quencher로서 작용할 수 있다. 이러한 특성을 활용하여 on/off 신호 체계의 센서를 개발할 수 있다. In the case of gold nanoparticles, it is possible to detect biochemicals using the Au-thiol reaction to easily modify the surface and attach a large amount of material to the surface of the nanoparticles. Gold nanoparticles not only have a very high extinction coefficient, but also can act as a quencher because they have a broad absorption wavelength that overlaps most of the emission wavelengths of commonly used energy donors. By utilizing these characteristics, it is possible to develop a sensor with an on/off signal system.
하나의 예로 헤어핀 구조의 DNA를 디자인하고 그 끝에는 형광 물질을 국부적 표면 플라즈몬 공명(LSPR)을 발휘하는 금 나노입자 상에 달고, 이후 타겟 DNA 가 존재하게 되면 상보적 결합을 하게 되고 헤어핀 구조가 풀리면서 형광물질이 국부적 표면 플라즈몬 공명(LSPR)을 발휘하는 금 나노입자 표면으로부터 멀어지게 되면서 형광물질의 라만 세기(intensity)가 turn-off되게 되고 이를 측정하여 생화학물질을 검출할 수 있다. As an example, DNA with a hairpin structure is designed, and a fluorescent material is attached to the end of gold nanoparticles that exhibit local surface plasmon resonance (LSPR). As the fluorescent material moves away from the surface of the gold nanoparticles that exert local surface plasmon resonance (LSPR), the Raman intensity of the fluorescent material turns off, and biochemicals can be detected by measuring it.
11. 라만 신호를 이용한 타겟 핵산 검출 방법 및 타겟 핵산 검출용 핵산 기반 자가조립 복합체 11. Target nucleic acid detection method using Raman signal and nucleic acid-based self-assembly complex for target nucleic acid detection
본 발명의 일구체예에 따라 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법은 하기 단계들을 포함하는, 액체내 핵산 기반 자가조립 복합체 유래 라만 신호를 이용하여 타겟 핵산을 검출하는 방법에 사용할 수 있다:According to an embodiment of the present invention, the Raman scattering spectrum database construction and search method through machine learning can be used in a method of detecting a target nucleic acid using a Raman signal derived from a nucleic acid-based self-assembly complex in a liquid, comprising the following steps. :
(a) 조건에 따라 타겟 핵산과 교합하는 제1 뉴클레오티드 하나 이상이 제1금속나노입자에 연결된 제1나노입자 기반 구조체 및 (b) 제1 뉴클레오티드와 10 염기쌍(bp) 이상 상보적인 제2 뉴클레오티드 하나 이상이 제2금속나노입자에 연결된 제2나노입자 기반 구조체 사이의 자가조립에 의해 형성되는 핵산 기반 자가조립 복합체가, 타겟 핵산 존재시 제1 뉴클레오티드와 타겟 핵산과의 교합(hybridization)에 의해 형성되지 않거나 해체되는 경우 라만 신호의 변화값을 측정할 수 있게 설계되어 있는 타겟 핵산 검출 시약을 준비하는 I단계; (a) a first nanoparticle-based structure linked to a first metal nanoparticle at least one first nucleotide that mates with a target nucleic acid according to conditions, and (b) a second nucleotide complementary to the first nucleotide by at least 10 base pairs (bp) The nucleic acid-based self-assembly complex formed by self-assembly between the second nanoparticle-based structure linked to the second metal nanoparticle is not formed by hybridization between the first nucleotide and the target nucleic acid in the presence of the target nucleic acid. Step I of preparing a target nucleic acid detection reagent designed to measure the change value of the Raman signal when it is not or disassembled;
핵산 함유 액상 시료에서 I단계의 (a) 제1 뉴클레오티드가 연결된 제1나노입자 기반 구조체 및 (b) 제2 뉴클레오티드가 연결된 제2나노입자 기반 구조체 함유 타겟 핵산 검출 시약과의 교합 반응(hybridization)을 수행하는 II단계; A hybridization reaction (hybridization) with a target nucleic acid detection reagent containing (a) the first nanoparticle-based structure to which the first nucleotide is linked and (b) the second nanoparticle-based construct to which the second nucleotide is linked in step I in the nucleic acid-containing liquid sample Step II to carry out;
II단계의 교합 반응 전, 후 및/또는 동시에 액상 시료내 핵산 기반 자가조립 복합체 유래 라만 신호를 측정하는 III단계; 및 Step III of measuring the Raman signal derived from the nucleic acid-based self-assembly complex in the liquid sample before, after and/or simultaneously with the occlusion reaction of step II; and
III단계에서 측정되는 라만 신호 또는 이의 변화값을 분석하는 알고리즘을 통해 시료 내 타겟 핵산의 검출 및/또는 정량 데이터를 제공하는 IV단계Step IV providing detection and/or quantitative data of the target nucleic acid in the sample through an algorithm that analyzes the Raman signal measured in step III or its change value
를 포함한다. includes
이때, I단계의 타겟 핵산 검출 시약은 (a) 타겟 핵산과 교합하는 제1 뉴클레오티드가 제1금속 나노입자에 연결된 제1나노입자 기반 구조체와 (b) 제1 뉴클레오티드와 상보적인 제2 뉴클레오티드가 제2금속 나노입자에 연결된 제2나노입자 기반 구조체로부터, 제1 뉴클레오티드와 제2 뉴클레오티드 사이 분자적 수준의 자발적인 결합, 즉 최소 10개 염기쌍의 상보적인 수소결합에 의해 핵산 기반 자가조립 복합체를 형성하는 것으로, In this case, the target nucleic acid detection reagent of step I is (a) a first nanoparticle-based structure in which a first nucleotide that intersects with a target nucleic acid is linked to a first metal nanoparticle, and (b) a second nucleotide complementary to the first nucleotide From a second nanoparticle-based structure linked to a bimetallic nanoparticle, a nucleic acid-based self-assembly complex is formed by spontaneous bonding at a molecular level between the first nucleotide and the second nucleotide, that is, complementary hydrogen bonding of at least 10 base pairs. ,
핵산 기반 자가조립 복합체 형성시 (i) 인접한 2개의 금속 나노입자들에 의해 나노갭이 형성되고, (ii) 상기 나노갭은 광 조사시 표면 플라즈몬 공명 현상(전자기적 효과)을 발생 및 더욱 강화시키는 공간이며, (iii) 제2 올리고 뉴클레오타이드에 연결된 라만 표지자(Raman indicator)를 상기 나노갭에 위치시킴으로써 광 조사시 검출하는 라만 산란 신호를 증강시키도록 설계된 것일 수 있다(도 9).When a nucleic acid-based self-assembly complex is formed, (i) a nanogap is formed by two adjacent metal nanoparticles, and (ii) the nanogap generates and further strengthens a surface plasmon resonance phenomenon (electromagnetic effect) upon irradiation with light. space, and (iii) a Raman indicator linked to the second oligonucleotide may be positioned in the nanogap to enhance the Raman scattering signal detected during light irradiation (FIG. 9).
또한, 타겟 핵산과 교합하는 제1 뉴클레오티드는 조건에 따라 타겟 핵산의 일부 서열과 교합하는 핵산 서열을 가진 올리고 뉴클레오티드 프로브(probe), 활동 자유도를 높여 10 bp 이상의 상보적인 수소결합을 도와주는 스페이서 및 나노입자에 부착하는 올리고 뉴클레오티드 접착자(attacher)를 포함하고; 제1 뉴클레오티드와 10 bp 이상 상보적인 제2 뉴클레오티드는 제1 뉴클레오티드의 올리고 뉴클레오티드 프로브(probe)와 상보적인 핵산 서열을 가진 제2-1 올리고 뉴클레오티드 접착자(attacher), 활동 자유도를 높여 10 bp 이상의 상보적인 수소결합을 도와주는 스페이서 및 나노입자에 부착하는 제2-2 올리고 뉴클레오티드 접착자(attacher)를 포함할 수 있다.In addition, the first nucleotide that mates with the target nucleic acid is an oligonucleotide probe having a nucleic acid sequence that mates with a partial sequence of the target nucleic acid depending on conditions, spacers and nano an oligonucleotide adhesive that attaches to the particle; A second nucleotide complementary to the first nucleotide by 10 bp or more is a 2-1 oligonucleotide adhesive having a nucleic acid sequence complementary to the oligonucleotide probe of the first nucleotide, and the degree of freedom of action is increased by 10 bp or more It may include a spacer and a second 2-2 oligonucleotide adhesive attached to the nanoparticles to assist in hydrogen bonding.
본 발명의 타겟 핵산 검출 시약은 타겟 핵산 존재시 제1 뉴클레오티드와 타겟 핵산과의 교합(hybridization)에 의해 핵산 기반 자가조립 복합체가 형성되지 않거나 해체되는 경우 라만 신호의 변화값을 측정할 수 있게 하는 것이 특징이다(도 9 ~ 도 12). The target nucleic acid detection reagent of the present invention is capable of measuring the change value of the Raman signal when the nucleic acid-based self-assembly complex is not formed or disassembled by hybridization between the first nucleotide and the target nucleic acid in the presence of the target nucleic acid. characteristic (FIGS. 9 to 12).
상기 III단계 및 IV단계는 본 발명의 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법을 활용하면서 수행될 수 있다(도 10 ~ 도 12).Steps III and IV may be performed while utilizing the Raman scattering spectrum database construction and search method through machine learning of the present invention ( FIGS. 10 to 12 ).
예컨대, 핵산 함유 액상 시료에서 상기 타겟 핵산 검출 시약과의 교합 반응(hybridization)을 수행하고, 교합 반응 전, 후 및/또는 동시에 액상 시료내 핵산 기반 자가조립 복합체 유래 라만 신호, 즉 (a) 라만 쉬프트 값(들), 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 측정하여 입력하면, 구축된 라만 스펙트럼 데이터베이스로부터 원하는 예측 정보를 산출할 수 있다.For example, hybridization with the target nucleic acid detection reagent is performed in a nucleic acid-containing liquid sample, and a Raman signal derived from a nucleic acid-based self-assembly complex in the liquid sample before, after and/or simultaneously with the hybridization reaction, that is, (a) Raman shift If (b) Raman highest point and (c) Raman lowest point are measured and inputted from each shift value, desired prediction information can be calculated from the constructed Raman spectrum database.
예컨대, 타겟 핵산은 유전체 또는 이의 단편일 수 있으며, 타겟 핵산의 검출 및/또는 정량을 통해 바이러스 및/또는 미생물의 동정 또는 질병의 진단 및/또는 치료제의 효과 평가 등 다양한 예측 정보를 산출할 수 있다.For example, the target nucleic acid may be a genome or a fragment thereof, and through detection and/or quantification of the target nucleic acid, various predictive information such as identification of viruses and/or microorganisms or diagnosis of diseases and/or evaluation of the effectiveness of therapeutic agents can be calculated. .
상기 핵산 기반 자가조립 복합체는 타겟 핵산 존재시 turn-off 신호 방식의 센서 역할을 수행한다. 예컨대, 상기 핵산 기반 자가조립 복합체는 2개의 금속나노입자 사이에 정확하게 구조적으로 정의되어 있는 나노갭을 형성시키고, 라만 산란 신호를 증폭시키는 국부적 표면 플라즈몬 공명(localized surface plasmon resonance, LSPR)을 발휘할 수 있는 나노갭에 라만 표지자(Raman indicator)를 위치시킴으로써, 증강된 라만 산란 신호를 재현성 있게 확보하기 위해 측정하고자 하는 타겟 핵산의 존재 여부에 역으로 연동하여 나노갭 형성 여부가 결정되는 on/off 신호 체계의 센서 역할을 수행할 수 있고 이로인해 타겟 핵산의 정량분석이 가능한 라만용 핵산 기반 자가조립 복합체를 형성 또는 함유하는 것이다(도 9).The nucleic acid-based self-assembly complex functions as a sensor of a turn-off signal method in the presence of a target nucleic acid. For example, the nucleic acid-based self-assembly complex forms a precisely structurally defined nanogap between two metal nanoparticles and can exert localized surface plasmon resonance (LSPR) that amplifies the Raman scattering signal. By locating a Raman indicator in the nanogap, the on/off signal system in which the formation of a nanogap is determined by inversely interlocking with the presence or absence of the target nucleic acid to be measured in order to reproducibly secure the enhanced Raman scattering signal. It is to form or contain a nucleic acid-based self-assembly complex for Raman that can serve as a sensor and thereby quantitative analysis of a target nucleic acid (FIG. 9).
따라서, 본 발명의 일구체예에 따른 타겟 핵산 검출 시약은 액체 내 핵산 기반 자가조립 복합체에서 포착되는 라만 신호로 핵산 기반 자가조립 복합체 형성여부 및/또는 형성정도(정량)를 확인할 수 있고, 이로부터 핵산 기반 자가조립 복합체가 형성되지 않도록 또는 해체되도록 하는, 제1 뉴클레오티드와 교합(hybridization)하는 타겟 핵산을 검출 또는 정량할 수 있다(도 12). Therefore, the target nucleic acid detection reagent according to one embodiment of the present invention can confirm whether or not the nucleic acid-based self-assembly complex is formed and/or the degree of formation (quantitation) with the Raman signal captured in the nucleic acid-based self-assembly complex in the liquid, and from this It is possible to detect or quantify a target nucleic acid that hybridizes with the first nucleotide so that the nucleic acid-based self-assembly complex is not formed or disassembled ( FIG. 12 ).
예컨대, 기지의 농도로 핵산 기반 자가조립 복합체를 형성 또는 함유하는 타겟 핵산 검출 시약은 타겟 핵산 부재시 라만 표지자의 신호가 최대치인 상태에서, 타겟 핵산이 많아질수록 라만 표지자의 신호가 낮아져, 기지의 농도의 핵산 기반 자가조립 복합체에 대응되는 과량의 타겟 핵산 존재시 라만 신호가 최소치가 된다(도 11). 따라서, 타겟 핵산 검출 시약 내 형성가능한 핵산 기반 자가조립 복합체 농도별로 라만 신호의 최소 및 최대 기준점을 확보 또는 예측할 수 있다(도 12).For example, a target nucleic acid detection reagent that forms or contains a nucleic acid-based self-assembly complex at a known concentration has a Raman marker signal at its maximum in the absence of the target nucleic acid, and as the target nucleic acid increases, the Raman marker signal decreases, resulting in a known concentration of the target nucleic acid. In the presence of an excess of the target nucleic acid corresponding to the nucleic acid-based self-assembly complex of the Raman signal becomes the minimum (FIG. 11). Therefore, it is possible to secure or predict the minimum and maximum reference points of the Raman signal for each concentration of a nucleic acid-based self-assembly complex that can be formed in the target nucleic acid detection reagent ( FIG. 12 ).
또한, 타겟 핵산 검출 시약 내 형성가능한 핵산 기반 자가조립 복합체 농도별로 라만 신호의 최소 및 최대 기준점은 본 발명의 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법을 통해, 기계학습할 수 있다.In addition, the minimum and maximum reference points of the Raman signal for each concentration of the nucleic acid-based self-assembly complex that can be formed in the target nucleic acid detection reagent can be machine-learned through the Raman scattering spectrum database construction and search method through the machine learning of the present invention.
12. 생명공학 발명에 수반되는 컴퓨터 소프트웨어 관련 발명12. Computer software related inventions accompanying biotechnology inventions
본 발명은 생체 시스템에서 생산된 데이터로부터 목적하는 정보를 도출해내는 컴퓨터 소프트웨어로서, 본 발명의 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법이 컴퓨터에서 수행되도록, 제1단계 내지 제9단계 중 적어도 한 단계를 실행시키기 위한 프로그램을 컴퓨터에 전송하는 매체 또는 컴퓨터로 읽을 수 있는 기록매체를 제공한다.The present invention provides target information from data produced in a biological system. As computer software for deriving, a medium for transmitting a program for executing at least one of the first to ninth steps to a computer so that the Raman scattering spectrum database construction and search method through machine learning of the present invention is performed on a computer; or A computer-readable recording medium is provided.
본 명세서에서 컴퓨터는 정보처리 능력을 가진 장치이다. 정보처리는 사용 목적에 따른 정보의 연산 또는 가공하는 것이다.In the present specification, a computer is a device having information processing capability. Information processing is the operation or processing of information according to the purpose of use.
본 명세서에서 소프트웨어는 컴퓨터 등의 장비와 그 주변 장치에 대하여 명령, 입력, 처리, 저장, 출력, 상호작용이 가능하도록 하게 하는 지시, 명령(음성이나 영상정보 포함)의 집합이다.In this specification, software is a set of instructions and commands (including audio or image information) that enable commands, input, processing, storage, output, and interaction with equipment such as a computer and its peripheral devices.
본 명세서에서 컴퓨터프로그램은 컴퓨터 내에 탑재되어 특정의 기능을 수행하기 위한 프로그램으로서, 컴퓨터로 상기 제1단계 내지 제9단계를 실행하기에 적합한 명령의 집합이다.In the present specification, a computer program is a program installed in a computer to perform a specific function, and is a set of instructions suitable for executing the first to ninth steps with a computer.
본 명세서에서 데이터 기록매체는 기록된 데이터 구조로 말미암아 컴퓨터가 하는 처리 내용이 특정되는, 구조를 가진 데이터를 기록한 컴퓨터로 읽을 수 있는 매체이다.In the present specification, a data recording medium is a computer-readable medium in which data having a structure in which processing contents performed by a computer are specified due to the recorded data structure.
일부 구현예에서, 시료의 (a) 라만 쉬프트 값(들), 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 입력하면, 구축된 라만 스펙트럼 데이터베이스로부터 원하는 예측 정보를 산출하는 방법은 서버 또는 컴퓨터 서버 상에서 처리된다 (도 13). 일부 구현예에서, 서버 401는 중앙 처리 장치 (CPU, 또한 “프로세서”) 405를 포함하며, 이는 단일 코어 프로세서, 멀티 코어 프로세서, 또는 병렬 처리를 위한 복수의 프로세서이다. 일부 구현예에서, 제어 어셈블리의 일부로서 사용되는 프로세서는 마이크로프로세서이다. 일부 구현예에서, 서버 401는 또한 메모리 410 (예를 들면 랜덤 액세스 메모리, 읽기 전용 메모리, 플래시 메모리); 전자 저장 유닛 415 (예를 들면 하드 디스크); 하나 이상의 다른 시스템과 통신하기 위한 통신 인터페이스 420 (예를 들면 네트워크 어댑터); 및 캐시, 다른 메모리, 데이터 저장, 및/또는 전자 디스플레이 어댑터를 포함하는 주변 장치 425를 포함한다. 메모리 410, 저장 유닛 415, 인터페이스 420, 및 주변 장치 425는 마더 보드와 같은 통신 버스 (실선)를 통해 프로세서 405와 통신한다. 일부 구현예에서, 저장 유닛 415은 데이터를 저장하기 위한 데이터 저장유닛이다. 서버 401는 통신 인터페이스 420의 도움으로 컴퓨터 네트워크 (“네트워크”) 430에 작동가능하게 결합된다. 일부 구현예에서, 추가적인 하드웨어의 도움을 받는 프로세서가 또한 네트워크에 작동가능하게 결합된다. 일부 구현예에서, 네트워크 430는 인터넷, 인트라넷 및/또는 엑스트라넷, 인터넷, 전기통신 또는 데이터 네트워크와 통신하는 인트라넷 및/또는 엑스트라넷이다.In some embodiments, by inputting (a) Raman shift value(s), (b) Raman peaks, and (c) Raman troughs at each shift value of a sample, the method of calculating desired prediction information from the constructed Raman spectrum database is processed on a server or computer server (FIG. 13). In some implementations, server 401 includes a central processing unit (CPU, also “processor”) 405 , which is a single core processor, a multi-core processor, or multiple processors for parallel processing. In some implementations, the processor used as part of the control assembly is a microprocessor. In some implementations, server 401 may also include memory 410 (eg, random access memory, read-only memory, flash memory); electronic storage unit 415 (eg hard disk); a communication interface 420 (eg, a network adapter) for communicating with one or more other systems; and peripheral devices 425 including cache, other memory, data storage, and/or electronic display adapters. The memory 410, the storage unit 415, the interface 420, and the peripheral device 425 communicate with the processor 405 via a communication bus (solid line), such as a motherboard. In some implementations, the storage unit 415 is a data storage unit for storing data. The server 401 is operatively coupled to a computer network (“network”) 430 with the aid of a communication interface 420 . In some implementations, a processor assisted by additional hardware is also operatively coupled to the network. In some implementations, network 430 is an intranet and/or extranet that communicates with the Internet, an intranet and/or extranet, the Internet, a telecommunications or data network.
일부 구현예에서, 서버 401의 도움을 받는 네트워크 430는 피어 투 피어 (peer-to-peer) 네트워크를 구현하며, 이는 클라이언트 또는 서버로서 작동하는 서버 401에 결합된 장치를 가능하게 한다. 일부 구현예에서, 서버는 네트워크 430를 통해 전송된 전자 신호를 통해 컴퓨터-판독가능한 명령 (예를 들면, 장치/시스템 운영 프로토콜 또는 파라미터) 또는 데이터 (예를 들면, 센서 측정, 대사 산물의 검출로부터 수득된 원시 데이터, 대사 산물의 검출로부터 수득된 원시 데이터의 분석, 대사 산물의 검출로부터 수득된 원시 데이터의 해석, 등)를 전송 및 수신할 수 있다. 게다가, 일부 구현예에서, 네트워크는, 예를 들어, 국제 경계를 넘어 데이터를 송신 또는 수신하는데 사용된다.In some implementations, network 430 assisted by server 401 implements a peer-to-peer network, which enables a device coupled to server 401 to act as a client or server. In some embodiments, the server is configured to provide computer-readable instructions (eg, device/system operating protocols or parameters) or data (eg, sensor measurements, detection of metabolites) via electronic signals transmitted over the network 430 . raw data obtained, analysis of raw data obtained from detection of metabolites, interpretation of raw data obtained from detection of metabolites, etc.) can be transmitted and received. Moreover, in some implementations, a network is used, for example, to transmit or receive data across international boundaries.
일부 구현예에서, 서버 401는 하나 이상의 출력 장치 435 예컨대 디스플레이 또는 프린터, 및/또는 하나 이상의 입력 장치 440 예컨대, 예를 들면, 키보드, 마우스, 또는 조이스틱과 통신한다. 일부 구현예에서, 디스플레이는 터치 스크린 디스플레이이고, 이 경우에 그것은 디스플레이 장치 및 입력 장치 모두로서 기능한다. 일부 구현예에서, 이넌시에이터 (enunciator), 스피커, 또는 마이크로폰과 같은 상이한 및/또는 추가의 입력 장치가 존재한다. 일부 구현예에서, 서버는, 예를 들면, 윈도우즈®, 또는 MacOS®, 또는 유닉스®, 또는 리눅스® 의 몇 가지 버전 중 어느 하나와 같은, 다양한 운영 체제 중 어느 하나를 사용한다.In some implementations, the server 401 communicates with one or more output devices 435 such as a display or printer, and/or one or more input devices 440 such as, for example, a keyboard, mouse, or joystick. In some implementations, the display is a touch screen display, in which case it functions as both a display device and an input device. In some implementations, different and/or additional input devices are present, such as enunciators, speakers, or microphones. In some implementations, the server uses any one of a variety of operating systems, such as, for example, Windows®, or MacOS®, or any one of several versions of Unix®, or Linux®.
일부 구현예에서, 저장 유닛 415은 본원에 기재된 장치, 시스템 또는 방법의 운영과 관련된 파일 또는 데이터를 저장한다.In some implementations, the storage unit 415 stores files or data related to the operation of an apparatus, system, or method described herein.
일부 구현예에서, 서버는 네트워크 430를 통해 하나 이상의 원격 컴퓨터 시스템과 통신한다. 일부 구현예에서, 하나 이상의 원격 컴퓨터 시스템은, 예를 들면, 퍼스널 컴퓨터, 랩톱, 태블릿, 전화기, 스마트폰, 또는 개인 디지털 단말기를 포함한다.In some implementations, the server communicates with one or more remote computer systems via a network 430 . In some implementations, the one or more remote computer systems include, for example, personal computers, laptops, tablets, telephones, smartphones, or personal digital terminals.
일부 구현예에서, 제어 어셈블리는 단일 서버 401를 포함한다. 다른 상황에서, 시스템은 인트라넷, 엑스트라넷 및/또는 인터넷을 통해 서로 통신하는 다수의 서버를 포함한다.In some implementations, the control assembly includes a single server 401 . In other contexts, a system includes multiple servers that communicate with each other via intranets, extranets, and/or the Internet.
일부 구현예에서, 서버 401는 장치 운영 파라미터, 프로토콜, 본원에 기재된 방법, 및 잠재적으로 관련된 다른 정보를 저장하도록 조정된다. 일부 구현예에서, 그러한 정보는 저장 유닛 415 또는 서버 401 상에 저장되고 그러한 데이터는 네트워크를 통해 전송된다.In some implementations, server 401 is adapted to store device operating parameters, protocols, methods described herein, and other potentially relevant information. In some implementations, such information is stored on storage unit 415 or server 401 and such data is transmitted over a network.
프로그램, 데이터 등 소정의 정보를 전송할 수 있는 것은 통상의 통신망, 통신 선로 등이 있다.A general communication network, communication line, etc. may transmit predetermined information such as a program or data.
컴퓨터로 읽을 수 있는 기록매체의 비제한적인 예로는 하드 디스크들, 플로피 디스크들, 자기 기록매체들, 광 기록매체들이 있고, 전송 매체의 비제한적인 예로는 전송(통신)매체, 캐리어 웨이브(carrier wave), 반송파, 전송(통신) 매커니즘 등) 등이 있다. Non-limiting examples of computer-readable recording media include hard disks, floppy disks, magnetic recording media, and optical recording media, and non-limiting examples of transmission media include a transmission (communication) medium, a carrier wave (carrier) medium. wave), carrier wave, transmission (communication) mechanism, etc.).
13. 광원 및 라만 검출 장치13. Light source and Raman detection device
레이저 광은 단일 파장 동위상의 빛이다. 일반적으로 레이저 빔은 가늘고 퍼지지 않는다. 레이저는 정확하게 정해지는 단색의 파장 때문에 분광학 분야에 주로 사용된다. Laser light is light in phase with a single wavelength. In general, the laser beam is thin and does not spread. Lasers are mainly used in spectroscopy because of their precisely defined monochromatic wavelengths.
근본적으로 라만 분광법의 단점은 신호의 세기가 약하다는 것이므로, 광원으로 고출력의 입사광 즉, 고밀도의 광자를 제공할 수 있는 레이저를 사용하는 것이 바람직하다. 따라서, 검출기로는 검출신호를 효과적으로 증폭시킬 수 있는 PMT(photomultiplier tube), APD(avalanche photodiode), CCD(charge coupled device) 등을 구비하는 것이 바람직하다.Fundamentally, the disadvantage of Raman spectroscopy is that the signal strength is weak, so it is preferable to use a laser capable of providing high-power incident light, that is, high-density photons, as a light source. Accordingly, it is preferable to include a photomultiplier tube (PMT), an avalanche photodiode (APD), a charge coupled device (CCD), or the like, which can effectively amplify the detection signal as the detector.
본 발명에서, (i) 국부적 표면 플라즈몬 공명(LSPR)을 이용한 금속 나노입자들에 의한 라만 표면증강 효과, (ii) 나노 갭으로 인해 더욱 증폭되는 라만 표지자(Raman indicator)의 라만 산란 신호 세기 증폭 수준 및/또는 (iii) 라만 표지자(Raman indicator)의 라만 시프트 값은, 라만 분석시 사용된 레이저 입사광의 파장에 따라 변할 수 있다.In the present invention, (i) Raman surface enhancement effect by metal nanoparticles using localized surface plasmon resonance (LSPR), (ii) Raman indicator further amplified due to nano-gap The Raman scattering signal intensity amplification level and/or (iii) the Raman shift value of the Raman indicator may vary depending on the wavelength of the laser incident light used in the Raman analysis.
라만 분광법을 통해 라만 산란 신호를 획득하는 방법은 임의의 공지된 라만 분광법에 의해 수행될 수 있으며, 바람직하게는 표면 증강 라만 분광법(SERS, Surface Enhanced Raman Scattering), 표면 증강 공명 라만 분광법(SERRS, Surface enhanced resonance Raman spectroscopy), 하이퍼-라만 및/또는 비간섭성 반스톡스 라만 분광법(CARS, coherent anti-Stokes Raman spectroscopy)을 사용할 수 있다.The method of acquiring a Raman scattering signal through Raman spectroscopy may be performed by any known Raman spectroscopy, preferably, Surface Enhanced Raman Scattering (SERS), Surface Enhanced Resonance Raman Spectroscopy (SERRS, Surface). enhanced resonance Raman spectroscopy), hyper-Raman and/or incoherent anti-Stokes Raman spectroscopy (CARS, coherent anti-Stokes Raman spectroscopy) may be used.
당해 분야에 공지된 임의의 적절한 형태 또는 구성의 라만 분광법 또는 관련 기법이 분석물 검출에 사용될 수 있으며, 이로는 노말 라만 스캐터링, 공명 라만 스캐터링, 표면 증강 라만 스캐터링, 표면 증강 공명 라만 스캐터링, 비간섭성 반스톡스 라만 분광법(CARS), 자극 라만 스캐터링, 역 라만 분광법, 자극 게인 라만 분광법, 하이퍼-라만 스캐터링, 분자 광학 레이저 시험기(molecular optical laser examiner, MOLE) 또는 라만 마이크로탐침 또는 라만 현미경법 또는 공초점 라만 마이크로분광기, 3차원 또는 스캐닝 라만, 라만 포화 분광법, 시간 분해 공명 라만, 라만 해리 분광법 또는 UV-라만 현미경법을 포함하지만, 이에 한정되지 않는다.Any suitable form or configuration of Raman spectroscopy or related techniques known in the art may be used for analyte detection, including normal Raman scattering, resonance Raman scattering, surface enhanced Raman scattering, surface enhanced resonance Raman scattering. , incoherent anti-Stokes Raman spectroscopy (CARS), stimulated Raman scattering, inverse Raman spectroscopy, excitation gain Raman spectroscopy, hyper-Raman scattering, molecular optical laser examiner (MOLE) or Raman microprobe or Raman microscopy or confocal Raman microspectroscopy, three-dimensional or scanning Raman, Raman saturation spectroscopy, time-resolved resonance Raman, Raman dissociation spectroscopy or UV-Raman microscopy.
본 발명에서, 라만 검출 장치는 컴퓨터를 포함할 수 있다. 상기 실시양태는 사용되는 컴퓨터 유형에 대해 제한을 두지 않는다. 예시적 컴퓨터는 정보를 상호교환하는 버스 및 정보 처리를 위한 프로세서를 포함할 수 있다. 컴퓨터는 램(RAM) 또는 다른 동적 저장 장치, 롬(ROM) 또는 다른 정적 저장 장치 및 데이터 저장 장치, 예컨대 마그네틱 디스크 또는 광학 디스크 및 이와 상응하는 드라이브를 추가로 포함할 수 있다. 또한, 컴퓨터는 당해 분야에 공지된 주변 장치, 예컨대 표시 장치(예컨대, 음극 선관 또는 액정 표시), 알파벳 입력 장치(예컨대, 키보드), 커서 조절 장치(예컨대, 마우스, 트랙볼 또는 커서 방향키) 및 커뮤니케이션 장치(예컨대, 모뎀, 네트워크 인터페이스 카드 또는 에더넷, 토큰 링 또는 기타 유형의 네트워크와 결합하는데 사용된 인터페이스 장치)를 포함할 수 있다.In the present invention, the Raman detection apparatus may include a computer. The above embodiment places no restrictions on the type of computer used. An example computer may include a bus for exchanging information and a processor for processing information. A computer may further include RAM (RAM) or other dynamic storage devices, ROM (ROM) or other static storage devices and data storage devices, such as magnetic or optical disks and corresponding drives. Computers also include peripheral devices known in the art, such as display devices (eg cathode ray tubes or liquid crystal displays), alphabet input devices (eg keyboards), cursor control devices (eg mouse, trackball, or cursor arrow keys), and communication devices. (eg, a modem, network interface card or interface device used to couple with an Ethernet, token ring, or other type of network).
본 발명에서, 라만 검출 장치는 컴퓨터와 작동가능하게 결합될 수 있다. 검출 장치로부터의 데이터는 프로세서에 의해 처리되고 데이터는 주기억장치에 저장될 수 있다. 표준 분석물에 대한 방출 프로파일 상의 데이터는 또한 주기억 장치 또는 ROM에 저장될 수 있다. 프로세서는 라만 활성 기판에서의 분석물로부터의 방출 스펙트럼을 비교하여 샘플의 분석물 유형을 확인할 수 있다. 프로세서는 검출 장치로부터의 데이터를 분석하여 여러 분석물의 정체 및/또는 농도를 측정할 수 있다. 서로 다르게 구비된 컴퓨터는 특정 이행에 사용될 수 있다. 따라서, 시스템의 구조는 본 발명의 상이한 실시양태에서 다를 수 있다. 데이터 수집 작업 이후, 전형적으로 데이터는 데이터 분석 작업으로 보내질 것이다. 분석 작업을 용이하게 하기 위해, 검출 장치에 의해 수득된 데이터는 상기한 바와 같이 디지털 컴퓨터를 사용하여 전형적으로 분석할 것이다. 전형적으로, 컴퓨터는 검출 장치로부터의 데이터 수용 및 저장뿐만 아니라 수집된 데이터의 분석 및 보고를 위해 적절히 프로그래밍될 것이다.In the present invention, the Raman detection apparatus may be operatively coupled with a computer. Data from the detection device may be processed by a processor and the data stored in main memory. Data on release profiles for standard analytes may also be stored in main memory or ROM. The processor may compare the emission spectra from the analyte on the Raman active substrate to determine the analyte type of the sample. The processor may analyze data from the detection device to determine the identity and/or concentration of various analytes. Differently equipped computers may be used for specific implementations. Accordingly, the structure of the system may differ in different embodiments of the present invention. After a data collection job, typically the data will be sent to a data analysis job. To facilitate the analytical task, the data obtained by the detection device will typically be analyzed using a digital computer as described above. Typically, the computer will be suitably programmed for receiving and storing data from the detection device, as well as for analysis and reporting of the collected data.
라만 검출 장치의 비제한적인 예는 미국특허 제6,002,471호에 개시되어 있다. 여기 빔은 532 nm 파장에서의 주파수 중첩된 Nd:YAG 레이저 또는 365 nm 파장에서의 주파수 중첩된 Ti:사파이어 레이저에 의해 생성된다. 펄스 레이저 빔 또는 연속 레이저 빔이 사용될 수 있다.A non-limiting example of a Raman detection device is disclosed in US Pat. No. 6,002,471. The excitation beam is generated by a frequency superposed Nd:YAG laser at a wavelength of 532 nm or a frequency superposed Ti: sapphire laser at a wavelength of 365 nm. A pulsed laser beam or a continuous laser beam may be used.
검출 장치의 또 다른 예는 미국특허 제5,306,403호에 개시되어 있으며, 이로는 단광자 카운팅 방식으로 작동하는 갈륨-비소 광전자증배관(RCA Model C31034 또는 Burle Industries Model C3103402)이 구비된 스펙스 모델(Spex Model) 1403 이중 격자 분광계를 들 수 있다. 여기화 공급원은 스펙트라피직스(SpectraPhysics), 모델 166으로부터의 514.5nm 선 아르곤-이온 레이저, 및 크립턴(krypton)-이온 레이저(Innova 70, 비간섭성)의 647.1nm 선을 포함한다.Another example of a detection device is disclosed in US Pat. No. 5,306,403, which is a Spex Model equipped with a gallium-arsenide photomultiplier tube (RCA Model C31034 or Burle Industries Model C3103402) operating in a single photon counting mode. ) 1403 double grating spectrometer. Excitation sources include a 514.5 nm line argon-ion laser from SpectraPhysics, model 166, and a 647.1 nm line from a krypton-ion laser (Innova 70, incoherent).
다른 여기화 공급원으로는 337nm에서의 질소 레이저(레이저 사이언스 인코포레이티드(Laser Science Inc.) 및 325nm에서의 헬륨-카드뮴 레이저(라이코녹스(Liconox)(미국특허 제6,174,677호), 발광 다이오드, Nd:YLF 레이저, 및/또는 다양한 이온 레이저 및/또는 염료 레이저를 포함한다. 여기 빔은 밴드패스 필터(Corion)에 의해 스펙트럼으로 정제되어 6X 대물 렌즈(Newport, Model L6X)를 이용하는 라만 활성 기판 상에 초점화될 수 있다.Other sources of excitation include nitrogen lasers at 337 nm (Laser Science Inc.) and helium-cadmium lasers at 325 nm (Liconox (US Pat. No. 6,174,677), light emitting diodes, Nd : include YLF laser, and/or various ion laser and/or dye laser.Excitation beam is spectrally refined by bandpass filter (Corion) on Raman active substrate using 6X objective lens (Newport, Model L6X) can be focused.
이하, 본 발명을 실시예를 통하여 보다 구체적으로 설명한다. 다만, 하기 실시예는 본 발명의 기술적 특징을 명확하게 예시하기 위한 것일 뿐 본 발명의 보호범위를 한정하는 것은 아니다.Hereinafter, the present invention will be described in more detail through examples. However, the following examples are only for clearly illustrating the technical features of the present invention, and do not limit the protection scope of the present invention.
실시예 1: 핵산 기반 자가조립 복합체 함유 타겟 핵산 검출 시약의 제조Example 1: Preparation of a target nucleic acid detection reagent containing a nucleic acid-based self-assembly complex
도 9에 예시된 바와 같이, 실시예 1에서 준비하고자 하는 타겟 핵산 검출 시약은 핵산 기반 자가조립 복합체(NEW 구조체)를 함유하며, NEW 구조체는 (a) 타겟 핵산과 교합하는 제1 뉴클레오티드가 직경 20~30 nm의 구형 금 나노입자에 연결된 제1나노입자 기반 구조체와 (b) 제1 뉴클레오티드와 상보적인 제2 뉴클레오티드가 직경 20~30 nm의 구형 금 나노입자에 연결된 제2나노입자 기반 구조체로부터, 제1 뉴클레오티드와 제2 뉴클레오티드의 상보적인 수소결합을 통해 수(水) 기반 용매에서 자가조립되는 것이다. As illustrated in FIG. 9 , the target nucleic acid detection reagent to be prepared in Example 1 contains a nucleic acid-based self-assembly complex (NEW construct), and the NEW construct has (a) a first nucleotide that mates with the target nucleic acid has a diameter of 20 From a first nanoparticle-based structure linked to spherical gold nanoparticles of ~30 nm and (b) a second nanoparticle-based structure in which a second nucleotide complementary to the first nucleotide is linked to spherical gold nanoparticles with a diameter of 20-30 nm, Self-assembly in a water-based solvent through complementary hydrogen bonding of the first nucleotide and the second nucleotide.
이때, 타겟 핵산은 현재 유행중인 코로나 바이러스의 유전체임을 동정할 수 있는 염기 서열(12 mer ~ 30 mer)로 합성된 것이다. In this case, the target nucleic acid is synthesized with a nucleotide sequence (12 mer to 30 mer) that can be identified as the genome of the currently prevalent coronavirus.
타겟 핵산과 교합하는 염기 서열을 가진 올리고 뉴클레오티드 프로브(probe), 활동 자유도를 높여 10 bp 이상의 상보적인 수소결합을 도와주는 하기 화학식 1의 C3 스페이서 및 나노입자에 부착하는 올리고 뉴클레오티드 접착자(poly-adenine 10mer)를 순차적으로 연결시켜 타겟 핵산과 교합하는 제1 뉴클레오티드를 준비하였다.An oligonucleotide probe having a nucleotide sequence that intersects with a target nucleic acid, a C3 spacer of the following formula (1) that increases the freedom of activity to facilitate complementary hydrogen bonding of 10 bp or more, and an oligonucleotide adhesive (poly-adenine) attached to the nanoparticles 10mer) was sequentially ligated to prepare a first nucleotide that mates with the target nucleic acid.
또한, 제1 뉴클레오티드의 올리고 뉴클레오티드 프로브(probe)와 20 ~ 50bp 상보적인 염기 서열을 가진 제2-1 올리고 뉴클레오티드 접착자(attacher), 활동 자유도를 높여 10 bp 이상의 상보적인 수소결합을 도와주는 하기 화학식 1의 C3 스페이서 및 나노입자에 부착하는 제2-2 올리고 뉴클레오티드 접착자(poly-adenine 10mer)를 순차적으로 연결시켜 제1 뉴클레오티드와 상보적인 제2 뉴클레오티드를 준비하였다. 이때, 라만 표지자로 Cy3는 제2 뉴클레오티드 내 C3 스페이서와 나노입자에 부착하는 올리고 뉴클레오티드 접착자(poly-adenine 10mer) 사이에 위치한다.In addition, the oligonucleotide probe of the first nucleotide and the 2-1 oligonucleotide attacher having a nucleotide sequence complementary to 20 to 50 bp, increasing the freedom of action to help hydrogen bond complementary to 10 bp or more A second nucleotide complementary to the first nucleotide was prepared by sequentially linking the C3 spacer of 1 and the 2-2 oligonucleotide adhesive (poly-adenine 10mer) attached to the nanoparticles. In this case, Cy3 as a Raman marker is located between the C3 spacer in the second nucleotide and the oligonucleotide adhesive (poly-adenine 10mer) attached to the nanoparticles.
다만, 제1 뉴클레오티드와의 교합시 경쟁에서 타겟 핵산이 우위를 점하도록, 제2-1 올리고 뉴클레오티드 접착자(attacher)의 서열 길이는 상기 합성된 타겟 핵산의 서열 길이보다 짧다(도 9).However, the sequence length of the 2-1 oligonucleotide adhesive is shorter than the sequence length of the synthesized target nucleic acid so that the target nucleic acid has the upper hand in competition during mating with the first nucleotide (FIG. 9).
[화학식 1][Formula 1]
Figure PCTKR2021020362-appb-I000004
Figure PCTKR2021020362-appb-I000004
일말단이 -SH 기로 개질된 제1 뉴클레오티드와 금 나노입자의 혼합용액에 100 mM phosphate buffer, 2M NaCl을 순차적으로 첨가해주고 실온에서 반응시켜, 제1나노입자 기반 구조체를 합성하였다. 마찬가지로 일말단이 -SH 기로 개질된 상기 제2 뉴클레오티드와 금 나노입자의 혼합용액에 100 mM phosphate buffer, 2M NaCl을 순차적으로 첨가해주고 실온에서 반응시켜, 제2나노입자 기반 구조체를 합성하였다.100 mM phosphate buffer and 2M NaCl were sequentially added to a mixed solution of the first nucleotide modified with a -SH group at one end and gold nanoparticles, and reacted at room temperature to synthesize a first nanoparticle-based structure. Similarly, 100 mM phosphate buffer and 2M NaCl were sequentially added to a mixed solution of the second nucleotide modified with a -SH group at one end and gold nanoparticles, and reacted at room temperature to synthesize a second nanoparticle-based structure.
이어서, 제1나노입자 기반 구조체 함유 수용액과 제2나노입자 기반 구조체 함유 수용액을 섞어, 제1 뉴클레오티드와 제2 뉴클레오티드 사이의 상보적인 수소결합을 통해 형성된 핵산 기반 자가조립 복합체(NEW 구조체) 함유 타겟 핵산 검출 시약을 준비하였다. Subsequently, the aqueous solution containing the first nanoparticle-based structure and the aqueous solution containing the second nanoparticle-based structure are mixed, and a nucleic acid-based self-assembly complex (NEW structure) containing target nucleic acid formed through complementary hydrogen bonding between the first nucleotide and the second nucleotide A detection reagent was prepared.
자체 제작한 도립형 라만 검출 장치를 사용하여, 실시예 1에서 준비된 인산 완충액 내 NEW 구조체 함유 타겟 핵산 검출 시약에서 NEW 구조체의 라만 신호, 즉 제2 뉴클레오티드에 연결된 라만 표지자 Cy3의 라만신호를 측정하였으며, 그 결과는 도 10에 도시하였다. Using the self-made inverted Raman detection device, the Raman signal of the NEW construct, that is, the Raman signal of the Raman marker Cy3 linked to the second nucleotide, was measured in the target nucleic acid detection reagent containing the NEW construct in the phosphate buffer prepared in Example 1. The results are shown in FIG. 10 .
산란된 라만 스펙트럼은 하나의 acquision, 1초간 추적(accumulation), 400μW, 500-2000cm-1의 범위에서 기록하였다. 낮은 강도(intensity)에도 불구하고, 532nm의 레이저 입사광(incident light)에서의 지문 스펙트라(fingerprint spectra)인 1470 및 1580cm-1에서 Cy3의 특징적인 피크가 나타났다.Scattered Raman spectra were recorded at one acquisition, 1 second accumulation, 400 μW, in the range of 500-2000 cm -1 . In spite of the low intensity (intensity), the characteristic peaks of Cy3 appeared at 1470 and 1580 cm -1 , which are fingerprint spectra in the laser incident light of 532 nm.
놀랍게도, 타겟 핵산 검출 시약에서 전술한 NEW 구조체는 연속적으로 1초마다 100회 측정시에도 안정적으로 증강된 라만 산란 신호가 일정한 범위 내에서 재현성 있게 제공되고 있다는 것을 발견하였다. 다만, 안정적인 신호를 얻을 수 있으나 완벽히 같은 신호를 얻지는 못하였다.Surprisingly, it was found that the above-described NEW construct in the target nucleic acid detection reagent provides a stably enhanced Raman scattering signal within a certain range and reproducibly even when continuously measured 100 times per second. However, a stable signal could be obtained, but the exact same signal could not be obtained.
실시예 2: 타겟 핵산과 교합 반응 후 라만 신호 측정Example 2: Measurement of Raman signal after bite reaction with target nucleic acid
실시예 1에서 준비된 핵산 기반 자가조립 복합체 (NEW 구조체)는 제1 뉴클레오티드와의 교합반응에서 타겟 핵산과 경쟁하는 제2 뉴클레오티드에 기지(already-known)의 라만 쉬프트값을 발휘하는 라만 표지자(indicator)가 연결되어 있으며, 핵산 기반 자가조립 복합체 (NEW 구조체)는 타겟 핵산 존재시 turn-off 신호 방식의 센서 역할을 수행하므로, 핵산 기반 자가조립 복합체 형성 여부/개수/농도를 라만 표지자(indicator)의 라만 신호로 확인(정량)할 수 있다(도 9 및 도 12).The nucleic acid-based self-assembly complex (NEW construct) prepared in Example 1 is a Raman indicator that exhibits a known Raman shift value to a second nucleotide that competes with a target nucleic acid in a occlusion reaction with the first nucleotide. is connected, and the nucleic acid-based self-assembly complex (NEW structure) acts as a sensor of the turn-off signal method in the presence of a target nucleic acid, so whether or not the nucleic acid-based self-assembly complex is formed/number/concentration of the Raman indicator It can be confirmed (quantified) by a signal (FIGS. 9 and 12).
실시예 1에서 준비된 핵산 기반 자가조립 복합체 (NEW 구조체) 함유 타겟 핵산 검출 시약에 합성된 타겟 핵산(12 mer ~ 30 mer)을 기지의 농도로 첨가하였다(도 12).The synthesized target nucleic acid (12 mer to 30 mer) was added to the target nucleic acid detection reagent containing the nucleic acid-based self-assembly complex (NEW construct) prepared in Example 1 at a known concentration (FIG. 12).
(a) 타겟 핵산과 교합하는 제1 뉴클레오티드가 구형 금 나노입자에 연결된 제1나노입자 기반 구조체와 (b) 제1 뉴클레오티드와 상보적인 제2 뉴클레오티드가 구형 금 나노입자에 연결된 제2나노입자 기반 구조체로부터, 제1 뉴클레오티드와 제2 뉴클레오티드의 상보적인 수소결합을 통해 형성된 핵산 기반 자가조립 복합체(NEW 구조체)를 해체하는 온도, 즉 제1 뉴클레오티드와 제2 뉴클레오티드의 상보적인 수소결합을 제거하는 온도(Tm), 72.0 ℃로 상승시킨 후, 상기 합성된 타겟 핵산과 제1 뉴클레오티드를 교합시키는 온도(Ta)로 약 5℃ 온도를 낮추었다.(a) a first nanoparticle-based construct in which a first nucleotide that occludes with a target nucleic acid is linked to a spherical gold nanoparticle, and (b) a second nanoparticle-based construct in which a second nucleotide complementary to the first nucleotide is linked to a spherical gold nanoparticle From, the temperature at which the nucleic acid-based self-assembly complex (NEW structure) formed through the complementary hydrogen bonding of the first nucleotide and the second nucleotide is dissociated, that is, the temperature at which the complementary hydrogen bond between the first nucleotide and the second nucleotide is removed (Tm) ), after raising it to 72.0 °C, the temperature was lowered to about 5 °C to the temperature (Ta) at which the synthesized target nucleic acid and the first nucleotide were mated.
실시예 1과 동일한 방법으로 자체 제작한 도립형 라만 검출 장치를 사용하여, 제2 뉴클레오티드에 연결된 라만 표지자의 라만신호를 측정하였다.An inverted Raman detection device manufactured by itself in the same manner as in Example 1 was was used to measure the Raman signal of the Raman marker linked to the second nucleotide.
도 11에서 왼쪽 그래프는 타겟 핵산이 없는 타겟 핵산 검출 시약을 72.0 ℃로 상승 및 약 5℃ 온도를 낮춘 후 측정된 라만 스펙트럼이고, 도 11에서 오른쪽 그래프는 타겟 핵산 검출 시약 내 제1 뉴클레오티드의 개수를 고려하여 합성된 타겟 핵산을 과량으로 넣었을 때 측정된 라만 스펙트럼이다.The graph on the left in FIG. 11 is a Raman spectrum measured after raising the target nucleic acid detection reagent without a target nucleic acid to 72.0°C and lowering the temperature to about 5°C, and the graph on the right in FIG. 11 shows the number of first nucleotides in the target nucleic acid detection reagent It is a Raman spectrum measured when an excessive amount of the synthesized target nucleic acid is taken into consideration.
실시예 1에 따라 타겟 핵산과 교합하는 프로브(probe)인 제1 뉴클레오티드 및 이와 상보적인 제2 뉴클레오티드 사이 분자적 수준의 자발적인 수소결합에 의해 형성되는 핵산 기반 자가조립 복합체(NEW 구조체)의 개수는 타겟 핵산의 개수와 반대로 연동되는 함수관계에 있다(도 9). 즉, (i) 시료내 타겟 핵산이 부존재 또는 타겟 핵산 검출 시약의 타겟 핵산에 대한 검출 민감도의 최소치 (핵산 기반 자가조립 복합체 함유 또는 형성 타겟 핵산 검출 시약의 검출 범위의 최소치) 이하로 존재할 때, 서로 상보적인 제1 뉴클레오티드와 제2 뉴클레오티드의 자가조립에 의해 형성되는 핵산 기반 자가조립 복합체(NEW 구조체)의 개수는 최대치이고, (ii) 시료내 타겟 핵산이 타겟 핵산 검출 시약의 타겟 핵산에 대한 검출 민감도의 최대치 이상으로 존재할 때, 서로 상보적인 제1 뉴클레오티드와 제2 뉴클레오티드의 자가조립에 의해 형성되는 핵산 기반 자가조립 복합체(NEW 구조체)의 개수는 최소치이다(도 11).According to Example 1, the number of nucleic acid-based self-assembling complexes (NEW constructs) formed by spontaneous hydrogen bonding at the molecular level between the first nucleotide, which is a probe that mates with the target nucleic acid, and the second nucleotide complementary thereto, the number of It is in a functional relationship that is linked to the number of nucleic acids oppositely (FIG. 9). That is, when (i) the target nucleic acid in the sample is absent or less than the minimum value of the detection sensitivity of the target nucleic acid detection reagent for the target nucleic acid (the minimum value of the detection range of the target nucleic acid detection reagent containing or forming a nucleic acid-based self-assembly complex) The number of nucleic acid-based self-assembly complexes (NEW constructs) formed by self-assembly of complementary first and second nucleotides is the maximum, and (ii) the target nucleic acid in the sample is the detection sensitivity of the target nucleic acid detection reagent to the target nucleic acid When present above the maximum value of , the number of nucleic acid-based self-assembly complexes (NEW constructs) formed by self-assembly of complementary first nucleotides and second nucleotides is the minimum ( FIG. 11 ).
따라서, 실시예 1에서 준비된 타겟 핵산 검출 시약 내 핵산 기반 자가조립 복합체 (NEW 구조체)는 타겟 핵산 존재시 turn-off 신호 방식의 센서 역할을 수행하므로, 기지의 농도로 핵산 기반 자가조립 복합체를 함유하는 타겟 핵산 검출 시약은 타겟 핵산 부재시 라만 신호가 최대치인 상태에서, 타겟 핵산이 많아질수록 라만 신호가 낮아져, 기지의 농도의 핵산 기반 자가조립 복합체에 대응되는 과량의 타겟 핵산 존재시 라만 신호가 최소치가 되며, 타겟 핵산 검출 시약 내 핵산 기반 자가조립 복합체 농도별로 광신호의 기준점(Min, Max)을 확보 또는 예측할 수 있다(도 11).Therefore, the nucleic acid-based self-assembly complex (NEW construct) in the target nucleic acid detection reagent prepared in Example 1 serves as a sensor of the turn-off signal method in the presence of the target nucleic acid, so that the nucleic acid-based self-assembly complex containing the nucleic acid-based self-assembly complex at a known concentration The target nucleic acid detection reagent has a maximum Raman signal in the absence of the target nucleic acid, and the Raman signal decreases as the amount of target nucleic acid increases. and it is possible to secure or predict the reference points (Min, Max) of the optical signal for each concentration of the nucleic acid-based self-assembly complex in the target nucleic acid detection reagent (FIG. 11).
한편, 도 12는 실시예 1에서 준비된 핵산 기반 자가조립 복합체 (NEW 구조체) 함유 타겟 핵산 검출 시약에 합성된 타겟 핵산(12 mer ~ 30 mer)를 기지의 농도(0 M, 10-16 M, 10-12 M)로 첨가한 경우, 측정된 라만 스펙트럼이다.On the other hand, Figure 12 shows the target nucleic acid (12 mer ~ 30 mer) synthesized in the target nucleic acid detection reagent containing the nucleic acid-based self-assembly complex (NEW construct) prepared in Example 1 at a known concentration (0 M, 10 -16 M, 10 -12 M) is the measured Raman spectrum.
놀랍게도 전술한 핵산 기반 자가조립 복합체(NEW 구조체)를 형성하는 타겟 핵산 검출 시약에서 타겟 핵산 미존재시 또는 기지(known) 농도의 타겟 핵산 존재시 타겟 핵산 농도에 반대로 연동하여 광조사시 라만 산란 신호가 일정한 패턴으로 감소하는 것을 발견하였다(도 12). 이는 전술한 핵산 기반 자가조립 복합체(NEW 구조체)의 형성 개수, 즉 이의 나노갭 형성 개수 및 이로부터 형성되는 증강된 라만 산란 신호의 강도가 타겟 핵산 개수와 함수관계에 있으므로, 전술한 핵산 기반 자가조립 복합체(NEW 구조체)를 형성하는 타겟 핵산 검출 시약은 on/off 신호 체계의 센서 역할을 수행할 수 있고, 광조사시 측정된 라만 산란 신호의 강도로부터 컴퓨터 알고리즘을 통해 타겟 핵산의 정량분석도 가능하다는 것을 유추할 수 있다. Surprisingly, in the absence of a target nucleic acid or in the presence of a known concentration of target nucleic acid in the target nucleic acid detection reagent that forms the above-described nucleic acid-based self-assembly complex (NEW structure), the Raman scattering signal is reversely linked to the target nucleic acid concentration upon light irradiation. was found to decrease in a constant pattern (FIG. 12). This is because the number of formation of the aforementioned nucleic acid-based self-assembly complex (NEW structure), that is, the number of nanogap formations thereof and the intensity of the enhanced Raman scattering signal formed therefrom, is in a functional relationship with the number of target nucleic acids, so the above-described nucleic acid-based self-assembly The target nucleic acid detection reagent that forms the complex (NEW structure) can act as a sensor of the on/off signal system, and it is also possible to quantitatively analyze the target nucleic acid through a computer algorithm from the intensity of the Raman scattering signal measured during light irradiation. can be inferred
요컨대, 실시예 1의 타겟 핵산 검출 시약은 타겟 핵산 존재시 제1 뉴클레오티드와 타겟 핵산과의 교합(hybridization)에 의해 핵산 기반 자가조립 복합체가 형성되지 않거나 해체되는 경우 라만 표지자의 라만 신호의 변화값(인텐시티 감소)을 측정할 수 있을 뿐만 아니라, 타겟 핵산을 정량분석할 수 있다(도 9 ~ 도 12). In short, the target nucleic acid detection reagent of Example 1 is the change value of the Raman signal of the Raman marker when the nucleic acid-based self-assembly complex is not formed or disassembled by hybridization between the first nucleotide and the target nucleic acid in the presence of the target nucleic acid ( Intensity reduction) can be measured, and a target nucleic acid can be quantitatively analyzed ( FIGS. 9 to 12 ).

Claims (23)

  1. 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법에 있어서,In the Raman scattering spectrum database construction and search method through machine learning,
    각 시료로부터 (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 세로축의 라만 인텐시티 중 상대적으로 가장 높은 값인 라만 최고점 및 (c) 세로축의 라만 인텐시티 중 상대적으로 가장 낮은 값인 라만 최저점을 도출하여, 각 시료의 라만 스펙트럼을 생성하는 제1단계; From each sample, (a) one or more Raman shift values, from each shift value, (b) the Raman peak, which is the relatively highest value among the Raman intensities on the vertical axis, and (c) the Raman lowest point, which is the relatively lowest value among the Raman intensities on the vertical axis, is derived. , a first step of generating a Raman spectrum of each sample;
    제1단계의 라만 쉬프트 값(a)을 구간학습하는 제2-1단계,Step 2-1 of section learning the Raman shift value (a) of the first step;
    제1단계의 라만 최고점(b)을 클러스터 학습하는 제2-2단계, 및Step 2-2 of cluster learning the Raman peak (b) of the first step, and
    제1단계의 라만 최저점(c)을 클러스터 학습하는 제2-3단계Step 2-3 of cluster learning the Raman lowest point (c) of the first step
    를 수행하는 기계 학습 알고리즘에 따라 학습된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 생성하는 제2단계;a second step of generating (a') Raman shift values, (b') Raman peaks, and (c') Raman troughs from each shift value according to a machine learning algorithm that performs
    제2단계에서 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 기반으로, (d) 주어진 반복성내 신호잡음비(signal to noise ratio)가 50%이상인 스펙트럼의 비로 정의된, 민감도(sensitivity), (e) 주어진 스펙트럼 값의 분포 범위가 정규분포 평균(μ)의 표준편차(δ) 범위(μ-δ, μ+δ)의 스펙트럼구성비로 정의된, 안정도(stability) 및 (f) 주어진 검사 시간내 측정회수를 바탕으로 상기 정의된 안정도 50%이상, 민감도 50%이상의 스펙트럼은 반복성이 있다고 정의된, 반복성 (repeatability)을 기계학습으로 추론하는 제3단계;Based on (a') Raman shift values generated in step 2, (b') Raman peaks and (c') Raman troughs at each shift value, (d) signal to noise ratio within a given repeatability is Sensitivity, defined as the ratio of the spectrum that is 50% or more Defined, stability and (f) based on the number of measurements within a given inspection time, the above-defined spectrum of stability of 50% or more and sensitivity of 50% or more has repeatability. Step 3;
    Fractional Bandwidth를 계산하는 제4단계; a fourth step of calculating fractional bandwidth;
    제3단계에서 정의된 대로 반복성이 있으며, 안정도가 80% 이상, 민감도가 90% 이상인 스펙트럼을 해당 쉬프트내 선택값으로 선정하여, 스펙트럼 선택도(selectivity)를 계산하는 제5단계; a fifth step of calculating spectral selectivity by selecting a spectrum having repeatability as defined in the third step and having a stability of 80% or more and a sensitivity of 90% or more as a selection value within the shift;
    제1단계의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점; 제2단계에서 기계 학습시켜 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점; 제3단계에서 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 제5단계에서 계산된 스펙트럼 선택도(selectivity)를 입력하여, 라만 스펙트럼 데이터베이스를 구축하는 제6단계; 및(a) Raman shift values of the first step, (b) Raman peaks and (c) Raman troughs at each shift value; (a') Raman shift values generated by machine learning in the second step, (b') Raman peaks and (c') Raman troughs at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and a sixth step of constructing a Raman spectrum database by inputting the spectral selectivity calculated in the fifth step; and
    선택적으로(optionally), 제1단계에서 시료의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 입력하면, 제6단계에서 구축된 라만 스펙트럼 데이터베이스로부터 원하는 예측 정보를 산출하는 제7단계;Optionally, by inputting (a) Raman shift values, (b) Raman peaks and (c) Raman troughs at each shift value of the sample in the first step, the desired prediction from the Raman spectrum database constructed in the sixth step a seventh step of calculating information;
    를 포함하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.Raman scattering spectrum database construction and search method through machine learning, characterized in that it comprises a.
  2. 제1항에 있어서, 라만 스펙트럼 데이터베이스 구축 대상인 시료는 액체 시료인 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method of claim 1 , wherein the sample to be constructed of the Raman spectrum database is a liquid sample.
  3. 제2항에 있어서, 라만 쉬프트 값(a)을 도출하고자 하는 검출 표지자(indicator)는 액체 시료에 분산되어 있는 나노 입자 상에 연결되어 있는 것으로,According to claim 2, wherein the detection indicator to derive the Raman shift value (a) is connected to the nanoparticles dispersed in the liquid sample,
    상기 라만 쉬프트 값(a)을 측정하기 위해, 상기 나노 입자에 의한 국부적 표면 플라즈몬 공명(localized surface plasmon resonance)을 이용한 표면 분석 라만 분광법을 수행하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.To measure the Raman shift value (a), Raman scattering spectrum database construction and search through machine learning, characterized in that surface analysis Raman spectroscopy using localized surface plasmon resonance by the nanoparticles is performed Way.
  4. 제2항에 있어서, 라만 스펙트럼 데이터베이스 구축 대상은 액체내 브라운 운동을 하는 핵산 기반 자가조립 복합체에서 유래되는 라만산란 신호인 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.[Claim 3] The method of claim 2, wherein the target for constructing the Raman spectrum database is a Raman scattering signal derived from a nucleic acid-based self-assembled complex undergoing Brownian motion in a liquid.
  5. 제4항에 있어서, 라만 쉬프트 값(a)을 도출하고자 하는 검출 표지자(indicator)는 액체내 브라운 운동을 하는 핵산 기반 자가조립 복합체에 연결되어 있는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.[Claim 5] The Raman scattering spectrum database construction and How to search.
  6. 제1항에 있어서, 제7단계에서 산출되는 예측 정보는, The method of claim 1, wherein the prediction information calculated in step 7,
    제2단계에서 학습시켜 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점; 제3단계에서 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 제5단계에서 계산된 스펙트럼 선택도(selectivity)로 구성된 군에서 선택된 하나 이상의 값을 입력하여 특정 함수(hidden layer)를 통해 생성되는 하나이상의 출력값인 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.(a') Raman shift values generated by learning in the second step, (b') Raman highest points and (c') Raman lowest points at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred by machine learning in the third step; and Raman scattering spectrum database construction through machine learning characterized in that it is one or more output values generated through a specific function (hidden layer) by inputting one or more values selected from the group consisting of the spectrum selectivity calculated in step 5 and search methods.
  7. 제6항에 있어서, 함수를 통해 출력되는 예측 정보는 제1단계에서 (a) 라만 쉬프트 값을 도출하는 대상인 시료를 추출한 동물 또는 세포의 상태 정보, 질환 진단 및/또는 치료제의 효과 평가, 및/또는 상기 시료의 세균 감염 정보인 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method of claim 6, wherein the prediction information output through the function is (a) state information of an animal or cell from which a sample is extracted, which is a target for deriving a Raman shift value, in the first step, disease diagnosis and/or evaluation of the effect of a therapeutic agent, and/ Or a Raman scattering spectrum database construction and search method through machine learning, characterized in that the bacterial infection information of the sample.
  8. 제6항에 있어서, 함수를 통해 출력되는 예측 정보는 특정 생체 물질의 존재 여부 및/또는 농도, 세포 유래의 화학 결합, 구성물질 및/또는 세포 종류의 확인(identification) 및/또는 농도인 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method according to claim 6, wherein the prediction information output through the function is the presence and/or concentration of a specific biomaterial, a chemical bond derived from a cell, identification and/or concentration of a constituent material and/or a cell type. A method for building and searching a Raman scattering spectrum database through machine learning.
  9. 제1항에 있어서, 실험적으로 레퍼런스 물질(negative control)을 통해 나온 신호의 쉬프트별 표준편차 평균값으로 베이스라인을 설정하고, 베이스라인 교정을 한 후 모든 라만 스펙트럼을 도출하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method of claim 1, wherein the baseline is set as the average value of the standard deviation for each shift of the signal experimentally output through the reference material (negative control), and all Raman spectra are derived after baseline correction. Raman scattering spectrum database construction and retrieval methods.
  10. 제1항에 있어서, 라만 쉬프트 값(△v)은 하기 수학식 1 또는 수학식 2를 통해 도출하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method of claim 1, wherein the Raman shift value ( Δv ) is derived through Equation 1 or Equation 2 below.
    [수학식 1][Equation 1]
    Figure PCTKR2021020362-appb-I000005
    Figure PCTKR2021020362-appb-I000005
    [수학식 2][Equation 2]
    Figure PCTKR2021020362-appb-I000006
    Figure PCTKR2021020362-appb-I000006
  11. 제1항에 있어서, 제1단계에서 하나 이상의 라만 쉬프트 값은 400 cm-1 ~ 3200 cm-1 범위에서 도출하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.[Claim 2] The method of claim 1, wherein in the first step, one or more Raman shift values are derived in the range of 400 cm -1 to 3200 cm -1 .
  12. 제1항에 있어서, 제1단계는 검사 장비의 라만 쉬프트 범위에서, 1 ~ 5 쉬프트씩 이동하면서 각 라만 쉬프트 값에서 최저점 및 최고점을 구하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method of claim 1, wherein the first step is to obtain the lowest point and the highest point from each Raman shift value while moving 1 to 5 shifts in the Raman shift range of the inspection equipment. .
  13. 제1항에 있어서, 라만 최고점은 상대적으로 라만 쉬프트 범위 내에 있는 가장 높은 인텐시티이고, 라만 최저점은 상대적으로 라만 쉬프트 범위 내에 있는 가장 낮은 인텐시티인 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method of claim 1, wherein the Raman peak is the highest intensity within the relatively Raman shift range, and the Raman lowest point is the lowest intensity within the relatively Raman shift range. .
  14. 제1항에 있어서, 라만 스펙트럼 데이터베이스를 구축하는 제6단계는 하기 (i) ~ (iv)의 항목을 저장하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법:[2] The method of claim 1, wherein the sixth step of constructing the Raman spectrum database comprises storing the items of (i) to (iv) below:
    (i) 물질에 해당하는 부여 코드(i) the grant code corresponding to the substance;
    (ii) 물질에 해당하는 모든 선택 라만 쉬프트 값(ii) any selected Raman shift values corresponding to the material;
    (iii) 모든 선택 쉬프트 값에 해당하는 상대 인텐시티 값(iii) Relative Intensity values for all selection shift values
    (iv) 해당 쉬프트에 사용된 음성 대조군(negative control)의 베이스 라인(baseline) 차이값(iv) the baseline difference value of the negative control used for the corresponding shift
  15. 제1항에 있어서, 라만 스펙트럼 데이터베이스를 구축하는 제6단계는 액체 시료 전용 라만 스펙트럼 데이터베이스를 구별하여 구축하되, The method according to claim 1, wherein the sixth step of constructing the Raman spectrum database is constructed by distinguishing a Raman spectrum database dedicated to the liquid sample,
    선택적으로(optionally), 액체 시료일 경우 스펙트럼 인텐시티 (intensity) 인덱싱을 통한 분리 보관하는 제8단계; 및Optionally, in the case of a liquid sample, an eighth step of separating and storing through spectral intensity indexing; and
    선택적으로(optionally), 액체 시료일 경우 스펙트럼 패턴 매칭을 이용한 노이즈 스펙트럼 인텐시티 (intensity) 감소 필터링하는 제9단계를 포함하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.Optionally, in the case of a liquid sample, a Raman scattering spectrum database construction and search method through machine learning, characterized in that it comprises a ninth step of filtering noise spectrum intensity reduction using spectral pattern matching.
  16. 제15항에 있어서, 액체 시료일 경우 스펙트럼 패턴 매칭을 이용한 노이즈 스펙트럼 인텐시티 (intensity) 감소 필터링하는 제9단계는, 해당 물질을 바탕으로 선정된 라만 쉬프트 값들과의 일치율로 스펙트럼 패턴 매칭을 판단하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method according to claim 15, wherein, in the ninth step of filtering the noise spectrum intensity using spectral pattern matching in the case of a liquid sample, determining the spectral pattern matching by a coincidence rate with Raman shift values selected based on the material A method for building and searching a Raman scattering spectrum database through machine learning.
  17. 제15항에 있어서, 해당 물질을 바탕으로 선정된 라만 쉬프트 값들과의 일치율로 스펙트럼 패턴 매칭을 판단하는 것은 하기 방법 (i) ~ (iv) 중 적어도 하나를 사용하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법: 16. The method of claim 15, wherein to determine the spectral pattern matching by the coincidence rate with the Raman shift values selected based on the material, at least one of the following methods (i) to (iv) is used. How to build and search a scatter spectrum database:
    (i) 노이즈는 각 쉬프트의 신호잡음비가 50%이하로 정의함;(i) Noise is defined as 50% or less of the signal-to-noise ratio of each shift;
    (ii) 획득된 라만 쉬프트 값들의 최소값과 최대값을 해당 물질에 대한 기 획득 레퍼런스 최소, 최대값에 일치시키고 비율을 조정함;(ii) matching the minimum and maximum values of the obtained Raman shift values to the previously obtained reference minimum and maximum values for the material and adjusting the ratio;
    (iii) 각 선택 라만 쉬프트 값의 양측 1% 이내 값은 일치함으로 판단함; (iii) Values within 1% of both sides of each selected Raman shift value are judged to be identical;
    (iv) 모든 선택 스펙트럼 일치율이 95%이상일 경우 일치했다 판정함(iv) If the matching rate of all selected spectra is more than 95%, it is judged to be consistent
  18. 제1항에 있어서, 하기 단계들을 포함하는, 액체내 핵산 기반 자가조립 복합체 유래 라만 신호를 이용하여 타겟 핵산을 검출하는 방법에 사용하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법:The method of claim 1, wherein the Raman scattering spectrum database construction and search method through machine learning is characterized in that it is used in a method for detecting a target nucleic acid using a Raman signal derived from a nucleic acid-based self-assembly complex in a liquid, comprising the following steps:
    (a) 조건에 따라 타겟 핵산과 교합하는 제1 뉴클레오티드 하나 이상이 제1금속나노입자에 연결된 제1나노입자 기반 구조체 및 (b) 제1 뉴클레오티드와 10 염기쌍(bp) 이상 상보적인 제2 뉴클레오티드 하나 이상이 제2금속나노입자에 연결된 제2나노입자 기반 구조체 사이의 자가조립에 의해 형성되는 핵산 기반 자가조립 복합체가, 타겟 핵산 존재시 제1 뉴클레오티드와 타겟 핵산과의 교합(hybridization)에 의해 형성되지 않거나 해체되는 경우 라만 신호의 변화값을 측정할 수 있게 설계되어 있는 타겟 핵산 검출 시약을 준비하는 제I단계; (a) a first nanoparticle-based structure linked to a first metal nanoparticle at least one first nucleotide that mates with a target nucleic acid according to conditions, and (b) a second nucleotide complementary to the first nucleotide by at least 10 base pairs (bp) The nucleic acid-based self-assembly complex formed by self-assembly between the second nanoparticle-based structure linked to the second metal nanoparticle is not formed by hybridization between the first nucleotide and the target nucleic acid in the presence of the target nucleic acid. A first step of preparing a target nucleic acid detection reagent designed to measure the change value of the Raman signal when it is not or disassembled;
    핵산 함유 액상 시료에서 제I단계의 (a) 제1 뉴클레오티드가 연결된 제1나노입자 기반 구조체 및 (b) 제2 뉴클레오티드가 연결된 제2나노입자 기반 구조체 함유 타겟 핵산 검출 시약과의 교합 반응(hybridization)을 수행하는 제II단계; Hybridization reaction (hybridization) with a target nucleic acid detection reagent containing (a) a first nanoparticle-based construct linked with a first nucleotide and (b) a second nanoparticle-based construct linked with a second nucleotide in Step I in a nucleic acid-containing liquid sample A second step of performing;
    제II단계의 교합 반응 전, 후 및/또는 동시에 액상 시료내 핵산 기반 자가조립 복합체 유래 라만 신호를 측정하는 제III단계; 및 a third step of measuring a Raman signal derived from a nucleic acid-based self-assembly complex in a liquid sample before, after and/or simultaneously with the occlusion reaction of the second step; and
    제III단계에서 측정되는 라만 신호 또는 이의 변화값을 분석하는 알고리즘을 통해 시료 내 타겟 핵산의 검출 및/또는 정량 데이터를 제공하는 제IV단계.A step IV of providing detection and/or quantitative data of a target nucleic acid in a sample through an algorithm for analyzing the Raman signal measured in step III or a change value thereof.
  19. 제18항에 있어서, 제I단계의 타겟 핵산 검출 시약은 (a) 타겟 핵산과 교합하는 제1 뉴클레오티드가 제1금속 나노입자에 연결된 제1나노입자 기반 구조체와 (b) 제1 뉴클레오티드와 상보적인 제2 뉴클레오티드가 제2금속 나노입자에 연결된 제2나노입자 기반 구조체로부터, 제1 뉴클레오티드와 제2 뉴클레오티드의 상보적인 수소결합을 통해 핵산 기반 자가조립 복합체를 형성하는 것으로, The method according to claim 18, wherein the target nucleic acid detection reagent of step I comprises (a) a first nanoparticle-based structure in which a first nucleotide that mates with a target nucleic acid is linked to a first metal nanoparticle and (b) is complementary to the first nucleotide Forming a nucleic acid-based self-assembly complex through complementary hydrogen bonding of a first nucleotide and a second nucleotide from a second nanoparticle-based structure in which a second nucleotide is linked to a second metal nanoparticle,
    핵산 기반 자가조립 복합체 형성시 (i) 인접한 2개의 금속 나노입자들에 의해 나노갭이 형성되고, (ii) 상기 나노갭은 광 조사시 표면 플라즈몬 공명 현상을 발생 및 더욱 강화시키는 공간이며, (iii) 제2 올리고 뉴클레오타이드에 연결된 라만 표지자(Raman indicator)를 상기 나노갭에 위치시킴으로써 광 조사시 검출하는 라만 산란 신호를 증강시키도록 설계된 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.When the nucleic acid-based self-assembly complex is formed, (i) a nanogap is formed by two adjacent metal nanoparticles, (ii) the nanogap is a space that generates and further enhances surface plasmon resonance when irradiated with light, (iii) ) A Raman scattering spectrum database construction and search method through machine learning, characterized in that it is designed to enhance the Raman scattering signal detected during light irradiation by placing a Raman indicator linked to the second oligonucleotide in the nanogap.
  20. 제18항에 있어서, 핵산 함유 액상 시료에서 상기 타겟 핵산 검출 시약과의 교합 반응(hybridization)을 수행하고, 교합 반응 전, 후 및/또는 동시에 액상 시료내 핵산 기반 자가조립 복합체 유래 (a) 라만 쉬프트 값(들), 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 측정하여 입력하면, 구축된 라만 스펙트럼 데이터베이스로부터 원하는 예측 정보를 산출하는 것이 특징인 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법.The method of claim 18, wherein the hybridization reaction with the target nucleic acid detection reagent is performed in the nucleic acid-containing liquid sample, and (a) Raman shift derived from the nucleic acid-based self-assembly complex in the liquid sample before, after and/or simultaneously with the hybridization reaction Raman scattering spectrum database construction through machine learning, which is characterized by calculating the desired prediction information from the constructed Raman spectrum database by measuring and inputting (b) Raman peak and (c) Raman minimum at each shift value. and search methods.
  21. 제1항 내지 제20항 중 어느 한 항에 기재된 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법이 컴퓨터에서 수행되도록, 제1단계 내지 제9단계 중 적어도 한 단계를 실행시키기 위한 프로그램을 컴퓨터에 전송하는 매체 또는 컴퓨터로 읽을 수 있는 기록매체.A program for executing at least one of the first to ninth steps in a computer so that the Raman scattering spectrum database construction and search method through machine learning according to any one of claims 1 to 20 is performed in the computer A transmission medium or a computer-readable recording medium.
  22. 라만 스펙트럼 데이터베이스로부터 원하는 생체 예측 정보 산출 장치에 있어서, In the apparatus for calculating desired biometric prediction information from a Raman spectrum database,
    생체 유래 액체 시료로부터 (a) 하나 이상의 라만 쉬프트 값, 각 쉬프트 값에서 (b) 세로축의 라만 인텐시티 중 상대적으로 가장 높은 값인 라만 최고점 및 (c) 세로축의 라만 인텐시티 중 상대적으로 가장 낮은 값인 라만 최저점을 해당 시료의 라만 스펙트럼 목록(list)으로 수집하는 정보 수신부(A);From a biological-derived liquid sample, (a) one or more Raman shift values, and at each shift value, (b) a Raman peak, which is a relatively highest value among Raman intensities on the vertical axis, and (c) a Raman minimum, which is a relatively low value among Raman intensities on the vertical axis. Information receiving unit (A) to collect the Raman spectrum list (list) of the sample;
    해당 시료의 라만 스펙트럼 목록(list)에 포함된 정보를, (a-1) 라만 쉬프트 값을 구간학습하는 알고리즘; (b-1) 라만 최고점을 클러스터 학습하는 알고리즘; 및 (c-1) 라만 최저점을 클러스터 학습하는 알고리즘의 입력으로 하고, Information included in the Raman spectrum list of the sample, (a-1) an algorithm for section learning the Raman shift value; (b-1) an algorithm for cluster learning of Raman peaks; and (c-1) using the Raman lowest point as an input to the algorithm for cluster learning,
    상기 기계 학습 알고리즘에 의해 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점을 기반으로, (d) 주어진 반복성내 신호잡음비(signal to noise ratio)가 50%이상인 스펙트럼의 비로 정의된, 민감도(sensitivity), (e) 주어진 스펙트럼 값의 분포 범위가 정규분포 평균(μ)의 표준편차(δ) 범위 (μ-δ, μ+δ)의 스펙트럼구성비로 정의된, 안정도(stability) 및 (f) 주어진 검사 시간내 측정회수를 바탕으로 상기 정의된 안정도 50%이상, 민감도 50%이상의 스펙트럼은 반복성이 있다고 정의된, 반복성 (repeatability)을 기계학습으로 추론하며,Based on (a') Raman shift values generated by the machine learning algorithm, (b') Raman peaks and (c') Raman troughs at each shift value, (d) signal to noise ratio within a given repeatability ), defined as the ratio of the spectrum with 50% or more, sensitivity, (e) the spectrum of the standard deviation (δ) range (μ-δ, μ+δ) of the normal distribution mean (μ) in which the distribution range of a given spectrum value is Stability, defined as the composition ratio, and (f) repeatability, defined as repeatability of a spectrum with a stability of 50% or more and a sensitivity of 50% or more, as defined above based on the number of measurements within a given inspection time, is machine learning. infer,
    반복성이 있으며, 안정도가 80% 이상, 민감도가 90% 이상인 스펙트럼을 해당 쉬프트내 선택값으로 선정하여, 스펙트럼 선택도(selectivity)를 계산하여,Spectrum with repeatability, stability of 80% or more, and sensitivity of 90% or more is selected as the selection value within the shift, and spectrum selectivity is calculated,
    해당 시료의 라만 스펙트럼 목록(list)에 포함된 정보인, (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점; 기계 학습 알고리즘에 의해 생성된 (a') 라만 쉬프트 값, 각 쉬프트 값에서 (b') 라만 최고점 및 (c') 라만 최저점; 이로부터 기계 학습으로 추론된, (d) 민감도(sensitivity), (e) 안정도(stability) 및 (f) 반복성 (repeatability); 및 이로부터 해당 쉬프트내 선택값으로 선정하여 계산된 스펙트럼 선택도(selectivity)를 입력하여 구축된, 라만 스펙트럼 데이터베이스(B); Information included in the Raman spectrum list of the sample, (a) Raman shift values, at each shift value, (b) Raman peaks and (c) Raman troughs; (a') Raman shift values generated by a machine learning algorithm, (b') Raman peaks and (c') Raman troughs at each shift value; (d) sensitivity, (e) stability and (f) repeatability, inferred from this by machine learning; and a Raman spectrum database (B) constructed by inputting the calculated spectral selectivity by selecting it as a selection value within the corresponding shift therefrom;
    선택적으로(optionally), 해당 시료의 (a) 라만 쉬프트 값, 각 쉬프트 값에서 (b) 라만 최고점 및 (c) 라만 최저점을 입력하면, 구축된 라만 스펙트럼 데이터베이스(B)로부터 원하는 생체 예측 정보를 산출하는 생체 정보 예측부(C)Optionally, by inputting (a) Raman shift value, (b) Raman peak and (c) Raman trough at each shift value of the sample, desired biometric prediction information is calculated from the constructed Raman spectrum database (B) Biometric information prediction unit (C)
    를 포함하는 것이 특징인 생체 예측 정보 산출 장치.Bio-prediction information calculating device, characterized in that it comprises a.
  23. 제22항에 있어서, 제1항 내지 제20항 중 어느 한 항에 기재된 기계학습을 통한 라만 산란 스펙트럼 데이터베이스 구축 및 검색 방법을 수행하는 것이 특징인 생체 예측 정보 산출 장치.23. The apparatus of claim 22, wherein the method for constructing and searching a Raman scattering spectrum database through machine learning according to any one of claims 1 to 20 is performed.
PCT/KR2021/020362 2020-12-31 2021-12-31 Construction of and searching method for raman scattering spectrum database through machine learning WO2022146103A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2020-0189631 2020-12-31
KR20200189631 2020-12-31

Publications (1)

Publication Number Publication Date
WO2022146103A1 true WO2022146103A1 (en) 2022-07-07

Family

ID=82259807

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2021/020362 WO2022146103A1 (en) 2020-12-31 2021-12-31 Construction of and searching method for raman scattering spectrum database through machine learning

Country Status (2)

Country Link
KR (1) KR20220097351A (en)
WO (1) WO2022146103A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117290669A (en) * 2023-11-24 2023-12-26 之江实验室 Optical fiber temperature sensing signal noise reduction method, device and medium based on deep learning

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7106437B2 (en) * 2003-01-06 2006-09-12 Exxonmobil Chemical Patents Inc. On-line measurement and control of polymer product properties by Raman spectroscopy
WO2017089427A1 (en) * 2015-11-23 2017-06-01 Celltool Gmbh Device and method for analyzing biological objects with raman spectroscopy
KR20200139237A (en) * 2018-04-06 2020-12-11 브라스켐 아메리카, 인크. Raman spectroscopy and machine learning for quality control

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7106437B2 (en) * 2003-01-06 2006-09-12 Exxonmobil Chemical Patents Inc. On-line measurement and control of polymer product properties by Raman spectroscopy
WO2017089427A1 (en) * 2015-11-23 2017-06-01 Celltool Gmbh Device and method for analyzing biological objects with raman spectroscopy
KR20200139237A (en) * 2018-04-06 2020-12-11 브라스켐 아메리카, 인크. Raman spectroscopy and machine learning for quality control

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LUSSIER FéLIX; THIBAULT VINCENT; CHARRON BENJAMIN; WALLACE GREGORY Q.; MASSON JEAN-FRANCOIS: "Deep learning and artificial intelligence methods for Raman and surface-enhanced Raman scattering", TRAC TRENDS IN ANALYTICAL CHEMISTRY, ELSEVIER, AMSTERDAM, NL, vol. 124, 7 January 2020 (2020-01-07), AMSTERDAM, NL , XP086065392, ISSN: 0165-9936, DOI: 10.1016/j.trac.2019.115796 *
UYSAL CILOGLU FATMA, SARIDAG AYSE MINE, KILIC IBRAHIM HALIL, TOKMAKCI MAHMUT, KAHRAMAN MEHMET, AYDIN OMER: "Identification of methicillin-resistant Staphylococcus aureus bacteria using surface-enhanced Raman spectroscopy and machine learning techniques", ANALYST, ROYAL SOCIETY OF CHEMISTRY, UK, vol. 145, no. 23, 23 November 2020 (2020-11-23), UK , pages 7559 - 7570, XP055949288, ISSN: 0003-2654, DOI: 10.1039/d0an00476f *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117290669A (en) * 2023-11-24 2023-12-26 之江实验室 Optical fiber temperature sensing signal noise reduction method, device and medium based on deep learning
CN117290669B (en) * 2023-11-24 2024-02-06 之江实验室 Optical fiber temperature sensing signal noise reduction method, device and medium based on deep learning

Also Published As

Publication number Publication date
KR20220097351A (en) 2022-07-07

Similar Documents

Publication Publication Date Title
Muro et al. Forensic body fluid identification and differentiation by Raman spectroscopy
Dixon et al. Feasibility of detection and identification of individual bioaerosols using laser-induced breakdown spectroscopy
Wang et al. Advances in single cell Raman spectroscopy technologies for biological and environmental applications
Lutz et al. Spectral analysis of multiplex Raman probe signatures
Myers Kelley Resonance Raman and resonance hyper-Raman intensities: structure and dynamics of molecular excited states in solution
Kalasinsky et al. Raman chemical imaging spectroscopy reagentless detection and identification of pathogens: signature development and evaluation
Cialla-May et al. Raman spectroscopy and imaging in bioanalytics
Taleb et al. Raman microscopy for the chemometric analysis of tumor cells
Ryder Surface enhanced Raman scattering for narcotic detection and applications to chemical biology
JP4071278B2 (en) Detection of nucleic acids and nucleic acid units
Weng et al. Recent advances in Raman technology with applications in agriculture, food and biosystems: A review
US7397559B1 (en) Surface plasmon enhanced Raman spectroscopy
Menikh et al. Terahertz biosensing technology: Frontiers and progress
US7515269B1 (en) Surface-enhanced-spectroscopic detection of optically trapped particulate
Kuzmin et al. Resonance Raman probes for organelle-specific labeling in live cells
JP2007524389A (en) Wide-field method for detecting pathogenic microorganisms
US8467053B2 (en) Identification of body fluids using raman spectroscopy
JP2014515496A (en) Raman analysis-based high-speed multiple drug high-speed screening device
WO2022146103A1 (en) Construction of and searching method for raman scattering spectrum database through machine learning
US20140016116A1 (en) System and method for raman-based chronic exposure detection
Luo et al. A portable Raman system for the identification of foodborne pathogenic bacteria
Luo et al. Developing a peak extraction and retention (PEER) algorithm for improving the temporal resolution of Raman spectroscopy
Esparza et al. Surface-enhanced Raman analysis of underlaying colorants on redyed hair
Dogariu et al. Coherent anti-stokes Raman spectroscopy for detecting explosives in real time
Zhang et al. Deep learning-based spectral extraction for improving the performance of surface-enhanced Raman spectroscopy analysis on multiplexed identification and quantitation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21915900

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21915900

Country of ref document: EP

Kind code of ref document: A1