WO2022139465A1 - Procédé de diagnostic précoce du cancer du pancréas à l'aide d'une technique d'analyse raman à intelligence artificielle - Google Patents

Procédé de diagnostic précoce du cancer du pancréas à l'aide d'une technique d'analyse raman à intelligence artificielle Download PDF

Info

Publication number
WO2022139465A1
WO2022139465A1 PCT/KR2021/019622 KR2021019622W WO2022139465A1 WO 2022139465 A1 WO2022139465 A1 WO 2022139465A1 KR 2021019622 W KR2021019622 W KR 2021019622W WO 2022139465 A1 WO2022139465 A1 WO 2022139465A1
Authority
WO
WIPO (PCT)
Prior art keywords
pancreatic cancer
information
learning
unit
raman spectrum
Prior art date
Application number
PCT/KR2021/019622
Other languages
English (en)
Korean (ko)
Inventor
남좌민
김송철
이학진
황재호
최용준
김지은
이우형
임경묵
이연희
차승상
Original Assignee
두에이아이(주)
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 두에이아이(주) filed Critical 두에이아이(주)
Publication of WO2022139465A1 publication Critical patent/WO2022139465A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/65Raman scattering
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N21/00Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
    • G01N21/62Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
    • G01N21/63Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
    • G01N21/65Raman scattering
    • G01N21/658Raman scattering enhancement Raman, e.g. surface plasmons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Definitions

  • the present invention relates to a method for early diagnosis of pancreatic cancer using an artificial intelligence Raman analysis technique.
  • Raman scattering is an optical phenomenon generated by the interaction between light and molecular vibrational motion.
  • signal amplification such as surface-enhanced Raman scattering (SERS) using a surface plasmon of a plasmonic nanostructure is essential.
  • the amplified signal contains information about various vibrational movements of molecules, and a specific pattern like a fingerprint appears in the spectrum as Raman shift and intensity. This specific pattern has been proposed to be applied to various fields such as specific molecular detection and quantification, and disease diagnosis through molecular detection.
  • SERS-based Raman spectrum can be used to diagnose diseases of the human body, and in particular, it can be actively used to diagnose pancreatic cancer, which is difficult to diagnose early.
  • pancreatic cancer which is one of the most difficult diseases to diagnose early
  • CT abdominal computed tomography
  • MRI magnetic resonance imaging
  • ERCP endoscopic retrograde cholangiopancreatography
  • EUS endoscopic ultrasound
  • PET proton emission tomography
  • CA19-9 Carbohydrate Antigen 19-9 indicator used for the conventional diagnosis of pancreatic cancer
  • CA19-9 Carbohydrate Antigen 19-9 indicator used for the conventional diagnosis of pancreatic cancer
  • Patent Document 1 Korean Patent Document No. 10-1830314
  • Patent Document 2 Korean Patent Publication No. 10-2021-0100068
  • the present invention has been devised to solve the above problems.
  • the present invention overcomes the limitations of the existing Raman-based molecular detection method through the existing Raman-based molecular detection method that uses limited information by fusion of high-complexity Raman technology by overlapping complex information and artificial intelligence, which is easy to extract information from high-complexity information. aim to do
  • the present invention generates an early diagnosis model of pancreatic cancer through a spectral biomarker obtained through a SERS substrate and a Raman spectrum obtained through a SLISA substrate, and uses this to accurately diagnose whether a patient has pancreatic cancer and other cancers. intended to provide
  • An embodiment of the present invention for solving the above problems is a method of generating a pancreatic cancer diagnostic model, (a) Raman data collected from the patient's blood using a first method preset by the learning data collection unit 100 collecting spectral information; (b) generating, by the training data collection unit 100, training data including the Raman spectrum information; and (c) the model building unit 400 generates a pancreatic cancer diagnosis model using the learning data, wherein the pancreatic cancer diagnosis model learns the learning data and inputs Raman spectrum information collected from the blood of the arbitrary individual.
  • a model for outputting the presence or absence of pancreatic cancer of the arbitrary individual, step provides a method, including.
  • the preset first method collects the blood of a patient located on the nanocube substrate 10 on which the gold nanocube array having the nanogap 11 between the cubes is formed. , it may be a method of acquiring Raman spectrum information through a surface enhanced Raman Spectroscopy (SERS) technique.
  • SERS surface enhanced Raman Spectroscopy
  • the step (a) may further include the step of collecting, by the learning data collection unit 100, biomarker concentration information.
  • the biomarker concentration information is CA19-9 ((Carbohydrate antigen 19-9), CEA (Cacinoembryonic antigen), LRG1 (Leucine-rich alpha-2-glycoprotein 1), CFB (Complement Factor B), It may include concentration information about any one or more of TTL (Tubulin tyrosine ligase) and Thrombosondin-2 (THBS2).
  • step (x) the data preprocessing unit 200 performing preprocessing using Raman spectrum information;
  • the step (x) includes: (x1) removing an outlier from the Raman spectrum information by the data preprocessor 200; and (x2) removing, by the data preprocessor 200, a baseline with respect to the Raman spectrum information from which the outlier has been removed; may include
  • variable selection unit 300 selects a variable from any one or more of the preprocessed learning data, respectively, PCA (Principle components analysis) and deep learning learning Selecting and generating selection training data to include the selected variable in the pre-processed training data; further comprising, in step (c), the model building unit 400 learns the selection training data and , the method may further include generating a pancreatic cancer diagnostic model that outputs whether or not the arbitrary individual has pancreatic cancer, when clinical information of an individual and Raman spectrum information collected from the blood of the arbitrary individual are input .
  • step (z) generating, by the variable selection unit 300, selection learning data to include any one or more of biomarker concentration information and patient clinical information; further comprising; And, in the step (c), the model building unit 400 learns the selection learning data, and the clinical information of an arbitrary individual, the biomarker concentration information of the arbitrary individual, and the blood of the arbitrary individual When inputting the Raman spectrum information collected from It may include any one or more of age, weight, height, and body mass index.
  • the clinical information of the patient further includes a pancreatic cancer progression stage, and after the (x) step, (o) the variable selection unit 300 performs principal component analysis (PCA, PCA, Generating selected learning data to include any one or more of a variable selected from principle components analysis, a variable selected from deep learning learning, the biomarker concentration information, and patient clinical information; further comprising, (c) In step ), the model building unit 400 learns the selection learning data, and collects clinical information of an arbitrary individual, biomarker concentration information of the arbitrary individual, and Raman spectrum collected from the blood of the arbitrary individual.
  • PCA principal component analysis
  • the method further comprises generating a pancreatic cancer diagnostic model outputting whether or not the arbitrary individual has pancreatic cancer, wherein the patient's clinical information is any one of gender, age, weight, height, and body mass index of the patient. It may include more than one.
  • step (d) the step of further learning the pancreatic cancer diagnosis model by the additional learning unit 600; further comprising, wherein the step (d) comprises: (d1) the addition
  • the learning unit 600 generates noise in the original signal of the Raman spectrum information (Jittering), signal scaling (Scaling), signal rotation (Rotation), interval mixing (Permutation), distortion addition (Magnitude warping), linear transformation (Linear transformation) ), generating additional training data using any one or more methods of shifting; may include
  • the step (d) includes: (d2) generating, by the additional learning unit 600, another additional training data by changing the reference line of the Raman spectrum information; and (d3) further learning, by the additional learning unit 600, the pancreatic cancer diagnosis model using the other additional learning data.
  • the model building unit 400 converts the learning data to an artificial neural network (ANN), a support vector machine, a logistic regression ( Logistic regression), gradient boosting (Tree based Gradient boosting), learning using any one or more methods of deep learning learning, generating the pancreatic cancer diagnostic model; may include
  • a method for diagnosing pancreatic cancer using the generated model after step (c), (e) the patient's clinical information and the patient to the diagnostic unit 500 that has received the pancreatic cancer diagnosis model
  • the stage of pancreatic cancer progression is outputted through the output unit.
  • a system for diagnosing pancreatic cancer using a generated model comprising: an input device for receiving clinical information of an arbitrary individual and Raman spectrum information obtained from blood of the arbitrary individual; and a computing device that receives the pancreatic cancer diagnosis model and outputs whether or not the inputted information has pancreatic cancer when the clinical information and Raman spectrum information input from the input device are input to the pancreatic cancer diagnosis model. to provide.
  • a method of collecting Raman spectrum information using a substrate for performing the above method (q) diluting the patient's blood by a predetermined multiple, and diluting the diluted blood to the nanogap (11) being positioned on the nano-cube substrate 10 is formed; And (r) Raman spectroscopy unit generates a Raman signal by surface-enhanced Raman Spectroscopy (SERS) on the nanocube substrate 10, and the SSFA biomarker collection unit 120 collects the Raman spectrum information. collecting; It provides a method comprising:
  • the spectrum itself can be used as a biomarker by including cancer-related information in the spectrum through the overall Raman spectrum pattern identified in blood as well as quantitative information of the biomarker, and the existing liquid biopsy It has the advantage of being able to use material information that could have been missed in the .
  • the present invention can generate an early diagnosis model of pancreatic cancer through the spectral biomarker obtained through the SERS substrate, and use this to accurately diagnose whether a patient has pancreatic cancer and other types of cancer.
  • FIG. 1 is a schematic diagram for explaining a method for generating a pancreatic cancer diagnostic model according to the present invention.
  • Figure 2 shows the spectrum comparison results according to the SERS material and the difference in the Raman spectrum between pancreatic cancer and a normal person.
  • FIG. 3 is a diagram for explaining a method of acquiring a Raman spectrum using SSFA technology.
  • FIG. 5 is a diagram for explaining a method of a data pre-processing unit pre-processing learning data.
  • FIG. 6 is a graph illustrating an original signal of a Raman spectrum and a signal from which a reference line is removed as a result of data preprocessing.
  • FIG. 7 is a view for explaining the accuracy of the pancreatic cancer diagnosis model generated according to the present invention.
  • FIG. 8 is a diagram for explaining that the diagnosis unit according to the present invention outputs the pancreatic cancer progression stage.
  • 9 to 12 are diagrams for explaining that the additional learning unit additionally learns the pancreatic cancer diagnosis model.
  • FIG. 13 is a view for explaining a nanocube substrate.
  • FIG. 14 is a diagram for explaining a process of acquiring a Raman spectrum from a nanocube substrate.
  • 15 is a diagram for explaining a Raman spectrum in the presence and absence of plasma.
  • the present invention is described based on diagnosing pancreatic cancer, but is not limited thereto, and the primary application fields will be hepato-pancreatic biliary surgery and diagnostic laboratory medicine. It can be expanded.
  • an early diagnosis of a pancreatic cancer patient can be easily and quickly confirmed by analyzing a Raman spectrum related to chemical components and components present in the blood of a cancer patient with artificial intelligence.
  • the present invention can perform early diagnosis of pancreatic cancer patients by applying a novel liquid biopsy technology that combines surface enhanced Raman spectroscopy (SERS) technology and artificial intelligence technology.
  • SERS surface enhanced Raman spectroscopy
  • a diagnosis result analyzed by artificial intelligence can be output by collecting plasma from a patient, putting it on a SERS substrate and putting it in a Raman spectrometer.
  • the method according to the present invention includes a learning data collection unit 100 , a data preprocessing unit 200 , a labeling unit 300 , a model building unit 400 , a diagnosis unit 500 , and an additional learning unit 600 . .
  • the learning data collection unit 100 collects data for generating the pancreatic cancer diagnosis model of the present invention, and generates learning data by using it.
  • the learning data collection unit 100 includes an input unit 110 , an SSFA biomarker collection unit 120 , and a biomarker concentration collection unit 130 .
  • Patient clinical information is input to the input unit 110 .
  • Patient clinical information may include any one or more of the patient's gender, age, weight, height, body mass index, and pancreatic cancer progression stage.
  • the present invention is not limited thereto, and may include all information related to a patient's clinical practice, for example, blood pressure.
  • the learning data collection unit 100 collects the patient's clinical information and Raman spectrum information about the patient's blood using a preset first method.
  • biomarker concentration information is further collected.
  • a pancreatic cancer diagnostic model may be generated using any one or more of patient clinical information, Raman spectrum information, and biomarker concentration information.
  • SSFA biomarker collection unit 120 collects Raman spectrum information by surface-enhanced Raman spectroscopy (SERS) of the patient's blood (plasma) on the nano-cube substrate 10.
  • Learning data collection unit 100 may generate a plurality of learning data by repeating the above process.
  • FIG. 3A collecting Raman spectrum information from the nanocube substrate 10 is illustrated
  • FIG. 4 is a diagram illustrating a Raman spectrum obtained from the same sample of the same nanocube substrate 10 .
  • the SSFA biomarker collecting unit 120 may collect a plurality of Raman spectra for the same target by applying Raman spectroscopy to the blood of a patient placed on the same substrate multiple times.
  • the SSFA biomarker collecting unit 120 collects 50 Raman spectra by applying Raman spectroscopy to the blood of a patient placed on the same substrate multiple times.
  • nanocube substrate 10 In the nanocube substrate 10 , a nanogap 11 is formed. A more detailed description of the nanocube substrate 10 will be described later.
  • the spectrum obtained by Raman spectroscopy may be defined as a one-dimensional signal vector having an intensity value of Raman scattering for each wave number within a certain wavelength range.
  • FIG. 2 it shows the spectrum comparison results ( FIGS. 2(a), (b)) and the Raman spectrum difference between pancreatic cancer and normal people ( FIG. 2(c)) according to the substrate design (material/form) to which plasma is injected. do.
  • Fig. 2(a) as a result of comparing spectral changes according to various substrate materials (gold film, nanocube, etc.), it was confirmed that a Raman amplification technique such as the use of a nanocube-plasma mixture is necessary. Glass or gold without such signal amplification In the case of the film, it was confirmed that the Raman signal did not occur.
  • the biomarker concentration collecting unit 130 may check biomarker concentration information for diagnosing pancreatic cancer. This is transmitted to the variable selection unit 300 to be described later, and may be used as one variable for diagnosing pancreatic cancer in the variable selection unit 300 .
  • the method of collecting biomarker concentration information in the biomarker concentration collecting unit 130 is not limited to a specific method.
  • the biomarker concentration information collection unit 130 may collect known biomarker concentration information used for diagnosing pancreatic cancer.
  • the biomarker concentration collection unit 130 may be information collected from the patient's blood using an enzyme-linked immunoassay (ELISA) method, but is not limited to a specific method.
  • ELISA enzyme-linked immunoassay
  • Biomarker concentration information is CA19-9 ((Carbohydrate antigen 19-9), CEA (Cacinoembryonic antigen), LRG1 (Leucine-rich alpha-2-glycoprotein 1), CFB (Complement Factor B), TTL (Tubulin tyrosine ligase), It may include concentration information for any one or more of Thrombosondin-2 (THBS2).
  • the data pre-processing unit 200 receives training data from the training data collection unit 100 and pre-processes the training data.
  • the data preprocessor 200 removes outliers of the Raman spectrum.
  • the data preprocessor 200 removes outliers of the Raman spectrum information.
  • An outlier is a value that is out of the normal range by calculating a normal range for each moving window, and an outlier is an intensity value lower or higher than the normal range.
  • the moving windows may have a length of 100, and an interval between the moving windows is 50, but is not limited thereto.
  • the data preprocessor 200 calculates an average value and a standard deviation value of the signal strength for each moving window in order to calculate the normal range.
  • the normal range was assumed to be the mean value + a * standard deviation from the mean value - a * standard deviation, where a is set to a value of 5, and the normal range can be modified by changing it to a value between 1 and 10.
  • the data preprocessor 200 checks the average value within the range of the moving window excluding outliers for each Raman spectrum as a correction value.
  • the data preprocessor 200 may replace intensity values corresponding to the outliers with the correction values.
  • the diagnostic model according to the present invention is not limited thereto, and an artificial intelligence model capable of ignoring outliers and analyzing spectra can be created without removing outliers from each of a plurality of spectra.
  • the data preprocessor 200 removes a baseline for each spectrum.
  • a baseline may be estimated according to a disadvantage weight-based least squares method.
  • the least squares method assumes that the original spectrum of the Raman spectrum is composed of a baseline, noise, and a filtered spectrum.
  • the baseline refers to a line continuously output regardless of blood when a spectrum for the patient's blood is extracted.
  • the noise refers to various noises generated by the substrate when the spectrum is extracted from the SERS substrate.
  • the filtered spectrum means a spectrum from which a baseline and noise are removed from the original spectrum.
  • the data preprocessor 200 iteratively estimates the noise level in the signal and adjusts the weight accordingly.
  • the data preprocessor 200 gives weight to the original signal only when it is lower than the reference line.
  • the baseline is repeatedly estimated so that the square of the residual between the baseline reflecting these weights and the original signal is minimized.
  • the data preprocessor 200 finally removes the reference line when the residual is the minimum from the original signal.
  • both the original signal and the signal from which the reference line is removed are shown.
  • the original signal is shown as the upper signal, and the signal from which the reference line is removed is shown as the lower signal.
  • a baseline and noise included in the original spectrum may be extracted, and the baseline and noise may be extracted and removed from the original spectrum using a pre-trained artificial intelligence model using the extracted baseline and noise as training data.
  • the variable selection unit 300 receives the data pre-processed by the data pre-processing unit 200 described above.
  • variable selector 300 selects variables from the pre-processed data to generate selective learning data.
  • the variable selection unit 300 transmits selection learning data to the model building unit 400 .
  • the variable selector 300 may select a variable through any one or more of principal component analysis (PCA) and deep learning learning.
  • PCA principal component analysis
  • the corresponding process may be performed in the dimension reduction variable selection unit 320 of the variable selection unit 300 .
  • variable selector 300 may receive biomarker concentration information from the learning data collection unit 100 and use the transferred biomarker concentration information as a variable. Also, the variable selector 300 collects learning data The patient may receive clinical information from the unit 100 and select it as a variable.
  • variable input unit 330 of the variable selection unit 300 may receive the biomarker concentration and patient clinical information from the learning data collection unit 100 .
  • selection learning data may include any one or more of a variable selected from principal component analysis, a variable selected from deep learning learning, the biomarker concentration information, and patient clinical information.
  • the model building unit 400 to be described later may generate a plurality of pancreatic cancer diagnosis models by using the selection learning data.
  • the variable selector 300 may select a variable by performing a principal component analysis (PCA) on the learning data.
  • PCA principal component analysis
  • the principal component w1 of the data set x is defined as follows.
  • x corresponds to Raman spectrum data obtained by analyzing serum with a Raman spectrometer.
  • the kth principal component can be found by subtracting the previous k-1 principal components:
  • principal component analysis is equivalent to finding a singular value decomposition of a data matrix X and then mapping X into a subspace defined by L singular vectors, WL, to find a partial data set Y.
  • the eigenvector with the largest eigenvalue corresponds to the dimension with the strongest correlation in the data set.
  • the variable selector 300 may select the eigenvector as a variable through the above process.
  • the variable selection unit 300 may select a variable by deep learning the learning data. In this case, the corresponding process may be performed in the deep learning variable selection unit 310 of the variable selection unit 300 .
  • Deep learning learning is an artificial neural network composed of numerous hidden layers between input and output.
  • One-dimensional convolutional neural networks (1-D CNNs) are specialized in reflecting local characteristics of one-dimensional data.
  • long-term memory (LSTM) networks a type of recursive neural network (RNN), excel at analyzing continuous data such as speech and character strings.
  • RNN recursive neural network
  • the deep learning model that combines these two models has shown good performance in classification or regression problems of ECG signals and motion detection signals, so it can be used.
  • Deep learning learning used in the present invention may include Deep Neural Network (DNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Generative Adversarial Network (GAN), Reinforcement Learning (RL), but is limited thereto It is not, and various deep learning learning can be applied.
  • DNN Deep Neural Network
  • CNN Convolutional Neural Network
  • RNN Recurrent Neural Network
  • GAN Generative Adversarial Network
  • RL Reinforcement Learning
  • the model building unit 400 generates a pancreatic cancer diagnosis model by using the learning data.
  • the model building unit 400 may generate a pancreatic cancer diagnosis model using the selection learning data.
  • the model building unit 400 is an artificial neural network (ANN), a support vector machine (Support vector machine), logistic regression (Logistic regression), gradient boosting (Tree based Gradient boosting), any one or more methods of deep learning learning It is possible to create a pancreatic cancer diagnostic model by learning using
  • ANN artificial neural network
  • Support vector machine Small vector machine
  • Logistic regression logistic regression
  • gradient boosting Trae based Gradient boosting
  • the method for the model building unit 400 to learn the learning data is not limited to the above, and of course, various types of machine learning and deep learning learning may be applied.
  • the deep learning learning used in the present invention may include a Deep Neural Network (DNN), a Convolutional Neural Network (CNN), a Recurrent Neural Network (RNN), a Generative Adversarial Network (GAN), and Reinforcement Learning (RL). It is not limited thereto, and various deep learning learning may be applied.
  • DNN Deep Neural Network
  • CNN Convolutional Neural Network
  • RNN Recurrent Neural Network
  • GAN Generative Adversarial Network
  • RL Reinforcement Learning
  • the model building unit 400 generates a pancreatic cancer diagnostic model using the learning data as an artificial neural network model, but the method of generating a pancreatic cancer diagnostic model in the present invention is not limited to a specific method.
  • the artificial neural network mimics the operation of a biological neuron, and has a framework that receives data as input, multiplies the weight by the input, and sends the result of the activation function f to the next neuron.
  • the activation function (Activation Function) is a function that multiplies the weight and the input (Node). When the hidden layer components are obtained, it is expressed as an activation function.
  • the output function is a function that multiplies the weight and the input, and when an output layer result is obtained, it is expressed as an output function.
  • the loss function refers to a function that measures the error between the result of the output function and the predicted (y) value for weight learning.
  • the artificial neural network-based early diagnosis system of the present invention includes an input layer, which is a layer that receives selective learning data, a hidden layer consisting of data obtained by multiplying weights and obtaining the result of an activation function, and weights in the final hidden layer/input layer It is composed of an output layer that multiplies by and generates the result of the output function.
  • the artificial neural network model may be implemented by the operation of the following [Equation 6].
  • model building unit 400 may generate a pancreatic cancer diagnosis model by performing deep learning learning using the selection learning data.
  • model building unit 400 trains two or more deep neural networks and then integrates them to output a final diagnosis result.
  • the collected result values are set as input data of the final merge model, and the diagnostic result corresponding to the previous input spectrum can be set as a target variable.
  • Ensemble learning learns weights for each output of each deep neural network so that the final integrated model produces integrated diagnostic results with high performance.
  • the learned integrated model receives spectral data, collects the diagnosis result values of each of the included deep neural networks according to weights, and outputs the final diagnosis result.
  • two or more results may be derived by using each of two or more deep neural networks, and a result indicated by more than half of the derived results may be collected as a final result (hard voting).
  • two or more results may be derived using two or more deep neural networks, respectively, and an average of two or more derived results may be calculated, and the calculated average may be determined as a final result (soft voting) ),
  • results can be derived using any one of a multi-path one-dimensional convolutional neural network, a two-dimensional convolutional neural network, and a model combining a convolutional neural network and a recurrent neural network, but in some cases, two or more neural network models are combined results can be derived.
  • the control unit determines the performance of each neural network, and when the performance of each neural network is less than or equal to the standard, the control unit may derive a result using an ensemble model using two or more neural networks among a plurality of neural networks, When the performance is above the standard, a result may be derived by using any one of a plurality of neural networks.
  • the diagnosis unit 500 receives the pancreatic cancer diagnosis model and inputs at least one of clinical information of the patient and Raman spectrum information of the patient to the diagnosis unit 500 , the presence or absence of pancreatic cancer may be output through the output unit.
  • pancreatic cancer diagnosis model may also learn the pancreatic cancer progression stage among the patient's clinical information and output the pancreatic cancer progression stage.
  • Figure 8 shows the results of Raman spectrum measurement (pancreatic ductal adenocarcinoma 40 patients, cholelithiasis 40 patients) of a total of 80 samples, 9 patients in stage 1, 10 patients in stage 2, 21 patients in late stage 2, surgery delivered at Asan Hospital The overall CA19-9 and CEA levels were compared with the pancreatic cancer diagnostic performance.
  • AUROC 0.969
  • Accuracy 0.900
  • Sensitivity 0.825
  • Specificity It is shown as 0.975.
  • ANN artificial neural
  • AUROC is 0.944
  • Accuracy is 0.938
  • Sensitivity is 0.900
  • Specificity is 0.975 do.
  • AUROC of CA19-19 measured by conventional ELISA is 0.762
  • Accuracy is 0.762
  • Sensitivity is 0.575
  • Specificity is 0.950. Accordingly, it can be seen that the diagnostic model according to the present invention has high accuracy in diagnosing pancreatic cancer.
  • pancreatic cancer diagnosis model according to the present invention may be additionally learned by the additional learning unit 600 .
  • the additional learning unit 600 additionally trains the pancreatic cancer diagnosis model using the additional learning data with reference to FIGS. 9 to 12 .
  • the additional learning unit 600 generates noise in the original signal of each Raman spectrum (Jittering), signal scaling (Scaling), signal rotation (Rotation), interval mixing (Permutation), distortion addition (Magnitude warping), linear transformation (Linear) Transformation) and shifting may be applied to generate additional training data.
  • “Jittering” means adding a noise signal based on a Gaussian distribution to an original signal.
  • Scaling means multiplying an original signal by a random real value.
  • “Signal rotation” refers to rotating a randomly selected point in a Raman spectrum by a preset angle.
  • the preset angle may be -10 to 10 degrees.
  • Rotation transformation matrix to original data Use the result of multiplying by , where X means the value in radians to rotate.
  • Period refers to dividing the original signal into a random number of pieces, then mixing them and combining them again.
  • Magnetic warping means adding a random line segment to a signal.
  • “Shifting” means randomly shifting the spectrum from -2 to 2 on the x-axis, in which case data out of the data range is deleted, and the missing part is filled using the closest data.
  • the pancreatic cancer diagnostic model can Analysis can be performed ignoring noise on the input original signal, and through this, more accurate diagnostic results can be derived without a process of controlling noise.
  • the additional learning unit 600 may change and apply the reference line of the Raman spectrum information to generate other additional training data.
  • the additional learning unit 600 generates a virtual Raman spectrum by using the reference line estimated from the Raman spectrum, and additionally learns using the generated virtual Raman spectrum.
  • the additional learning unit 600 collects the reference line obtained by the above-described method for estimating the reference line from the original spectrum of the Raman spectrum and the filtered signal from which the reference line is removed.
  • the additional learning unit 600 randomly shuffles the reference lines and adds a reference line obtained from another Raman spectrum to the filtered signal.
  • FIG. 12(a) the original spectrum is shown, and the baseline is shown.
  • Figure 12 (b) is the filtered signal of Figure 12 (a).
  • the reference lines of the left and right spectra in the upper part of FIGS. 12(a) and 12(b) are replaced, two spectra are newly generated, and the reference lines are replaced in the newly created lower parts of FIGS. 12(c) and 12(d).
  • the spectrum is shown.
  • the additional learning unit 600 may artificially create a new virtual Raman spectrum.
  • the later pancreatic cancer diagnosis model receives the original spectrum from which the baseline has not been removed. Analysis can be performed by ignoring the baseline in the spectrum, which can lead to more accurate diagnostic results without the process of removing the baseline
  • the nanocube substrate 10 used to collect learning data from the SSFA biomarker collecting unit 120 will be described in more detail.
  • the optical properties of the nanocube substrate 10 were confirmed through rhodamine molecules (rhodamine 6G).
  • nanocube substrate 10 which has been confirmed to be useful as a SERS substrate, proceeds with a clinical sample as described below.
  • blood plasma was diluted 10 times, 5 ⁇ L of each was sampled on the nanocube substrate 10, and after waiting for adsorption of biomolecules for 30 minutes, it was dried in a vacuum state. Thereafter, the Raman spectrum of the clinical sample to which the molecules in the sample were adsorbed was measured.
  • a Raman spectrum may be obtained through the nanocube substrate 10 on which the nanogap 11 is formed.
  • FIG. 13 shows a nanocube substrate 10 .
  • FIG. 13( a ) shows the nanocube substrate 10 in which nanogaps 11 with predetermined intervals are formed.
  • the nanocube substrate 10 formed a nanogap 11 by forming a gold nanocube array structure. Since it is known that Raman enhancement occurs in the nanogap 11 , such a structure was selected as the SERS substrate.
  • FIG. 13( b ) shows a scanning electron microscope (SEM) image of the nanocube substrate 10 .
  • 13( c ) shows an image projected onto an optical microscope image by displaying the Raman intensity in green at a position of 1509.46 cm ⁇ 1 as a result of 0.1 mM rhodamine molecular mapping to confirm the effectiveness of the nanocube substrate 10 .
  • a region (yellow) and a region (black) in which a hot spot was formed were both present, and the SERS substrate was subjected to point mapping to confirm that a Raman signal was emitted only from the region in which the hot spot was formed.
  • the size of the area in the vicinity of 1509.4 cm -1 in the spectrum for each location where the mapping was performed is shown in green, and the image as shown in FIG. 13(c) was created by projecting it to an optical microscope image. As shown in FIG. 13(c) , a green color appeared in the shape of the region where the hotspot was formed, and it was confirmed that the Raman signal of the rhodamine molecule was generated only in the region where the hotspot was formed.
  • 13( d ) is a graph located on the lower side before hotspot exposure, in red, and after exposure, in blue, located on the upper side, and shows the average intensity (solid line) and standard deviation (shaded) of the graph do.
  • 13 (d) shows the Raman spectrum obtained through Raman mapping to confirm the change in the Raman intensity of the rhodamine molecule before and after exposure to the hotspot.
  • Representative peaks of rhodamine molecules 1311.74, 1363.91, 1509.46, 1651.76 cm-1) increased from a minimum of 13 times to a maximum of 28 times after hotspot exposure, and the mean relative standard deviation was 38.3%.
  • the nanocube substrate 10 according to the present invention is It was confirmed that it can function as a SERS substrate for diagnosing pancreatic cancer.
  • plasma If plasma is used immediately, it is coagulated before wetting on the substrate and Raman signal does not occur on the substrate, so the plasma is diluted before use.
  • the plasma was diluted 10 times, but it is not limited to the corresponding value.
  • the imsal sample on the substrate After sampling the imsal sample on the substrate, it is dried for a predetermined time, and waits for biomolecule adsorption to occur.
  • the clinical sample is sampled on the substrate and dried for 30 minutes, but it is not limited to the corresponding value, but it is known that the longer the plasma storage time at room temperature, the more metabolic reaction proceeds and the concentration of molecules such as hypoxanthine increases, The molecule can be observed as spectral information and may cause distortion of plasma information. For this reason, the sampling time may not exceed one hour.
  • Raman spectra of the plasma sampled substrate and non-plasma substrate are measured.
  • the plasma spectrum is measured under certain conditions, and it is used as learning data to generate the aforementioned artificial intelligence model, and is used as input data to be input to the artificial intelligence model, so that it can be used for pancreatic cancer diagnosis. .
  • the world's first diagnostic technology combining surface-enhanced Raman spectroscopy and artificial intelligence was developed to supplement the low accuracy that was a problem with the existing liquid biopsy.
  • Information on the unique chemical composition in blood can be acquired at once by using the degree of light scattering, and information invisible to the human eye can be automatically extracted from complex Raman signals using artificial intelligence.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Pathology (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Radiology & Medical Imaging (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Chemical & Material Sciences (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biochemistry (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Investigating, Analyzing Materials By Fluorescence Or Luminescence (AREA)

Abstract

La présente invention se rapporte à un procédé permettant le diagnostic précoce du cancer du pancréas à l'aide d'une technique d'analyse Raman à intelligence artificielle, le procédé comprenant les étapes dans lesquelles : (A) une unité de collecte de données de formation (100) collecte des informations de spectre Raman à partir du sang d'un patient à l'aide d'un premier procédé préétabli ; (b) l'unité de collecte de données d'apprentissage (100) construit des données de formation comprenant les informations de spectre Raman ; (c) une unité de construction de modèle (400) génère un modèle de diagnostic du cancer du pancréas à l'aide des données de formation, le modèle de diagnostic du cancer du pancréas étant un modèle qui, lorsqu'une entrée présentant des informations de spectre Raman collectées à partir du sang d'une entité arbitraire, fournit des informations indiquant si l'entité arbitraire a un cancer du pancréas.
PCT/KR2021/019622 2020-12-22 2021-12-22 Procédé de diagnostic précoce du cancer du pancréas à l'aide d'une technique d'analyse raman à intelligence artificielle WO2022139465A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR10-2020-0181368 2020-12-22
KR20200181368 2020-12-22
KR20200181367 2020-12-22
KR10-2020-0181367 2020-12-22

Publications (1)

Publication Number Publication Date
WO2022139465A1 true WO2022139465A1 (fr) 2022-06-30

Family

ID=82158483

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2021/019622 WO2022139465A1 (fr) 2020-12-22 2021-12-22 Procédé de diagnostic précoce du cancer du pancréas à l'aide d'une technique d'analyse raman à intelligence artificielle

Country Status (2)

Country Link
KR (1) KR20220091408A (fr)
WO (1) WO2022139465A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117783088B (zh) * 2024-02-23 2024-05-14 广州贝拓科学技术有限公司 激光显微拉曼光谱仪的控制模型训练方法及装置、设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130100096A (ko) * 2010-08-13 2013-09-09 소마로직, 인크. 췌장암 바이오마커 및 그것의 용도
KR20170007774A (ko) * 2014-05-08 2017-01-20 유겐가이샤 마이테크 플라즈모닉 칩 및 이를 이용한 형광 화상 및 라만 분광에 의한 암 질환의 진단 방법
KR20170039168A (ko) * 2014-07-02 2017-04-10 싱가포르국립대학교 비정상적 성장하는 표본 또는 조직의 유형 또는 특성을 분석하는 라만 분광 시스템, 장치 및 방법
KR101830314B1 (ko) * 2017-07-26 2018-02-20 재단법인 구미전자정보기술원 인공지능 기반 베이지안 네트워크를 이용한 췌장암 진단에 필요한 정보제공 방법, 컴퓨터 프로그램 및 컴퓨터 판독 가능한 기록 매체
WO2019213133A1 (fr) * 2018-04-30 2019-11-07 City Of Hope Système et procédé de détection et d'ablation de cancer

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102289278B1 (ko) 2019-07-09 2021-08-13 주식회사 베르티스 췌장암 진단용 바이오마커 패널 및 그 용도

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130100096A (ko) * 2010-08-13 2013-09-09 소마로직, 인크. 췌장암 바이오마커 및 그것의 용도
KR20170007774A (ko) * 2014-05-08 2017-01-20 유겐가이샤 마이테크 플라즈모닉 칩 및 이를 이용한 형광 화상 및 라만 분광에 의한 암 질환의 진단 방법
KR20170039168A (ko) * 2014-07-02 2017-04-10 싱가포르국립대학교 비정상적 성장하는 표본 또는 조직의 유형 또는 특성을 분석하는 라만 분광 시스템, 장치 및 방법
KR101830314B1 (ko) * 2017-07-26 2018-02-20 재단법인 구미전자정보기술원 인공지능 기반 베이지안 네트워크를 이용한 췌장암 진단에 필요한 정보제공 방법, 컴퓨터 프로그램 및 컴퓨터 판독 가능한 기록 매체
WO2019213133A1 (fr) * 2018-04-30 2019-11-07 City Of Hope Système et procédé de détection et d'ablation de cancer

Also Published As

Publication number Publication date
KR20220091408A (ko) 2022-06-30

Similar Documents

Publication Publication Date Title
WO2022139465A1 (fr) Procédé de diagnostic précoce du cancer du pancréas à l'aide d'une technique d'analyse raman à intelligence artificielle
WO2016195417A1 (fr) Appareil et procédé de traitement d'image médicale
Vickers et al. Curve fitting and linearity: data processing in Raman spectroscopy
WO2020106010A1 (fr) Système d'analyse et procédé d'analyse d'image
WO2019083227A1 (fr) Procédé de traitement d'image médicale, et appareil de traitement d'image médicale mettant en œuvre le procédé
WO2023033329A1 (fr) Dispositif et procédé pour générer des informations de mutation génique à risque pour chaque maladie par une analyse de mutation génique liée à une maladie
WO2019074339A1 (fr) Système et procédé de conversion de signaux
WO2021006649A1 (fr) Panel de biomarqueurs pour le diagnostic du cancer du pancréas et utilisation de celui-ci
Bird et al. A protocol for rapid, label-free histochemical imaging of fibrotic liver
WO2020076135A1 (fr) Dispositif d'apprentissage à modèle d'apprentissage profond et procédé pour région cancéreuse
WO2022010255A1 (fr) Procédé, système et support lisible par ordinateur permettant la déduction de questions approfondies destinées à une évaluation automatisée de vidéo d'entretien à l'aide d'un modèle d'apprentissage automatique
WO2022203229A1 (fr) Appareil d'échographe d'imagerie thermique de type à intelligence artificielle pour le diagnostic du cancer du sein à l'aide d'un miroir intelligent, et procédé d'autodiagnostic du cancer du sein l'utilisant
Bassan et al. Automated high-throughput assessment of prostate biopsy tissue using infrared spectroscopic chemical imaging
WO2019177345A1 (fr) Procédé de détection ultrasensible de biomarqueurs multiples
Breshike et al. Rapid detection of infrared backscatter for standoff detection of trace explosives
SE7705157L (sv) Forfarande och apparat for analys
WO2021091282A1 (fr) Système de diagnostic tridimensionnel
WO2020091337A1 (fr) Appareil et procédé d'analyse d'image
WO2020091253A1 (fr) Dispositif électronique et procédé de commande d'un dispositif électronique
WO2022146103A1 (fr) Construction et procédé de recherche pour base de données de spectre de diffusion raman par apprentissage automatique
WO2020116988A1 (fr) Dispositif d'analyse d'images, procédé d'analyse d'images, et support d'enregistrement
WO2023063528A1 (fr) Dispositif et procédé pour générer des informations d'apparition de maladie au moyen d'une analyse de facteurs associés à une maladie sur la base de la variabilité temporelle
WO2022145590A1 (fr) Appareil et procédé de prédiction de temps de rétention dans une analyse chromatographique d'un analyte
JP2020144012A (ja) 染色画像推定器学習装置、画像処理装置、染色画像推定器学習方法、画像処理方法、染色画像推定器学習プログラム、及び、画像処理プログラム
RU2729946C1 (ru) Способ полетной абсолютной радиометрической калибровки с использованием зондирующего сигнала

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21911532

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03/11/2023)