WO2024082581A1 - 一种m蛋白检测的方法 - Google Patents

一种m蛋白检测的方法 Download PDF

Info

Publication number
WO2024082581A1
WO2024082581A1 PCT/CN2023/087606 CN2023087606W WO2024082581A1 WO 2024082581 A1 WO2024082581 A1 WO 2024082581A1 CN 2023087606 W CN2023087606 W CN 2023087606W WO 2024082581 A1 WO2024082581 A1 WO 2024082581A1
Authority
WO
WIPO (PCT)
Prior art keywords
light chain
peak
sample
protein
tested
Prior art date
Application number
PCT/CN2023/087606
Other languages
English (en)
French (fr)
Inventor
周宏伟
曾念宜
黄均达
方臻成
陈慕璇
Original Assignee
南方医科大学珠江医院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南方医科大学珠江医院 filed Critical 南方医科大学珠江医院
Publication of WO2024082581A1 publication Critical patent/WO2024082581A1/zh

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N27/00Investigating or analysing materials by the use of electric, electrochemical, or magnetic means
    • G01N27/62Investigating or analysing materials by the use of electric, electrochemical, or magnetic means by investigating the ionisation of gases, e.g. aerosols; by investigating electric discharges, e.g. emission of cathode
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/68Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing involving proteins, peptides or amino acids
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/2431Multiple classes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis

Definitions

  • the present invention relates to the biomedical field, and in particular to a method for detecting M protein.
  • Monoclonal gammopathy is a disease characterized by clonal expansion of plasma cells and is classified into stages such as low tumor burden disease, precancerous lesions, and malignant tumors.
  • Low tumor burden disease does not have a large number of plasma cell clones proliferating, but the secreted monoclonal protein can directly cause lesions, such as immunoglobulin light chain amyloidosis; precancerous lesions include monoclonal gammopathy of undetermined significance (MGUS) and smoldering multiple myeloma (SMM), which do not show symptoms such as damage to related organs and tissues that can be attributed to plasma cell clones or monoclonal proteins.
  • MGUS monoclonal gammopathy of undetermined significance
  • SMM multiple myeloma
  • M protein monoclonal protein
  • M protein is a serum biomarker directly related to the clonal plasma cell load of MG patients. It can be used as a diagnostic marker for identifying the disease and a quantitative marker for tracking disease progression and response to treatment. The identification, typing, and quantification of M protein are helpful for the initial diagnosis of the disease, risk stratification, and monitoring of response to treatment.
  • Normal human immunoglobulins consist of two identical light chains and two identical heavy chains. Two identical light chains are either ⁇ or ⁇ , and there is no Ig that binds both ⁇ and ⁇ light chains at the same time.
  • the monoclonal immunoglobulin light chain in myeloma patients inhibits the production of another light chain, thereby unbalancing the ⁇ / ⁇ ratio.
  • An increase or decrease in the ratio indicates the possibility of ⁇ or ⁇ type multiple myeloma, respectively. If a large amount of M protein appears in the patient's serum, the serum Ig composition changes, such as the relative content of different Ig isotypes, the light chain ⁇ / ⁇ ratio, etc.
  • the imbalance of the ⁇ / ⁇ ratio is an important indicator for distinguishing monoclonal proliferation of M M plasma cells from other diseases.
  • Most newly diagnosed multiple myeloma patients have a high M protein concentration, but shortly after treatment, the M protein concentration changes significantly, usually decreasing by several orders of magnitude within a few months, indicating that the malignant clone plasma cells are gradually eliminated.
  • serum protein electrophoresis PEL
  • immunofixation electrophoresis IFE
  • sFLC serum free light chain turbidimetry
  • serum protein electrophoresis mainly screens for the presence of M protein; immunofixation electrophoresis mainly types M protein, the detection sensitivity of PEL is not high, and IFE is a low-throughput electrophoresis technology. Moreover, they are unable to detect low levels of M protein, which soon becomes undetectable when monitoring multiple myeloma disease activity after effective treatment. Obviously, these methods cannot be used for early detection of disease recurrence, resulting in a high rate of missed diagnosis. In addition, the interpretation of experimental results requires experienced laboratory staff, so there are differences in interpretation by different personnel, resulting in inconsistencies in the interpretation of electrophoresis results by different staff in different laboratories. Therefore, it is difficult to achieve uniformity in the standardization of screening methods.
  • High-throughput turbidimetry is currently considered to be the most sensitive measurement method for indirect proof of the presence of M protein, but it is not applicable to all multiple myeloma cases.
  • Patients with multiple myeloma have abnormal sFLC ratios at diagnosis.
  • serum free light chain turbidimetry (sFLC) has problems such as antigen excess, nonlinear reaction, and high cost of antibody reagents, which means that sFLC is still not available in a large number of clinical laboratories.
  • sFLC serum free light chain turbidimetry
  • Mass spectrometry is an investigational method that uses high-resolution molecular weight detection to accurately identify and classify M proteins in serum.
  • MALDI-TOF mass spectrometry has replaced immunofixation electrophoresis to identify M proteins.
  • the mass spectrometry currently under development requires the use of antibodies for capture, resulting in high detection costs.
  • an object of the present invention is to provide a method for detecting M protein to solve the problems in the prior art.
  • the present invention provides a method for detecting M protein, which comprises the following steps:
  • the light chain m/z range has a narrow base, high and sharp mass spectrometry peak, it is determined that the sample to be tested contains M protein; or, if the peak area ratio of ⁇ light chain: ⁇ light chain is less than 1.8 or greater than 3.5, and the peak shape is non-Gaussian distribution, it is determined that the sample to be tested contains M protein; the light chain peak with a narrow base, high and sharp mass spectrometry peak in the light chain m/z range, or the peak area ratio of ⁇ light chain: ⁇ light chain is less than 1.8 or greater than 3.5, and the peak shape is non-Gaussian distribution is the M protein light chain peak.
  • the present invention also provides a device for detecting M protein, the device comprising:
  • Information acquisition module used to obtain m/z distribution data of immunoglobulin light chain and heavy chain single-charged ions in the sample to be tested, wherein the light chain includes ⁇ light chain and ⁇ light chain;
  • Peak shape recognition module used to analyze the peak area of ⁇ light chain: ⁇ light chain and whether the peak shape is non-Gaussian distribution;
  • Result determination module used to output results according to the following situations:
  • the peak area ratio of ⁇ light chain: ⁇ light chain is less than 1.8, and the m/z of the singly charged ions contains a non-Gaussian distribution peak shape within 22400-23100 Da, it is determined that the sample to be tested contains ⁇ light chain type M protein;
  • the peak area ratio of ⁇ light chain: ⁇ light chain is greater than 3.5, and the m/z of the singly charged ions contains a non-Gaussian distribution peak shape within 23100-24600Da, it is determined that the sample to be tested contains ⁇ light chain type M protein;
  • the peak area ratio of ⁇ light chain: ⁇ light chain is ⁇ 1.8 or ⁇ 3.5, and the peak shape is Gaussian distribution, it is determined that the sample to be tested does not contain M protein.
  • the method for detecting M protein of the present invention has the following beneficial effects:
  • MALDI-TOF MS Based on MALDI-TOF MS, a systematic evaluation of the ⁇ / ⁇ light chain ratio was made, and the light chain ratio can be detected. Based on the unique molecular weight and high abundance of M protein, when the light chain region of the patient shows ⁇ or ⁇ type M protein peaks and (or) abnormal ⁇ / ⁇ ratio, it indicates the presence of M protein. Compared with SPE or IFE, MALDI-TOF MS can track low levels of M protein in a very sensitive and specific manner during the patient's treatment, and can provide more accurate detection for diagnosing diseases and monitoring patients' responses to treatment. At the same time, compared with the mass spectrometry method currently under development, this patent does not need to enrich the immunoglobulins in the serum, and does not use antibodies for capture.
  • this patent method can provide a more economical solution for the detection of M protein; and this method is simple to operate, can be automated, and has high analytical sensitivity. It can not only quickly and objectively determine the M protein, but also greatly avoid the subjective errors of manual visual analysis.
  • the present invention can analyze various types of M proteins encountered in the clinic, and can improve the screening, diagnosis and monitoring methods of multiple myeloma.
  • Figure 1 shows the results of validating the Peak Shape Identification Tool.
  • FIG. 2 shows a detection process and principle schematic diagram of the present invention.
  • FIG3 shows the mass spectra overlay from 60 healthy subjects as normal controls.
  • FIG4 shows the comparison of serum fingerprints of normal control (left) and multiple myeloma patients (right) before and after reduction.
  • FIG5 shows the fingerprint comparison (after reduction) of standard globulin (left) and serum of multiple myeloma patients (right).
  • FIG6 shows the distribution diagram of the peak area ratio of ⁇ light chain: ⁇ light chain of 60 healthy subjects.
  • Figure 7 shows that the peak identification tool highlights the area where the M protein is located in a vertical tangent manner.
  • Figure 8 shows a comparison of the sensitivity of MALDI-TOF MS and IFE analyses.
  • Figure 9 shows the linearity comparison between MALDI-TOF MS and SPE methods.
  • FIG10 shows the process of MALDI-TOF MS analysis of the light chain of M protein.
  • the horizontal axes are all on the order of magnitude of 10 4
  • the vertical axes are all on the order of magnitude of 10 -3 .
  • FIG. 11 is a schematic diagram of the device for qualitative, quantitative and typing of M protein of the present invention.
  • FIG. 12 is a schematic diagram showing a service terminal of the present invention.
  • the present invention detects reduced globulin (Ig) based on MALDI-TOF MS, and the designed methodology principle (i.e., by calculating the mass spectrometry peak area ratio of the two light chain regions, establishing the threshold of normal samples and M protein positive samples and their typing) can quantify the ⁇ / ⁇ light chain ratio.
  • the M protein can be identified by this method.
  • the present invention provides a method for detecting M protein, the method comprising the following steps:
  • the sample to be tested contains M protein; or, if the peak area ratio of ⁇ light chain: ⁇ light chain is less than 1.8 or greater than 3.5, and the peak shape is non-Gaussian distribution, it is determined that the sample to be tested contains M protein; the light chain peak with a narrow base, high peak and sharp peak within the m/z range of the light chain, or the peak area ratio of ⁇ light chain: ⁇ light chain is less than 1.8 or greater than 3.5, and the peak shape is non-Gaussian distribution is the M protein light chain peak;
  • the ratio is ⁇ 1.8 or ⁇ 3.5, and the peak shape is Gaussian distribution, it is determined that the sample to be tested does not contain M protein.
  • the sample to be tested is a serum or urine sample.
  • the m/z distribution data of the immunoglobulin light chain and heavy chain singly charged ions in the sample to be tested are obtained using the following steps:
  • the sample to be tested is diluted and then reduced with a reducing agent.
  • the dilution factor of the sample to be tested is 5 to 20 times, for example, 5 to 15 times.
  • the sample to be tested can be diluted with one or more of water, PBS, and physiological saline.
  • the reducing agent in step I) is selected from any one or more of dithiothreitol (DTT), tris(2-carboxyethyl)phosphine (TCEP), tris(3-hydroxypropyl)phosphine (TPP or THPP), and ⁇ -mercaptoethanol.
  • DTT dithiothreitol
  • TCEP tris(2-carboxyethyl)phosphine
  • TPP or THPP tris(3-hydroxypropyl)phosphine
  • ⁇ -mercaptoethanol ⁇ -mercaptoethanol
  • the final concentration of the reducing agent is 0.02 to 0.08 mol/L.
  • the final concentration of the reducing agent is 0.02 to 0.06 mol/L. More preferably, the final concentration of the reducing agent is 0.04 mol/L.
  • the reducing agent in step I) is a formic acid solution of dithiothreitol.
  • the step of reducing with a reducing agent is: after mixing the sample to be tested and the reducing agent, incubate at 20-30° C. for 10-30 minutes. Preferably, incubate at 24-27° C. for 15-25 minutes. More preferably, incubate at 25° C. for 20 minutes.
  • the matrix liquid in step II) is selected from a mustard acid matrix liquid, a 2,5-dihydroxybenzoic acid matrix liquid or an ⁇ -cyano-4-hydroxycinnamic acid matrix liquid.
  • the solvent of the mustard acid matrix liquid is acetonitrile + an aqueous solution containing trifluoroacetic acid.
  • the volume ratio of acetonitrile to the aqueous solution containing trifluoroacetic acid is 1:1.
  • the final concentration of the matrix solution is 1 to 5 mg/mL.
  • the mass spectrometry conditions during the analysis and measurement in step II) are: source voltage 20 kV, detector voltage 0.48 kV, laser energy 4.8 ⁇ J, laser frequency 3000 Hz, focusing mass 20 kDa, scanning speed 1 mm/s, and collection mass range 5 kDa to 200 kDa.
  • the light chain m/z range in step 2) refers to the light chain single charge m/z of 22400-24600 Da.
  • the light chain m/z range or other m/z range can be adjusted accordingly based on the m/z range disclosed in the present invention, such as 22400-24600 Da, according to the deviation range of different instruments.
  • the fingerprint of the sample to be tested obtained by the test can be superimposed and compared with the fingerprint of a healthy person to observe whether there is a mass spectrum peak with a narrower base, a higher peak and a sharper peak in the light chain m/z range of the fingerprint of the sample to be tested compared with the peak of a healthy person. If so, it is determined that the sample to be tested contains M protein; if not, it is determined that the sample to be tested does not contain M protein. Specifically, as long as any one or more of the conditions of a narrower base, a higher peak and a sharper peak compared with the peak of a healthy person are not met, it is determined that the sample to be tested does not contain M protein.
  • the spectrum of healthy people shows a Gaussian distribution in the light chain m/z range.
  • the mass spectrum of M protein has a narrow base, a high and sharp peak, similar to a church spire, and does not show a Gaussian distribution.
  • the peak area ratio of ⁇ light chain: ⁇ light chain is less than 1.8 or greater than 3.5, and the peak shape is a non-Gaussian distribution, it is determined that the sample to be tested contains M protein. If the peak area ratio of ⁇ light chain: ⁇ light chain is equal to 1.8, and the peak shape is a Gaussian distribution, it is determined that the sample to be tested does not contain M protein. In the absence of M protein, two Gaussian distribution peaks will generally appear in the ⁇ light chain and ⁇ light chain regions, respectively.
  • the peak area ratio of ⁇ light chain: ⁇ light chain is calculated by integrating the peak area using the workstation that comes with the mass spectrometer. Whether the peak shape is a Gaussian distribution is determined using a peak shape recognition tool.
  • the peak shape recognition tool is a tool obtained by using a random forest (RF) algorithm using a number of detected mass spectrometry samples as a training sample set.
  • the random forest algorithm is implemented using MATLAB's TreeBagger function.
  • the random forest algorithm is an algorithm that integrates multiple decision trees through the idea of ensemble learning. Its basic unit is the decision tree. Its essence belongs to a major branch of machine learning - ensemble learning methods.
  • the random forest algorithm is a very mature technology in the prior art.
  • Random forest is a commonly used Supervised machine learning classification algorithm.
  • a machine learning algorithm can be viewed as a complex function.
  • the function is that when the feature value of a sample is input, the function outputs n values that sum to 1. These n values can be viewed as the scores of the input sample in each category. In general, the category with the highest score is regarded as the function's judgment on the input sample.
  • the so-called supervised learning means that we input a large number of samples and their classification information (also called sample labels) into the classifier, and then the algorithm self-adjusts the classifier's function (or decision process) based on the sample labels. The ultimate goal of the adjustment is to make the score output by the function as consistent as possible with the sample label.
  • This iterative process is also called training. After the training is completed, the samples that did not participate in the training are input into the classifier, the classifier's judgment on the samples is recorded, and the performance of the classifier is evaluated using different indicators. This process is called testing. Random forests consist of multiple decision trees, each of which is a subclassifier. Random forests use these decision trees to vote on samples and calculate the scores of samples in each category. For each decision tree, random forest will randomly select a subset of training data and a subset of features used to describe the sample to train each decision tree. The structure of each decision tree is a binary tree, and each node represents a feature. The value of the feature determines which direction the process should go in the binary tree.
  • the last layer of nodes in the binary tree represents the decision tree's judgment on the sample category.
  • each decision tree selects the optimal features based on the sample label and generates a binary tree.
  • a random forest classifier is used to identify the situation in the light chain region.
  • the first case is that the ⁇ peak with m/z values in the interval [22400, 23100] is abnormal, while the ⁇ peak with m/z values in the interval [23100, 24600] is normal;
  • the second case is that the ⁇ peak with m/z values in the interval [22400, 23100] is normal, while the ⁇ peak with m/z values in the interval [23100, 24600] is abnormal;
  • the third case is that both the ⁇ peak with m/z values in the interval [22400, 23100] and the ⁇ peak with m/z values in the interval [23100, 24600] are normal.
  • the abnormal peaks are all the maximum values in the corresponding interval. Therefore, the recognition problem can be simplified to a 3-classification problem, that is, first find the maximum value point in the corresponding interval of the peak graph, and then classify the samples according to the peak graph characteristics of these two maximum value points.
  • the process of developing the peak shape identification tool is as follows:
  • the characteristic values extracted from the mass spectrum sample characteristics include ⁇ L , CL , ⁇ R , CR , ⁇ L / ⁇ R , CL / CR .
  • the peak with the largest intensity in the m/z interval [23100, 24600] and the peak with the largest intensity in the interval [22400, 23100] of each sample singly charged ion are determined, and Gaussian distribution fitting is performed.
  • describes the peak width of the normal distribution
  • C describes the degree of deformation of the peak in the vertical direction.
  • the goal of fitting is to find a ⁇ such that The value of is the smallest.
  • the minimum value of ⁇ and its corresponding C value are used as the peak characteristics of the maximum value in this interval.
  • the mass spectrum data of each sample can be represented by the eigenvalues ( ⁇ L , CL , ⁇ R , CR , ⁇ L / ⁇ R , CL / CR ), that is, each sample can be represented by these six eigenvalues.
  • S normal represents the predicted value of Gaussian distribution of the light chain region peak
  • S ⁇ represents the predicted value of non-Gaussian distribution peak in the ⁇ light chain region
  • S ⁇ represents the predicted value of non-Gaussian distribution peak in the kappa light chain region.
  • S normal +S ⁇ +S ⁇ 1, and the maximum value among S normal , S ⁇ , and S ⁇ is taken as the final conclusion. For example, if S normal is the maximum value, it means that the light chain region peak is Gaussian distribution.
  • 1929 detected mass spectrometry samples are collected as a training sample set.
  • each peak graph is manually annotated and annotated into the following three categories: negative samples (marked as normal) with a total of 924 samples, lambda peak abnormalities (marked as lambda) with a total of 433 samples, and kappa peak abnormalities (marked as kappa) with a total of 572 samples.
  • feature extraction is performed on each peak to obtain a random forest model as a peak shape recognition tool.
  • each row represents all samples.
  • the first row shows that 0.9885% of the 433 lambda-type M protein samples were classified as lambda-type M protein, that is, 428 were predicted correctly, and only 5 were misclassified as negative or kappa type; similarly, the second row represents 0.9957% of the 924 negative samples were classified as negative (correct), and only 4 were wrong; the third row represents 0.9895% of the 572 kappa-type M protein samples were classified as kappa-type M protein, and only 7 were classified as negative (wrong); 10-fold cross validation shows that the accuracy of the model established using the random forest algorithm is ideal.
  • step 2) The specific determination method in step 2) is as follows:
  • ⁇ / ⁇ ⁇ 1.8, and the ⁇ light chain region contains an abnormal M protein light chain peak, it is determined to be ⁇ light chain type M protein positive;
  • represents the peak area of the M protein ⁇ light chain
  • represents the peak area of the M protein ⁇ light chain.
  • the ⁇ light chain region refers to the region where the m/z of the singly charged ions is 23100-24600Da; the ⁇ light chain region refers to the region where the m/z of the singly charged ions is 22400-23100Da.
  • the abnormal M protein peak refers to the M protein peak with non-Gaussian distribution.
  • the present invention also provides a device for detecting M protein, the device comprising:
  • Information acquisition module 101 used to obtain m/z distribution data of immunoglobulin light chain single-charged ions in the sample to be tested, wherein the light chain includes ⁇ light chain and ⁇ light chain;
  • Peak shape recognition module 102 used to analyze the peak area of ⁇ light chain: ⁇ light chain and whether the peak shape is non-Gaussian distribution;
  • Result determination module 103 used to output the result according to the following conditions:
  • the peak area ratio of ⁇ light chain: ⁇ light chain is less than 1.8, and the m/z of the singly charged ions contains a non-Gaussian distribution peak shape within 22400-23100 Da, it is determined that the sample to be tested contains ⁇ light chain type M protein;
  • the peak area ratio of ⁇ light chain: ⁇ light chain is greater than 3.5, and the m/z of the singly charged ions contains a non-Gaussian distribution peak shape within 23100-24600Da, it is determined that the sample to be tested contains ⁇ light chain type M protein;
  • the peak area ratio of ⁇ light chain: ⁇ light chain is ⁇ 1.8 or ⁇ 3.5, and the peak shape is Gaussian distribution, it is determined that the sample to be tested does not contain M protein.
  • the peak shape recognition module 102 includes:
  • Training data set generation submodule used to obtain a mass spectrometry sample data set that has been detected and manually annotated, and to assign corresponding values to negative samples, ⁇ peak anomaly samples, and ⁇ peak anomaly samples in the data set to obtain a training data set;
  • Feature extraction submodule used to extract features from the peak graphs of each mass spectrometry sample in the training data set; the feature values of feature extraction include ⁇ L , CL , ⁇ R , CR , ⁇ L / ⁇ R , CL / CR ;
  • Model generation submodule It is used to use the random forest algorithm to obtain a random forest model that can analyze the peak area of ⁇ light chain: ⁇ light chain and whether the peak shape is non-Gaussian distribution based on the feature values of each mass spectrometry sample extracted by the feature extraction submodule and the corresponding value of each sample.
  • the information source in the information acquisition module of the device is consistent with the description in the M protein detection method, and the rules and methods of the result determination module are also consistent with the description in the M protein detection method, which will not be repeated here.
  • modules of the above system is only a division of logical functions. In actual implementation, they can be fully or partially integrated into one physical entity, or they can be physically separated.
  • These modules can all be implemented in the form of software called by processing elements; they can also be all implemented in the form of hardware; some modules can be implemented in the form of software called by processing elements, and some modules can be implemented in the form of hardware.
  • the information acquisition module can be a separately established processing element, or it can be integrated in a certain chip.
  • it can also be stored in the memory in the form of program code, and called by a certain processing element to execute the functions of the above protein annotation module.
  • the implementation of other modules is similar to this.
  • each step of the above method or each of the above modules can be completed by an integrated logic circuit of hardware in the processor element or an instruction in the form of software.
  • the above modules may be one or more integrated circuits configured to implement the above methods, such as one or more application specific integrated circuits (ASIC), or one or more digital singnal processors (DSP), or one or more field programmable gate arrays (FPGA).
  • ASIC application specific integrated circuits
  • DSP digital singnal processors
  • FPGA field programmable gate arrays
  • the processing element may be a general-purpose processor, such as a central processing unit (CPU) or other processor that can call program code.
  • CPU central processing unit
  • these modules may be integrated together and implemented in the form of a system-on-a-chip (SOC).
  • SOC system-on-a-chip
  • the present invention also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the steps of the above method are implemented.
  • the computer-readable storage medium stores a computer program for running to implement the M protein detection method.
  • the computer-readable storage medium may include, but is not limited to, a floppy disk, an optical disk, a CD-ROM (read-only disk memory), a magneto-optical disk, a ROM (read-only memory), a RAM (random access memory), an EPROM (erasable programmable read-only memory), an EEPROM (electrically erasable programmable read-only memory), a magnetic card or an optical card, a flash memory, or other types of media/machine-readable media suitable for storing machine-executable instructions.
  • the computer-readable storage medium may be a product that is not connected to a computer device, or a component that is connected to a computer device for use.
  • the computer program is a routine, program, object, component, data structure, etc. that performs specific tasks or implements specific abstract data types.
  • the present invention also provides a computer processing device, comprising a processor and the computer-readable storage medium, wherein the processor executes a computer program on the computer-readable storage medium to implement the steps of the aforementioned method.
  • the present invention also provides a service terminal, comprising:
  • Communicator 201 used for communicating with the outside
  • Memory 202 storing computer programs
  • the processor 203 is used to run the computer program to implement the M protein detection method.
  • the service terminal can communicate with a user terminal having network communication capability through its communicator 201, thereby providing M protein detection service.
  • the memory 202 in the embodiment of FIG. 10 may include, but is not limited to, a high-speed random access memory, a non-volatile memory, such as one or more disk storage devices, flash memory devices, or other non-volatile solid-state storage devices;
  • the processor 203 may include but is not limited to a central processing unit (CPU), a network processor (NP), etc.; it may also be a digital signal processor (DSP), an application specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other programmable logic devices, discrete gate or transistor logic devices, or discrete hardware components.
  • CPU central processing unit
  • NP network processor
  • DSP digital signal processor
  • ASIC application specific integrated circuit
  • FPGA field programmable gate array
  • the communicator 201 in the embodiment of FIG. 10 may be a network communication circuit module of a wired or wireless standard.
  • Serum samples from patients diagnosed with multiple myeloma were included in the MALDI-TOF MS analysis after testing by serum protein electrophoresis (SPEP), serum immunofixation electrophoresis (IFE), total protein, and different globulin isotypes (the methods and instruments used are described below);
  • SPEP includes the following steps: First, turn on the V8 instrument for self-test. After the self-test is completed, load the serum sample into the instrument. The instrument will automatically test all loaded samples. Then, the protein in the serum will move to the negative electrode with positive charge in the pH area below the isoelectric point (PI). According to the different isoelectric points of different types of serum protein molecules, they are focused at pH After the experiment, the results will be transferred to Platinum4v software.
  • IFE includes the following steps: the first step is high-resolution gel electrophoresis, the sample is diluted with physiological saline at 1:3 (50 ⁇ L serum + 100 ⁇ L physiological saline), 1:5 (50 ⁇ L serum + 100 ⁇ L physiological saline), the sample plate is placed on the left side of the electrophoresis tank, 20 ⁇ L of the sample diluted at 1:3 is sucked into the SP position, and 20ml of the sample diluted at 1:5 is added to the other 5 positions.
  • the proteins in the serum are separated due to different charges under the action of electrophoresis and electroosmosis; the second step is immunoprecipitation, 40 ⁇ L of the corresponding antiserum is added, and the soluble antigen will form an antigen-antibody complex with the antibody to become an insoluble precipitate.
  • the unreacted protein is removed by washing, and the antigen-antibody complex is stained, and the immunofixation precipitation band will appear on the protein map.
  • the total protein concentration was determined by the biuret method using a fully automatic biochemical analyzer system (BS2000) from Mindray; the concentrations of IgG, IgA, and IgM were determined by the scattering immunoassay using a fully automatic protein analysis biochemical analyzer system (BNProSpec) from Siemens (SIEMENS).
  • BS2000 fully automatic biochemical analyzer system
  • BNProSpec fully automatic protein analysis biochemical analyzer system
  • DTT Dithiothreitol
  • trifluoroacetic acid trifluoroacetic acid
  • acetonitrile purchased from Sigma-Aldrich
  • IgG, IgA, and IgM immunoglobulin standards were purchased from Sigma-Aldrich. Each immunoglobulin standard was purified from a multiple myeloma patient;
  • the mass spectrometry conditions for the analysis and measurement were: source voltage 20 kV, detector voltage 0.48 kV, laser energy 4.8 ⁇ J, laser frequency 3000 Hz, focus mass 20 kDa, scanning speed 1 mm/s, and acquisition mass range 5 kDa to 200 kDa.
  • the LC components were ionized into single-charged ions. ions and measure the m/z distribution of LC singly charged ions.
  • the QuanTOF's built-in viewer software to visually inspect the serum fingerprint.
  • the fingerprints from different patient samples are compared with the fingerprints of normal controls for visual detection of monoclonal immunoglobulins.
  • the criterion for defining a positive result is the identification of sharp mass spectral peaks resembling church spires within the expected light chain m/z range, which are distinguishable from the Gaussian distributed polyclonal background presented in normal samples. That is, a mass spectral peak with a narrower base, higher peaks, and sharper peaks in the light chain m/z range of the fingerprint of the sample to be tested compared with the peak shape of a healthy person is considered a positive result.
  • the mass spectral peak with a +1 charge state is supportive evidence for the presence of M protein.
  • this patent is based on time-of-flight mass spectrometry technology and peak shape recognition tools for the identification of M protein.
  • the workstation that comes with the time-of-flight mass spectrometry technology identifies the light chain type through m/z data, integrates the peak areas of immunoglobulin ⁇ light chain and ⁇ light chain, and automatically calculates the peak area ratio of ⁇ light chain and ⁇ light chain respectively; the peak shape recognition tool can identify whether the ⁇ or ⁇ light chain region contains abnormal M protein peaks by whether there are Gaussian distributed peaks, thereby distinguishing normal and abnormal samples.
  • Abnormal M protein peaks refer to M protein peaks with non-Gaussian distribution.
  • represents the peak area of the M protein ⁇ light chain
  • represents the peak area of the M protein ⁇ light chain
  • the ⁇ light chain region refers to the region where the m/z of the singly charged ions is 23100 to 24600 Da
  • the ⁇ light chain region refers to the region where the m/z of the singly charged ions is 22400 to 23100 Da.
  • Figure 3 shows the mass spectra of 60 healthy subjects as normal controls, showing the region of total light chain TLC ions (m/z 22,400Da to 24,600Da).
  • the mass spectrometry results of the 60 healthy subjects were consistent, showing a Gaussian distribution, with a large number of polyclonal ⁇ -type light chains and ⁇ -type light chains, with a peak height ratio of about 2:1.
  • Figure 4 shows the comparison of serum fingerprints of normal controls and multiple myeloma (IgG ⁇ type) patients before and after reduction.
  • A By superimposing, the mass spectra of the serum samples of healthy people before (blue) and after (green) reduction can be clearly seen;
  • B The serum fingerprints of IgG ⁇ type multiple myeloma patients before and after reduction. Due to the large amount of ⁇ type light chain generated, ⁇ type light chain is inhibited, and the mass spectrum of M protein shows a non-Gaussian distribution. Comparing the serum fingerprints of healthy people and MM patients after serum reduction, it can be clearly seen that the results of healthy people and IgG ⁇ type M protein are quite different.
  • Figure 5 shows the comparison of serum fingerprints of standard globulin and different types of serum from multiple myeloma patients
  • the left picture shows that different types of standard globulins (IgG, IgA, IgM) are reduced and tested separately, and mass spectra are generated and superimposed for analysis.
  • Polyclonal LC single-charged ions are labeled with isotype and charge respectively.
  • the mass spectra of different Ig isotypes are marked with different colors.
  • the enlarged mass spectrum in the figure is concentrated in the area of LC single-charged ions.
  • the right picture shows the comparison of the serum fingerprint of patients containing IgG ⁇ , IgA ⁇ , and IgM ⁇ M proteins and the serum fingerprint of healthy people (after reduction).
  • the LC mass spectrum peak base of multiple myeloma patients with different Ig isotypes is narrow, high and sharp, and can be clearly distinguished from the Gaussian distribution peak shape presented by healthy people (black); in the mass spectrum overlay of the enlarged LC single-charged ion region (m/z: 22400-24600), it can be seen that in the polyclonal background of healthy people, the LC single-charged ion region of different Ig isotypes of multiple myeloma patients has a unique and relatively high intensity peak.
  • the present invention defines the m/z range of LC ( ⁇ , ⁇ ) single-charged ions, which are ⁇ -TLC (23100-24600Da, [M+H]+) and ⁇ -TLC (22400-23100Da, [M+H]+).
  • the present invention calculates the light chain of M protein.
  • Figure 6 shows the peak areas of ⁇ light chain and ⁇ light chain of M protein of 60 healthy people, and the light chain ⁇ / ⁇ ratio is calculated by mass spectrometry workstation. As shown in the figure, the ⁇ / ⁇ of 60 healthy people is concentrated between 1.8-3.5.
  • the peak shape recognition tool identifies whether there is an M protein light chain peak in the light chain m/z range based on whether the peak shape is Gaussian distribution. If the peak shape is non-Gaussian distribution, as shown in Figure 7, the MATLAB software is used to highlight the area where the monoclonal protein component is located in a vertical tangent manner, and the mass spectrometry workstation is used to calculate the proportion of the highlighted area (i.e., M protein) in the TLC.
  • IFE is considered to be the most sensitive method for detecting M protein
  • this example compares MALDI-TOF MS with IFE.
  • the specific operation is as follows: the sera of different M protein-positive multiple myeloma patients are mixed with the sera of normal people, and continuously diluted at the ratios of 0 times, 1:2, 1:10, 1:20, 1:100 and 1:200. All diluted samples are divided into two equal parts and analyzed by IFE and MALDI-TOF MS, and the analysis method is the same as that in Example 1.
  • MALDI-TOF MS can still detect ⁇ -type M protein when detecting M protein serum samples diluted 1:100 (A in Figure 8); in the detection of IFE, M protein can no longer be detected when the serum is diluted 1:100 (B in Figure 8); the analytical sensitivity of different M proteins determined by IFE and MALDI at different dilutions is compared (C in Figure 8),
  • the specific operation is as follows: 7 patient serum samples with known M protein concentrations (4 IgG, 2 IgA and 1 IgM; range 0.5-8 g/dL) were diluted with normal human serum at the following ratios: 0x, 1:3, 1:15, 1:75 and 1:375, and Quantification was performed by SPEP and MALDI (5 samples per patient; 35 samples in total, the method was the same as in Example 1).
  • MALDI-TOF MS the peak area was calculated using a workstation, and the M protein peak was gated and quantified using a peak recognition tool. As shown in Figure 9, the M protein concentration measured by SPEP and MALDI-TOF MS was very consistent with the expected concentration (R 2 >0.98).
  • the present invention is based on MALDI-TOF MS (QuanTOF, Rongzhi Biotechnology (Qingdao) Co., Ltd.) to quickly identify M protein.
  • MALDI-TOF MS QuanTOF, Rongzhi Biotechnology (Qingdao) Co., Ltd.
  • QuanTOF mass spectrometer (Rongzhi Biotechnology (Qingdao) Co., Ltd.) provides a wider mass range, higher sensitivity and better reproducibility.
  • QuanTOF has high sensitivity and accuracy in identifying and monitoring patient serum M protein.
  • the new automated system based on time-of-flight mass spectrometry of the present invention is mainly achieved by reducing immunoglobulins to break the disulfide bond between the heavy chain (HC) and the light chain (LC) of Ig, and can directly analyze the specific changes of the light chain of M protein, with higher sensitivity and stronger specificity.
  • this method only needs one test for qualitative and typing, the sample pretreatment process is simple and fast, the consumption of reagents and consumables is small, the detection throughput is high, and the sample pretreatment time is greatly shortened compared with the existing electrophoresis method.
  • the test results are not easily affected by laboratory conditions, and it is easier to achieve standard unification.
  • the present invention effectively solves the problems of low accuracy, low detection throughput, poor specificity and sensitivity in the existing methods for screening M protein. This method is expected to be applied to large-scale clinical screening of M protein, improving the screening, Diagnostic and monitoring capabilities.

Landscapes

  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Immunology (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Urology & Nephrology (AREA)
  • Hematology (AREA)
  • Artificial Intelligence (AREA)
  • Pathology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Analytical Chemistry (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Food Science & Technology (AREA)
  • Cell Biology (AREA)
  • Medical Informatics (AREA)
  • Medicinal Chemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Electrochemistry (AREA)
  • Bioethics (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Public Health (AREA)

Abstract

本发明涉及生物医学领域,特别是涉及一种M蛋白检测的方法,所述方法包括:1)提供待测样本中免疫球蛋白轻链单电荷离子的m/z分布数据,所述轻链包括λ轻链和κ轻链;2)结果判定:若轻链m/z范围内具有基底窄、峰高且尖锐的质谱峰,则判定待测样本中含有M蛋白,或,若κ轻链:λ轻链的峰面积比值小于1.8或大于3.5,且峰形为非高斯分布,则判定待测样本中含有M蛋白。

Description

一种M蛋白检测的方法 技术领域
本发明涉及生物医学领域,特别是涉及一种M蛋白检测的方法。
背景技术
单克隆性丙种球蛋白病(MG)是一种以浆细胞克隆性扩增为特征的疾病,被分类为低肿瘤负荷疾病、癌前病变和恶性肿瘤等阶段。低肿瘤负荷疾病没有大量的浆细胞克隆增殖,但分泌的单克隆蛋白会直接引起病变,如免疫球蛋白轻链淀粉样变性;癌前病变包括意义未明的单克隆丙种球蛋白病(MGUS)和冒烟型多发性骨髓瘤(SMM),它们不表现出可归因于浆细胞克隆或单克隆蛋白的相关器官以及组织的损害等症状。多发性骨髓瘤(MM)是MG中最常见的恶性肿瘤疾病之一,是一种好发于中老年人的恶性血液肿瘤,其特征是单克隆浆细胞恶性增殖并分泌大量单克隆蛋白(M蛋白)。M蛋白是一种与MG患者的克隆性浆细胞负荷直接相关的血清生物标志,可以作为作识别疾病的诊断标志物和跟踪疾病进展和对治疗的反应的定量标志物。M蛋白的识别、分型和定量有助于疾病的初步诊断、风险分层和监测对治疗的反应等。
正常人的免疫球蛋白(Immunoglobulin,Ig)由两条相同的轻链及两条相同的重链组成。两条相同的轻链即为κ或λ,不存在Ig同时结合κ和λ两种轻链。骨髓瘤患者体内的单克隆免疫球蛋白轻链会抑制另一种轻链的生成,从而使κ/λ比值失去平衡,比值的升高或降低分别提示患κ或λ型多发性骨髓瘤的可能。若患者血清中出现大量M蛋白,血清Ig成分发生变化,如不同Ig同种型的相对含量、轻链κ/λ比值等。κ/λ比值的失衡是区分M M浆细胞单克隆性增生与其他疾病的重要指标。大多数新诊断的多发性骨髓瘤患者,其M蛋白浓度很高,但在治疗后不久,M蛋白浓度发生显著变化,通常在几个月内下降几个数量级,提示恶性克隆浆细胞逐渐被消除。目前血清蛋白质电泳(PEL)、免疫固定电泳(IFE)和血清游离轻链比浊法(sFLC)可用于检测、监测和量化M蛋白。但血清蛋白电泳主要筛查是否有M蛋白;免疫固定电泳主要对M蛋白分型,PEL的检测灵敏度不高,IFE是低通量电泳技术,且都无法检测低水平的M蛋白,在有效治疗后监测多发性骨髓瘤疾病活动时很快就无法检测到,显然这些方法不能用于早期发现疾病复发,导致漏诊率高。此外,对实验结果的解读需要有经验的实验室工作人员,因此存在不同人员解读的差异,导致电泳结果评判标准在不同实验室不同工作人员解读中存在不一致现象,因此在筛查方法标准化方面很难做到统一。高通量的比浊法是目前认为分析敏感最高的一种间接证明M蛋白存在的测量方法,但并非所有多发 性骨髓瘤患者诊断时sFLC比率异常。而且,血清游离轻链比浊法(sFLC)存在抗原过量、非线性反应和抗体试剂成本高等问题,导致sFLC仍无法在大量临床实验室开展。随着多发性骨髓瘤患者对新化疗和免疫疗法的治疗反应的显著改善,大多数多发性骨髓瘤患者现在可以实现持久缓解,常规用于M蛋白诊断的传统电泳方法将面临着新的挑战。因此,能够检测低水平M蛋白的高灵敏度方法对于提供评估微小残留疾病(MRD)很重要。质谱法是一种在研方法,使用高分辨率分子量检测来准确鉴定血清中的M蛋白,并对其进行分类。在美国梅奥医学中心,MALDI-TOF质谱法已取代免疫固定电泳来鉴定M蛋白。但目前在研的质谱法需要使用抗体进行捕获,导致检测成本较高。
发明内容
鉴于以上所述现有技术的缺点,本发明的目的在于提供一种M蛋白检测的方法,用于解决现有技术中的问题。
为实现上述目的及其他相关目的,本发明提供一种M蛋白检测的方法,所述方法包括如下步骤:
1)提供待测样本中免疫球蛋白轻链和重链单电荷离子的m/z分布数据,所述轻链包括λ轻链和κ轻链;
2)结果判定:若轻链m/z范围内具有基底窄、峰高且尖锐的质谱峰,则判定待测样本中含有M蛋白;或,若κ轻链:λ轻链的峰面积比小于1.8或大于3.5,且峰形为非高斯分布,则判定待测样本中含有M蛋白;所述轻链m/z范围内具有基底窄、峰高且尖锐的质谱峰,或κ轻链:λ轻链的峰面积比值小于1.8或大于3.5,且峰形为非高斯分布的轻链峰为M蛋白轻链峰。
本发明还提供一种M蛋白检测的装置,所述装置包括:
信息获取模块:用于获取待测样本中免疫球蛋白轻链和重链单电荷离子的m/z分布数据,所述轻链包括λ轻链和κ轻链;
峰形识别模块:用于分析κ轻链:λ轻链的峰面积以及峰形是否为非高斯分布;
结果判定模块:用于按照以下情况输出结果:
若κ轻链:λ轻链的峰面积比值小于1.8,且单电荷离子的m/z在22400~23100Da内含有非高斯分布的峰形,则判定待测样本中含有λ轻链型M蛋白;
若κ轻链:λ轻链的峰面积比值大于3.5,且单电荷离子的m/z在23100~24600Da内含有非高斯分布的峰形,则判定待测样本中含有κ轻链型M蛋白;
若κ轻链:λ轻链的峰面积比值≥1.8或≤3.5,且峰形为高斯分布,则判定待测样本中不含有M蛋白。
如上所述,本发明的M蛋白检测的方法,具有以下有益效果:
基于MALDI-TOF MS对κ/λ轻链比率作了系统性的评估,可以检测到轻链比值。基于M蛋白的独特分子量和高丰度的特性,当患者的轻链区出现κ或λ型M蛋白峰和(或)κ/λ比值异常,表明存在M蛋白。相比于SPE或IFE,MALDI-TOF MS能在患者治疗过程中以非常灵敏和特异的方式对低水平M蛋白进行跟踪,能为诊断疾病和监测患者对治疗的反应提供更加精准的检测。同时,与目前在研的质谱法相比,本专利无需富集血清中的免疫球蛋白,没有使用抗体进行捕获,仅通过成本极低的还原剂二硫苏糖醇(DTT)还原血清中的M蛋白,即可将轻链与重链分开,因此,从耗材的角度来看,与现有方法相比,本专利方法能为M蛋白的检测提供了更经济的解决方案;而且该方法操作简单、可自动化、分析灵敏度高,不仅能快速客观判定M蛋白,大大避免了人工目测分析的主观误差。本发明能够分析临床上遇到的各种类型的M蛋白,可以改进多发性骨髓瘤的筛查、诊断和监测方式。
附图说明
图1显示为验证峰形识别工具的结果。
图2显示为本发明的检测流程和原理示意图。
图3显示为来自60名健康人作为正常对照的质谱叠加图。
图4显示为正常对照(左)和多发性骨髓瘤患者(右)还原前与还原后的血清指纹图谱比较。
图5显示为标准球蛋白(左)和多发性骨髓瘤患者(右)血清的指纹图谱比较(还原后)。
图6显示60名健康体检者κ轻链:λ轻链的峰面积比值分布图。
图7显示为峰形识别工具以垂直切线的方式高亮M蛋白所在的区域。
图8显示为MALDI-TOF MS与IFE分析敏感度比较。
图9显示为MALDI-TOF MS与SPE方法线性比较。
图10显示为MALDI-TOF MS对M蛋白轻链分析的流程,图中右侧三个质谱图上横坐标数量级均为104,纵坐标数量级均为10-3
图11显示为本发明M蛋白定性、定量和分型装置的示意图。
图12显示为本发明服务终端的示意图。
具体实施方式
本发明基于MALDI-TOF MS对还原后的球蛋白(Ig)进行检测,所设计的方法学原理(即通过计算两种轻链区质谱峰峰面积比,建立正常样本与M蛋白阳性样本及其分型的阈值)能量化κ/λ轻链比值。通过该方法能实现对M蛋白的识别。
本发明提供一种M蛋白检测的方法,所述方法包括如下步骤:
1)提供待测样本中免疫球蛋白轻链单电荷离子的m/z分布数据,所述轻链包括λ轻链和κ轻链;
2)结果判定:
若轻链m/z范围内具有基底窄、峰高且尖锐的质谱峰,则判定待测样本中含有M蛋白;或,若κ轻链:λ轻链的峰面积比值小于1.8或大于3.5,且峰形为非高斯分布,则判定待测样本中含有M蛋白;所述轻链m/z范围内具有基底窄、峰高且尖锐的质谱峰,或κ轻链:λ轻链的峰面积比值小于1.8或大于3.5,且峰形为非高斯分布的轻链峰为M蛋白轻链峰;
若比值≥1.8或≤3.5,且峰形为高斯分布,则判定为待测样本中不含有M蛋白。
在本发明的某些实施方式中,所述待测样本为血清或尿液样本。
待测样本中免疫球蛋白轻链和重链单电荷离子的m/z分布数据采用下列步骤获得:
I)样本还原:将待测样本用还原剂还原,得到还原样本;
II)检测:将还原样本和基质液混合后点样,利用MALDI-TOF MS分析测量待测样本中免疫球蛋白轻链单电荷离子的m/z分布数据。
在本发明的某些实施方式中,所述待测样本稀释后再用还原剂还原。所述待测样本的稀释倍数为5~20倍,例如为5~15倍。所述待测样本可以用水、PBS、生理盐水中的一种或多种进行稀释。
在本发明的某些实施方式中,步骤I)中所述还原剂选自二硫苏糖醇(DTT)、三(2-羧乙基)膦(TCEP)、三(3-羟丙基)膦(TPP或THPP)、β-巯基乙醇中的任一种或多种。
在本发明的某些实施方式中,待测样本和还原剂混合后,所述还原剂的终浓度为0.02~0.08mol/L。优选的,所述还原剂的终浓度为0.02~0.06mol/L。更优选的,所述还原剂的终浓度为0.04mol/L。
在一种实施方式中,步骤I)中所述还原剂为二硫苏糖醇的甲酸溶液。
在本发明的某些实施方式中,用还原剂还原的步骤为:将待测样本和还原剂混合后,20~30℃下孵育10~30分钟。优选的,24~27℃下孵育15~25分钟。更优选的,25℃下孵育20分钟。
在本发明的某些实施方式中,步骤II)中所述基质液选自芥子酸基质液、2,5一二羟基苯甲酸基质液或α一氰基一4一羟基肉桂酸基质液。所述芥子酸基质液的溶剂为乙腈+含三氟乙酸的水溶液。乙腈和含三氟乙酸的水溶液的体积比为1∶1。
在本发明的某些实施方式中,还原样本和基质液混合后,所述基质液的终浓度为1~5mg/mL。
在本发明的某些实施方式中,步骤II)中分析测量时质谱条件为:源电压20kV,检测器电压0.48kV,激光能量4.8μJ,激光频率3000Hz,聚焦质量20kDa,扫描速度1mm/s,采集质量范围5kDa~200kDa。
在本发明的某些实施方式中,步骤2)中所述轻链m/z范围指轻链单电荷m/z为22400~24600Da。所述轻链m/z范围或其他m/z范围可以根据不同仪器的偏差范围在本发明公开的m/z范围例如22400~24600Da基础上进行相应调整。
在本发明的某些实施方式中,步骤2)中可以将检测获得的待测样本的指纹图谱与健康人的指纹图谱叠加比较,观察在待测样本指纹图谱上轻链m/z范围内是否具有与健康人的峰形相比基底更窄、峰更高且更尖锐的质谱峰,若有,则判定为待测样本中含有M蛋白;若无,则判定为待测样本中不含有M蛋白。具体的,只要不满足与健康人的峰形相比基底更窄、峰更高且更尖锐中的任一个或多个条件,均判定待测样本中不含有M蛋白。
健康人的图谱在轻链m/z范围呈高斯分布。M蛋白的质谱峰基底窄,峰高且尖锐,类似教堂尖顶样,且不呈高斯分布。
在本发明的某些实施方式中,若κ轻链:λ轻链的峰面积比小于1.8或大于3.5,且峰形为非高斯分布,则判定待测样本中含有M蛋白。若κ轻链:λ轻链的峰面积比等于1.8,且峰形为高斯分布,则判定待测样本中不含有M蛋白。不含有M蛋白的情况下,κ轻链和λ轻链区会一般会分别出现两个呈高斯分布的峰。含有M蛋白的情况下,由于κ轻链和λ轻链中的一种会大量生成,导致另一种轻链受抑制,因此在轻链区仅会出现一个强度较强的主峰,判断峰形时,仅判断该主峰是否为高斯分布即可。
在本发明中,κ轻链:λ轻链的峰面积比值是利用质谱自带的工作站对峰面积进行积分计算得到的。峰形是否为高斯分布是利用峰形识别工具判断的。所述峰形识别工具为采用若干个已检测的质谱样本作为训练样本集利用随机森林(Random Forest,RF)算法得到的工具,随机森林算法使用MATLAB的TreeBagger函数完成,随机森林算法是通过集成学习的思想将多棵决策树集成的一种算法,它的基本单元是决策树,它的本质属于机器学习的一大分支——集成学习方法,随机森林算法是现有技术中非常成熟的一种技术。随机森林是常用的有 监督机器学习分类算法。机器学习算法可视为一个复杂函数。在一个n分类任务中,该函数的功能是,当输入某个样本的特征值时,该函数输出n个和为1的值,这n个值可视为输入样本在每一类上的打分,一般情况下,将得分最高的类别视为函数对输入样本的判断。所谓的有监督学习,是指我们把大量样本以及它们的分类信息(亦称为样本的标签)输入到分类器中,然后算法通过样本的标签,对分类器的函数(或决策过程)进行自调整,调整的最终目的是使函数输出的分数,尽可能和样本的标签一致。这个迭代过程亦称为训练(training)。在训练结束后,把未参与训练的样本输入到分类器中,记录分类器对样本的判断情况,并用不同的指标对分类器的性能进行评估,这个过程称为测试(test)。随机森林由多个决策树组成,每个决策树是一个子分类器,随机森林用这些决策树对样本进行投票,并计算出样本在每个类别中的得分。对每个决策树,随机森林会随机选取训练数据的一个子集以及用于描述样本的特征的一个子集来训练每个决策树。每个决策树的结构为二叉树,每个节点代表一个特征,根据特征的取值,决定流程该往二叉树的哪一个方向走。二叉树最后一层节点,代表该决策树对样本类别的判断。在训练过程中,每个决策树根据样本的标签选择最优的特征并生成二叉树。具体的,使用随机森林分类器对轻链区的情况进行识别。在这部分分析中,轻链区一共有3种情况:第一种,m/z值在[22400,23100]区间的λ峰异常,而[23100,24600]的κ峰正常;第二种,m/z值在[22400,23100]区间的λ峰正常,而[23100,24600]的κ峰异常;第三种,m/z值在[22400,23100]区间的λ峰和[23100,24600]的κ峰均正常。在这里假设出现异常的峰都是对应区间的最大值。因此,识别问题可简化为一个3分类的问题,即先在峰图的对应区间找到最大值的点,然后根据这两个最大值点的峰图特征,对样本进行分类。
峰形识别工具的开发的过程如下:
1)采用若干个已检测的质谱样本作为训练样本集,对训练样本集中的各质谱样本的峰图进行人工标注类别,分为三类:阴性样本(标记为normal),λ峰异常(标记为lambda),κ峰异常(标记为kappa)样本。
2)对训练样本集中的各质谱样本峰图进行特征提取;
在一些具体实施方式中,质谱样本特征提取的特征值包括σL,CL,σR,CR,σLR,CL/CR
具体的,确定每个样本单电荷离子的m/z在[23100,24600]区间强度最大的峰与[22400,23100]区间强度最大的峰,并进行高斯分布拟合。
设A(xm,ym)为峰顶点m的坐标,xm代表该峰顶点的质荷比,ym代表该峰顶点的丰度,取横坐标在[xm-xm*0.001,xm+xm*0.001]的点进行拟合。这些点的坐标记为(x1,y1),(x2,y2),...(xn, yn),拟合的方程表达式为:
其中x为荷质比,y为丰度
σ描述的是正态分布的峰宽,C描述的是峰在纵轴方向的变形程度。拟合的目标是找到一个σ,使得的值最小。把上述求和取最小值的σ以及其对应的C值作为该区间最大值所在的峰特征表示。令σR和XR表示m/z区间[23100,24600]的特征,σL和XL表示m/z区间[22400,23100]的特征。
以用特征值(σL,CL,σR,CR,σLR,CL/CR)来表示每个样本的质谱数据,即每个样本可以用这6个特征值进行表示。
3)将训练样本集中的各质谱样本的6个特征值及每个样本对应的标注类别,采用随机森林算法获得随机森林模型。
利用每个质谱样本对应的6个特征值,采用前述步骤获得的随机森林模型计算出Snormal、Sλ、Sκ,Snormal代表轻链区峰形呈高斯分布的预测值、Sλ代表λ轻链区存在非高斯分布峰的预测值、Sκ代表kappa轻链区存在非高斯分布峰的预测值,且Snormal+Sλ+Sκ=1,取Snormal、Sλ、Sκ中的最大数值作为最终结论。例如Snormal为最大数值,则代表轻链区峰形呈高斯分布。
在一个具体的实施方式中,收集1929个已检测的质谱样本作为训练样本集,该1929个质谱数据中,通过对每个峰图进行人工标注,将该1929个峰图标注为以下三类:阴性样本(标记为normal)共924个样本,λ峰异常(标记为lambda)共433个样本,κ峰异常(标记为kappa)共有572个样本。随后,对每个峰进行特征提取,获得随机森林模型作为峰形识别工具。
在此基础上,进一步采用上述1929个样本进行十折交叉验证。结果见图1,每一行表示所有样本,第一行说明的433例lambda型M蛋白样品有0.9885比例的样本分类为lambda型M蛋白,即428例预测正确,仅5例错分类成阴性或kappa型;同理,第二行代表924例阴性样品中有0.9957比例的样本分为阴性(正确),仅4例错误;第三行代表572例kappa型M蛋白样品中有0.9895比例的样本分为kappa型M蛋白,仅7例分类为阴性(错误);10折交叉验证表明,采用随机森林算法建立的模型的准确率理想。
步骤2)中具体判定方法如下:
若κ/λ<1.8,λ轻链区含有异常的M蛋白轻链峰,,则判定为λ轻链型M蛋白阳性;
若κ/λ>3.5,κ轻链区含有异常的M蛋白轻链峰,,则判定为κ轻链型M蛋白阳性;
公式中,κ代表M蛋白κ轻链的峰面积,λ代表M蛋白λ轻链的峰面积。κ轻链区指的是单电荷离子的m/z为23100~24600Da的区域;λ轻链区指的是单电荷离子的m/z为22400~23100Da的区域。异常的M蛋白峰是指非高斯分布的M蛋白峰。
本发明还提供一种M蛋白检测的装置,所述装置包括:
信息获取模块101:用于获取待测样本中免疫球蛋白轻链单电荷离子的m/z分布数据,所述轻链包括λ轻链和κ轻链;
峰形识别模块102:用于分析κ轻链:λ轻链的峰面积以及峰形是否为非高斯分布;
结果判定模块103:用于按照以下情况输出结果:
若κ轻链:λ轻链的峰面积比值小于1.8,且单电荷离子的m/z在22400~23100Da内含有非高斯分布的峰形,则判定待测样本中含有λ轻链型M蛋白;
若κ轻链:λ轻链的峰面积比值大于3.5,且单电荷离子的m/z在23100~24600Da内含有非高斯分布的峰形,则判定待测样本中含有κ轻链型M蛋白;
若κ轻链:λ轻链的峰面积比值≥1.8或≤3.5,且峰形为高斯分布,则判定待测样本中不含有M蛋白。
所述峰形识别模块102包括:
1)训练数据集生成子模块:用于获取已检测并进行人工标注的质谱样本数据集,对数据集中阴性样本、λ峰异常、κ峰异常样本进行相应赋值,以获得训练数据集;
2)特征提取子模块:用于对训练数据集中的各质谱样本峰图进行特征提取;特征提取的特征值包括σL,CL,σR,CR,σLR,CL/CR
3)模型生成子模块:用于将特征提取子模块提取的各质谱样本的特征值及每个样本对应的赋值,采用随机森林算法获得能够分析κ轻链:λ轻链的峰面积以及峰形是否为非高斯分布的随机森林模型。
所述装置的信息获取模块中的信息来源与M蛋白检测的方法中的描述一致,所述结果判定模块的规则、方法等也与M蛋白检测的方法中的描述一致,在此不再赘述。
需要说明的是,应理解以上系统的各个模块的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。这些模块可以全部以软件通过处理元件调用的形式实现;也可以全部以硬件的形式实现;还可以部分模块通过处理元件调用软件的形式实现,部分模块通过硬件的形式实现。例如,信息获取模块可以为单独设立的处理元件,也可以集成在某一个芯片中实现,此外,也可以以程序代码的形式存储于存储器中,由某一个处理元件调用并执行以上蛋白质注释模块的功能。其它模块的实现与之类 似。此外这些模块全部或部分可以集成在一起,也可以独立实现。这里所述的处理元件可以是一种集成电路,具有信号的处理能力。在实现过程中,上述方法的各步骤或以上各个模块可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。
例如,以上这些模块可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路(Application Specific Integrated Circuit,简称ASIC),或,一个或多个微处理器(digital singnal processor,简称DSP),或,一个或者多个现场可编程门阵列(Field Programmable Gate Array,简称FPGA)等。再如,当以上某个模块通过处理元件调度程序代码的形式实现时,该处理元件可以是通用处理器,例如中央处理器(Central Processing Unit,简称CPU)或其它可以调用程序代码的处理器。再如,这些模块可以集成在一起,以片上系统(system-on-a-chip,简称SOC)的形式实现。
本发明还提供一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现前述方法的步骤。
所述计算机可读存储介质,存储有计算机程序,用于运行以实现所述M蛋白检测方法。所述计算机可读存储介质可包括,但不限于,软盘、光盘、CD-ROM(只读光盘存储器)、磁光盘、ROM(只读存储器)、RAM(随机存取存储器)、EPROM(可擦除可编程只读存储器)、EEPROM(电可擦除可编程只读存储器)、磁卡或光卡、闪存、或适于存储机器可执行指令的其他类型的介质/机器可读介质。所述计算机可读存储介质可以是未接入计算机设备的产品,也可以是已接入计算机设备使用的部件。
在具体实现上,所述计算机程序为执行特定任务或实现特定抽象数据类型的例程、程序、对象、组件、数据结构等等。
本发明还提供一种计算机处理设备,包括处理器及所述的计算机可读存储介质,所述处理器执行所述计算机可读存储介质上的计算机程序,实现前述方法的步骤。
本发明还提供一种服务终端,包括:
通信器201,用于与外部通信;
存储器202,存储有计算机程序;
处理器203,用于运行所述计算机程序以实现所述的M蛋白检测方法。
所述服务终端可以通过其通信器201与具备网络通信能力的用户终端通信,从而提供M蛋白检测服务。
在图10实施例中的存储器202,可能包括但不限于高速随机存取存储器、非易失性存储器。例如一个或多个磁盘存储设备、闪存设备或其他非易失性固态存储设备;在图12实施例 中的处理器203,可能包括但不限于中央处理器(Central Processing Unit,简称CPU)、网络处理器(Network Processor,简称NP)等;还可以是数字信号处理器(Digital Signal Processing,简称DSP)、专用集成电路(Application Specific Integrated Circuit,简称ASIC)、现场可编程门阵列(Field-Programmable Gate Array,简称FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
在图10实施例中的通信器201,可以是有线或无线制式的网络通信电路模块。
以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。
在进一步描述本发明具体实施方式之前,应理解,本发明的保护范围不局限于下述特定的具体实施方案;还应当理解,本发明实施例中使用的术语是为了描述特定的具体实施方案,而不是为了限制本发明的保护范围;在本发明说明书和权利要求书中,除非文中另外明确指出,单数形式“一个”、“一”和“这个”包括复数形式。
当实施例给出数值范围时,应理解,除非本发明另有说明,每个数值范围的两个端点以及两个端点之间任何一个数值均可选用。除非另外定义,本发明中使用的所有技术和科学术语与本技术领域技术人员通常理解的意义相同。除实施例中使用的具体方法、设备、材料外,根据本技术领域的技术人员对现有技术的掌握及本发明的记载,还可以使用与本发明实施例中所述的方法、设备、材料相似或等同的现有技术的任何方法、设备和材料来实现本发明。
实施例1
1、样品
经血清蛋白电泳(SPEP)、血清免疫固定电泳(IFE)、总蛋白、不同球蛋白同种型检测(使用的方法和仪器见下)方法检测,确诊为多发性骨髓瘤的患者的血清样本被纳入MALDI-TOF MS分析;
(1)SPEP和IFE
所有测定均根据现有的临床免疫学实验室的方案进行:SPEP在美国海伦娜的毛细管电泳V8系统上进行,而IFE在美国海伦娜的SPIFE TOUCH系统上进行。
SPEP包括以下步骤:首先打开V8仪器进行自检,自检结束后,将血清样本加载到仪器中,仪器会自动对所有已加载的样本自动进行检测,然后血清中蛋白会在低于等电点(PI)的pH区域带正电向负极移动。根据不同种类的血清蛋白分子的等电点不同,将其聚焦在pH 梯度内加以分离;实验结束后,检测结果将被传输到Platinum4v软件中。
IFE包括以下步骤:第一步高解析凝胶电泳,用生理盐水将样本按1∶3稀释(50μL血清+100μL生理盐水),1∶5稀释(50μL血清+100μL生理盐水),电泳槽左侧放好样本盘,吸入20μL按照1∶3稀释的标本放入SP位,其余5个位置加20ml按照1∶5稀释的标本,血清中蛋白根据电泳和电渗作用下,因携带电荷不同而分离;第二步免疫沉淀,加入相应抗血清40μL,可溶性抗原会和抗体形成抗原抗体复合物变成不溶的沉淀物。通过洗涤去除未反应的蛋白,并对抗原抗体复合物进行染色,即可在蛋白图上出现免疫固定沉淀带。
(2)总蛋白、不同球蛋白同种型的测定
使用来自迈瑞的全自动生化分析仪系统(BS2000)通过双缩脲法测定总蛋白浓度;使用来自西门子(SIEMENS)的全自动蛋白分析生化分析仪系统(BNProSpec)通过散射免疫法测定IgG、IgA、IgM的浓度。
2、试剂
(1)二硫苏糖醇(DTT)、三氟乙酸、乙腈购自Sigma-Aldrich;
(2)芥子酸(SA)购自Sigma-Aldrich;
(3)IgG、IgA和IgM免疫球蛋白标准品购自Sigma-Aldrich,各免疫球蛋白标准品是从多发性骨髓瘤患者纯化的;
(4)质量校准品:含有细胞色素C(Mass=12362Da[M+H]+)、肌红蛋白(Mass=16952Da[M+H]+)、醛缩酶(Mass=39212Da[M+H]+)和牛血清白蛋白(Mass=66430Da[M+H]+)的混合物,购自融智生物科技(青岛)有限公司;
3、MALDI-TOF MS检测:
检测流程和原理如图2所示,具体步骤如下:
(1)样品还原:将20μL血清加入到180μL PBS中(稀释10倍),以1000rpm振荡30s。稀释后的血清样品用0.4M二硫苏糖醇(DTT,溶于0.1%甲酸)进行还原,血清:DTT体积比例为9∶1,然后以1000rpm振荡,在室温下孵育20分钟,将Igs解离成分离的LC和HC;
(2)MALDI-TOF MS检测:配制溶于体积比例为5∶5的乙腈/水溶液(含0.1%三氟乙酸)的10mg/mL的芥子酸基质液,并按照体积比例为4∶1分别加入芥子酸基质液和还原后的血清样品,混匀后直接点在可重复使用的96孔靶板(QuanTOF)上。样品干燥后,使用线性QuanTOF质谱仪(QuanTOF I型,融智生物科技(青岛)有限公司)进行质量分析,分析测量时质谱条件为:源电压20kV,检测器电压0.48kV,激光能量4.8μJ,激光频率3000Hz,聚焦质量20kDa,扫描速度1mm/s,采集质量范围5kDa~200kDa。LC组分被电离成单电荷离 子,并测量LC单电荷离子的m/z分布。
4.结果判定方法
(1)使用QuanTOF自带的视图器软件目测检查血清指纹图谱。将来自不同患者样品的指纹图谱,与正常对照的指纹图谱进行比较,用于单克隆免疫球蛋白的视觉检测。定义为阳性结果的标准是在预期的轻链m/z范围内识别出类似教堂尖顶样的尖锐的质谱峰,这些峰与正常样本中呈现的高斯分布的多克隆背景可区分,即待测样本指纹图谱上轻链m/z范围内具有与健康人的峰形相比基底更窄、峰更高且更尖锐的质谱峰即判定为阳性结果。+1电荷状态的质谱峰是检查是否存在M蛋白的支持性证据。
(2)峰面积计算和峰形识别工具:
为了进一步量化结果,本专利基于飞行时间质谱技术和峰形识别工具,用于M蛋白的识别。飞行时间质谱技术自带的工作站通过m/z数据识别轻链类型,并对免疫球蛋白λ轻链、κ轻链的峰面积进行积分,分别自动计算λ轻链、κ轻链的峰面积比值;峰形识别工具可以通过是否存在呈高斯分布的峰来识别λ或κ轻链区是否含有异常的M蛋白峰,从而区分正常和异常样本,异常的M蛋白峰是指非高斯分布的M蛋白峰。
若κ/λ=[1.8,3.5],轻链区无异常的M蛋白峰,,则样本鉴定为正常;
若κ/λ<1.8,λ轻链区含有异常的M蛋白峰,则判定为λ轻链型M蛋白阳性;
若κ/λ>3.5,κ轻链区含有异常的M蛋白峰,则判定为κ轻链型M蛋白阳性。
公式中,κ代表M蛋白κ轻链的峰面积,λ代表M蛋白λ轻链的峰面积;κ轻链区指的是单电荷离子的m/z为23100~24600Da的区域;λ轻链区指的是单电荷离子的m/z为22400~23100Da的区域。
5.结果
(1)图3显示为来自60名健康人作为正常对照的质谱叠加图,总轻链TLC离子的区域(m/z 22,400Da~24,600Da)。60名健康人质谱测定结果较一致,呈高斯分布,存在大量多克隆的κ型轻链与λ型轻链,二者峰高比约为2∶1。
(2)图4显示为正常对照和多发性骨髓瘤(IgGλ型)患者还原前与还原后的血清指纹图谱比较,(A)通过叠加方式,可以清晰地看到健康人的血清标本还原前(蓝色)与还原后(绿色)的质谱图;(B)IgGλ型多发性骨髓瘤患者还原前、后的血清指纹图谱。由于大量的λ型轻链生成,κ型轻链受抑制,M蛋白的质谱图呈非高斯分布。比较健康人和MM患者的血清还原后的血清指纹图谱,可以清晰地看到健康人与IgGλ型M蛋白的结果相差较大。
(3)图5显示为标准球蛋白和不同类型的多发性骨髓瘤患者血清的血清指纹图谱比较 (还原后),左图为将不同类型的标准球蛋白(IgG、IgA、IgM)还原后分别进行检测,生成质谱图并叠加起来用于分析。多克隆LC单电荷离子分别用同种型和电荷标记。不同Ig同种型的质谱图分别以不同的颜色标记。图中的扩大质谱图集中在LC单电荷离子的区域。右图为含有IgGλ、IgAκ、IgMκM蛋白的患者血清指纹图谱和健康人的血清指纹图谱比较(还原后)。图中扩大质谱图的LC单电荷离子区域(m/z:22400~24600)可见,不同Ig同种型的多发性骨髓瘤患者的LC质谱峰基底较窄,高而尖锐,与健康人(黑色)呈现的高斯分布的峰形可明显区分;在扩大的LC单电荷离子区域(m/z:22400~24600)的质谱叠加图可见,在健康人多克隆背景中,多发性骨髓瘤患者的不同Ig同种型的LC单电荷离子区域存在独特相对较高的强度峰。根据标准球蛋白及健康人的结果,本发明定义LC(κ、λ)单电荷离子的m/z范围,分别为κ-TLC(23100~24600Da,[M+H]+)、λ-TLC(22400~23100Da,[M+H]+)。
(4)峰面积计算
为了进一步将具有定性特征的M蛋白窄峰量化,提供识别M蛋白的依据,本发明对M蛋白的轻链进行计算。图6为计算的60名健康人的M蛋白λ轻链、κ轻链的峰面积,通过质谱工作站分别计算轻链κ/λ比值。如图所示,60名健康人的κ/λ集中在1.8-3.5之间。
(5)峰形识别
峰形识别工具根据峰形是否为高斯分布识别轻链m/z范围是否存在M蛋白轻链峰,如果峰形为非高斯分布,则如图7所示,利用MATLAB软件以垂直切线的方式高亮单克隆蛋白成分所在的区域,并利用质谱工作站计算该高亮区域(即M蛋白)占TLC的比例。
实施例1分析敏感度
由于IFE被认为是检测M蛋白最敏感的方法,本实施例将MALDI-TOF MS与IFE进行比较。具体操作如下:将不同的M蛋白阳性的多发性骨髓瘤患者的血清来与正常人的血清混合,按照0倍、1∶2、1∶10、1∶20、1∶100和1∶200比例连续稀释。将所有稀释的样品分成两等份,并通过IFE和MALDI-TOF MS进行分析,分析方法与实施例1相同。由图8可见,MALDI-TOF MS检测稀释1∶100的M蛋白血清样本时仍能检测κ型M蛋白(图8中A);在IFE的检测中,血清1∶100稀释时已经检测不到M蛋白(图8中B);不同稀释度的IFE和MALDI测定的不同M蛋白的分析灵敏度进行比较(图8中C),
实施例2线性
SPE和MALDI线性比较:
具体操作如下:7份已知M蛋白浓度的患者血清样本(4个IgG、2个IgA和1个IgM;范围0.5-8g/dL),用正常人的血清按照如下0倍,1∶3,1∶15、1∶75和1∶375比例进行稀释,并 通过SPEP和MALDI进行量化(每位患者5个样本;总共35个样本,方法同实施例1)。对于MALDI-TOF MS,使用工作站计算峰面积,利用峰型识别工具对M蛋白峰进行门控和量化。由图9可知,SPEP和MALDI-TOF MS测量的M蛋白浓度与预期浓度非常一致(R2>0.98)。
实施例3方法一致性比较
4.1MALDI-TOF MS与SPEP、IFE对M蛋白定性的一致性研究
为了进一步证明本发明的方法在检测M蛋白的实用价值,从珠江医院生化免疫实验室收集124例多发性骨髓瘤患者治疗前后的血清样本(SPE、IFE检测后的剩余血清样本),将MALDI-TOF MS质谱检测的性能与常规检测方法即血清蛋白电泳(SPEP)、免疫固定(IFE)的性能进行盲法一致性研究。如表1所示,SPE阳性107例,IFE阳性115例,这些样本中有94%的样本(n=117)被MALDI检测为阳性。在SPEP和IFE均为阴性的样本(n=9)样本中,MALDI-TOF MS发现了2例阳性病例,检测流程图如图10。
表1
(SPE:血清蛋白电泳;IFE:免疫固定电泳)
本发明基于MALDI-TOF MS(QuanTOF,融智生物科技(青岛)有限公司)对M蛋白进行快速鉴定。与其他同类型的线性MALDI-TOF MS相比,QuanTOF质谱仪(融智生物科技(青岛)有限公司)提供更宽的质量范围、更高的灵敏度和更好的重现性。QuanTOF识别和监测患者血清M蛋白灵敏度高,准确度高。本发明全新的基于飞行时间质谱的自动化系统相较传统方法,主要是通过还原免疫球蛋白,使Ig的重链(HC)和轻链(LC)间的二硫键断裂而分开来实现的,可直接分析M蛋白轻链的具体变化,灵敏度更高,特异性更强。此外该方法定性和分型时只需一项检测,样本前处理流程简单快捷,试剂耗材消耗量少,检测通量高,相比现有的电泳方法大大缩短了样本前处理的时间。检测结果不易受实验室条件影响,更容易实现标准统一。本发明有效的解决了现有方法对M蛋白筛查时存在的准确率低、检测通量低、特异性及灵敏度差等问题。该方法有望应用于M蛋白大规模临床筛查中,提高MM的筛查、 诊断和监测能力。
以上的实施例是为了说明本发明公开的实施方案,并不能理解为对本发明的限制。此外,本文所列出的各种修改以及发明中方法的变化,在不脱离本发明的范围和精神的前提下对本领域内的技术人员来说是显而易见的。虽然已结合本发明的多种具体优选实施例对本发明进行了具体的描述,但应当理解,本发明不应仅限于这些具体实施例。事实上,各种如上所述的对本领域内的技术人员来说显而易见的修改来获取发明都应包括在本发明的范围内。

Claims (12)

  1. 一种M蛋白检测的方法,其特征在于,所述方法包括如下步骤:
    1)提供待测样本中免疫球蛋白轻链单电荷离子的m/z分布数据,所述轻链包括λ轻链和κ轻链;
    2)结果判定:
    若轻链m/z范围内具有基底窄、峰高且尖锐的质谱峰,则判定待测样本中含有M蛋白;
    或,
    若κ轻链:λ轻链的峰面积比值小于1.8或大于3.5,且峰形为非高斯分布,则判定待测样本中含有M蛋白;
    所述轻链m/z范围内具有基底窄、峰高且尖锐的质谱峰,或κ轻链∶λ轻链的峰面积比值小于1.8或大于3.5,且峰形为非高斯分布的轻链峰为M蛋白轻链峰;
    若κ轻链∶λ轻链的峰面积比值≥1.8或≤3.5,且峰形为高斯分布,则判定为待测样本中不含有M蛋白。
  2. 根据权利要求1所述的方法,其特征在于,所述待测样本为血清样本或尿液样本,和/或,步骤2)中所述轻链m/z范围指轻链单电荷m/z为22400~24600Da。
  3. 根据权利要求1所述的方法,其特征在于,待测样本中免疫球蛋白轻链单电荷离子的m/z分布数据采用下列步骤获得:
    I)样本还原:将待测样本用还原剂还原,得到还原样本;
    II)检测:将还原样本和基质液混合后点样,利用MALDI-TOF MS分析测量待测样本中免疫球蛋白轻链和重链单电荷离子的m/z分布数据。
  4. 根据权利要求3所述的方法,其特征在于,所述待测样本稀释后再用还原剂还原;优选的,所述待测样本用水、PBS或生理盐水中的一种或多种进行稀释;
    和/或,
    步骤I)中所述还原剂选自二硫苏糖醇、三(2-羧乙基)膦、三(3-羟丙基)膦、β-巯基乙醇中的任一种或多种;优选的,所述还原剂的终浓度为0.02~0.08mol/L;
    和/或,步骤I)中用还原剂还原的步骤为:将待测样本和还原剂混合后,20~30℃下孵育10~30分钟;
    和/或,步骤II)中所述基质液选自芥子酸基质液、2,5-二羟基苯甲酸基质液或α-氰基-4-羟基肉桂酸基质液;优选的,所述基质液的终浓度为1~5mg/mL。
  5. 根据权利要求1所述的方法,其特征在于,步骤2)中将检测获得的待测样本的指纹图谱 与健康人的指纹图谱叠加比较,观察在待测样本指纹图谱上轻链m/z范围内是否具有与健康人的峰形相比基底更窄、峰更高且更尖锐的质谱峰,若有,则判定为待测样本中含有M蛋白;若无,则判定为待测样本中不含有M蛋白。
  6. 根据权利要求1所述的方法,其特征在于,步骤2)中具体判定方法如下:
    若κ/λ<1.8,单电荷离子的m/z在22400~23100Da内含有非高斯分布的M蛋白轻链峰,则判定为λ轻链型M蛋白阳性;
    若κ/λ>3.5,单电荷离子的m/z在23100~24600Da内含有非高斯分布的M蛋白轻链峰,则判定为κ轻链型M蛋白阳性;
    公式中,κ代表M蛋白κ轻链的峰面积,λ代表M蛋白λ轻链的峰面积。
  7. 根据权利要求1所述的方法,其特征在于,步骤2)中利用质谱工作站计算得到κ轻链和λ轻链的峰面积比值,和/或,利用峰形识别工具判断峰形是否为高斯分布。
  8. 一种M蛋白检测的装置,其特征在于,所述装置包括:
    信息获取模块:用于获取待测样本中免疫球蛋白轻链和重链单电荷离子的m/z分布数据,所述轻链包括λ轻链和κ轻链;
    峰形识别模块:用于分析κ轻链:λ轻链的峰面积以及峰形是否为非高斯分布;
    结果判定模块:用于按照以下情况输出结果:
    若κ轻链:λ轻链的峰面积比值小于1.8,且单电荷离子的m/z在22400~23100Da内含有非高斯分布的峰形,则判定待测样本中含有λ轻链型M蛋白;
    若κ轻链:λ轻链的峰面积比值大于3.5,且单电荷离子的m/z在23100~24600Da内含有非高斯分布的峰形,则判定待测样本中含有κ轻链型M蛋白;
    若κ轻链:λ轻链的峰面积比值≥1.8或≤3.5,且峰形为高斯分布,则判定待测样本中不含有M蛋白。
  9. 根据权利要求8所述的装置,其特征在于,所述峰形识别模块包括:
    1)训练数据集生成子模块:用于获取已检测并进行人工标注的质谱样本数据集,对数据集中阴性样本、λ峰异常、κ峰异常样本进行相应赋值,以获得训练数据集;
    2)特征提取子模块:用于对训练数据集中的各质谱样本峰图进行特征提取;特征提取的特征值包括σL,CL,σR,CR,σLR,CL/CR
    3)模型生成子模块:用于将特征提取子模块提取的各质谱样本的特征值及每个样本对应的赋值,采用随机森林算法获得能够分析κ轻链:λ轻链的峰面积以及峰形是否为非高斯分布的随机森林模型。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求1-7任一所述方法的步骤。
  11. 一种计算机处理设备,包括处理器及权利要求10所述的计算机可读存储介质,其特征在于,所述处理器执行所述计算机可读存储介质上的计算机程序,实现权利要求1-7任一所述方法的步骤。
  12. 一种服务终端,包括:
    通信器,用于与外部通信;
    存储器,存储有计算机程序;
    处理器,用于运行所述计算机程序以实现权利要求1-7任一所述的方法。
PCT/CN2023/087606 2022-10-21 2023-04-11 一种m蛋白检测的方法 WO2024082581A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211293501.4A CN115684606B (zh) 2022-10-21 2022-10-21 一种m蛋白检测的方法
CN202211293501.4 2022-10-21

Publications (1)

Publication Number Publication Date
WO2024082581A1 true WO2024082581A1 (zh) 2024-04-25

Family

ID=85066367

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087606 WO2024082581A1 (zh) 2022-10-21 2023-04-11 一种m蛋白检测的方法

Country Status (2)

Country Link
CN (1) CN115684606B (zh)
WO (1) WO2024082581A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115684606B (zh) * 2022-10-21 2023-11-28 南方医科大学珠江医院 一种m蛋白检测的方法
CN117491653A (zh) * 2023-11-06 2024-02-02 上海体育大学 一种生长激素样品的制备方法
CN117849159A (zh) * 2024-01-09 2024-04-09 融智生物科技(青岛)有限公司 M蛋白的检测方法、电子设备及存储介质

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104272113A (zh) * 2012-03-06 2015-01-07 结合点集团有限公司 表征浆细胞相关疾病的方法
CN104598767A (zh) * 2015-01-16 2015-05-06 上海市第一人民医院 一种利用计算机鉴定免疫固定电泳m蛋白成份的方法
US20160041184A1 (en) * 2013-03-15 2016-02-11 David R. Barnidge Identification and monitoring of monoclonal immunoglobulins by molecular mass
CN109753939A (zh) * 2019-01-11 2019-05-14 银丰基因科技有限公司 一种hla测序峰图识别方法
US20210247402A1 (en) * 2018-05-04 2021-08-12 The Binding Site Group Ltd Identification of immunoglobulins using mass spectrometry
CN113720900A (zh) * 2021-09-14 2021-11-30 首都医科大学附属北京朝阳医院 一种基于madli-tof ms技术检测血清中m蛋白的方法
CN113811772A (zh) * 2019-05-10 2021-12-17 结合点集团有限公司 质谱分析校准物
CN115684606A (zh) * 2022-10-21 2023-02-03 南方医科大学珠江医院 一种m蛋白检测的方法

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005062044A1 (ja) * 2003-11-19 2005-07-07 Yamasa Corporation ベンスジョーンズ蛋白質の検定法
EP3050074B1 (en) * 2013-09-23 2020-08-26 Micromass UK Limited Peak assessment for mass spectrometers
US10267806B2 (en) * 2014-04-04 2019-04-23 Mayo Foundation For Medical Education And Research Isotyping immunoglobulins using accurate molecular mass
GB2530521B (en) * 2014-09-24 2020-06-10 Map Ip Holding Ltd Mass spectral analysis of patient samples for the detection of the human chorionic gonadotropin
GB201708262D0 (en) * 2017-05-23 2017-07-05 Binding Site Group Ltd Assay for plasma cell associated disease
CN109085282A (zh) * 2018-06-22 2018-12-25 东南大学 一种基于小波变换和随机森林模型的色谱重叠峰解析方法
CN113008860B (zh) * 2021-04-25 2023-04-18 广东工业大学 血脂分类方法、系统、储存介质及计算机设备
CN113341345A (zh) * 2021-06-04 2021-09-03 浙江大学 一种基于特征提取和随机森林的mmc开关管开路故障诊断方法
CN114023379B (zh) * 2021-12-31 2022-05-13 浙江迪谱诊断技术有限公司 一种确定基因型的方法及装置
CN114755357A (zh) * 2022-04-14 2022-07-15 武汉迈特维尔生物科技有限公司 一种色谱质谱自动积分方法、系统、设备、介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104272113A (zh) * 2012-03-06 2015-01-07 结合点集团有限公司 表征浆细胞相关疾病的方法
US20160041184A1 (en) * 2013-03-15 2016-02-11 David R. Barnidge Identification and monitoring of monoclonal immunoglobulins by molecular mass
CN104598767A (zh) * 2015-01-16 2015-05-06 上海市第一人民医院 一种利用计算机鉴定免疫固定电泳m蛋白成份的方法
US20210247402A1 (en) * 2018-05-04 2021-08-12 The Binding Site Group Ltd Identification of immunoglobulins using mass spectrometry
CN109753939A (zh) * 2019-01-11 2019-05-14 银丰基因科技有限公司 一种hla测序峰图识别方法
CN113811772A (zh) * 2019-05-10 2021-12-17 结合点集团有限公司 质谱分析校准物
US20220221469A1 (en) * 2019-05-10 2022-07-14 The Binding Site Group Limited Mass Spectrometry Calibrator
CN113720900A (zh) * 2021-09-14 2021-11-30 首都医科大学附属北京朝阳医院 一种基于madli-tof ms技术检测血清中m蛋白的方法
CN115684606A (zh) * 2022-10-21 2023-02-03 南方医科大学珠江医院 一种m蛋白检测的方法

Also Published As

Publication number Publication date
CN115684606B (zh) 2023-11-28
CN115684606A (zh) 2023-02-03

Similar Documents

Publication Publication Date Title
WO2024082581A1 (zh) 一种m蛋白检测的方法
Zhang et al. Evaluation of a novel, integrated approach using functionalized magnetic beads, bench-top MALDI-TOF-MS with prestructured sample supports, and pattern recognition software for profiling potential biomarkers in human plasma
CN102027373B (zh) 发现用于前列腺癌诊断和治疗之生物标志物和药物靶标的方法及其确立的生物标志物测定
US20120004854A1 (en) Metabolic biomarkers for ovarian cancer and methods of use thereof
US20080086272A1 (en) Identification and use of biomarkers for the diagnosis and the prognosis of inflammatory diseases
JP2008545960A (ja) 結核の診断
SG173310A1 (en) Apolipoprotein fingerprinting technique
JP5855264B2 (ja) N−型糖ペプチドの高スループット同定および定量のためのバイオインフォマティックスプラットフォーム
US20240168024A1 (en) Method and system for diagnosing whether an individual has lung cancer
CN103776891A (zh) 一种检测差异表达蛋白质的方法
CN109557165B (zh) 用于监控质谱成像制备工作流程的质量的方法
US20070218505A1 (en) Identification of biomolecules through expression patterns in mass spectrometry
CN114166924A (zh) 尿液蛋白标志物在诊断遗传性血管水肿中的用途
CN114167066B (zh) 生物标志物在制备妊娠糖尿病诊断试剂中的用途
CN109791158A (zh) 用于复杂样品的多属性监测的方法
CN113720900A (zh) 一种基于madli-tof ms技术检测血清中m蛋白的方法
Jain et al. Hemoglobin normalization outperforms other methods for standardizing dried blood spot metabolomics: A comparative study
CN112798678A (zh) 基于血清的新型冠状病毒感染快速检测方法
CN117686712A (zh) 一种基于舌苔微生物蛋白筛查胃癌的方法
JP2015021739A (ja) 質量分析におけるペプチドピークの同定・定量のためのデータベース作成方法
CN113393902A (zh) 基于免疫表征技术对样本分类的方法、装置及存储介质
Pais et al. An automated workflow for MALDI-ToF mass spectra pattern identification on large data sets: An application to detect aneuploidies from pregnancy urine
CN104292322A (zh) 原发性胆汁性肝硬化特异性自身抗原及其应用
CN115097147A (zh) 测定样本中生物标志物水平的试剂在预测奥密克戎复阳风险的应用及代谢、蛋白、联合模型
Jiang et al. Integration of metabolomics and peptidomics reveals distinct molecular landscape of human diabetic kidney disease

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23878581

Country of ref document: EP

Kind code of ref document: A1