WO2021180182A1 - 基于免疫表征技术对样本分类的方法、装置及存储介质 - Google Patents

基于免疫表征技术对样本分类的方法、装置及存储介质 Download PDF

Info

Publication number
WO2021180182A1
WO2021180182A1 PCT/CN2021/080279 CN2021080279W WO2021180182A1 WO 2021180182 A1 WO2021180182 A1 WO 2021180182A1 CN 2021080279 W CN2021080279 W CN 2021080279W WO 2021180182 A1 WO2021180182 A1 WO 2021180182A1
Authority
WO
WIPO (PCT)
Prior art keywords
category
target
sample
infected
coronavirus
Prior art date
Application number
PCT/CN2021/080279
Other languages
English (en)
French (fr)
Inventor
王俊
李英睿
王健
郑汉城
刘兵行
沈凌浩
陶一敏
燕鸣琛
李振宇
罗瀚
宋捷
胡晓莹
Original Assignee
珠海碳云智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 珠海碳云智能科技有限公司 filed Critical 珠海碳云智能科技有限公司
Publication of WO2021180182A1 publication Critical patent/WO2021180182A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/10Signal processing, e.g. from mass spectrometry [MS] or from PCR
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/10Machine learning using kernel methods, e.g. support vector machines [SVM]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Definitions

  • the embodiments of the present application relate to the field of communications, and in particular, to a method, device, storage medium, and electronic device for classifying samples based on immune characterization technology.
  • nucleic acid detection kits for the new coronavirus, using nucleic acid testing methods to quickly and effectively confirm the diagnosis of infected patients in my country.
  • the principle of nucleic acid detection is to design primers based on the gene sequence of the virus, and detect the fluorescent signal generated after amplification by PCR amplification and the addition of fluorescent probe labeling during the amplification process, thereby indicating whether there is viral nucleic acid in the sample.
  • Nucleic acid detection has the characteristics of high throughput, easy development, and quantification.
  • nucleic acid testing is the current diagnostic indicator of new coronary pneumonia, with the development of nucleic acid testing, multiple results indicate that nucleic acid testing has a high false negative rate, with a detection rate of only 30%-50%, which is the same as nucleic acid testing for sampling sites.
  • the sampling process of nucleic acid detection poses a high risk to medical staff, and it can only detect the presence of viral nucleic acid but cannot determine whether it is a live virus. Therefore, it is particularly important to develop a detection method that does not require high sampling requirements (such as universal, can be detected as long as blood is collected), which is more specific, more sensitive, and has lower sampling requirements. In related technologies, there is no clear technology for detecting whether a sample contains a new coronavirus for peptides.
  • the embodiments of this application provide a method, device, storage medium, and electronic device for classifying samples based on immunocharacterization technology, to at least solve the problem that there is no clear technique for detecting whether a sample is infected with peptides in related technologies. problem.
  • a method for classifying samples based on immunocharacterization technology includes: using the immunocharacterization technology to detect the corresponding differential response of the differential peptide in the target sample to be tested and the control sample. Signal to obtain a second differential response signal, wherein the differential peptide is a peptide that has a first differential response signal between the positive sample of the target coronavirus infection and the control sample that has been screened by the immunocharacterization technology in advance,
  • the control sample includes a negative control sample and/or a sample in another state, the sample in the other state includes a sample infected by a pathogen other than the target coronavirus, and the sample is a serum sample or a plasma sample
  • Each set of data includes: the difference response signal and the category to which the sample to be tested corresponding to the difference response
  • the method further includes: using the immunological characterization technology
  • the positive sample infected with the target coronavirus and the control sample are screened out for peptides with the first differential response signal, and the screened peptides are determined as the differential peptides.
  • the method further includes: using the multiple sets of data to perform machine learning on the initial model Training is performed to obtain the target model, wherein the target model includes a first model or a second model; the first model is used to output a label for identifying one of the following results for the input signal: The aforementioned coronavirus is infected and has been infected by the target coronavirus; the second model is used to output a label for identifying one of the following results for the input signal: not infected by the target coronavirus and not infected
  • the category is the first category, the category that is not infected by the target coronavirus and is not infected is the second category, has been infected by the target coronavirus and the infected category is the third category, has been infected by the target coronavirus
  • the infection category is the fourth category, the target coronavirus has been infected and the infection category
  • the method further includes: determining the category to which the output target sample to be tested belongs
  • a third model is used to analyze the second differential response signal to determine the category of the target sample to be tested that is not infected by the target coronavirus, where The third model is trained by machine learning using multiple sets of data, and each set of data in the multiple sets of data includes: the difference response signal and the sample to be tested corresponding to the difference response signal belongs to which the target does not belong to.
  • the category of coronavirus infection includes one of the following: the category that is not infected by the target coronavirus and is not infected is the first category, and is not infected by the target coronavirus And the uninfected category is the second category; output the category that the target sample to be tested belongs to that is not infected with the target coronavirus.
  • the method further includes: determining that the output target sample to be tested belongs to When the symptom category of is infected by the target coronavirus, use the fourth model to analyze the second differential response signal to determine the target sample to be tested belongs to the category of the target coronavirus infected , wherein the fourth model is trained through machine learning using multiple sets of data, and each set of data in the multiple sets of data includes: the difference response signal and the sample to be tested corresponding to the difference response signal has been taken.
  • the category of the aforementioned coronavirus infection, the category that has been infected by the target coronavirus includes one of the following: has been infected by the target coronavirus and the type of infection is the third category, has been infected by the target coronavirus And the infection category is the fourth category, the target coronavirus has been infected and the infection category is the fifth category, and the target coronavirus
  • the target model includes a first linear kernel support vector machine SVM.
  • the third model includes a second linear kernel support vector machine SVM.
  • the fourth model includes a third linear kernel support vector machine SVM.
  • an apparatus for classifying samples based on immunocharacterization technology includes: a detection module configured to use the immunocharacterization technology to detect differential peptides between the target sample to be tested and the control sample In order to obtain a second differential response signal, the differential peptide segment is a positive sample of the target coronavirus infection that has been screened in advance using the immunocharacterization technique and the control sample has a first differential response.
  • the control sample includes a negative control sample and/or a sample in another state, and the sample in the other state includes a sample infected by a pathogen other than the target coronavirus, and the sample is Serum sample or plasma sample;
  • the first analysis module is configured to use a target model to analyze the second differential response signal to determine the category to which the target sample to be tested belongs, wherein the target model is to use multiple sets of data to pass Trained by machine learning, each set of data in the multiple sets of data includes: the difference response signal and the category to which the sample to be tested corresponding to the difference response signal belongs;
  • the first output module is set to output the target sample to be tested belongs to Category.
  • the device further includes: a screening module, which is configured to use immunocharacterization technology to detect the corresponding differential response signal of the differential peptide in the target sample to be tested and the control sample to obtain the second differential response signal.
  • the immunocharacterization technology screens out the peptides that have the first differential response signal between the positive sample infected with the target coronavirus and the control sample, and determines the screened peptides as the differential peptides .
  • the device further includes: a training module, configured to use the multiple sets of data to pass data before analyzing the second differential response signal using a target model to determine the symptom category to which the target sample to be tested belongs
  • Machine learning trains an initial model to obtain the target model, wherein the target model includes a first model or a second model; the first model is used to output signals for the input to identify one of the following results Label: not infected by the target coronavirus, has been infected by the target coronavirus; the second model is used to output a label for identifying one of the following results for the input signal: not infected by the target coronavirus And the uninfected category is the first category, the category that is not infected by the target coronavirus and the uninfected category is the second category, has been infected by the target coronavirus and the infected category is the third category, and has been infected.
  • the aforementioned coronavirus infection and the category of infection are the fourth category, the target coronavirus has been infected and the infection category is the fifth category, and the target coronavirus has been infected and the infection category is the sixth category, where, The degree of infection corresponding to the third category, the fourth category, the fifth category, and the sixth category increases in turn.
  • the device further includes: a second analysis module configured to output the category to which the target sample to be tested belongs when the target model includes the first model, and determine the output
  • a second analysis module configured to output the category to which the target sample to be tested belongs when the target model includes the first model, and determine the output
  • a third model is used to analyze the second differential response signal, and it is determined that the target sample to be tested belongs to which is not infected by the target coronavirus.
  • Target category of coronavirus infection where the third model is trained by machine learning using multiple sets of data, and each set of data in the multiple sets of data includes: a differential response signal and a test to be tested corresponding to the differential response signal
  • the category that the sample belongs to is not infected by the target coronavirus, and the category that is not infected by the target coronavirus includes one of the following: the category that is not infected by the target coronavirus and is not infected is the first category, The category that is not infected by the target coronavirus and is not infected is the second category; the second output module is configured to output the category that the target sample to be tested belongs to and is not infected by the target coronavirus.
  • the device further includes: a third analysis module configured to output the symptom category to which the target sample to be tested belongs in a case where the target model includes the second model, and after determining the output
  • the fourth model is used to analyze the second differential response signal to determine that the target sample to be tested belongs to have been The target type of coronavirus infection, wherein, the fourth model is trained by machine learning using multiple sets of data, and each set of data in the multiple sets of data includes: a difference response signal and a difference response signal corresponding
  • the category that the sample to be tested belongs to has been infected by the target coronavirus
  • the category that has been infected by the target coronavirus includes one of the following: has been infected by the target coronavirus and the infection category is the third category, Has been infected with the target coronavirus and the infected category is the fourth category, has been infected by the target coronavirus and the
  • a detection method of coronavirus infection includes: using immunocharacterization technology to screen out the positive sample for the target coronavirus infection and the control sample with the first differential response signal
  • the peptides of the differential peptides are recorded as differential peptides, and the sample is a serum sample or a plasma sample; characterized by the first differential response signal of the differential peptide, the support vector machine method is used to compare the positive sample and the sample Constructing a classification model for the control sample to obtain a sample classification model; using the immunocharacterization technology to detect the corresponding differential response signal of the differential peptide in the sample to be tested and the control sample, and record it as the second differential response signal;
  • the second differential response signal is input into the sample classification model for classification, thereby obtaining the symptom category of the sample to be tested;
  • the control sample includes a negative control sample and other lung disease control samples
  • the other Pulmonary disease refers to a lung disease caused by a non-target coronavirus
  • a detection device for coronavirus infection includes: a differential peptide screening module, which is configured to screen out positive samples for target coronavirus infection by using immunocharacterization technology.
  • the peptides with the first differential response signal in the control sample are recorded as differential peptides, and the sample is a serum sample or a plasma sample;
  • the model building module is set to be characterized by the first differential response signal of the differential peptide , Using a support vector machine method to construct a classification model for the positive sample and the control sample to obtain a sample classification model;
  • the response signal detection module is configured to use the immunocharacterization technology to detect that the differential peptide is in the to-be-tested
  • the corresponding difference response signal in the sample and the control sample is recorded as the second difference response signal;
  • the classification detection module is configured to input the second difference response signal into the sample classification model for classification, thereby obtaining the test The symptom category of the sample;
  • the control sample includes a negative control sample and a control sample of
  • a computer-readable storage medium in which a computer program is stored, wherein the computer program is configured to execute any of the above methods when running Steps in the embodiment.
  • an electronic device including a memory and a processor, the memory is stored with a computer program, and the processor is configured to run the computer program to execute any of the above Steps in the method embodiment.
  • the corresponding differential response signals of the differential peptides in the target sample to be tested and the control sample can be detected based on the neural network, and then the category to which the sample belongs can be determined. Therefore, it can solve the problems in the related technology.
  • the technical problem of detecting whether the sample is infected by peptides is clear, and the technology of detecting whether the sample is infected by peptides is realized, and the detection accuracy rate of the category of the sample is improved.
  • FIG. 1 is a hardware structural block diagram of a mobile terminal based on a method for classifying samples based on immune characterization technology according to an embodiment of the present application;
  • FIG. 2 is a flowchart of a method for classifying samples based on immune characterization technology according to an embodiment of the present application
  • LEO Leave-one-out
  • Fig. 5 is a first diagram of verification results according to an embodiment of the present application.
  • Figure 6 is Figure 2 of the verification result according to an embodiment of the present application.
  • Figure 7 is Figure 3 of the verification result according to an embodiment of the present application.
  • Fig. 8 is a diagram of an apparatus for classifying samples based on immune characterization technology according to an embodiment of the present application.
  • ImmuneSignatuer technology immunological characterization technology that uses high-density random peptides (for example, 130,000 peptides) chips to bind to antibodies in the blood, and after incubating with a fluorescently labeled secondary antibody, the fluorescence value is detected in the microplate reader to reflect the blood Antibody. This method can identify antibodies that are differentially expressed between different individuals.
  • Polypeptide In this application, it refers to any peptide that is predicted or screened to specifically bind to an antibody.
  • Antigen refers to all substances that can induce an immune response in the body. That is, it can be specifically bound by antigen receptors (TCR/BCR) on the surface of T/B lymphocytes, activate T/B cells, make them proliferate and differentiate, produce immune response products (sensitized lymphocytes or antibodies) and can interact with corresponding products A substance that specifically binds inside and outside the body. Therefore, antigens have two important characteristics: immunogenicity and immunoreactivity.
  • the antigen in this application refers to a complete antigen with immunogenicity formed after a polypeptide hapten is coupled with a carrier protein, which can be a polypeptide-carrier protein conjugate formed by coupling a polypeptide of a single amino acid sequence with a carrier protein; or A polypeptide-carrier protein conjugate composition formed by coupling polypeptides with multiple different amino acid sequences and a carrier protein.
  • ROC curve the curve of the relationship between reaction sensitivity and specificity.
  • the X-axis on the abscissa is 1-specificity, which is also called false positive rate.
  • AUC Absolute Under Curve
  • FIG. 1 is a hardware structural block diagram of a mobile terminal based on a method for classifying samples based on immune characterization technology in an embodiment of the present application.
  • the mobile terminal may include one or more (only one is shown in FIG. 1) processor 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA)
  • the memory 104 configured to store data, wherein the above-mentioned mobile terminal may also include a transmission device 106 and an input/output device 108 for communication functions.
  • FIG. 1 is only for illustration, and does not limit the structure of the above-mentioned mobile terminal.
  • the mobile terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration from that shown in FIG.
  • the memory 104 may be configured to store computer programs, for example, software programs and modules of application software, such as the computer programs corresponding to the method for classifying samples based on immune characterization technology in the embodiment of the present application, and the processor 102 is stored in the memory 104 by running The computer program to perform various functional applications and data processing, that is, to achieve the above-mentioned methods.
  • the memory 104 may include a high-speed random access memory, and may also include a non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory.
  • the memory 104 may further include a memory remotely provided with respect to the processor 102, and these remote memories may be connected to the mobile terminal through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, corporate intranets, local area networks, mobile communication networks, and combinations thereof.
  • the transmission device 106 is configured to receive or transmit data via a network.
  • the above-mentioned specific examples of the network may include a wireless network provided by a communication provider of a mobile terminal.
  • the transmission device 106 includes a network adapter (Network Interface Controller, NIC for short), which can be connected to other network devices through a base station to communicate with the Internet.
  • the transmission device 106 may be a radio frequency (Radio Frequency, referred to as RF) module, which is configured to communicate with the Internet in a wireless manner.
  • RF Radio Frequency
  • a method for classifying samples based on immune characterization technology is provided.
  • This embodiment is differentially expressed in a large number of samples (serum samples or plasma samples) of healthy people, other lung diseases, and patients with new coronary pneumonia Antibody feature data, based on artificial intelligence methods, establish a classification model for determining the class of the sample. Then the sensitivity and specificity of the classification model were tested and verified by known samples. It shows that the classification model has high classification accuracy, and the sensitivity and specificity data of using the classification model to classify the object to be tested show that the method can effectively and accurately determine the class of the sample.
  • This embodiment is differentially expressed in a large number of samples (serum samples or plasma samples) of healthy people, other lung diseases, and patients with new coronary pneumonia Antibody feature data, based on artificial intelligence methods, establish a classification model for determining the class of the sample. Then the sensitivity and specificity of the classification model were tested and verified by known samples. It shows that the classification model has high classification accuracy, and the sensitivity and specificity data of
  • Fig. 2 is a flowchart of a method for classifying samples based on immune characterization technology according to an embodiment of the present application. As shown in Fig. 2, the process includes the following steps:
  • S202 Using the immunocharacterization technology, detect the corresponding differential response signals of the differential peptides in the target sample to be tested and the control sample to obtain a second differential response signal, wherein the differential peptides are pre-screened using the immunocharacterization technology
  • the positive sample of the target coronavirus infection and the control sample have a first differential response signal peptide.
  • the control sample includes a negative control sample and/or samples in other states, and the samples in other states include A sample infected with a pathogen other than the target coronavirus, and the sample is a serum sample or a plasma sample;
  • S204 Use a target model to analyze the second difference response signal to determine the category to which the target sample to be tested belongs, where the target model is trained through machine learning using multiple sets of data, and the multiple sets of data Each group of data in includes: the difference response signal and the category of the sample to be tested corresponding to the difference response signal;
  • the category to which the sample to be tested belongs can be obtained by the above method, where different categories can be used to indicate whether the sample to be tested is infected by the target coronavirus, and/or the specific degree of non-infection, and/or the specific infected Degree.
  • the above-mentioned other pathogens include viruses or bacteria used to cause other lung diseases, and other lung diseases refer to lung diseases caused by non-target coronavirus infections.
  • the type of the target sample to be tested is a serum sample
  • the type of the control sample is also a serum sample
  • the type of the control sample is also It is a plasma sample, that is, there is a one-to-one correspondence between the type of the target sample to be tested and the type of the control sample.
  • the corresponding differential response signals of the differential peptides in the target sample to be tested and the control sample can be detected based on the machine learning model, and then the category to which the sample belongs can be determined. Therefore, the existing problems in the related technology can be solved. There is no clear technical problem for detecting whether a sample is infected with peptides, and it has achieved the effect of realizing a technique for detecting whether a sample is infected with peptides and improving the detection accuracy of the category of the sample.
  • the corresponding differentially expressed antibody characteristics can be screened separately and then trained, so that classification and screening models of different categories can be obtained. So as to accurately confirm the different types of samples.
  • the samples consisted of 3 groups, namely healthy (denoted as H) plasma samples, other lung diseases (denoted as T, mainly tuberculosis) plasma samples, and new coronary pneumonia (denoted as F) plasma samples.
  • the number of samples mentioned above is only an example. In practical applications, other numbers of sample data can be used, for example, 200 samples, 500 samples, etc., and the larger the number of sample data , The final confirmation result is actually more accurate.
  • the method further includes: using the immunological characterization technology
  • the positive sample infected with the target coronavirus and the control sample are screened out for peptides with the first differential response signal, and the screened peptides are determined as the differential peptides.
  • the first step is to compare F and H to screen out the HT polypeptide characteristics that are significantly increased in plasma samples infected by pathogens that can cause lung diseases. Such characteristics correspond to plasma samples being infected by pathogens.
  • the antibody concentration caused by infection rises, but the antibodies found are not necessarily specific for the new coronavirus (corresponding to the above-mentioned target coronavirus), but may also be caused by other pathogens that cause lung disease infection or other factors
  • the second step is to compare F and T to find specific antibodies for the new coronavirus compared to other lung diseases.
  • the new coronavirus is infected Comparing with other lung diseases, it may be easy to find some non-specific HT peptides by mistake; therefore, in order to obtain the new crown-specific peptides more accurately, we finally use the characteristic peptides found in the first and second steps Take the intersection to obtain the new crown-specific peptide with high accuracy.
  • the method further includes: using the multiple sets of data to perform machine learning on the initial model Training is performed to obtain the target model, wherein the target model includes a first model or a second model; the first model is used to output a label for identifying one of the following results for the input signal: The aforementioned coronavirus is infected and has been infected by the target coronavirus; the second model is used to output a label for identifying one of the following results for the input signal: not infected by the target coronavirus and not infected
  • the category is the first category, the category that is not infected by the target coronavirus and is not infected is the second category, has been infected by the target coronavirus and the infected category is the third category, has been infected by the target coronavirus
  • the infection category is the fourth category, the target coronavirus has been infected and the infection category
  • the data contains three categories: data of uninfected plasma samples (denoted as H), data of plasma samples of other lung diseases (tuberculosis) Mainly, denoted as T), data of plasma samples infected by the new coronavirus (denoted as F).
  • H uninfected plasma samples
  • T data of plasma samples of other lung diseases
  • F data of plasma samples infected by the new coronavirus
  • the support vector machine classifier is used to construct the classification model.
  • the model kernel function uses a linear kernel.
  • the category weight of the loss function is the inverse ratio of the number of categories in the training set. Category 1, non-F is category 0). It should be noted that the sample in this embodiment comes from Shenzhen Third People's Hospital.
  • a better way is to construct a model (that is, the above-mentioned target model) to predict whether the sample is infected by the new coronavirus through the input data characteristics.
  • a model that is, the above-mentioned target model
  • linear kernel support vector machines of course, it is also feasible to choose other neural network models.
  • the embodiment of this application takes linear kernel SVM as an example. Note) for classification, the error penalty weight used is 1.0, and the category weight of the loss function is the inverse ratio of the number of categories in the training set.
  • the classification granularity of the above-mentioned trained model is adjustable.
  • the classification granularity of the above-mentioned trained model can be adjusted to only determine whether the sample is not infected by the target coronavirus or has been infected.
  • the classification granularity of the above-mentioned trained model can also be adjusted more finely.
  • the classification granularity of the above-mentioned trained model can be further adjusted to be judged as follows Classification: the category that is not infected by the target coronavirus and is not infected is the first category, the category that is not infected by the target coronavirus and is not infected is the second category, has been infected and infected by the target coronavirus
  • the category of is the third category, has been infected by the target coronavirus and the infected category is the fourth category, has been infected by the target coronavirus and the infected category is the fifth category, has been infected by the target coronavirus and
  • the infection category is the sixth category, where the infection degree corresponding to the third category, the fourth category, the fifth category and the sixth category increases in turn (it should be noted that the above third category
  • the division of the fourth category, the fifth category and the sixth category is only an optional division method.
  • the above-mentioned first category may be a category that is not currently infected by the new coronavirus but has antibodies to the new coronavirus (that is, the corresponding sample has been infected with the new coronavirus before), and the above-mentioned second category may be the category that has never been infected by the new coronavirus Category.
  • the classification granularity of the above-trained model is only to determine whether the sample is not infected by the target coronavirus or the sample has been infected by the target coronavirus, if it is necessary to further determine a more detailed category, it can be introduced
  • Other models are used for discrimination, such as:
  • the method further includes: determining that the category to which the target sample to be tested belongs is not In the case of the target coronavirus infection, a third model is used to analyze the second differential response signal to determine the category of the target sample to be tested that is not infected by the target coronavirus, wherein the first The three models are trained by machine learning using multiple sets of data.
  • Each set of data in the multiple sets of data includes: a differential response signal and a sample to be tested corresponding to the differential response signal that is not infected with the target coronavirus Category, the category that is not infected by the target coronavirus includes one of the following: the category that is not infected by the target coronavirus and is not infected is the first category, is not infected by the target coronavirus and is not infected The category of is the second category; output the category of the target sample to be tested that is not infected by the target coronavirus.
  • the method further includes: determining that the symptom category to which the target sample to be tested belongs to be output is
  • the fourth model is used to analyze the second differential response signal to determine the category of the target sample to be tested that has been infected by the target coronavirus, where The fourth model is trained by machine learning using multiple sets of data, and each set of data in the multiple sets of data includes: a differential response signal and a sample to be tested corresponding to the differential response signal belongs to the target coronavirus
  • the category of infection, the category that has been infected by the target coronavirus includes one of the following: has been infected by the target coronavirus and the category of infection is the third category, and the category has been infected and infected by the target coronavirus Is the fourth category, has been infected by the target coronavirus and the infected category is the
  • the above-mentioned third model may also be a linear kernel SVM
  • the above-mentioned fourth model may also be a linear kernel SVM
  • the above model type is only an exemplary description.
  • Other types of models can be trained to obtain the above-mentioned third model and/or fourth model.
  • one of the data set 1 and data set 2 is used as the training set, and the other data set is used as the test set for performance testing.
  • the model After the model is trained on the feature data of the above 864 differential peptides, it can be used to predict whether the feature data of the new differential peptide corresponds to the sample infected by the new coronavirus.
  • the specific method of use is: the same 864 for the new sample After detecting the response signal value of two different peptides, after necessary pre-processing and correction, input the characteristic data of the selected 864 peptides into the model, and judge whether the sample is infected by the new coronavirus according to the prediction result output by the model .
  • sample data from Hefei and Wuhan were mainly collected.
  • the samples were divided into four categories according to the severity of the new coronavirus infection, namely suspected type, mild (Mild) type, and ordinary ( Regular) type and heavy (Severe) type, see Table 1 for details (Wuhan data may contain a large number of false positive diagnoses, the N protein test was subsequently carried out, here 21* is the data that removes the positive N protein) :
  • This verification method can verify the modeling method under different data and level ratios. Reliability and universality.
  • this verification operation It uses the leave-one-out method to verify model performance, and evaluates model performance based on AUC.
  • the model is used to detect the sample data of Hefei and Wuhan respectively, and the results obtained are shown in Figure 5 and Figure 6. Among them, in the data of Hefei, the detection constructed by this method is used. Model, the model prediction threshold is 0.5.
  • the method according to the above embodiment can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is Better implementation.
  • the technical solutions of the embodiments of the present application can be embodied in the form of a software product in essence or a part that contributes to the prior art.
  • the computer software product is stored in a storage medium (such as ROM/RAM, magnetic Disk, optical disk), including several instructions to make the computing device execute the method described in each embodiment of the present application, or make the processor execute the method described in each embodiment of the present application.
  • This embodiment also provides a device for classifying samples based on immune characterization technology. As shown in FIG. 8, the device includes:
  • the detection module 82 is configured to use the immunocharacterization technology to detect the corresponding differential response signal of the differential peptide in the target sample to be tested and the control sample to obtain a second differential response signal, wherein the differential peptide is used in advance
  • the positive sample of the target coronavirus infection screened by the immunocharacterization technology and the control sample have a first differential response signal peptide, and the control sample includes a negative control sample and/or samples in other states.
  • the samples of includes samples infected by pathogens other than the target coronavirus, and the samples are serum samples or plasma samples;
  • the first analysis module 84 is configured to analyze the second differential response signal using a target model to determine the category to which the target sample to be tested belongs, wherein the target model is trained by machine learning using multiple sets of data Each of the multiple sets of data includes: the difference response signal and the category to which the sample to be tested corresponding to the difference response signal belongs;
  • the first output module 86 is configured to output the category to which the target sample to be tested belongs.
  • the device further includes: a screening module, which is configured to use immunocharacterization technology to detect the corresponding differential response signal of the differential peptide in the target sample to be tested and the control sample to obtain the second differential response signal.
  • the immunocharacterization technology screens out the peptides that have the first differential response signal between the positive sample infected with the target coronavirus and the control sample, and determines the screened peptides as the differential peptides .
  • the device further includes: a training module, configured to use the multiple sets of data to pass data before analyzing the second differential response signal using a target model to determine the symptom category to which the target sample to be tested belongs
  • Machine learning trains an initial model to obtain the target model, wherein the target model includes a first model or a second model; the first model is used to output signals for the input to identify one of the following results Label: not infected by the target coronavirus, has been infected by the target coronavirus; the second model is used to output a label for identifying one of the following results for the input signal: not infected by the target coronavirus And the uninfected category is the first category, the category that is not infected by the target coronavirus and the uninfected category is the second category, has been infected by the target coronavirus and the infected category is the third category, and has been infected.
  • the aforementioned coronavirus infection and the category of infection are the fourth category, the target coronavirus has been infected and the infection category is the fifth category, and the target coronavirus has been infected and the infection category is the sixth category, where, The degree of infection corresponding to the third category, the fourth category, the fifth category, and the sixth category increases in turn.
  • the device further includes: a second analysis module configured to output the category to which the target sample to be tested belongs when the target model includes the first model, and determine the output
  • a second analysis module configured to output the category to which the target sample to be tested belongs when the target model includes the first model, and determine the output
  • a third model is used to analyze the second differential response signal, and it is determined that the target sample to be tested belongs to which is not infected by the target coronavirus.
  • Target category of coronavirus infection where the third model is trained by machine learning using multiple sets of data, and each set of data in the multiple sets of data includes: a differential response signal and a test to be tested corresponding to the differential response signal
  • the category that the sample belongs to is not infected by the target coronavirus, and the category that is not infected by the target coronavirus includes one of the following: the category that is not infected by the target coronavirus and is not infected is the first category, The category that is not infected by the target coronavirus and is not infected is the second category; the second output module is configured to output the category that the target sample to be tested belongs to and is not infected by the target coronavirus.
  • the device further includes: a third analysis module configured to output the symptom category to which the target sample to be tested belongs in a case where the target model includes the second model, and after determining the output
  • the fourth model is used to analyze the second differential response signal to determine that the target sample to be tested belongs to have been The target type of coronavirus infection, wherein, the fourth model is trained by machine learning using multiple sets of data, and each set of data in the multiple sets of data includes: a difference response signal and a difference response signal corresponding
  • the category that the sample to be tested belongs to has been infected by the target coronavirus
  • the category that has been infected by the target coronavirus includes one of the following: has been infected by the target coronavirus and the infection category is the third category, Has been infected with the target coronavirus and the infected category is the fourth category, has been infected by the target coronavirus and the
  • the target model includes a first linear kernel support vector machine SVM.
  • the third model includes a second linear kernel support vector machine SVM.
  • the fourth model includes a third linear kernel support vector machine SVM.
  • a method for detecting coronavirus infection includes: using immunocharacterization technology to screen out peptides that have a first differential response signal between a positive sample and a control sample infected by the target coronavirus.
  • the samples are serum samples or plasma samples; feature the first differential response signal of the differential peptides, and use the support vector machine method to construct a classification model for the positive samples and the control samples to obtain the sample classification Model; using immune characterization technology to detect the corresponding differential response signal of the differential peptide in the test sample and the control sample, and record it as the second differential response signal; input the second differential response signal into the sample classification model for classification, thereby obtaining the test
  • the symptom category of the sample includes a negative control sample and samples of other lung diseases.
  • Other lung diseases refer to lung diseases caused by non-target coronavirus infections.
  • the target coronavirus is SARS-CoV-2.
  • the immunocharacterization technology is used to screen out the peptides that have the first differential response signal between the positive sample and the control sample for the target coronavirus infection, and recording as the differential peptide includes: selecting the positive sample for the target coronavirus infection, the negative control sample, and Control samples for other lung diseases.
  • Other lung diseases refer to lung diseases caused by viral infections other than the target coronavirus; immunocharacterization technology is used to combine positive samples, negative control samples and other lung disease control samples with peptide array chips , To obtain the signal value of the binding peptide response; for each binding peptide, calculate the p value when there is a difference between the signal value of the positive sample and the signal value of the negative control sample, record it as the first p value, and calculate the positive sample at the same time When there is a difference between the signal value of and other lung disease control samples, the p value is recorded as the second p value; all binding peptides that meet the first p value and the second p value and meet the third threshold are retained, so Obtain differential peptides; preferably the third threshold is ⁇ 0.05.
  • log10 conversion is performed on the signal value of the binding peptide, and the converted log value is used as the feature, and the p-value of each feature when there is a difference between the positive sample and the negative control sample is calculated through a one-tailed T test, and The p-value is corrected by multiple hypothesis testing to obtain the first p-value; at the same time, the p-value when the corresponding feature is different between the positive sample and the control sample of other lung diseases is calculated, and the p-value is corrected by multiple hypothesis testing and recorded as The second p-value: screening the binding peptides that meet the first p-value less than the third threshold and the second p-value less than the third threshold at the same time, so as to obtain different peptides.
  • a detection device for coronavirus infection includes: a differential peptide screening module configured to screen out positive samples and control samples for target coronavirus infection by using immunocharacterization technology.
  • the peptides with the first differential response signal are recorded as differential peptides, and the samples are serum samples or plasma samples;
  • the model building module is set to feature the first differential response signal of the differential peptides, and the support vector machine is used.
  • Methods The positive sample and the control sample were classified into the model to obtain the sample classification model; the response signal detection module was set to use the immunocharacterization technology to detect the corresponding differential response signal of the differential peptide in the test sample and the control sample, which was recorded as the first 2.
  • the classification detection module is configured to input the second differential response signal into the sample classification model for classification, thereby obtaining the symptom category of the sample to be tested; wherein the control sample includes a negative control sample and other lung disease samples,
  • Other lung diseases refer to lung diseases caused by non-target coronavirus infections, and the preferred target coronavirus is SARS-CoV-2.
  • the differential peptide screening module includes: a sample selection unit configured to select positive samples, negative control samples and other lung disease control samples for the target coronavirus infection, and other lung diseases are caused by virus infections other than the target coronavirus
  • the signal acquisition unit is set to use immunocharacterization technology to combine the positive samples, negative control samples and other lung disease control samples with the peptide array chip to obtain the signal value of the binding peptide response;
  • the differential peptide screening unit Set to calculate the p value when there is a difference between the signal value of the positive sample and the signal value of the negative control sample for each bound peptide, and record it as the first p value, and calculate the signal value of the positive sample and other lungs at the same time
  • the p value when the signal value of the disease control sample is different is recorded as the second p value; all binding peptides that meet the first p value and the second p value while meeting the third threshold are retained to obtain the difference peptide;
  • the three thresholds are ⁇ 0.05.
  • the differential peptide screening unit includes: a signal conversion subunit, configured to perform log10 conversion on the signal value of the bound peptide; Test, calculate the p value of each feature when there is a difference between the positive sample and the negative control sample, and perform multiple hypothesis test correction on the p value to obtain the first p value; at the same time, calculate the corresponding feature in the positive sample and other lung diseases
  • the p-value when there is a difference between the control samples, and the p-value is corrected by multiple hypothesis tests, and it is recorded as the second p-value; the combination that meets the first p-value less than the third threshold and the second p-value is less than the third threshold is selected at the same time Peptides, so as to get differential peptides.
  • the data processing part of the technical solution of the present application can be embodied in the form of a software product, and the computer software product can be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., including several instructions.
  • a computer device which may be a personal computer, a server, or a network device, etc. executes the various embodiments of the application or the methods of some parts of the embodiments.
  • This application can be used in many general or special computing system environments or configurations. For example: personal computers, server computers, handheld devices or portable devices, tablet devices, multi-processor systems, microprocessor-based systems, set-top boxes, programmable consumer electronic devices, network PCs, small computers, large computers, including Distributed computing environment for any of the above systems or equipment, etc.
  • the embodiment of the present application also provides a computer-readable storage medium in which a computer program is stored, wherein the computer program is configured to execute the steps in any one of the foregoing method embodiments when running.
  • the foregoing computer-readable storage medium may include, but is not limited to: U disk, Read-Only Memory (Read-Only Memory, referred to as ROM), Random Access Memory (Random Access Memory, referred to as RAM) , Mobile hard drives, magnetic disks or optical disks and other media that can store computer programs.
  • U disk Read-Only Memory
  • RAM Random Access Memory
  • Mobile hard drives magnetic disks or optical disks and other media that can store computer programs.
  • An embodiment of the present application also provides an electronic device, including a memory and a processor, the memory stores a computer program, and the processor is configured to run the computer program to execute the steps in any one of the foregoing method embodiments.
  • the aforementioned electronic device may further include a transmission device and an input-output device, wherein the transmission device is connected to the aforementioned processor, and the input-output device is connected to the aforementioned processor.
  • modules or steps in the above embodiments of the present application can be implemented by a general computing device, and they can be concentrated on a single computing device, or distributed among multiple computing devices. On the composed network, they can be implemented by the program code executable by the computing device, so that they can be stored in the storage device to be executed by the computing device, and in some cases, they can be executed in a different order than here.
  • the steps shown or described can be implemented by making them into individual integrated circuit modules, or making multiple modules or steps of them into a single integrated circuit module. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
  • the method, device, and storage medium for classifying samples based on immune characterization technology provided by the embodiments of the present application have the following beneficial effects: solving the problem in related technologies that does not clearly detect whether the sample is infected or not for peptides.
  • the problem of the technology has achieved the effect of realizing the technology of detecting whether the sample is infected according to the peptide, and improving the detection accuracy of the category of the sample.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Public Health (AREA)
  • Biotechnology (AREA)
  • Bioethics (AREA)
  • Signal Processing (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

本申请实施例提供了一种基于免疫表征技术对样本分类的方法、装置、存储介质及电子装置,其中,该方法包括:利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号;使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的类别;输出所述目标待测样本所属的类别。通过本申请实施例,解决了相关技术中存在的没有明确的针对肽段对样本是否被感染进行检测的技术的问题,进而达到了实现针对肽段对样本是否被感染进行检测的技术,提高样本所属类别的检测准确率的效果。

Description

基于免疫表征技术对样本分类的方法、装置及存储介质 技术领域
本申请实施例涉及通信领域,具体而言,涉及一种基于免疫表征技术对样本分类的方法、装置、存储介质及电子装置。
背景技术
由于新冠肺炎患者以发热、干咳、乏力等症状为主要表现,单从临床表现、胸部影像学难以将其与其他获得性肺炎区分。随后,国内多家检测机构开发了针对新型冠状病毒的核酸检测试剂盒,利用核酸检测手段对我国的感染患者进行了快速有效的确诊。核酸检测原理是根据病毒的基因序列设计引物,通过PCR扩增以及在扩增过程中加入荧光探针标记的方法,对扩增后产生的荧光信号进行检测,从而指示样本中是否存在病毒核酸。核酸检测具有通量高、易开发、可定量等特点。核酸检测虽然是目前新冠肺炎的确诊指标,但随着核酸检测开展,多项结果表明核酸检测具有较高的假阴性率,检出率仅为30%-50%,这与核酸检测对采样部位要求较高有关。同时,核酸检测的采样过程对医护人员的风险较高,而且只能检测病毒核酸的存在但不能确定是否是活病毒。因此,开发一种对采样要求不高(比如普适性的,只要采集到血就能检测)更具特异性,灵敏度更高,对采样要求更低的检测手段尤为重要。在相关技术中没有明确的针对肽段对样本是否含有新冠病毒进行检测的技术。
发明内容
本申请实施例提供了一种基于免疫表征技术对样本分类的方法、装置、存储介质及电子装置,以至少解决相关技术中存在的没有明确的针对肽段对样本是否被感染进行检测的技术的问题。
根据本申请的一个实施例,提供了一种基于免疫表征技术对样本分类 的方法,所述检测方法包括:利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号,其中,所述差异肽段为预先利用所述免疫表征技术筛选出的目的冠状病毒感染的阳性样本与所述对照样本存在第一差异响应信号的肽段,所述对照样本包括阴性对照样本和/或其他状态下的样本,所述其他状态下的样本包括由除所述目的冠状病毒之外的其他病原体感染的样本,所述样本为血清样本或血浆样本;使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的类别,其中,所述目标模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的类别;输出所述目标待测样本所属的类别。
可选地,在利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号之前,所述方法还包括:利用所述免疫表征技术筛选出对所述目的冠状病毒感染的阳性样本与所述对照样本存在所述第一差异响应信号的肽段,并将筛选出的所述肽段确定为所述差异肽段。
可选地,在使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的症状类别之前,所述方法还包括:使用所述多组数据通过机器学习对初始模型进行训练,以得到所述目标模型,其中,所述目标模型包括第一模型或者第二模型;所述第一模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染、已被所述目的冠状病毒感染;所述第二模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别、已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别,其中,所述第三类别、所述第四类别、所述第五类别和所述第六类别所对应的感 染程度依次加重。
可选地,在所述目标模型包括所述第一模型的情况下,在输出所述目标待测样本所属的类别之后,所述方法还包括:在确定输出的所述目标待测样本所属的类别为未被所述目的冠状病毒感染的情况下,使用第三模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的未被所述目的冠状病毒感染的类别,其中,所述第三模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的未被所述目的冠状病毒感染的类别,所述未被所述目的冠状病毒感染的类别包括以下之一:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别;输出所述目标待测样本所属的未被所述目的冠状病毒感染的类别。
可选地,在所述目标模型包括所述第二模型的情况下,在输出所述目标待测样本所属的症状类别之后,所述方法还包括:在确定输出的所述目标待测样本所属的症状类别为已被所述目的冠状病毒感染的情况下,使用第四模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的已被所述目的冠状病毒感染的类别,其中,所述第四模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的已被所述目的冠状病毒感染的类别,所述已被所述目的冠状病毒感染的类别包括以下之一:已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别;输出所述目标待测样本所属的已被所述目的冠状病毒感染的类别。
可选地,所述目标模型包括第一线性核支持向量机SVM。
可选地,所述第三模型包括第二线性核支持向量机SVM。
可选地,所述第四模型包括第三线性核支持向量机SVM。
根据本申请的一个实施例,还提供了一种基于免疫表征技术对样本分类的装置,所述装置包括:检测模块,设置为利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号,其中,所述差异肽段为预先利用所述免疫表征技术筛选出的目的冠状病毒感染的阳性样本与所述对照样本存在第一差异响应信号的肽段,所述对照样本包括阴性对照样本和/或其他状态下的样本,所述其他状态下的样本包括由除所述目的冠状病毒之外的其他病原体感染的样本,所述样本为血清样本或血浆样本;第一分析模块,设置为使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的类别,其中,所述目标模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的类别;第一输出模块,设置为输出所述目标待测样本所属的类别。
可选地,所述装置还包括:筛选模块,设置为在利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号之前,利用所述免疫表征技术筛选出对所述目的冠状病毒感染的阳性样本与所述对照样本存在所述第一差异响应信号的肽段,并将筛选出的所述肽段确定为所述差异肽段。
可选地,所述装置还包括:训练模块,设置为在使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的症状类别之前,使用所述多组数据通过机器学习对初始模型进行训练,以得到所述目标模型,其中,所述目标模型包括第一模型或者第二模型;所述第一模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染、已被所述目的冠状病毒感染;所述第二模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别、已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第 六类别,其中,所述第三类别、所述第四类别、所述第五类别和所述第六类别所对应的感染程度依次加重。
可选地,所述装置还包括:第二分析模块,设置为在所述目标模型包括所述第一模型的情况下,在输出所述目标待测样本所属的类别之后,且在确定输出的所述目标待测样本所属的类别为未被所述目的冠状病毒感染的情况下,使用第三模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的未被所述目的冠状病毒感染的类别,其中,所述第三模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的未被所述目的冠状病毒感染的类别,所述未被所述目的冠状病毒感染的类别包括以下之一:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别;第二输出模块,设置为输出所述目标待测样本所属的未被所述目的冠状病毒感染的类别。
可选地,所述装置还包括:第三分析模块,设置为在所述目标模型包括所述第二模型的情况下,在输出所述目标待测样本所属的症状类别之后,且在确定输出的所述目标待测样本所属的症状类别为已被所述目的冠状病毒感染的情况下,使用第四模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的已被所述目的冠状病毒感染的类别,其中,所述第四模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的已被所述目的冠状病毒感染的类别,所述已被所述目的冠状病毒感染的类别包括以下之一:已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别;第三输出模块,设置为输出所述目标待测样本所属的已被所述目的冠状病毒感染的类别。
根据本申请的又一个实施例,还提供了一种冠状病毒感染的检测方法,所述检测方法包括:利用免疫表征技术筛选出对目的冠状病毒感染的阳性 样本与对照样本存在第一差异响应信号的肽段,记为差异肽段,所述样本为血清样本或血浆样本;以所述差异肽段的所述第一差异响应信号为特征,采用支持向量机的方法对所述阳性样本和所述对照样本进行分类模型构建,得到样本分类模型;利用所述免疫表征技术,检测所述差异肽段在待测样本与所述对照样本中相应的差异响应信号,记为第二差异响应信号;将所述第二差异响应信号输入所述样本分类模型进行分类,从而获得所述待测样本的所属症状类别;其中,所述对照样本包括阴性对照样本和其他肺部疾病对照样本,所述其他肺部疾病指非所述目的冠状病毒感染引起的肺部疾病,优选所述目的冠状病毒为SARS-CoV-2。
根据本申请的又一个实施例,还提供了一种冠状病毒感染的检测装置,所述检测装置包括:差异肽段筛选模块,设置为利用免疫表征技术筛选出对目的冠状病毒感染的阳性样本与对照样本存在第一差异响应信号的肽段,记为差异肽段,所述样本为血清样本或血浆样本;模型建立模块,设置为以所述差异肽段的所述第一差异响应信号为特征,采用支持向量机的方法对所述阳性样本和所述对照样本进行分类模型构建,得到样本分类模型;响应信号检测模块,设置为利用所述免疫表征技术,检测所述差异肽段在待测样本与所述对照样本中相应的差异响应信号,记为第二差异响应信号;分类检测模块,设置为将所述第二差异响应信号输入所述样本分类模型进行分类,从而获得所述待测样本的所属症状类别;其中,所述对照样本包括阴性对照样本和其他肺部疾病对照样本,所述其他肺部疾病指非所述目的冠状病毒感染引起的肺部疾病,优选所述目的冠状病毒为SARS-CoV-2。
根据本申请的又一个实施例,还提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
根据本申请的又一个实施例,还提供了一种电子装置,包括存储器和处理器,所述存储器中存储有计算机程序,所述处理器被设置为运行所述计算机程序以执行上述任一项方法实施例中的步骤。
通过本申请实施例,可以基于神经网络来对差异肽段在目标待测样本与对照样本中相应的差异响应信号进行检测,进而确定出样本所属的类别,因此,可以解决相关技术中存在的没有明确的针对肽段对样本是否被感染进行检测的技术的问题,达到了实现针对肽段对样本是否被感染进行检测的技术,提高样本所属类别的检测准确率的效果。
附图说明
图1是本申请实施例的一种基于免疫表征技术对样本分类的方法的移动终端的硬件结构框图;
图2是根据本申请实施例的基于免疫表征技术对样本分类的方法的流程图;
图3是根据本申请实施例的基于留一法(Leave-one-out,LOO)验证分类模型的分类性能的ROC曲线图;其中,数据集1验证敏感性=0.943,特异性=0.900;数据集2验证敏感性=0.958,特异性=0.889;
图4是根据本申请实施例的以数据集1和数据集2互为训练集和测试集,测试分类模型对新的待测数据的分类预测性能的ROC曲线图,其中,数据集1作为训练集,数据集2作为测试集时,敏感性=0.845,特异性=0.889;数据集2作为训练集,数据集1作为测试集时,敏感性=0.800,特异性=0.900;
图5是根据本申请实施例的验证结果图一;
图6是根据本申请实施例的验证结果图二;
图7是根据本申请实施例的验证结果图三;
图8是根据本申请实施例的基于免疫表征技术对样本分类的装置图。
具体实施方式
下文中将参考附图并结合实施例来详细说明本申请的实施例。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序 或先后次序。
术语解释:
ImmuneSignatuer技术:免疫表征技术,采用高密度随机多肽(比如,130,000条多肽)芯片与血液中的抗体结合,通过与荧光标记的二抗孵育后,在酶标仪中检测荧光值来反映血液中的抗体。该方法能够鉴定不同个体之间差异表达的抗体。
多肽:本申请中指预测的或者筛选的能够与抗体特异性结合的任意一条肽段。
抗原:指所有能够诱导机体发生免疫应答的物质。即能够被T/B淋巴细胞表面的抗原受体(TCR/BCR)特异性结合,活化T/B细胞,使之增殖分化,产生免疫应答产物(致敏淋巴细胞或抗体)并能与相应产物在体内外发生特异性结合的物质。因此,抗原具备两方面的重要特性:免疫原性和免疫反应性。本申请中的抗原指多肽半抗原与载体蛋白偶联之后形成的具有免疫原性的完全抗原,可以为单一氨基酸序列的多肽与载体蛋白偶联形成的多肽-载体蛋白偶联物;也可以为具有多种不同氨基酸序列的多肽与载体蛋白偶联形成的多肽-载体蛋白偶联物的组合物。
ROC曲线:反应敏感性与特异性之间关系的曲线。横坐标X轴为1-特异性,也称为假阳性率,X轴越接近零准确率越高;纵坐标Y轴称为敏感度,也称为真阳性率,Y轴越大代表敏感度越好。根据曲线位置,把整个图划分为两部分,曲线下方部分的面积被称为AUC(Area Under Curve),用来表示预测准确性,AUC值越高,表明预测准确率越高。曲线越接近左上角(X越小,Y越大),预测准确率越高。
下面结合实施例对本申请进行说明:
本申请实施例中所提供的方法实施例可以在移动终端、计算机终端或者类似的运算装置中执行。以运行在移动终端上为例,图1是本申请实施例的一种基于免疫表征技术对样本分类的方法的移动终端的硬件结构框图。如图1所示,移动终端可以包括一个或多个(图1中仅示出一个)处 理器102(处理器102可以包括但不限于微处理器MCU或可编程逻辑器件FPGA等的处理装置)和设置为存储数据的存储器104,其中,上述移动终端还可以包括用于通信功能的传输设备106以及输入输出设备108。本领域普通技术人员可以理解,图1所示的结构仅为示意,其并不对上述移动终端的结构造成限定。例如,移动终端还可包括比图1中所示更多或者更少的组件,或者具有与图1所示不同的配置。
存储器104可设置为存储计算机程序,例如,应用软件的软件程序以及模块,如本申请实施例中的基于免疫表征技术对样本分类的方法对应的计算机程序,处理器102通过运行存储在存储器104内的计算机程序,从而执行各种功能应用以及数据处理,即实现上述的方法。存储器104可包括高速随机存储器,还可包括非易失性存储器,如一个或者多个磁性存储装置、闪存、或者其他非易失性固态存储器。在一些实例中,存储器104可进一步包括相对于处理器102远程设置的存储器,这些远程存储器可以通过网络连接至移动终端。上述网络的实例包括但不限于互联网、企业内部网、局域网、移动通信网及其组合。
传输装置106设置为经由一个网络接收或者发送数据。上述的网络具体实例可包括移动终端的通信供应商提供的无线网络。在一个实例中,传输装置106包括一个网络适配器(Network Interface Controller,简称为NIC),其可通过基站与其他网络设备相连从而可与互联网进行通讯。在一个实例中,传输装置106可以为射频(Radio Frequency,简称为RF)模块,其设置为通过无线方式与互联网进行通讯。
在本实施例中提供了一种基于免疫表征技术对样本分类的方法,该实施例是通过大量的健康人、其他肺部疾病及新冠肺炎患者的样本(血清样本或者血浆样本)中差异表达的抗体特征数据,基于人工智能的方法对用于确定样本的类别建立了分类模型。然后通过已知样本检测和验证了该分类模型的灵敏度和特异性。表明该分类模型具有较高的分类准确性,利用该分类模型对待测对象进行分类的灵敏度和特异性数据表明,利用该方法能够有效地准确地确定样本的类别。下面对本实施例进行具体描述:
图2是根据本申请实施例的基于免疫表征技术对样本分类的方法的流程图,如图2所示,该流程包括如下步骤:
S202,利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号,其中,所述差异肽段为预先利用所述免疫表征技术筛选出的目的冠状病毒感染的阳性样本与所述对照样本存在第一差异响应信号的肽段,所述对照样本包括阴性对照样本和/或其他状态下的样本,所述其他状态下的样本包括由除所述目的冠状病毒之外的其他病原体感染的样本,所述样本为血清样本或血浆样本;
S204,使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的类别,其中,所述目标模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的类别;
S206,输出所述目标待测样本所属的类别。
通过上述方法可以获得待测样本所属的类别,其中,不同的类别可以用于指示待测样本是否被目的冠状病毒感染,和/或具体的未被感染的程度,和/或具体的已被感染的程度。
可选地,上述其他病原体包括用于引起其他肺部疾病的病毒或细菌,其他肺部疾病指非目的冠状病毒感染引起的肺部疾病。其中,当上述的目标待测样本的类型为血清样本的情况下,上述对照样本的类型也为血清样本,当上述的目标待测样本的类型为血浆样本的情况下,上述对照样本的类型也为血浆样本,即,目标待测样本的类型和对照样本的类型是一一对应的,后续实施例也是如此,后续不再赘述。
通过本申请实施例,可以基于机器学习模型来对差异肽段在目标待测样本与对照样本中相应的差异响应信号进行检测,进而确定出样本所属的类别,因此,可以解决相关技术中存在的没有明确的针对肽段对样本是否被感染进行检测的技术的问题,达到了实现针对肽段对样本是否被感染进行检测的技术,提高样本所属类别的检测准确率的效果。
下面结合具体实施例对上述方法进行描述:
在本实施例中,可以根据样本所属的类别,分别筛选相应的差异表达的抗体特征后进行训练,就可以获得不同类别的分类筛选模型。从而准确地对不同类别的样本进行类别的有效确认。
(一)临床设计:
样本包含3组,分别是健康(记为H)的血浆样本,其他肺部疾病(记为T,肺结核为主)的血浆样本,新冠肺炎(记为F)的血浆样本。第一批数据共80个样本,H:T:F=5:5:70,记为数据集1;第二批数据共79个样本,H:T:F=5:4:70,记为数据集2,需要说明的是,上述样本的数量仅是一个示例性说明,在实际应用还可以采用其他数量的样本数据,例如,200个样本、500个样本等,并且样本数据个数越多,所得到的最终确认结果实际上是越准确的。
(二)差异肽段筛选:
可选地,在利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号之前,所述方法还包括:利用所述免疫表征技术筛选出对所述目的冠状病毒感染的阳性样本与所述对照样本存在所述第一差异响应信号的肽段,并将筛选出的所述肽段确定为所述差异肽段。下面结合具体操作对如何确定差异肽段进行描述:
设计的思路是,第一步,通过F与H的对比,可以筛选出被能够导致肺部疾病的病原体所感染的血浆样本中显著上升的HT多肽特征,此类特征对应着血浆样本在被病原体感染后所带来的抗体浓度上升,但是找出的抗体不一定是新冠病毒(对应于上述的目的冠状病毒)特异的抗体,也可能是由其他导致肺部疾病感染的病原体或者其他因素引起的抗体提高;第二步,通过对比F与T,可以找出新冠病毒相比其他肺部疾病特异性的抗体,但是由于疾病状态下抗体表达比较复杂,且受T的样本数量限制,新冠病毒感染与其他肺部疾病的比较可能会容易错误的找到一些非特异性的HT多肽;因此,为了进一步精准地获得新冠特异性肽段,最后我们将 第一步与第二步中找出的特征肽段取交集,从而得到具有高准确性的新冠特异肽段。
具体筛选方法:将原始数据做log10转换之后,不进行数据校正(筛选肽段如果是基于测得数据的,则未做数据校正;如果基于先验知识、比对结果等的,则本身也无需校正)。假设新冠病毒会造成特定抗体信号的升高,因此,我们通过单尾的T检验(T检验是一个词,跟肺部疾病的T不是同一个T),需要说明的是,在其他实施方式中,任何可以用来检验组间均值显著差异的统计检验均可以代替这里的单尾T检验。计算每个特征F相比于T升高的p值,并进行多重假设校正,记为p_FT_BH;同时,计算每个特征F相比于H升高的p值,并进行多重假设校正,记为p_FH_BH;筛选所有同时满足p_FT_BH<0.05且p_FH_BH<0.05的特征肽段,作为目标肽段。基于数据集1,筛选特征肽段864个。该方法相比于数据校正后筛选,能得到更稳定的信号肽段。
(三)分类模型的构建:
可选地,在使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的症状类别之前,所述方法还包括:使用所述多组数据通过机器学习对初始模型进行训练,以得到所述目标模型,其中,所述目标模型包括第一模型或者第二模型;所述第一模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染、已被所述目的冠状病毒感染;所述第二模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别、已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别,其中,所述第三类别、所述第四类别、所述第五类别和所述第六类别所对应的感染程度依次加重。下面结合具体操作对如何构建目标模型进行描述:
可选地,利用新冠病毒导致的差异肽段的数据进行建模,该数据包含3个类别:未被感染的血浆样本的数据(记为H),其他肺部疾病的血浆样本的数据(肺结核为主,记为T),新冠病毒感染的血浆样本的数据(记为F)。第一批数据共80个样本,H:T:F=5:5:70,用作训练数据,记为数据集1;第二批数据供79个样本,H:T:F=5:4:70,可用于测试,记为数据集2。将数据的差异肽段作为特征(具体以差异肽段的信号值作为特征),进行数据校正后(数据校正包含的步骤有异常值去除、芯片中位数校正、批次均值校正、分位数校正),采用支持向量机分类器进行分类模型构建,模型核函数使用线性核,损失函数的类别权重为训练集类别数量的反比,区分目标血浆样本是否属于新冠病毒感染的血浆样本(即F为类别1,非F为类别0)。需要说明的是,本实施例的样本来自深圳市第三人民医院。
为了能更好的应用所有差异特征肽段的信息,较优的方式是通过构建一个模型(即,上述的目标模型),通过输入的数据特征来预测样本是否被新冠病毒感染。考虑到模型的预测性能、鲁棒性与可解释性,我们选择使用线性核的支持向量机(当然,选择其他的神经网络模型也是可行的,本申请实施例是以线性核SVM为例来进行说明的)来进行分类,使用的错误惩罚权重为1.0,损失函数的类别权重为训练集类别数量的反比。
需要说明的是,上述训练的模型的分类粒度是可调的,一个具体实施方式中,可以将上述训练的模型的分类粒度调整为仅判断样本是未被所述目的冠状病毒感染的样本还是已被所述目的冠状病毒感染的样本,一个更优选的实施方式中,还可以将上述训练的模型的分类粒度调整的更细,例如,可以将上述训练的模型的分类粒度进一步调整为可以判断如下分类:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别、已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别,其中,所述第三类别、所述第四类别、所述第五类别和所述第六类别所对应的感染程度依次加重(需 要说明的是,上述第三类别、第四类别、第五类别和第六类别的划分仅是一种可选的划分方式,在实际应用中还可以基于感染的程度做更少类的划分,或者更多类的划分)。上述的第一类别可以是当前未被新冠病毒感染,但是存在新冠病毒的抗体的类别(即,对应的样本之前被新冠病毒感染过的样本),上述第二类别可以是从未被新冠病毒感染的类别。
在上述训练的模型的分类粒度为仅判断样本是未被所述目的冠状病毒感染的样本还是已被所述目的冠状病毒感染的样本的情况下,若需要进一步判断更细化的类别,可以引入其他的模型来进行判别,例如:
在所述目标模型包括所述第一模型的情况下,在输出所述目标待测样本所属的类别之后,所述方法还包括:在确定输出的所述目标待测样本所属的类别为未被所述目的冠状病毒感染的情况下,使用第三模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的未被所述目的冠状病毒感染的类别,其中,所述第三模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的未被所述目的冠状病毒感染的类别,所述未被所述目的冠状病毒感染的类别包括以下之一:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别;输出所述目标待测样本所属的未被所述目的冠状病毒感染的类别。
在所述目标模型包括所述第二模型的情况下,在输出所述目标待测样本所属的症状类别之后,所述方法还包括:在确定输出的所述目标待测样本所属的症状类别为已被所述目的冠状病毒感染的情况下,使用第四模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的已被所述目的冠状病毒感染的类别,其中,所述第四模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的已被所述目的冠状病毒感染的类别,所述已被所述目的冠状病毒感染的类别包括以下之一:已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类 别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别;输出所述目标待测样本所属的已被所述目的冠状病毒感染的类别。
可选地,上述的第三模型也可以是一种线性核SVM,上述的第四模型也可以是一种线性核SVM,但是上述模型类型仅是一种示例性说明,在实际应用中,还可以对其他类型的模型做训练,以得到上述的第三模型和/或第四模型。
(四)模型分类性能的验证:
为了验证该分类模型(即,上述的目标模型)对于上述864个差异肽段的特征数据的建模能力,在单一数据集上,即分别对数据集1与数据集2,基于留一法(Leave-one-out,LOO)验证分类模型的分类性能,验证的ROC曲线见图3。其中,数据集1验证敏感性=0.943,特异性=0.900;数据集2验证敏感性=0.958,特异性=0.889。
为了测试上述分类模型在新数据方面的预测性能,分别将数据集1和数据集2中之一作为训练集,另一个数据集作为测试集,进行性能测试,测试的ROC曲线见图4。其中,数据集1作为训练集,数据集2作为测试集时,敏感性=0.845,特异性=0.889;数据集2作为训练集,数据集1作为测试集时,敏感性=0.800,特异性=0.900)。
从上述两方面的测试结果可以看出,采用该分类模型对新冠肺炎的样本进行检测分类的敏感性和特异性都比较高(ROC曲线的AUC(Area Under The Curve,曲线下面积)>0.9)。
(五)模型在对待测样本进行分类中的应用
模型在上述864个差异肽段的特征数据上训练完成后,可用于预测新的差异肽段的特征数据是否对应于新冠病毒感染的样本,具体的使用方法为:对新的样本进行相同的864个差异肽段的响应信号值的检测,进行必要的预处理校正后,将筛选出的864个肽段特征数据输入模型,并根据模型输出的预测结果判断该样本是否为被新冠病毒感染的样本。
此外,为了验证上述分类模型的分类检测的准确度,还综合了特定地区的样本数据对上述模型进行了验证,具体如下:
在此次验证过程中,主要采集了来自合肥和武汉的样本数据,其中,按照样本被新冠病毒感染的严重程度分成了四类,分别为疑似(suspected)类型、轻型(Mild)类型、普通(Regular)类型和重型(Severe)类型,具体见表1(其中,武汉的数据由于可能包含大量假阳的诊断,后续进行了N蛋白的测试,此处21*为去除了N蛋白阳性的数据):
表1
Type 合肥数据 武汉数据
Suspected 15 21*
Mild 18 -
Regular 38 40
Severe 23 32
需要说明的是,武汉的样本来自人民医院东院和中国人民解放军中部战区总医院,合肥的数据来自安徽省立医院感染病院,合肥和武汉的样本类型均为血清样本。
在需要对感染程度进行分类的情况下,利用ANOVA挑选多类别的差异肽段,利用Benjamini & Hochberg方法(BH法)计算P值的校正结果,记为p_BH。其中,基于合肥数据,以p_BH<0.005挑选出3171条显著差异性的肽段。武汉数据由于效果不佳,未进行此项分析。
在仅需要确定是否被感染的情况下,利用t-test单尾检验挑选新冠特征特异性表达(显著上升)的肽段,利用Benjamini & Hochberg方法(BH法)计算P值的校正结果,记为p_BH。其中,基于合肥数据,以p_BH<0.005挑选出2730条显著特异性的肽段,基于武汉数据,以p_BH<0.005挑选出101条显著特异性的肽段。
由于数据批次效应较强,且深圳、武汉、合肥的样本不同级别的比例 都不一致,导致最终肽段数量不一样,通过该验证方法可以在不同的数据与级别的比例情况下验证建模方法的可靠性与普适性。
本次使用的模型为利用svm来构建的分类模型(其中主要参数可以与经过深圳的样本训练数据所训练的模型参数保持一致(kernel=linear;class_weight=balanced等)),此外,本次验证操作是利用留一法来进行模型性能验证,基于AUC来评价模型表现。
在仅需要确定是否被感染的情况下,针对合肥和武汉的样本数据分别利用模型进行检测,得到的结果如图5和图6所示,其中,在合肥的数据中,采用本方法构建的检测模型,模型预测阈值为0.5。
如图5和图6所示,两个模型都有AUC>0.9,说明模型效果较好,其中:
敏感性=TP/(TP+FN);特异性=TN/(TN+FP)
合肥:敏感性=TP/(TP+FN)=76/(76+3)=0.962
特异性=TN/(TN+FP)=15/(15+0)=1.000
武汉:敏感性=TP/(TP+FN)=68/72=0.944
特异性=TN/(TN+FP)=20/21=0.952
可见,取模型预测0.5为阈值时,敏感性,特异性都>0.8。
在需要对感染程度进行分类的情况下,利用模型对合肥的数据进行检测,得到的结果如图7所示,在多分类情况下,AUC>0.9,说明模型效果较好。
从混淆矩阵可以看出,大部分的分类错误出现在相邻类别(如4个mild预测成了regular,仅有2个预测成severe;或者5个severe预测成了regular,仅有2个预测为mild)说明模型能比较准确的判断各个新冠的级别。可以看到在不同的数据集,不同的新冠级别的分类上,模型都是比较有效的,所以建模的方法也是比较有效的。
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请实施例并不受所描述的动作顺序的限制,因为依据本申请实施例,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作并不一定是本申请所必须的。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到根据上述实施例的方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得计算设备执行本申请各个实施例所述的方法,或者是使得处理器来执行本申请各个实施例所述的方法。
本实施例中还提供了一种基于免疫表征技术对样本分类的装置,如图8所示,该装置包括:
检测模块82,设置为利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号,其中,所述差异肽段为预先利用所述免疫表征技术筛选出的目的冠状病毒感染的阳性样本与所述对照样本存在第一差异响应信号的肽段,所述对照样本包括阴性对照样本和/或其他状态下的样本,所述其他状态下的样本包括由除所述目的冠状病毒之外的其他病原体感染的样本,所述样本为血清样本或血浆样本;
第一分析模块84,设置为使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的类别,其中,所述目标模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的类别;
第一输出模块86,设置为输出所述目标待测样本所属的类别。
可选地,所述装置还包括:筛选模块,设置为在利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号之前,利用所述免疫表征技术筛选出对所述目的冠状病毒感染的阳性样本与所述对照样本存在所述第一差异响应信号的肽段,并将筛选出的所述肽段确定为所述差异肽段。
可选地,所述装置还包括:训练模块,设置为在使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的症状类别之前,使用所述多组数据通过机器学习对初始模型进行训练,以得到所述目标模型,其中,所述目标模型包括第一模型或者第二模型;所述第一模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染、已被所述目的冠状病毒感染;所述第二模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别、已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别,其中,所述第三类别、所述第四类别、所述第五类别和所述第六类别所对应的感染程度依次加重。
可选地,所述装置还包括:第二分析模块,设置为在所述目标模型包括所述第一模型的情况下,在输出所述目标待测样本所属的类别之后,且在确定输出的所述目标待测样本所属的类别为未被所述目的冠状病毒感染的情况下,使用第三模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的未被所述目的冠状病毒感染的类别,其中,所述第三模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的未被所述目的冠状病毒感染的类别,所述未被所述目的冠状病毒感染的类别包括以下之一:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所 述目的冠状病毒感染且未被感染的类别为第二类别;第二输出模块,设置为输出所述目标待测样本所属的未被所述目的冠状病毒感染的类别。
可选地,所述装置还包括:第三分析模块,设置为在所述目标模型包括所述第二模型的情况下,在输出所述目标待测样本所属的症状类别之后,且在确定输出的所述目标待测样本所属的症状类别为已被所述目的冠状病毒感染的情况下,使用第四模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的已被所述目的冠状病毒感染的类别,其中,所述第四模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的已被所述目的冠状病毒感染的类别,所述已被所述目的冠状病毒感染的类别包括以下之一:已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别;第三输出模块,设置为输出所述目标待测样本所属的已被所述目的冠状病毒感染的类别。
可选地,所述目标模型包括第一线性核支持向量机SVM。
可选地,所述第三模型包括第二线性核支持向量机SVM。
可选地,所述第四模型包括第三线性核支持向量机SVM。
根据本申请的一个实施例,还提供了一种冠状病毒感染的检测方法,检测方法包括:利用免疫表征技术筛选出对目的冠状病毒感染的阳性样本与对照样本存在第一差异响应信号的肽段,记为差异肽段,所述样本为血清样本或血浆样本;以差异肽段的第一差异响应信号为特征,采用支持向量机的方法对阳性样本和对照样本进行分类模型构建,得到样本分类模型;利用免疫表征技术,检测差异肽段在待测样本与对照样本中相应的差异响应信号,记为第二差异响应信号;将第二差异响应信号输入样本分类模型进行分类,从而获得待测样本的所属症状类别;其中,对照样本包括阴性对照样本和其他肺部疾病的样本,其他肺部疾病指非目的冠状病毒感染引 起的肺部疾病,优选目的冠状病毒为SARS-CoV-2。
优选地,利用免疫表征技术筛选出对目的冠状病毒感染的阳性样本与对照样本存在第一差异响应信号的肽段,记为差异肽段包括:选取目的冠状病毒感染的阳性样本、阴性对照样本和其他肺部疾病对照样本,其他肺部疾病指目的冠状病毒之外的病毒感染引起的肺部疾病;采用免疫表征技术,将阳性样本、阴性对照样本和其他肺部疾病对照样本与多肽阵列芯片结合,获得结合肽段响应的信号值;针对每个结合肽段,计算阳性样本的信号值与阴性对照样本的信号值之间存在差异时的p值,记为第一p值,同时计算阳性样本的信号值与其他肺部疾病对照样本的信号值存在差异时的p值,记为第二p值;保留所有符合第一p值和第二p值同时满足第三阈值的结合肽段,从而得到差异肽段;优选第三阈值为<0.05。
优选地,对结合肽段的信号值进行log10转换,以转换后的log值为特征,通过单尾T检验,计算各特征在阳性样本与阴性对照样本之间存在差异时的p值,并对p值进行多重假设检验校正,得到第一p值;同时计算相应的特征在阳性样本与其他肺部疾病对照样本之间存在差异时的p值,并对p值进行多重假设检验校正,记为第二p值;筛选同时满足第一p值小于第三阈值且第二p值小于第三阈值的结合肽段,从而得到差异肽段。
根据本申请另一个实施例,还提供了一种冠状病毒感染的检测装置,该检测装置包括:差异肽段筛选模块,设置为利用免疫表征技术筛选出对目的冠状病毒感染的阳性样本与对照样本存在第一差异响应信号的肽段,记为差异肽段,所述样本为血清样本或血浆样本;模型建立模块,设置为以差异肽段的第一差异响应信号为特征,采用支持向量机的方法对阳性样本和对照样本进行分类模型构建,得到样本分类模型;响应信号检测模块,设置为利用免疫表征技术,检测差异肽段在待测样本与对照样本中相应的差异响应信号,记为第二差异响应信号;分类检测模块,设置为将第二差异响应信号输入样本分类模型进行分类,从而获得待测样本的所属症状类别;其中,对照样本包括阴性对照样本和其他肺部疾病的样本,其他肺部疾病指非目的冠状病毒感染引起的肺部疾病,优选目的冠状病毒为 SARS-CoV-2。
优选地,差异肽段筛选模块包括:样本选择单元,设置为选取目的冠状病毒感染的阳性样本、阴性对照样本和其他肺部疾病对照样本,其他肺部疾病指目的冠状病毒之外的病毒感染引起的肺部疾病;信号获取单元,设置为采用免疫表征技术,将阳性样本、阴性对照样本和其他肺部疾病对照样本与多肽阵列芯片结合,获得结合肽段响应的信号值;差异肽段筛选单元,设置为针对每个结合肽段,计算阳性样本的信号值与阴性对照样本的信号值之间存在差异时的p值,记为第一p值,同时计算阳性样本的信号值与其他肺部疾病对照样本的信号值存在差异时的p值,记为第二p值;保留所有符合第一p值和第二p值同时满足第三阈值的结合肽段,从而得到差异肽段;优选第三阈值为<0.05。
优选地,差异肽段筛选单元包括:信号转换子单元,设置为对结合肽段的信号值进行log10转换;差异肽段筛选子单元,设置为以转换后的log值为特征,通过单尾T检验,计算各特征在阳性样本与阴性对照样本之间存在差异时的p值,并对p值进行多重假设检验校正,得到第一p值;同时计算相应的特征在阳性样本与其他肺部疾病对照样本之间存在差异时的p值,并对p值进行多重假设检验校正,记为第二p值;筛选同时满足第一p值小于第三阈值且第二p值小于第三阈值的结合肽段,从而得到差异肽段。
通过以上的实施方式的描述可知,本领域的技术人员可以清楚地了解到本申请可借助软件加必需的检测仪器等硬件设备的方式来实现。基于这样的理解,本申请的技术方案中数据处理的部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例或者实施例的某些部分的方法。
本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相 同或相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于系统实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本申请可用于众多通用或专用的计算系统环境或配置中。例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备、多处理器系统、基于微处理器的系统、置顶盒、可编程的消费电子设备、网络PC、小型计算机、大型计算机、包括以上任何系统或设备的分布式计算环境等等。
本申请的实施例还提供了一种计算机可读存储介质,该计算机可读存储介质中存储有计算机程序,其中,该计算机程序被设置为运行时执行上述任一项方法实施例中的步骤。
在一个示例性实施例中,上述计算机可读存储介质可以包括但不限于:U盘、只读存储器(Read-Only Memory,简称为ROM)、随机存取存储器(Random Access Memory,简称为RAM)、移动硬盘、磁碟或者光盘等各种可以存储计算机程序的介质。
本申请的实施例还提供了一种电子装置,包括存储器和处理器,该存储器中存储有计算机程序,该处理器被设置为运行计算机程序以执行上述任一项方法实施例中的步骤。
在一个示例性实施例中,上述电子装置还可以包括传输设备以及输入输出设备,其中,该传输设备和上述处理器连接,该输入输出设备和上述处理器连接。
本实施例中的具体示例可以参考上述实施例及示例性实施方式中所描述的示例,本实施例在此不再赘述。
显然,本领域的技术人员应该明白,上述的本申请实施例中的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置 上,或者分布在多个计算装置所组成的网络上,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请实施例不限制于任何特定的硬件和软件结合。
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
工业实用性
如上所述,本申请实施例提供的一种基于免疫表征技术对样本分类的方法、装置及存储介质具有以下有益效果:解决相关技术中存在的没有明确的针对肽段对样本是否被感染进行检测的技术的问题,达到了实现针对肽段对样本是否被感染进行检测的技术,提高样本所属类别的检测准确率的效果。

Claims (12)

  1. 一种基于免疫表征技术对样本分类的方法,所述方法包括:
    利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号,其中,所述差异肽段为预先利用所述免疫表征技术筛选出的目的冠状病毒感染的阳性样本与所述对照样本存在第一差异响应信号的肽段,所述对照样本包括阴性对照样本和/或其他状态下的样本,所述其他状态下的样本包括由除所述目的冠状病毒之外的其他病原体感染的样本,所述样本为血清样本或血浆样本;
    使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的类别,其中,所述目标模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的类别;
    输出所述目标待测样本所属的类别。
  2. 根据权利要求1所述的方法,其中,在利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号之前,所述方法还包括:
    利用所述免疫表征技术筛选出对所述目的冠状病毒感染的阳性样本与所述对照样本存在所述第一差异响应信号的肽段,并将筛选出的所述肽段确定为所述差异肽段。
  3. 根据权利要求1所述的方法,其中,在使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的症状类别之前,所述方法还包括:
    使用所述多组数据通过机器学习对初始模型进行训练,以得到所述目标模型,其中,所述目标模型包括第一模型或者第二模型;
    所述第一模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染、已被所述目的冠状病毒感染;
    所述第二模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别、已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别,其中,所述第三类别、所述第四类别、所述第五类别和所述第六类别所对应的感染程度依次加重。
  4. 根据权利要求3所述的方法,其中,
    在所述目标模型包括所述第一模型的情况下,在输出所述目标待测样本所属的类别之后,所述方法还包括:在确定输出的所述目标待测样本所属的类别为未被所述目的冠状病毒感染的情况下,使用第三模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的未被所述目的冠状病毒感染的类别,其中,所述第三模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的未被所述目的冠状病毒感染的类别,所述未被所述目的冠状病毒感染的类别包括以下之一:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别;输出所述目标待测样本所属的未被所述目的冠状病毒感染的类别;
    在所述目标模型包括所述第二模型的情况下,在输出所述目标待测样本所属的症状类别之后,所述方法还包括:在确定输出的所述目标待测样本所属的症状类别为已被所述目的冠状病毒感染的情况下,使用第四模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的已被所述目的冠状病毒感染的类别,其中,所述第四模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的已被所述目的冠状病毒感染的类别,所述已被所述目的冠状病毒感染的类别 包括以下之一:已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别;输出所述目标待测样本所属的已被所述目的冠状病毒感染的类别。
  5. 一种基于免疫表征技术对样本分类的装置,所述装置包括:
    检测模块,设置为利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号,其中,所述差异肽段为预先利用所述免疫表征技术筛选出的目的冠状病毒感染的阳性样本与所述对照样本存在第一差异响应信号的肽段,所述对照样本包括阴性对照样本和/或其他状态下的样本,所述其他状态下的样本包括由除所述目的冠状病毒之外的其他病原体感染的样本,所述样本为血清样本或血浆样本;
    第一分析模块,设置为使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的类别,其中,所述目标模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的类别;
    第一输出模块,设置为输出所述目标待测样本所属的类别。
  6. 根据权利要求5所述的装置,其中,所述装置还包括:
    筛选模块,设置为在利用免疫表征技术,检测差异肽段在目标待测样本与对照样本中相应的差异响应信号,以得到第二差异响应信号之前,利用所述免疫表征技术筛选出对所述目的冠状病毒感染的阳性样本与所述对照样本存在所述第一差异响应信号的肽段,并将筛选出的所述肽段确定为所述差异肽段。
  7. 根据权利要求5所述的装置,其中,所述装置还包括:
    训练模块,设置为在使用目标模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的症状类别之前,使用所述多组数 据通过机器学习对初始模型进行训练,以得到所述目标模型,其中,所述目标模型包括第一模型或者第二模型;
    所述第一模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染、已被所述目的冠状病毒感染;
    所述第二模型用于针对输入的信号输出用于标识以下结果之一的标签:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别、已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别,其中,所述第三类别、所述第四类别、所述第五类别和所述第六类别所对应的感染程度依次加重。
  8. 根据权利要求7所述的装置,其中,
    所述装置还包括:第二分析模块,设置为在所述目标模型包括所述第一模型的情况下,在输出所述目标待测样本所属的类别之后,且在确定输出的所述目标待测样本所属的类别为未被所述目的冠状病毒感染的情况下,使用第三模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的未被所述目的冠状病毒感染的类别,其中,所述第三模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的未被所述目的冠状病毒感染的类别,所述未被所述目的冠状病毒感染的类别包括以下之一:未被所述目的冠状病毒感染且未被感染的类别为第一类别、未被所述目的冠状病毒感染且未被感染的类别为第二类别;第二输出模块,设置为输出所述目标待测样本所属的未被所述目的冠状病毒感染的类别;
    或者,
    所述装置还包括:第三分析模块,设置为在所述目标模型包括所 述第二模型的情况下,在输出所述目标待测样本所属的症状类别之后,且在确定输出的所述目标待测样本所属的症状类别为已被所述目的冠状病毒感染的情况下,使用第四模型对所述第二差异响应信号进行分析,确定所述目标待测样本所属的已被所述目的冠状病毒感染的类别,其中,所述第四模型为使用多组数据通过机器学习训练出的,所述多组数据中的每组数据均包括:差异响应信号和差异响应信号对应的待测样本所属的已被所述目的冠状病毒感染的类别,所述已被所述目的冠状病毒感染的类别包括以下之一:已被所述目的冠状病毒感染且感染的类别为第三类别、已被所述目的冠状病毒感染且感染的类别为第四类别、已被所述目的冠状病毒感染且感染的类别为第五类别、已被所述目的冠状病毒感染且感染的类别为第六类别;第三输出模块,设置为输出所述目标待测样本所属的已被所述目的冠状病毒感染的类别。
  9. 一种冠状病毒感染的检测方法,所述检测方法包括:
    利用免疫表征技术筛选出对目的冠状病毒感染的阳性样本与对照样本存在第一差异响应信号的肽段,记为差异肽段,所述样本为血清样本或血浆样本;
    以所述差异肽段的所述第一差异响应信号为特征,采用支持向量机的方法对所述阳性样本和所述对照样本进行分类模型构建,得到样本分类模型;
    利用所述免疫表征技术,检测所述差异肽段在待测样本与所述对照样本中相应的差异响应信号,记为第二差异响应信号;
    将所述第二差异响应信号输入所述样本分类模型进行分类,从而获得所述待测样本的所属症状类别;
    其中,所述对照样本包括阴性对照样本和其他肺部疾病对照样本,所述其他肺部疾病指非所述目的冠状病毒感染引起的肺部疾病,优选所述目的冠状病毒为SARS-CoV-2。
  10. 一种冠状病毒感染的检测装置,所述检测装置包括:
    差异肽段筛选模块,设置为利用免疫表征技术筛选出对目的冠状病毒感染的阳性样本与对照样本存在第一差异响应信号的肽段,记为差异肽段,所述样本为血清样本或血浆样本;
    模型建立模块,设置为以所述差异肽段的所述第一差异响应信号为特征,采用支持向量机的方法对所述阳性样本和所述对照样本进行分类模型构建,得到样本分类模型;
    响应信号检测模块,设置为利用所述免疫表征技术,检测所述差异肽段在待测样本与所述对照样本中相应的差异响应信号,记为第二差异响应信号;
    分类检测模块,设置为将所述第二差异响应信号输入所述样本分类模型进行分类,从而获得所述待测样本的所属症状类别;
    其中,所述对照样本包括阴性对照样本和其他肺部疾病对照样本,所述其他肺部疾病指非所述目的冠状病毒感染引起的肺部疾病,优选所述目的冠状病毒为SARS-CoV-2。
  11. 一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,其中,所述计算机程序被处理器执行时实现所述权利要求1至4任一项中所述的方法的步骤,或者实现权利要求9中所述的方法的步骤。
  12. 一种电子装置,包括存储器、处理器以及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现所述权利要求1至4任一项中所述的方法的步骤,或者实现权利要求9中所述的方法的步骤。
PCT/CN2021/080279 2020-03-13 2021-03-11 基于免疫表征技术对样本分类的方法、装置及存储介质 WO2021180182A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202010176984 2020-03-13
CN202010176984.4 2020-03-13
CN202010923587.9 2020-09-04
CN202010923587.9A CN113393902A (zh) 2020-03-13 2020-09-04 基于免疫表征技术对样本分类的方法、装置及存储介质

Publications (1)

Publication Number Publication Date
WO2021180182A1 true WO2021180182A1 (zh) 2021-09-16

Family

ID=77616460

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/080279 WO2021180182A1 (zh) 2020-03-13 2021-03-11 基于免疫表征技术对样本分类的方法、装置及存储介质

Country Status (2)

Country Link
CN (1) CN113393902A (zh)
WO (1) WO2021180182A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113888636A (zh) * 2021-09-29 2022-01-04 山东大学 基于多尺度深度特征的蛋白质亚细胞定位方法
CN113903400A (zh) * 2021-10-29 2022-01-07 复旦大学附属华山医院 免疫相关疾病分子分型和亚型分类器的分类方法、系统
CN116564416A (zh) * 2023-07-12 2023-08-08 中国农业科学院蜜蜂研究所 一种基于分段融合的ace抑制小肽筛选方法及其应用

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103336915A (zh) * 2013-05-31 2013-10-02 中国人民解放军国防科学技术大学 基于质谱数据获取生物标志物的方法及装置
US20170073769A1 (en) * 2015-09-16 2017-03-16 Innomedicine, LLC Chemotherapy regimen selection
CN108491690A (zh) * 2018-03-16 2018-09-04 中国科学院数学与系统科学研究院 一种蛋白质组学中肽段的肽段定量效率预测方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2618939A1 (en) * 2004-08-13 2006-04-27 Jaguar Bioscience Inc. Systems and methods for identifying diagnostic indicators
GB0510511D0 (en) * 2005-05-23 2005-06-29 St Georges Entpr Ltd Diagnosis of tuberculosis

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103336915A (zh) * 2013-05-31 2013-10-02 中国人民解放军国防科学技术大学 基于质谱数据获取生物标志物的方法及装置
US20170073769A1 (en) * 2015-09-16 2017-03-16 Innomedicine, LLC Chemotherapy regimen selection
CN108491690A (zh) * 2018-03-16 2018-09-04 中国科学院数学与系统科学研究院 一种蛋白质组学中肽段的肽段定量效率预测方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113888636A (zh) * 2021-09-29 2022-01-04 山东大学 基于多尺度深度特征的蛋白质亚细胞定位方法
CN113903400A (zh) * 2021-10-29 2022-01-07 复旦大学附属华山医院 免疫相关疾病分子分型和亚型分类器的分类方法、系统
CN116564416A (zh) * 2023-07-12 2023-08-08 中国农业科学院蜜蜂研究所 一种基于分段融合的ace抑制小肽筛选方法及其应用
CN116564416B (zh) * 2023-07-12 2023-09-15 中国农业科学院蜜蜂研究所 一种基于分段融合的ace抑制小肽筛选方法及其应用

Also Published As

Publication number Publication date
CN113393902A (zh) 2021-09-14

Similar Documents

Publication Publication Date Title
WO2021180182A1 (zh) 基于免疫表征技术对样本分类的方法、装置及存储介质
Lei et al. Antibody dynamics to SARS‐CoV‐2 in asymptomatic COVID‐19 infections
JP2020113285A (ja) 多様体および超平面を用いる生物学的データのコンピュータ分析
CN107209184B (zh) 用于诊断多种感染的标记组合及其使用方法
Fu et al. Dynamics and correlation among viral positivity, seroconversion, and disease severity in COVID-19: a retrospective study
Coghill et al. Epstein–Barr virus serology as a potential screening marker for nasopharyngeal carcinoma among high-risk individuals from multiplex families in Taiwan
Wang et al. Screening and identification of a six-cytokine biosignature for detecting TB infection and discriminating active from latent TB
Dumollard et al. Prospective evaluation of a new Aspergillus IgG enzyme immunoassay kit for diagnosis of chronic and allergic pulmonary aspergillosis
Yang et al. Identification of eight-protein biosignature for diagnosis of tuberculosis
US11360086B2 (en) Diagnostic to distinguish bacterial infections
Wielders et al. High Coxiella burnetii DNA load in serum during acute Q fever is associated with progression to a serologic profile indicative of chronic Q fever
Li et al. Novel serological biomarker panel using protein microarray can distinguish active TB from latent TB infection
Sinha et al. Utility of Epstein-Barr virus (EBV) antibodies as screening markers for nasopharyngeal carcinoma: A narrative review
Liu et al. Multilaboratory assessment of Epstein-Barr virus serologic assays: the case for standardization
Li et al. Microarray-based selection of a serum biomarker panel that can discriminate between latent and active pulmonary TB
Rajam et al. Development and validation of a sensitive and robust multiplex antigen capture assay to quantify streptococcus pneumoniae serotype-specific capsular polysaccharides in urine
Tuite et al. Estimating SARS-CoV-2 seroprevalence in Canadian blood donors, April 2020 to March 2021: improving accuracy with multiple assays
CN106950365A (zh) 一种acpa阴性的ra诊断标志物及其应用
Byrum et al. multiSero: open multiplex-ELISA platform for analyzing antibody responses to SARS-CoV-2 infection
Chaillon et al. Decreased specificity of an assay for recent infection in HIV-1-infected patients on highly active antiretroviral treatment: implications for incidence estimates
Ravindran et al. Validation of multiplex microbead immunoassay for simultaneous serodetection of multiple infectious agents in laboratory mouse
CN106950366A (zh) 一种acpa阴性的ra诊断标志物及其应用
CN106918697A (zh) 一种预测ra药物疗效的诊断标志物及其应用
CN104292322A (zh) 原发性胆汁性肝硬化特异性自身抗原及其应用
Chambliss et al. Immune biomarkers associated with COVID-19 disease severity in an urban, hospitalized population

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21768970

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21768970

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 16/02/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21768970

Country of ref document: EP

Kind code of ref document: A1