WO2022205775A1 - Method and device for determining immunity index of individual, electronic device, and machine-readable storage medium - Google Patents

Method and device for determining immunity index of individual, electronic device, and machine-readable storage medium Download PDF

Info

Publication number
WO2022205775A1
WO2022205775A1 PCT/CN2021/117149 CN2021117149W WO2022205775A1 WO 2022205775 A1 WO2022205775 A1 WO 2022205775A1 CN 2021117149 W CN2021117149 W CN 2021117149W WO 2022205775 A1 WO2022205775 A1 WO 2022205775A1
Authority
WO
WIPO (PCT)
Prior art keywords
immune
index
individual
sequencing
sequence
Prior art date
Application number
PCT/CN2021/117149
Other languages
French (fr)
Chinese (zh)
Inventor
柴相花
袁玉英
王梦杰
强薇
李宁
Original Assignee
深圳华大基因股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大基因股份有限公司 filed Critical 深圳华大基因股份有限公司
Priority to CN202180065823.0A priority Critical patent/CN116391237A/en
Publication of WO2022205775A1 publication Critical patent/WO2022205775A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment

Definitions

  • the present invention relates to the field of biomedicine, and in particular, the present invention relates to a method, a device, an electronic device and a machine-readable storage medium for determining an individual's immunity index.
  • Immunity is the body's own defense mechanism. It is the body's ability to identify and eliminate any foreign intrusion (viruses, bacteria, etc.)
  • Ability is the physiological response of the human body to identify and exclude “others”.
  • the immune system of the human body is maintained by the immune system, and the immune system is the best doctor in the world that the human body is born with.
  • the immune system consists of two cooperating subsystems that provide innate and adaptive immunity.
  • Innate immunity refers to a non-specific defense mechanism that protects the body from toxins or foreign substances (called antigens).
  • antigens toxins or foreign substances
  • the rapid response of the innate immune system also activates the adaptive system, which is the body's antigen-specific response to itself.
  • the adaptive immune system consists of two main types of lymphocytes, called B cells and T cells. These lymphocytes have unique antigen receptors, each of which recognizes only one antigen, and this range of specificity is encoded by a fixed number of gene segments. Through a mechanism called V(D)J recombination, these genetic regions undergo irreversible somatic DNA recombination during cell development, resulting in the formation of mature lymphocytes with a single specificity.
  • the immune repertoire refers to all the unique genetic rearrangements of T cell receptors (TCRs) and B cell receptors (BCRs) within the adaptive immune system.
  • immune repertoire NGS detection provides technical support for evaluating the body's adaptive immune system in healthy or diseased states.
  • Immunoglobulin and complement are the main effector components of humoral immunity. In the case of certain diseases (such as infections, autoimmune diseases, immunodeficiency diseases, etc.), the concentrations of these indicators will increase or decrease relative to the reference value, so that they can be evaluated. The clinical value of immunity and diagnosis of diseases.
  • the five immunoassays target humoral immunity and cannot assess cellular immunity well.
  • humoral immunity only the overall levels of IgG, IgA, IgM, and complement C3 and C4 can be detected, and in-depth analysis at the molecular sequence level cannot be performed.
  • Lymphocyte subset analysis using flow cytometry and PCR technology to analyze the number and relative proportion of each subset of leukocytes in peripheral blood.
  • flow cytometry or PCR technology By flow cytometry or PCR technology, the relative and absolute counts of immune cells in peripheral blood and their changes are monitored, and the immune status in disease states (such as tumors, infectious diseases, immune diseases, etc.) Assisting in diagnosis, tracking disease progression and deciding on medication timing.
  • the most commonly detected subsets include T cells (CD3), B cells (CD19), NK cells (CD16+56), helper T cells (CD3+CD4+), and suppressor T cells (CD3+CD8+).
  • lymphocyte subsets there are many types of lymphocyte subsets, and if a comprehensive analysis is carried out, the amount of peripheral blood that needs to be collected, the cost and the time are all unacceptable. It is difficult to obtain a comprehensive immune system status by analyzing only a few lymphocyte subsets. In addition, lymphocyte subsets have different normal reference ranges at different ages, and the results are affected by many factors, making clinical interpretation relatively difficult.
  • an object of the present invention aims to solve one of the technical problems in the related art at least to a certain extent.
  • an object of the present invention is to carry out high-sensitivity detection of the adaptive immune system of an individual at the molecular sequence level by means of the immune repertoire sequencing method. (Immune Age (IA)) to assess the health status of the individual body to achieve early health risk prediction.
  • IA Immunune Age
  • the present invention proposes a method for determining an individual immunity index.
  • the method includes: (1) acquiring nucleic acid sequencing data of the individual to be tested; (2) by The sequencing result is compared with the reference sequence, and the V/J sequence and the CDR sequence contained in the nucleic acid sample are determined; (3) based on the V/J sequence and the CDR sequence contained in the nucleic acid sample, the statistical characteristics are determined.
  • the statistical characteristics include at least one selected from the following: V/J gene usage diversity index, immune cell diversity index, number of immune cell types, immune cell homogeneity index; (4) based on the statistical characteristics, determine the an immune age value of an individual; and (5) determining an immunity index of the individual based on the immune age value.
  • the method of the present invention can be implemented by using a small amount of samples by sequencing, so as to realize the high-sensitivity detection of the individual adaptive immune system at the molecular level, and realize non-invasive early diagnosis, curative effect evaluation, Condition tracking, relapse prediction and comprehensive immune assessment.
  • the PCR technology can be used to amplify the genes contained in lymphocytes in peripheral blood, which requires less blood samples, and the subsequent processing of the samples is simple, and no inaccurate human well blood cell observation technology is required, and no operation is required. Sophisticated immunolabeling and flow analysis.
  • immune evaluation by immune repertoire sequencing can not only improve the sensitivity of detection, but also realize functions such as early diagnosis, evaluation of curative effect, tracking of illness, prediction of recurrence, and comprehensive evaluation of immunity.
  • the present invention provides a device for determining an individual immunity index.
  • the device includes: a sequencing data acquisition unit for acquiring nucleic acid sequencing data of an individual to be tested; sequencing A result analysis unit for determining the V/J sequence and CDR sequence contained in the nucleic acid sample by comparing the sequencing result with a reference sequence; a statistical unit for determining the V/J sequence contained in the nucleic acid sample based on the Sequence and CDR sequence, determine statistical characteristics, and the statistical characteristics include at least one selected from the following: V/J gene usage diversity index, immune cell diversity index, number of immune cell types, immune cell homogeneity index; immune age a determining unit for determining an immune age value of the individual based on the statistical feature; and an immunity index determining unit for determining an immune index for the individual based on the immune age value.
  • the present invention provides an electronic device, according to an embodiment of the present invention, comprising a processor and a memory, the memory storing machine-executable instructions executable by the processor, the The processor executes the machine-executable instructions to implement the aforementioned method of determining an immunity index of an individual.
  • the present invention provides a machine-readable storage medium.
  • the machine-readable storage medium stores machine-executable instructions, and the machine-executable instructions are called by a processor when the and, when executed, the machine-executable instructions cause a processor to implement the method of determining an individual's immunity index as described in any preceding item.
  • FIG. 1 is a schematic flowchart of a method for determining an individual immunity index according to an embodiment of the present invention
  • FIG. 2 is a partial schematic flowchart of a method for determining an individual immunity index according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a device for determining an individual immunity index according to an embodiment of the present invention.
  • Fig. 4 is a partial structural schematic diagram of a device for determining an individual immunity index according to an embodiment of the present invention.
  • Fig. 5 is the prediction result of the immunity index of different age groups in the embodiment 2 of the present invention.
  • FIG. 6 is a distribution diagram of the relationship between the immunity index and individual age in Example 2 of the present invention.
  • Embodiments of the present invention are described in detail below.
  • the embodiments described below are exemplary, only for explaining the present invention, and should not be construed as limiting the present invention. If no specific technique or condition is indicated in the examples, the technique or condition described in the literature in the field or the product specification is used.
  • the reagents or instruments used without the manufacturer's indication are conventional products that can be obtained from the market.
  • the present invention proposes a method for determining the immunity index of an individual. 1, according to an embodiment of the present invention, the method includes:
  • nucleic acid sequencing data from the individual to be tested is first acquired for subsequent analysis.
  • these nucleic acid sequencing data may contain the genetic information of immune cells, for example, according to embodiments of the present invention, blood samples containing immune cells or tissue samples containing immune cells (described herein) may be used.
  • Tissue samples should be understood in a broad sense and can include at least a part of organs), such as non-encapsulated diffuse lymphoid tissue and lymph nodes contained in the submucosal mucosa of the intestinal tract, respiratory tract, urogenital tract, etc.
  • nucleic acid sequencing data can be obtained by high-throughput sequencing.
  • second- or third-generation sequencing platforms including but not limited to high-throughput sequencing platforms such as MGISEQ-T7, MGISEQ-2000, MGISEQ-200, BGISEQ-500, BGISEQ-50, MGISP-960, and MGISP-100.
  • the sequencing process includes:
  • RNA For blood or tissue samples, extract DNA or RNA. For each sample, take the starting amount of DNA or RNA, add primers (TCR or BCR) for a certain chain, and perform multiple PCR amplification. PCR is carried out for a total of two rounds. One round was PCR reaction with VJ-specific primers (with partial sequencing adapters), and the second round was sequencing adapters for ordinary PCR library construction. Afterwards, multiple samples are pooled together for sequencing, resulting in data for each sample. According to the embodiment of the present invention, a tag sequence may also be introduced in the second round of PCR, thereby realizing the distinction of sample batches.
  • acquiring nucleic acid sequencing data may further include:
  • a nucleic acid sample of the individual to be tested is obtained, and the nucleic acid sample includes at least one of DNA molecules and RNA molecules.
  • RNA molecules include at least one of DNA molecules and RNA molecules.
  • Those skilled in the art can use commercially available kits and follow the manufacturer's instructions for extraction of DNA molecules or RNA molecules. It can be understood by those skilled in the art that, after obtaining RNA molecules, reverse transcription can be easily used to obtain cDNA molecules.
  • VJ-specific primers can be used to perform a first amplification process, so as to obtain a first amplification product.
  • V gene and the J gene, the immune cell-specific sequences contained in the nucleic acid sample obtained in step S110 may be amplified by VJ-specific primers.
  • VJ-specific primers refer to specific primers that can amplify V and J genes. For V and J genes, it is worth noting that for most loci, they are classified as families according to their degree of homology. Forms come together. These VJ-specific primers can be used to analyze the combinatorial diversity of V-J rearrangements at at least one locus selected from loci TRA, TRB, TRG, TRD, IgH, IgK, IgL, and the like.
  • the VJ-specific primer used in the present invention has the following nucleotide sequence:
  • the VJ-specific primer contains a portion of the sequence of the sequencing adapter. Therefore, it is convenient to introduce sequencing adapters into the amplification products through the second amplification process.
  • a second amplification process is performed on the first amplification product to obtain a second amplification product, wherein the second amplification product carries a sequencing adapter.
  • the second amplification process can be performed by using the common sequence in the first amplification product, and the primers used can be set to be suitable for introduction into sequencing adapters.
  • the obtained second amplification product constitutes a sequencing library that can be used for sequencing.
  • the second amplification product is sequenced to obtain sequencing results.
  • the sequencing library (second amplification product) can be sequenced using a sequencing platform.
  • nucleic acid sequencing data can be obtained by high-throughput sequencing.
  • second- or third-generation sequencing platforms including but not limited to high-throughput sequencing platforms such as MGISEQ-T7, MGISEQ-2000, MGISEQ-200, BGISEQ-500, BGISEQ-50, MGISP-960, and MGISP-100. Paired-end sequencing is preferably used. It can improve the efficiency of subsequent analysis.
  • the V/J sequence and the CDR sequence contained in the nucleic acid sample are determined by aligning the sequencing result with the reference sequence.
  • software such as SOAPnuke (v1.5.3) can be used to filter the linker contaminating sequences, low-quality bases and sequences on the raw sequencing data.
  • the FASTQ file was converted into a FASTA file with a self-developed program for sequence splicing; finally, if the sequencing mode was paired-end sequencing, COPE (v1.5.3) and the self-developed program were used to assemble the sequences.
  • blastall (v2.2.25) can be used to align the preprocessed FASTA sequence to the V(D)J reference gene sequence, and then the self-developed program is used to perform re-alignment and select the best alignment result , that is: use different methods to count the scores of the non-CDR3 and CDR3 regions, select the best hit with the highest score, and determine the attribution of the sequenced sequence by aligning with the CDR, V, and J reference sequences, so as to determine the CDR sequence and VJ sequence. of.
  • the structure of immune molecules is analyzed. This part mainly includes two functions: error correction and region determination. First, the errors introduced in PCR and sequencing were corrected by self-developed programs, and then the CDR regions were determined using the rules of V/J gene reference sequences and conserved amino acids and the established computational methods.
  • the CDR sequence can be determined by a common method.
  • the CDR sequence is at least one of CDR1, CDR2 and CDR3 sequences, preferably a CDR3 sequence. Because CDR3 has the greatest variation, it directly determines the antigen-binding specificity of TCR.
  • the CDR3 of TCR is encoded by three genes V, D, and J. During the maturation of lymphocytes, various recombinant sequence fragments are formed through the rearrangement of V, D, and J genes, plus DNA base SNP, Indel Mutations create a diversity of T cells.
  • V/J refers to at least a portion of the result of a V(D)J rearrangement for a particular cell, which may be a V gene sequence, a J gene sequence, or a V gene sequence.
  • the combination of the gene sequence and the J gene sequence may also sandwich the D gene sequence between the V gene sequence and the J gene sequence.
  • Statistical features are determined based on the V/J sequences and CDR sequences contained in the nucleic acid sample, and the statistical features include at least one selected from the following: V/J gene usage diversity index, immune cell diversity index, number of immune cell types, immune Cell Homogeneity Index.
  • At least one of the V/J gene usage diversity index and the immune cell diversity index is a Shannon index.
  • the type of immune cells is determined based on the CDR3 sequence.
  • the immune cell homogeneity index is the Gini index.
  • the immune repertoire feature data is counted, and the statistical features mainly include the following:
  • V/J gene usage diversity i.e. Shannon_index(V-J);
  • Immune cell homogeneity i.e. Clone_Gini.
  • Shannon_index represents the Shannon index
  • the calculation formula is as follows:
  • CDR3 is taken as an example
  • S represents the total number of unique CDR3s
  • p(i) represents the frequency of CDR3s.
  • Uniq_number represents the unique sequence number.
  • Clone_Gini represents the Gini index, and the calculation formula is as follows:
  • x refers to the frequency of each immune cell type
  • n refers to the number of immune cell types.
  • the immune age value of the individual is determined.
  • the immune age value is determined based on at least one statistical feature using a maximum a posteriori probability estimate.
  • step S400 it further includes: (4-1) using a predetermined immune age prediction coefficient distribution (mainly according to the characteristics of the selected feature to determine the parameter prior distribution, if the selected feature is continuous, in In the case of a large amount of data, it is generally considered to be a normal distribution), based on each statistical feature, determine the immune age prediction coefficient corresponding to each statistical feature; and (4-2) According to the formula Determine the immune age of the individual, where IA represents the immune age of the individual, i represents the number of statistical features, n represents the number of statistical features, ⁇ i represents the immune age prediction coefficient corresponding to the ith statistical feature, and xi represents the ith statistical feature The numerical value of the feature, ⁇ 0 represents the bias term in the prediction model.
  • a predetermined immune age prediction coefficient distribution mainly according to the characteristics of the selected feature to determine the parameter prior distribution, if the selected feature is continuous, in In the case of a large amount of data, it is generally considered to be a normal distribution
  • the MAP maximum a posteriori probability estimate, maximum a posteriori probability estimation
  • ⁇ A means "not A"
  • Biochemical indicators mainly include conventional indicators, such as macrobiochemical, blood routine and so on.
  • the training data is mainly based on the characteristics of the selected features to determine the prior distribution of the parameters. If the selected features are continuous, in the case of a large amount of data, it is generally considered to be a normal distribution. If it is discrete, it is directly weighted according to the formula below. Just multiply.
  • the selected members of the training set mainly include some indicators (V/J gene usage diversity, immune diversity, immune cell type, immune cell homogeneity) obtained from immune repertoire analysis and some biochemical indicators (large biochemical, blood routine, etc.).
  • the immunity index of the individual is determined based on the immune age value.
  • the immunity index is determined by the following formula:
  • IA represents the immune age value determined in step S400
  • IAmax represents the upper limit of IA in the predetermined group
  • IAmin represents the lower limit of IA in the predetermined group.
  • the technical solution After determining the immune index of the individual, the technical solution can realize the high-sensitivity detection of the individual adaptive immune system at the molecular level, and can realize non-invasive early diagnosis, curative effect evaluation, disease tracking, recurrence prediction and comprehensive immunity. Evaluate.
  • the method of the present invention can be implemented by using a small amount of samples by sequencing, so as to realize the high-sensitivity detection of the individual adaptive immune system at the molecular level, and realize non-invasive early diagnosis, curative effect evaluation, Condition tracking, relapse prediction and comprehensive immune assessment.
  • the PCR technology can be used to amplify the genes contained in lymphocytes in peripheral blood, which requires less blood samples, and the subsequent processing of the samples is simple, and no inaccurate human well blood cell observation technology is required, and no operation is required. Sophisticated immunolabeling and flow analysis.
  • immune evaluation by immune repertoire sequencing can not only improve the sensitivity of detection, but also realize functions such as early diagnosis, evaluation of curative effect, tracking of disease condition, prediction of recurrence, and comprehensive evaluation of immunity.
  • the present invention provides a device for determining an individual immunity index.
  • the device includes:
  • the sequencing data acquisition unit 100 is used to acquire nucleic acid sequencing data of the individual to be tested; the sequencing result analysis unit 200 is used to determine the V/J sequence and CDR sequence contained in the nucleic acid sample by comparing the sequencing result with the reference sequence Statistical unit 300 for determining statistical features based on the V/J sequences and CDR3 sequences contained in the nucleic acid sample, the statistical features including at least one selected from the following: V/J gene usage diversity index, immune cell diversity index , the number of immune cell types, and the immune cell homogeneity index; the immune age determination unit 400 is used to determine the immune age value of the individual based on the statistical characteristics; the immune index determination unit 500 is used to determine the immune age value of the individual based on the immune age value. index.
  • the sequencing data acquisition unit further includes: a nucleic acid sample acquisition module 110 , a first amplification module 120 and a second amplification module 130 , and a sequencing module 140 .
  • the nucleic acid sample acquisition module 110 is used to acquire nucleic acid samples of the individual to be tested, and the nucleic acid samples include at least one of DNA molecules and RNA molecules;
  • the first amplification module 120 is used to use VJ specific The first amplification process is performed on the primers to obtain the first amplification product;
  • the second amplification module 130 is used for performing the second amplification process on the first amplification product to obtain the second amplification product, wherein the first amplification product is
  • the second amplification product carries a sequencing adapter;
  • the sequencing module 140 is used to sequence the second amplification product so as to obtain a sequencing result;
  • the nucleic acid sample is obtained from an individual's blood or tissue sample.
  • the VJ-specific primer contains a portion of the sequence of the sequencing adapter.
  • the CDR sequence is at least one of CDR1, CDR2 and CDR3 sequences, preferably a CDR3 sequence.
  • At least one of the V/J gene usage diversity index and the immune cell diversity index is a Shannon index.
  • the type of immune cells is determined based on the CDR3 sequence.
  • the immune cell homogeneity index is the Gini index.
  • the immune age determination unit is adapted to determine the immune age value based on the at least one statistical feature using a maximum a posteriori probability estimate.
  • the immune age determination unit is configured to: using a predetermined distribution of immune age prediction coefficients, based on each of the statistical features, respectively determine the immune age prediction coefficient corresponding to each statistical feature; and according to the formula Determine the immune age of the individual, where IA represents the immune age of the individual, i represents the number of statistical features, n represents the number of statistical features, ⁇ i represents the immune age prediction coefficient corresponding to the ith statistical feature, and xi represents the ith statistical feature The numerical value of the feature, ⁇ 0 represents the bias term in the pre-prediction model.
  • the immunity index is determined by the following formula:
  • IA represents the immune age value determined in the immune age determination unit
  • IAmax represents the upper limit of IA in the predetermined population
  • IAmin represents the lower limit of IA in the predetermined population.
  • the present invention provides an electronic device, according to an embodiment of the present invention, comprising a processor and a memory, the memory stores machine-executable instructions that can be executed by the processor, and the processor executes the machine-executable instructions. Instructions to implement the preceding method of determining an individual's immunity index.
  • the present invention provides a machine-readable storage medium.
  • the machine-readable storage medium stores machine-executable instructions, and the machine-executable instructions are called by a processor when the and when executed, the machine-executable instructions cause a processor to implement any of the preceding methods of determining an immunity index of an individual.
  • primers with sequencing adapters are used to further amplify and build a library, and the sequencing library is subjected to high-throughput sequencing.
  • sequencing data is analyzed as follows:
  • SOAPnuke (v1.5.3) was used to perform junction contamination sequences, low-quality bases and sequences (filtered according to the average quality value of the bases in the sequence and the proportion of the number of N bases contained in the sequence) on the original sequencing data. , “the base quality value of the read is less than or equal to 20", “the number of N bases is greater than or equal to 5", the two satisfy one or all of them are filtered out);
  • V/J gene usage diversity i.e. Shannon_index(V-J);
  • Immune cell homogeneity i.e. Clone_Gini.
  • Shannon_index represents the Shannon index
  • the calculation formula is as follows:
  • CDR3 is taken as an example
  • S represents the total number of unique CDR3s
  • p(i) represents the frequency of CDR3s.
  • Uniq_number represents the unique sequence number.
  • Clone_Gini represents the Gini index, and the calculation formula is as follows:
  • x refers to the frequency of each immune cell type
  • n refers to the number of immune cell types.
  • the MAP maximum a posteriori probability estimate
  • the specific model is as follows:
  • IA represents the immune age of the predicted sample
  • IA max and IA min represent the upper and lower bounds in the population distribution, respectively.
  • primers with sequencing adapters are used to further amplify and build a library, and the sequencing library is subjected to high-throughput sequencing.
  • sequencing data is analyzed as follows:
  • SOAPnuke (v1.5.3) was used to perform junction contamination sequences, low-quality bases and sequences (filtered according to the average quality value of the bases in the sequence and the proportion of the number of N bases contained in the sequence) on the original sequencing data. , “the base quality value of the read is less than or equal to 20", “the number of N bases is greater than or equal to 5", the two satisfy one or all of them are filtered out);
  • Shannon_index represents the Shannon index
  • the calculation formula is as follows:
  • CDR3 is taken as an example
  • S represents the total number of unique CDR3s
  • p(i) represents the frequency of CDR3s.
  • Uniq_number represents the unique sequence number.
  • the MAP maximum a posteriori probability estimate, maximum a posteriori probability estimation
  • IA represents the immune age of the predicted sample
  • IA max and IA min represent the upper and lower bounds in the population distribution, respectively.
  • the immunity index showed a downward trend with increasing age. Although the sample size of the age group greater than 50 is small, the decline trend of the immunity index shown in Figure 6 is not obvious, but the decline trend of the immunity index shown in Figure 5 is more obvious. Therefore, the results of this example show that the immunity index can be used as an index for evaluating the health index.

Abstract

A method and device for determining the immunity index of an individual, an electronic device, and a machine-readable storage medium. The method comprises: acquiring nucleic acid sequencing data of an individual to be tested (S100); determining a V/J sequence and a CDR sequence contained in a nucleic acid sample by comparing a sequencing result with a reference sequence (S200); determining statistical features on the basis of the V/J sequence and the CDR sequence contained in the nucleic acid sample (S300), the statistical features comprising at least one selected from among the following: the usage diversity index of a V/J gene, the diversity index of immune cells, the number of immune cell types, and the homogeneity index of immune cells; determining an immune age value of the individual on the basis of the statistical features (S400); and determining the immunity index of the individual on the basis of the immune age value (S500). The method for determining the immunity index of an individual can be implemented by using a small number of samples by sequencing.

Description

确定个体免疫力指数的方法、设备、电子设备和机器可读存储介质Method, device, electronic device, and machine-readable storage medium for determining an individual's immunity index 技术领域technical field
本发明涉及生物医学领域,具体的,本发明涉及确定个体免疫力指数的方法、设备、电子设备和机器可读存储介质。The present invention relates to the field of biomedicine, and in particular, the present invention relates to a method, a device, an electronic device and a machine-readable storage medium for determining an individual's immunity index.
背景技术Background technique
免疫力是人体自身的防御机制,是人体识别和消灭外来侵入的任何异物(病毒、细菌等),处理衰老、损伤、死亡、变性的自身细胞,以及识别和处理体内突变细胞和病毒感染细胞的能力,是人体识别和排除“异己”的生理反应。人体的免疫力是依靠免疫系统来维护的,免疫系统是人体与生俱来拥有的世界上最好的医生。Immunity is the body's own defense mechanism. It is the body's ability to identify and eliminate any foreign intrusion (viruses, bacteria, etc.) Ability is the physiological response of the human body to identify and exclude "others". The immune system of the human body is maintained by the immune system, and the immune system is the best doctor in the world that the human body is born with.
免疫系统由两个相互配合的子系统组成,可提供先天免疫和适应性免疫。先天免疫是指保护人体免受毒素或异物(称为抗原)的非特异性防御机制。先天免疫系统的快速反应也会激活适应性系统,适应性系统是机体针对自身的抗原特异性反应。The immune system consists of two cooperating subsystems that provide innate and adaptive immunity. Innate immunity refers to a non-specific defense mechanism that protects the body from toxins or foreign substances (called antigens). The rapid response of the innate immune system also activates the adaptive system, which is the body's antigen-specific response to itself.
适应性免疫系统由两种主要类型的淋巴细胞组成,称为B细胞和T细胞。这些淋巴细胞具有独特的抗原受体,每个独特的抗原受体仅识别一个抗原,这种特异性范围是由固定数目的基因片段编码的。通过一种称为V(D)J重组的机制,这些遗传区域在细胞发育过程中发生不可逆的体细胞DNA重组,从而形成具有单一特异性的成熟淋巴细胞。免疫库是指适应性免疫系统内所有独特的T细胞受体(TCR)和B细胞受体(BCR)遗传重排。The adaptive immune system consists of two main types of lymphocytes, called B cells and T cells. These lymphocytes have unique antigen receptors, each of which recognizes only one antigen, and this range of specificity is encoded by a fixed number of gene segments. Through a mechanism called V(D)J recombination, these genetic regions undergo irreversible somatic DNA recombination during cell development, resulting in the formation of mature lymphocytes with a single specificity. The immune repertoire refers to all the unique genetic rearrangements of T cell receptors (TCRs) and B cell receptors (BCRs) within the adaptive immune system.
随着精准医学和免疫疗法的发展,免疫组库的应用场景越来越广泛。应用场景包括:生物标志物的挖掘,自身免疫性疾病和感染性疾病的检测,免疫排斥和耐受性评估,肿瘤免疫评估,免疫重建以及用药和疫苗评估。因此,免疫组库NGS检测为评估健康或疾病状态下的机体适应性免疫系统提供了技术支持。With the development of precision medicine and immunotherapy, the application scenarios of immune repertoires are becoming more and more extensive. Application scenarios include: biomarker mining, detection of autoimmune and infectious diseases, immune rejection and tolerance assessment, tumor immune assessment, immune reconstitution, and drug and vaccine assessment. Therefore, immune repertoire NGS detection provides technical support for evaluating the body's adaptive immune system in healthy or diseased states.
目前市场上用来分析免疫功能的主要方法有:The main methods currently on the market for analyzing immune function are:
1)免疫五项,检测血液中免疫球蛋白和补体的含量。即通过单向免疫扩散试验、酶联免疫吸附试验(ELISA)、放射免疫试验(RIA)、免疫固定电泳、免疫比浊法等方法,检测血液中免疫球蛋白G(IgG)、免疫球蛋白A(IgA)、免疫球蛋白M(IgM)、补体C3和C4的含量。免疫球蛋白和补体是体液免疫的主要效应成分,在某些疾病(如感染、自身免疫疾病、免疫缺陷病等)情况下,这些指标的浓度相对参考值将出现升高或降低,从而具有评估免疫力、诊断疾病的临床价值。然而,免疫五项检测针对体液免疫,不能很好评估细胞免疫。在评估体液免疫时,只能检测IgG、IgA、IgM和补体C3、C4的总体水平,不能在分子序列层次上进行深度分析。1) Five items of immunity, to detect the content of immunoglobulin and complement in the blood. That is, by one-way immunodiffusion test, enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), immunofixation electrophoresis, immunoturbidimetry and other methods to detect immunoglobulin G (IgG), immunoglobulin A in blood (IgA), immunoglobulin M (IgM), complement C3 and C4 content. Immunoglobulin and complement are the main effector components of humoral immunity. In the case of certain diseases (such as infections, autoimmune diseases, immunodeficiency diseases, etc.), the concentrations of these indicators will increase or decrease relative to the reference value, so that they can be evaluated. The clinical value of immunity and diagnosis of diseases. However, the five immunoassays target humoral immunity and cannot assess cellular immunity well. When evaluating humoral immunity, only the overall levels of IgG, IgA, IgM, and complement C3 and C4 can be detected, and in-depth analysis at the molecular sequence level cannot be performed.
2)血常规,利用细胞计数的方法分析外周血中白细胞的数量,白细胞数目的增高表明 体内存在炎症反应。即通过显微镜观测对外周血中的白细胞进行分类和计数。白细胞总数高于参考值上限称白细胞增多,低于参考值下限为白细胞减少。其增多和减少主要受中性粒细胞数量的影响,淋巴细胞等数量的改变也会引起白细胞总数的变化。从生理性变化到恶性肿瘤都有可能引起白细胞总数异常,医生可结合血常规检测结果进行临床诊断。然而,血常规检测只能大致判断细胞免疫整体水平的状况,无法分辨针对具体疾病的免疫,也无法在基因水平判断免疫细胞的分类和多样性。2) Blood routine, the number of leukocytes in the peripheral blood is analyzed by the method of cell counting, and the increase of the number of leukocytes indicates that there is an inflammatory reaction in the body. That is, the leukocytes in peripheral blood are classified and counted by microscope observation. The total number of leukocytes above the upper limit of the reference value is called leukocytosis, and the lower limit of the reference value is leukopenia. Its increase and decrease are mainly affected by the number of neutrophils, and changes in the number of lymphocytes can also cause changes in the total number of white blood cells. From physiological changes to malignant tumors, the total number of white blood cells may be abnormal, and doctors can make clinical diagnosis based on the results of routine blood tests. However, blood routine testing can only roughly judge the overall level of cellular immunity, and cannot distinguish immunity against specific diseases, nor can it judge the classification and diversity of immune cells at the gene level.
淋巴细胞亚群分析,利用流式细胞分析以及PCR技术分析外周血中白细胞各个亚群的数目和相对比例。通过流式细胞分析或PCR技术,对外周血中免疫细胞的相对计数、绝对计数及其变化进行监控,分析疾病状态下的免疫状况(如肿瘤、感染性疾病、免疫性疾病等),以此辅助诊断、追踪病情发展及决定用药时机。最常检测的亚群包括T细胞(CD3)、B细胞(CD19)、NK细胞(CD16+56)、辅助性T细胞(CD3+CD4+)和抑制性T细胞(CD3+CD8+)等。然而,淋巴细胞亚群种类繁多,如进行全面分析,则需要采集的外周血量、费用及时间均难以接受。只进行少数几种淋巴细胞亚群分析,则难以获取全面的免疫系统状况。并且淋巴细胞亚群在不同年龄阶段有不同的正常参考范围,并且其结果受多种因素的影响,造成临床判读相对困难。Lymphocyte subset analysis, using flow cytometry and PCR technology to analyze the number and relative proportion of each subset of leukocytes in peripheral blood. By flow cytometry or PCR technology, the relative and absolute counts of immune cells in peripheral blood and their changes are monitored, and the immune status in disease states (such as tumors, infectious diseases, immune diseases, etc.) Assisting in diagnosis, tracking disease progression and deciding on medication timing. The most commonly detected subsets include T cells (CD3), B cells (CD19), NK cells (CD16+56), helper T cells (CD3+CD4+), and suppressor T cells (CD3+CD8+). However, there are many types of lymphocyte subsets, and if a comprehensive analysis is carried out, the amount of peripheral blood that needs to be collected, the cost and the time are all unacceptable. It is difficult to obtain a comprehensive immune system status by analyzing only a few lymphocyte subsets. In addition, lymphocyte subsets have different normal reference ranges at different ages, and the results are affected by many factors, making clinical interpretation relatively difficult.
发明内容SUMMARY OF THE INVENTION
本发明旨在至少在一定程度上解决相关技术中的技术问题之一。为此,本发明的一个目的旨在通过免疫组库测序方法从分子序列层面上对个体适应性免疫系统进行高灵敏度检测,通过免疫球蛋白基因和TCR基因的多种指标(如多样性、均一性等)的综合分析对个体免疫力评估,通过免疫年龄(Immune Age(IA))评估个体机体的健康状况,实现早期健康风险预测。The present invention aims to solve one of the technical problems in the related art at least to a certain extent. To this end, an object of the present invention is to carry out high-sensitivity detection of the adaptive immune system of an individual at the molecular sequence level by means of the immune repertoire sequencing method. (Immune Age (IA)) to assess the health status of the individual body to achieve early health risk prediction.
在本发明的第一方面,本发明提出了一种确定个体免疫力指数的方法,根据本发明的实施例,该方法包括:(1)获取待测个体的核酸测序数据;(2)通过将所述测序结果与参考序列比对,确定所述核酸样本中所包含V/J序列以及CDR序列;(3)基于所述核酸样本中所包含V/J序列以及CDR序列,确定统计特征,所述统计特征包括选自下列的至少之一:V/J基因使用多样性指数、免疫细胞多样性指数、免疫细胞种类数目、免疫细胞均一性指数;(4)基于所述统计特征,确定所述个体的免疫年龄数值;和(5)基于所述免疫年龄数值,确定所述个体的免疫力指数。In the first aspect of the present invention, the present invention proposes a method for determining an individual immunity index. According to an embodiment of the present invention, the method includes: (1) acquiring nucleic acid sequencing data of the individual to be tested; (2) by The sequencing result is compared with the reference sequence, and the V/J sequence and the CDR sequence contained in the nucleic acid sample are determined; (3) based on the V/J sequence and the CDR sequence contained in the nucleic acid sample, the statistical characteristics are determined. The statistical characteristics include at least one selected from the following: V/J gene usage diversity index, immune cell diversity index, number of immune cell types, immune cell homogeneity index; (4) based on the statistical characteristics, determine the an immune age value of an individual; and (5) determining an immunity index of the individual based on the immune age value.
根据本发明的实施例,通过测序可以采用少量的样本即可实施本发明的方法,以从分子层面上实现对个体适应性免疫系统进行高灵敏度检测,而且可以实现无创的早期诊断、疗效评估、病情追踪、复发预测以及免疫力综合评估。例如根据本发明的实施例,可以采 用PCR技术扩增外周血中淋巴细胞含有的基因,所需血液样本少,样本后续处理简便,不需要进行不准确的人孔血细胞观察技术,也不需要操作复杂的免疫标记和流式分析。对于骨髓瘤检验,因为只需要采取外周血,不需要实施骨髓穿刺,可以减少对病人身体的损伤,具有积极的意义。总之,根据本发明的实施例,免疫组库测序进行免疫评估不仅可以提升检测的灵敏度,而且可以实现早期诊断,评估疗效,追踪病情,预测复发以及免疫力的综合评估等功能。According to the embodiments of the present invention, the method of the present invention can be implemented by using a small amount of samples by sequencing, so as to realize the high-sensitivity detection of the individual adaptive immune system at the molecular level, and realize non-invasive early diagnosis, curative effect evaluation, Condition tracking, relapse prediction and comprehensive immune assessment. For example, according to the embodiments of the present invention, the PCR technology can be used to amplify the genes contained in lymphocytes in peripheral blood, which requires less blood samples, and the subsequent processing of the samples is simple, and no inaccurate human well blood cell observation technology is required, and no operation is required. Sophisticated immunolabeling and flow analysis. For myeloma test, because only peripheral blood needs to be taken, no bone marrow puncture is required, which can reduce the damage to the patient's body, which has positive significance. In conclusion, according to the embodiments of the present invention, immune evaluation by immune repertoire sequencing can not only improve the sensitivity of detection, but also realize functions such as early diagnosis, evaluation of curative effect, tracking of illness, prediction of recurrence, and comprehensive evaluation of immunity.
在本发明的第二方面,本发明提出了一种确定个体免疫力指数的设备,根据本发明的实施例,该设备包括:测序数据获取单元,用于获取待测个体的核酸测序数据;测序结果分析单元,用于通过将所述测序结果与参考序列比对,确定所述核酸样本中所包含V/J序列以及CDR序列;统计单元,用于基于所述核酸样本中所包含V/J序列以及CDR序列,确定统计特征,所述统计特征包括选自下列的至少之一:V/J基因使用多样性指数、免疫细胞多样性指数、免疫细胞种类数目、免疫细胞均一性指数;免疫年龄确定单元,用于基于所述统计特征,确定所述个体的免疫年龄数值;和免疫力指数确定单元,用于基于所述免疫年龄数值,确定所述个体的免疫力指数。In a second aspect of the present invention, the present invention provides a device for determining an individual immunity index. According to an embodiment of the present invention, the device includes: a sequencing data acquisition unit for acquiring nucleic acid sequencing data of an individual to be tested; sequencing A result analysis unit for determining the V/J sequence and CDR sequence contained in the nucleic acid sample by comparing the sequencing result with a reference sequence; a statistical unit for determining the V/J sequence contained in the nucleic acid sample based on the Sequence and CDR sequence, determine statistical characteristics, and the statistical characteristics include at least one selected from the following: V/J gene usage diversity index, immune cell diversity index, number of immune cell types, immune cell homogeneity index; immune age a determining unit for determining an immune age value of the individual based on the statistical feature; and an immunity index determining unit for determining an immune index for the individual based on the immune age value.
采用本发明的实施例的该设备,可以有效地实施前面所描述的确定个体免疫力的方法。由此,前面所描述的特征和优点同样适用于该设备,在此不再赘述。Using the apparatus of an embodiment of the present invention, the previously described method of determining immunity of an individual can be effectively implemented. Thus, the features and advantages described above are also applicable to the device and will not be repeated here.
在本发明的第三方面,本发明提出了一种电子设备,根据本发明的实施例,包括处理器和存储器,所述存储器存储有能够被所述处理器执行的机器可执行指令,所述处理器执行所述机器可执行指令以实现前面所述的确定个体免疫力指数的方法。In a third aspect of the present invention, the present invention provides an electronic device, according to an embodiment of the present invention, comprising a processor and a memory, the memory storing machine-executable instructions executable by the processor, the The processor executes the machine-executable instructions to implement the aforementioned method of determining an immunity index of an individual.
在本发明的第四方面,本发明提出了一种机器可读存储介质,根据本发明的实施例,该机器可读存储介质存储有机器可执行指令,该机器可执行指令在被处理器调用和执行时,机器可执行指令促使处理器实现前面任一项所述的确定个体免疫力指数的方法。In a fourth aspect of the present invention, the present invention provides a machine-readable storage medium. According to an embodiment of the present invention, the machine-readable storage medium stores machine-executable instructions, and the machine-executable instructions are called by a processor when the and, when executed, the machine-executable instructions cause a processor to implement the method of determining an individual's immunity index as described in any preceding item.
附图说明Description of drawings
图1是根据本发明一个实施例的确定个体免疫力指数的方法的流程示意图;1 is a schematic flowchart of a method for determining an individual immunity index according to an embodiment of the present invention;
图2是根据本发明一个实施例的确定个体免疫力指数的方法的部分流程示意图;FIG. 2 is a partial schematic flowchart of a method for determining an individual immunity index according to an embodiment of the present invention;
图3是根据本发明一个实施例的确定个体免疫力指数的设备的结构示意图;3 is a schematic structural diagram of a device for determining an individual immunity index according to an embodiment of the present invention;
图4是根据本发明一个实施例的确定个体免疫力指数的设备的部分结构示意图;Fig. 4 is a partial structural schematic diagram of a device for determining an individual immunity index according to an embodiment of the present invention;
图5是本发明实施例2中的不同年龄段人群的免疫力指数的预测结果;Fig. 5 is the prediction result of the immunity index of different age groups in the embodiment 2 of the present invention;
图6是本发明实施例2中的免疫力指数与个体年龄关系的分布图。FIG. 6 is a distribution diagram of the relationship between the immunity index and individual age in Example 2 of the present invention.
具体实施方式Detailed ways
下面详细描述本发明的实施例。下面描述的实施例是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。实施例中未注明具体技术或条件的,按照本领域内的文献所描述的技术或条件或者按照产品说明书进行。所用试剂或仪器未注明生产厂商者,均为可以通过市购获得的常规产品。Embodiments of the present invention are described in detail below. The embodiments described below are exemplary, only for explaining the present invention, and should not be construed as limiting the present invention. If no specific technique or condition is indicated in the examples, the technique or condition described in the literature in the field or the product specification is used. The reagents or instruments used without the manufacturer's indication are conventional products that can be obtained from the market.
在本发明的第一方面,本发明提出了一种确定个体免疫力指数的方法。参考图1,根据本发明的实施例,该方法包括:In the first aspect of the present invention, the present invention proposes a method for determining the immunity index of an individual. 1, according to an embodiment of the present invention, the method includes:
S100获取核酸测序数据S100 obtains nucleic acid sequencing data
根据本发明的实施例,在该步骤中,首先获取来自待测个体的核酸测序数据,以便用于后续的分析。本领域技术人员能够理解的是,这些核酸测序数据可以含有免疫细胞的遗传信息,例如根据本发明的实施例,可以采用来自含有免疫细胞的血液样本或者含有免疫细胞的组织样本(这里所述的组织样本应做广义理解,可以包括器官的至少一部分),例如肠道、呼吸道、泌尿生殖道等黏膜下所含有的非包膜化的弥散性淋巴组织和淋巴小结等。According to an embodiment of the present invention, in this step, nucleic acid sequencing data from the individual to be tested is first acquired for subsequent analysis. Those skilled in the art can understand that these nucleic acid sequencing data may contain the genetic information of immune cells, for example, according to embodiments of the present invention, blood samples containing immune cells or tissue samples containing immune cells (described herein) may be used. Tissue samples should be understood in a broad sense and can include at least a part of organs), such as non-encapsulated diffuse lymphoid tissue and lymph nodes contained in the submucosal mucosa of the intestinal tract, respiratory tract, urogenital tract, etc.
根据本发明的实施例,核酸测序数据可以通过高通量测序获得。例如二代或者三代测序平台,包括但不限于MGISEQ-T7、MGISEQ-2000、MGISEQ-200、BGISEQ-500、BGISEQ-50、MGISP-960、MGISP-100等高通量测序平台。According to embodiments of the present invention, nucleic acid sequencing data can be obtained by high-throughput sequencing. For example, second- or third-generation sequencing platforms, including but not limited to high-throughput sequencing platforms such as MGISEQ-T7, MGISEQ-2000, MGISEQ-200, BGISEQ-500, BGISEQ-50, MGISP-960, and MGISP-100.
本领域技术人员可以在获取核酸后,可以按照测序平台的操作手册进行测序,以便获得核酸测序数据。例如,简言之,根据本发明的一个实施例,测序过程包括:After obtaining the nucleic acid, those skilled in the art can perform sequencing according to the operation manual of the sequencing platform, so as to obtain nucleic acid sequencing data. For example, briefly, according to one embodiment of the present invention, the sequencing process includes:
对于血液或者组织样本,提取DNA或者RNA,对于每个样本,取DNA或RNA的起始量,加入某条链的引物(TCR或者BCR),进行多重PCR扩增,PCR总共进行两轮,第一轮是VJ特异性引物(带部分测序接头)PCR反应,第二轮是测序接头进行普通的PCR建库。之后,多个样本汇总在一起进行测序,从而得到每个样本的数据。根据本发明的实施例中,在第二轮PCR中还可以引入标签序列,从而实现对样本批次的区分。For blood or tissue samples, extract DNA or RNA. For each sample, take the starting amount of DNA or RNA, add primers (TCR or BCR) for a certain chain, and perform multiple PCR amplification. PCR is carried out for a total of two rounds. One round was PCR reaction with VJ-specific primers (with partial sequencing adapters), and the second round was sequencing adapters for ordinary PCR library construction. Afterwards, multiple samples are pooled together for sequencing, resulting in data for each sample. According to the embodiment of the present invention, a tag sequence may also be introduced in the second round of PCR, thereby realizing the distinction of sample batches.
参考图2,根据本发明的具体实施例,获取核酸测序数据可以进一步包括:2, according to a specific embodiment of the present invention, acquiring nucleic acid sequencing data may further include:
S110获取核酸样本S110 Obtain nucleic acid samples
在该步骤中,获取待测个体的核酸样本,核酸样本包括DNA分子和RNA分子的至少之一。本领域技术人员可以采用商购的试剂盒并按照制造商所提供的说明书进行DNA分子或RNA分子的提取。本领域技术人员能够理解的是,在获取RNA分子后,可以容易地采用逆转录处理,获得cDNA分子。In this step, a nucleic acid sample of the individual to be tested is obtained, and the nucleic acid sample includes at least one of DNA molecules and RNA molecules. Those skilled in the art can use commercially available kits and follow the manufacturer's instructions for extraction of DNA molecules or RNA molecules. It can be understood by those skilled in the art that, after obtaining RNA molecules, reverse transcription can be easily used to obtain cDNA molecules.
S120第一扩增处理S120 First Amplification Process
在获得核酸样本后,可以采用VJ特异性引物进行第一扩增处理,以便获得第一扩增产物。After the nucleic acid sample is obtained, VJ-specific primers can be used to perform a first amplification process, so as to obtain a first amplification product.
需要说明的是,可以通过VJ特异性引物对步骤S110中得到的核酸样本中所包含的免疫细胞特有序列即V基因和J基因进行扩增。It should be noted that the V gene and the J gene, the immune cell-specific sequences contained in the nucleic acid sample obtained in step S110, may be amplified by VJ-specific primers.
本文中,VJ特异性引物是指可以扩增V基因和J基因的特异性引物,对于V基因和J基因,值得注意的是,对大多数基因座而言,它们根据其同源程度以家族形式聚集在一起。这些VJ特异性引物可以用于分析至少一个基因座位上V-J重排的组合多样性,基因座位选自座位TRA、TRB、TRG、TRD、IgH、IgK、IgL等。Herein, VJ-specific primers refer to specific primers that can amplify V and J genes. For V and J genes, it is worth noting that for most loci, they are classified as families according to their degree of homology. Forms come together. These VJ-specific primers can be used to analyze the combinatorial diversity of V-J rearrangements at at least one locus selected from loci TRA, TRB, TRG, TRD, IgH, IgK, IgL, and the like.
根据本发明的实施例,本发明所采用VJ特异性引物具有下列核苷酸序列:According to an embodiment of the present invention, the VJ-specific primer used in the present invention has the following nucleotide sequence:
Figure PCTCN2021117149-appb-000001
Figure PCTCN2021117149-appb-000001
Figure PCTCN2021117149-appb-000002
Figure PCTCN2021117149-appb-000002
Figure PCTCN2021117149-appb-000003
Figure PCTCN2021117149-appb-000003
另外,根据本发明的实施例,VJ特异性引物含有测序接头的一部分序列。由此,方便后续通过第二扩增处理,在扩增产物中引入测序接头。In addition, according to an embodiment of the present invention, the VJ-specific primer contains a portion of the sequence of the sequencing adapter. Therefore, it is convenient to introduce sequencing adapters into the amplification products through the second amplification process.
S130第二扩增处理S130 Second Amplification Treatment
对第一扩增产物进行第二扩增处理,以便获得第二扩增产物,其中,第二扩增产物携带测序接头。A second amplification process is performed on the first amplification product to obtain a second amplification product, wherein the second amplification product carries a sequencing adapter.
通过采用第一扩增产物中的共同序列,可以进行第二扩增处理,并且采用的引物可以设置为适于引入测序接头。由此,所得到的第二扩增产物构成了可以用于测序的测序文库。The second amplification process can be performed by using the common sequence in the first amplification product, and the primers used can be set to be suitable for introduction into sequencing adapters. Thus, the obtained second amplification product constitutes a sequencing library that can be used for sequencing.
当然本领域技术人员能够理解的是,为了提高测序效率或者方便分析,还可以对第二扩增产物进行其他常规的处理,例如杂交探针筛选等处理。在此不再赘述。Of course, those skilled in the art can understand that, in order to improve sequencing efficiency or facilitate analysis, other conventional processing, such as hybridization probe screening, may also be performed on the second amplification product. It is not repeated here.
S140测序S140 sequencing
对第二扩增产物进行测序,以便获得测序结果。The second amplification product is sequenced to obtain sequencing results.
根据本发明的实施例,在构建测序文库之后,可以对测序文库(第二扩增产物)利用测序平台进行测序。根据本发明的实施例,核酸测序数据可以通过高通量测序获得。例如二代或者三代测序平台,包括但不限于MGISEQ-T7、MGISEQ-2000、MGISEQ-200、BGISEQ-500、BGISEQ-50、MGISP-960、MGISP-100等高通量测序平台。优选采用双末端测序。可以提高后续分析效率。According to an embodiment of the present invention, after the sequencing library is constructed, the sequencing library (second amplification product) can be sequenced using a sequencing platform. According to embodiments of the present invention, nucleic acid sequencing data can be obtained by high-throughput sequencing. For example, second- or third-generation sequencing platforms, including but not limited to high-throughput sequencing platforms such as MGISEQ-T7, MGISEQ-2000, MGISEQ-200, BGISEQ-500, BGISEQ-50, MGISP-960, and MGISP-100. Paired-end sequencing is preferably used. It can improve the efficiency of subsequent analysis.
S200序列比对确定V/J序列和CDR序列S200 sequence alignment to determine V/J sequences and CDR sequences
在获得测序数据后,根据本发明的实施例,通过将测序结果与参考序列比对,确定核酸样本中所包含V/J序列以及CDR序列。After the sequencing data is obtained, according to an embodiment of the present invention, the V/J sequence and the CDR sequence contained in the nucleic acid sample are determined by aligning the sequencing result with the reference sequence.
根据本发明的实施例,在进行比对之前,可以采用例如SOAPnuke(v1.5.3)等软件对原始测序数据进行接头污染序列、低质量碱基和序列的过滤。According to an embodiment of the present invention, before performing the alignment, software such as SOAPnuke (v1.5.3) can be used to filter the linker contaminating sequences, low-quality bases and sequences on the raw sequencing data.
用自主开发程序把FASTQ文件转换为FASTA文件,以便进行序列拼接;最后,如果测序模式是双末端测序,则采用COPE(v1.5.3)和自主开发的程序对序列进行拼接。接下来, 可以采用blastall(v2.2.25)对预处理之后的FASTA序列比对到V(D)J参考基因序列上,接下来采用自主开发的程序进行重比对并选择最佳的比对结果,即:对non-CDR3、CDR3区域用不同方法统计分数,选取得分最高的best hit,通过与CDR、V、J参考序列进行比对来确定测序序列的归属,以便确定CDR序列和VJ序列的。The FASTQ file was converted into a FASTA file with a self-developed program for sequence splicing; finally, if the sequencing mode was paired-end sequencing, COPE (v1.5.3) and the self-developed program were used to assemble the sequences. Next, blastall (v2.2.25) can be used to align the preprocessed FASTA sequence to the V(D)J reference gene sequence, and then the self-developed program is used to perform re-alignment and select the best alignment result , that is: use different methods to count the scores of the non-CDR3 and CDR3 regions, select the best hit with the highest score, and determine the attribution of the sequenced sequence by aligning with the CDR, V, and J reference sequences, so as to determine the CDR sequence and VJ sequence. of.
在得到V基因和J基因的序列后,对免疫分子的结构进行分析此部分主要包含两个功能:错误矫正和区域确定。首先,采用自主开发程序对PCR和测序环节的引入的错误进行矫正,其次利用V/J基因参考序列与保守氨基酸的规律与建立的计算方法确定CDR区域。After obtaining the sequences of V and J genes, the structure of immune molecules is analyzed. This part mainly includes two functions: error correction and region determination. First, the errors introduced in PCR and sequencing were corrected by self-developed programs, and then the CDR regions were determined using the rules of V/J gene reference sequences and conserved amino acids and the established computational methods.
根据本发明的实施例,可以通过常用的方法确定CDR序列。根据本发明的实施例,CDR序列为CDR1、CDR2和CDR3序列的至少之一,优选CDR3序列。因为CDR3变异最大,直接决定了TCR的抗原结合特异性。TCR的CDR3由V、D、J三个基因编码,在淋巴细胞的成熟过程中,通过V、D、J基因的重排形成了各种重组序列片段,再加上DNA碱基的SNP、Indel突变形成了T细胞的多样性。According to the embodiments of the present invention, the CDR sequence can be determined by a common method. According to an embodiment of the present invention, the CDR sequence is at least one of CDR1, CDR2 and CDR3 sequences, preferably a CDR3 sequence. Because CDR3 has the greatest variation, it directly determines the antigen-binding specificity of TCR. The CDR3 of TCR is encoded by three genes V, D, and J. During the maturation of lymphocytes, various recombinant sequence fragments are formed through the rearrangement of V, D, and J genes, plus DNA base SNP, Indel Mutations create a diversity of T cells.
在本文中所使用的术语“V/J”是指针对特定细胞,其所具有的V(D)J重排的结果的至少一部分,其可以是V基因序列,J基因序列,也可以是V基因序列与J基因序列的组合,还有可能在V基因序列和J基因序列中夹着D基因序列。The term "V/J" as used herein refers to at least a portion of the result of a V(D)J rearrangement for a particular cell, which may be a V gene sequence, a J gene sequence, or a V gene sequence. The combination of the gene sequence and the J gene sequence may also sandwich the D gene sequence between the V gene sequence and the J gene sequence.
S300确定统计特征S300 Determine statistical characteristics
基于核酸样本中所包含V/J序列以及CDR序列,确定统计特征,统计特征包括选自下列的至少之一:V/J基因使用多样性指数、免疫细胞多样性指数、免疫细胞种类数目、免疫细胞均一性指数。Statistical features are determined based on the V/J sequences and CDR sequences contained in the nucleic acid sample, and the statistical features include at least one selected from the following: V/J gene usage diversity index, immune cell diversity index, number of immune cell types, immune Cell Homogeneity Index.
根据本发明的实施例,V/J基因使用多样性指数和免疫细胞多样性指数的至少之一为香农指数。根据本发明的实施例,免疫细胞的种类是基于CDR3序列确定的。According to an embodiment of the present invention, at least one of the V/J gene usage diversity index and the immune cell diversity index is a Shannon index. According to an embodiment of the present invention, the type of immune cells is determined based on the CDR3 sequence.
根据本发明的实施例,免疫细胞均一性指数为基尼指数。According to an embodiment of the present invention, the immune cell homogeneity index is the Gini index.
根据本发明的实施例,对免疫组库特征数据进行统计,统计特征主要包括以下几个:According to an embodiment of the present invention, the immune repertoire feature data is counted, and the statistical features mainly include the following:
V/J基因使用多样性,即Shannon_index(V-J);V/J gene usage diversity, i.e. Shannon_index(V-J);
免疫多样性,即Shannon_index(CDR3_aa);Immune diversity, i.e. Shannon_index(CDR3_aa);
免疫细胞种类,即Uniq_number(CDR3_aa);Immune cell type, i.e. Uniq_number (CDR3_aa);
免疫细胞均一性,即Clone_Gini。Immune cell homogeneity, i.e. Clone_Gini.
以上指标中,Shannon_index表示Shannon指数,计算公式如下:Among the above indicators, Shannon_index represents the Shannon index, and the calculation formula is as follows:
Figure PCTCN2021117149-appb-000004
Figure PCTCN2021117149-appb-000004
其中,如果以CDR3为例,S表示唯一CDR3的总数,p(i)表示CDR3的频率。Among them, if CDR3 is taken as an example, S represents the total number of unique CDR3s, and p(i) represents the frequency of CDR3s.
Uniq_number表示唯一序列数。Uniq_number represents the unique sequence number.
Clone_Gini表示Gini指数,计算公式如下:Clone_Gini represents the Gini index, and the calculation formula is as follows:
Figure PCTCN2021117149-appb-000005
Figure PCTCN2021117149-appb-000005
其中,x指每一种免疫细胞类型出现的频率,n指免疫细胞种类数。Among them, x refers to the frequency of each immune cell type, and n refers to the number of immune cell types.
S400确定免疫年龄数值S400 determines the immune age value
在该步骤中,基于统计特征,确定个体的免疫年龄数值。In this step, based on the statistical characteristics, the immune age value of the individual is determined.
根据本发明的实施例,基于至少一个统计特征,利用最大后验概率估计,确定免疫年龄数值。According to an embodiment of the present invention, the immune age value is determined based on at least one statistical feature using a maximum a posteriori probability estimate.
根据本发明的实施例,在步骤S400中,进一步包括:(4-1)利用预先确定的免疫年龄预测系数分布(主要依据选取特征的特性确定参数先验分布,如果选取的特征是连续,在大数据量的情况下,一般认为是正态分布),基于统计特征的每一个,分别确定各统计特征所对应的免疫年龄预测系数;和(4-2)按照公式
Figure PCTCN2021117149-appb-000006
确定个体的免疫年龄,其中,IA表示个体的免疫年龄,i表示统计特征的编号,n表示统计特征的数目,θi表示第i个统计特征所对应的免疫年龄预测系数,xi表示第i个统计特征的数值,θ0表示预测模型中的偏置项。
According to an embodiment of the present invention, in step S400, it further includes: (4-1) using a predetermined immune age prediction coefficient distribution (mainly according to the characteristics of the selected feature to determine the parameter prior distribution, if the selected feature is continuous, in In the case of a large amount of data, it is generally considered to be a normal distribution), based on each statistical feature, determine the immune age prediction coefficient corresponding to each statistical feature; and (4-2) According to the formula
Figure PCTCN2021117149-appb-000006
Determine the immune age of the individual, where IA represents the immune age of the individual, i represents the number of statistical features, n represents the number of statistical features, θi represents the immune age prediction coefficient corresponding to the ith statistical feature, and xi represents the ith statistical feature The numerical value of the feature, θ0 represents the bias term in the prediction model.
为了方便理解,下面对最大后验概率估计的原理进行解释如下:For the convenience of understanding, the principle of maximum posterior probability estimation is explained as follows:
根据本发明的实施例,基于以上特征指数,结合生化指标采用MAP(maximum a posteriori probability estimate,最大后验概率估计)模型进行IA计算,从而进行综合性免疫力评估和机体风险预测,具体原理如下:According to an embodiment of the present invention, based on the above characteristic indices, combined with biochemical indices, the MAP (maximum a posteriori probability estimate, maximum a posteriori probability estimation) model is used to perform IA calculation, so as to perform comprehensive immunity assessment and body risk prediction. The specific principles are as follows :
MAP的理论依据源于贝叶斯模型,贝叶斯公式如下:The theoretical basis of MAP is derived from the Bayesian model, and the Bayesian formula is as follows:
Figure PCTCN2021117149-appb-000007
Figure PCTCN2021117149-appb-000007
由全概率公式将B时间展开得到如下公式:The following formula is obtained by expanding the B time by the full probability formula:
Figure PCTCN2021117149-appb-000008
Figure PCTCN2021117149-appb-000008
其中,~A表示“非A”,Among them, ~A means "not A",
生化指标主要包括常规的指标,如大生化、血常规等。Biochemical indicators mainly include conventional indicators, such as macrobiochemical, blood routine and so on.
MAP的原理具体如下:The principle of MAP is as follows:
最大后验概率假设在给定观测指标x下,预测参数θ的取值,假设f为x的抽样分布,则f(x|θ)为在给定参数θ时观测值为x的概率。假设g为参数θ的先验分布(可由训练数据得到),则根据贝叶斯公式,有:The maximum posterior probability assumes that under a given observation index x, the value of the prediction parameter θ is assumed, and if f is the sampling distribution of x, then f(x|θ) is the probability that the observed value is x when the parameter θ is given. Assuming that g is the prior distribution of the parameter θ (which can be obtained from the training data), then according to the Bayesian formula, there are:
Figure PCTCN2021117149-appb-000009
Figure PCTCN2021117149-appb-000009
其中,训练数据主要依据选取特征的特性确定参数先验分布,如果选取的特征是连续,在大数据量的情况下,一般认为是正态分布,如果是离散的,直接按照下面的公式加权累乘即可。选取的训练集成员主要包括免疫组库分析得到的一些指标(V/J基因使用多样性,免疫多样性,免疫细胞种类,免疫细胞均一性)以及一些生化指标(大生化,血常规等)。Among them, the training data is mainly based on the characteristics of the selected features to determine the prior distribution of the parameters. If the selected features are continuous, in the case of a large amount of data, it is generally considered to be a normal distribution. If it is discrete, it is directly weighted according to the formula below. Just multiply. The selected members of the training set mainly include some indicators (V/J gene usage diversity, immune diversity, immune cell type, immune cell homogeneity) obtained from immune repertoire analysis and some biochemical indicators (large biochemical, blood routine, etc.).
其中,
Figure PCTCN2021117149-appb-000010
为θ的参数空间,由于参数空间
Figure PCTCN2021117149-appb-000011
是连续的,因此分母以积分的形式计算,则:
in,
Figure PCTCN2021117149-appb-000010
is the parameter space of θ, since the parameter space
Figure PCTCN2021117149-appb-000011
is continuous, so the denominator is calculated as an integral, then:
Figure PCTCN2021117149-appb-000012
Figure PCTCN2021117149-appb-000012
其中
Figure PCTCN2021117149-appb-000013
为使函数f(x|θ)g(θ)取最大值的参数,即预测Immune Age(IA)的系数。若观测值为n维的(即x=(x 1,x 2,…,x n)),则
Figure PCTCN2021117149-appb-000014
in
Figure PCTCN2021117149-appb-000013
The parameter to maximize the function f(x|θ)g(θ), that is, the coefficient of predicting Immune Age (IA). If the observed value is n-dimensional (ie x=(x 1 ,x 2 ,...,x n )), then
Figure PCTCN2021117149-appb-000014
IA的预测公式如下:The prediction formula of IA is as follows:
Figure PCTCN2021117149-appb-000015
Figure PCTCN2021117149-appb-000015
S500确定免疫力指数S500 Determines the Immunity Index
在该步骤中,基于免疫年龄数值,确定个体的免疫力指数。In this step, the immunity index of the individual is determined based on the immune age value.
根据本发明的实施例,免疫力指数是通过下列公式确定的:According to an embodiment of the present invention, the immunity index is determined by the following formula:
Figure PCTCN2021117149-appb-000016
Figure PCTCN2021117149-appb-000016
其中,IA表示在步骤S400中确定的免疫年龄数值,IAmax表示预先确定的群体中的IA上限,IAmin表示预先确定的群体中的IA下限。Wherein, IA represents the immune age value determined in step S400, IAmax represents the upper limit of IA in the predetermined group, and IAmin represents the lower limit of IA in the predetermined group.
在确定个体的免疫力指数后,该技术方案可以实现从分子层面上实现对个体适应性免疫系统进行高灵敏度检测,而且可以实现无创的早期诊断、疗效评估、病情追踪、复发预测以及免疫力综合评估。After determining the immune index of the individual, the technical solution can realize the high-sensitivity detection of the individual adaptive immune system at the molecular level, and can realize non-invasive early diagnosis, curative effect evaluation, disease tracking, recurrence prediction and comprehensive immunity. Evaluate.
根据本发明的实施例,通过测序可以采用少量的样本即可实施本发明的方法,以从分子层面上实现对个体适应性免疫系统进行高灵敏度检测,而且可以实现无创的早期诊断、疗效评估、病情追踪、复发预测以及免疫力综合评估。例如根据本发明的实施例,可以采用PCR技术扩增外周血中淋巴细胞含有的基因,所需血液样本少,样本后续处理简便,不需要进行不准确的人孔血细胞观察技术,也不需要操作复杂的免疫标记和流式分析。对于骨髓瘤检验,因为只需要采取外周血,不需要实施骨髓穿刺,可以减少对病人身体的损伤,具有积极的意义。总之,根据本发明的实施例,免疫组库测序进行免疫评估不仅可以提升 检测的灵敏度,而且可以实现早期诊断,评估疗效,追踪病情,预测复发以及免疫力的综合评估等功能。According to the embodiments of the present invention, the method of the present invention can be implemented by using a small amount of samples by sequencing, so as to realize the high-sensitivity detection of the individual adaptive immune system at the molecular level, and realize non-invasive early diagnosis, curative effect evaluation, Condition tracking, relapse prediction and comprehensive immune assessment. For example, according to the embodiments of the present invention, the PCR technology can be used to amplify the genes contained in lymphocytes in peripheral blood, which requires less blood samples, and the subsequent processing of the samples is simple, and no inaccurate human well blood cell observation technology is required, and no operation is required. Sophisticated immunolabeling and flow analysis. For myeloma test, because only peripheral blood needs to be taken, no bone marrow puncture is required, which can reduce the damage to the patient's body, which has positive significance. In a word, according to the embodiment of the present invention, immune evaluation by immune repertoire sequencing can not only improve the sensitivity of detection, but also realize functions such as early diagnosis, evaluation of curative effect, tracking of disease condition, prediction of recurrence, and comprehensive evaluation of immunity.
在本发明的第二方面,本发明提出了一种确定个体免疫力指数的设备,根据本发明的实施例,参考图3,该设备包括:In a second aspect of the present invention, the present invention provides a device for determining an individual immunity index. According to an embodiment of the present invention, referring to FIG. 3 , the device includes:
测序数据获取单元100、测序结果分析单元200、统计单元300、免疫年龄确定单元400和免疫力指数确定单元500。其中,测序数据获取单元100,用于获取待测个体的核酸测序数据;测序结果分析单元200,用于通过将测序结果与参考序列比对,确定核酸样本中所包含V/J序列以及CDR序列;统计单元300,用于基于核酸样本中所包含V/J序列以及CDR3序列,确定统计特征,统计特征包括选自下列的至少之一:V/J基因使用多样性指数、免疫细胞多样性指数、免疫细胞种类数目、免疫细胞均一性指数;免疫年龄确定单元400,用于基于统计特征,确定个体的免疫年龄数值;免疫力指数确定单元500,用于基于免疫年龄数值,确定个体的免疫力指数。The sequencing data acquisition unit 100 , the sequencing result analysis unit 200 , the statistics unit 300 , the immune age determination unit 400 and the immune index determination unit 500 . The sequencing data acquisition unit 100 is used to acquire nucleic acid sequencing data of the individual to be tested; the sequencing result analysis unit 200 is used to determine the V/J sequence and CDR sequence contained in the nucleic acid sample by comparing the sequencing result with the reference sequence Statistical unit 300 for determining statistical features based on the V/J sequences and CDR3 sequences contained in the nucleic acid sample, the statistical features including at least one selected from the following: V/J gene usage diversity index, immune cell diversity index , the number of immune cell types, and the immune cell homogeneity index; the immune age determination unit 400 is used to determine the immune age value of the individual based on the statistical characteristics; the immune index determination unit 500 is used to determine the immune age value of the individual based on the immune age value. index.
采用本发明的实施例的该设备,可以有效地实施前面所描述的确定个体免疫力的方法。由此,前面所描述的特征和优点同样适用于该设备,在此不再赘述。Using the apparatus of an embodiment of the present invention, the previously described method of determining immunity of an individual can be effectively implemented. Thus, the features and advantages described above are also applicable to the device and will not be repeated here.
根据本发明的实施例,参考图4,测序数据获取单元进一步包括:核酸样本获取模块110、第一扩增模块120和第二扩增模块130、测序模块140。其中,根据本发明的实施例,核酸样本获取模块110,用于获取待测个体的核酸样本,核酸样本包括DNA分子和RNA分子的至少之一;第一扩增模块120,用于采用VJ特异性引物进行第一扩增处理,以便获得第一扩增产物;第二扩增模块130,用于对第一扩增产物进行第二扩增处理,以便获得第二扩增产物,其中,第二扩增产物携带测序接头;测序模块140,用于对第二扩增产物进行测序,以便获得测序结果;核酸样本是从个体的血液或者组织样本中获得的。According to an embodiment of the present invention, referring to FIG. 4 , the sequencing data acquisition unit further includes: a nucleic acid sample acquisition module 110 , a first amplification module 120 and a second amplification module 130 , and a sequencing module 140 . Among them, according to the embodiment of the present invention, the nucleic acid sample acquisition module 110 is used to acquire nucleic acid samples of the individual to be tested, and the nucleic acid samples include at least one of DNA molecules and RNA molecules; the first amplification module 120 is used to use VJ specific The first amplification process is performed on the primers to obtain the first amplification product; the second amplification module 130 is used for performing the second amplification process on the first amplification product to obtain the second amplification product, wherein the first amplification product is The second amplification product carries a sequencing adapter; the sequencing module 140 is used to sequence the second amplification product so as to obtain a sequencing result; the nucleic acid sample is obtained from an individual's blood or tissue sample.
根据本发明的实施例,VJ特异性引物含有测序接头的一部分序列。According to an embodiment of the present invention, the VJ-specific primer contains a portion of the sequence of the sequencing adapter.
根据本发明的实施例,CDR序列为CDR1、CDR2和CDR3序列的至少之一,优选CDR3序列。According to an embodiment of the present invention, the CDR sequence is at least one of CDR1, CDR2 and CDR3 sequences, preferably a CDR3 sequence.
根据本发明的实施例,V/J基因使用多样性指数和免疫细胞多样性指数的至少之一为香农指数。According to an embodiment of the present invention, at least one of the V/J gene usage diversity index and the immune cell diversity index is a Shannon index.
根据本发明的实施例,免疫细胞的种类是基于CDR3序列确定的。According to an embodiment of the present invention, the type of immune cells is determined based on the CDR3 sequence.
根据本发明的实施例,免疫细胞均一性指数为基尼指数。According to an embodiment of the present invention, the immune cell homogeneity index is the Gini index.
根据本发明的实施例,免疫年龄确定单元适于基于至少一个统计特征,利用最大后验概率估计,确定免疫年龄数值。According to an embodiment of the invention, the immune age determination unit is adapted to determine the immune age value based on the at least one statistical feature using a maximum a posteriori probability estimate.
根据本发明的实施例,免疫年龄确定单元用于:利用预先确定的免疫年龄预测系数分 布,基于统计特征的每一个,分别确定各统计特征所对应的免疫年龄预测系数;和按照公式
Figure PCTCN2021117149-appb-000017
确定个体的免疫年龄,其中,IA表示个体的免疫年龄,i表示统计特征的编号,n表示统计特征的数目,θi表示第i个统计特征所对应的免疫年龄预测系数,xi表示第i个统计特征的数值,θ0表示预先预测模型中的偏置项。
According to an embodiment of the present invention, the immune age determination unit is configured to: using a predetermined distribution of immune age prediction coefficients, based on each of the statistical features, respectively determine the immune age prediction coefficient corresponding to each statistical feature; and according to the formula
Figure PCTCN2021117149-appb-000017
Determine the immune age of the individual, where IA represents the immune age of the individual, i represents the number of statistical features, n represents the number of statistical features, θi represents the immune age prediction coefficient corresponding to the ith statistical feature, and xi represents the ith statistical feature The numerical value of the feature, θ0 represents the bias term in the pre-prediction model.
根据本发明的实施例,免疫力指数是通过下列公式确定的:According to an embodiment of the present invention, the immunity index is determined by the following formula:
Figure PCTCN2021117149-appb-000018
Figure PCTCN2021117149-appb-000018
其中,IA表示在免疫年龄确定单元中确定的免疫年龄数值,IAmax表示预先确定的群体中的IA上限,IAmin表示预先确定的群体中的IA下限。Wherein, IA represents the immune age value determined in the immune age determination unit, IAmax represents the upper limit of IA in the predetermined population, and IAmin represents the lower limit of IA in the predetermined population.
在本发明的第三方面,本发明提出了一种电子设备,根据本发明的实施例,包括处理器和存储器,存储器存储有能够被处理器执行的机器可执行指令,处理器执行机器可执行指令以实现前面的确定个体免疫力指数的方法。In a third aspect of the present invention, the present invention provides an electronic device, according to an embodiment of the present invention, comprising a processor and a memory, the memory stores machine-executable instructions that can be executed by the processor, and the processor executes the machine-executable instructions. Instructions to implement the preceding method of determining an individual's immunity index.
在本发明的第四方面,本发明提出了一种机器可读存储介质,根据本发明的实施例,该机器可读存储介质存储有机器可执行指令,该机器可执行指令在被处理器调用和执行时,机器可执行指令促使处理器实现前面任一项的确定个体免疫力指数的方法。In a fourth aspect of the present invention, the present invention provides a machine-readable storage medium. According to an embodiment of the present invention, the machine-readable storage medium stores machine-executable instructions, and the machine-executable instructions are called by a processor when the and when executed, the machine-executable instructions cause a processor to implement any of the preceding methods of determining an immunity index of an individual.
实施例1:Example 1:
1、测序数据获取1. Sequencing data acquisition
采集1000例志愿者的外周血液5mL,利用DNA提取试剂盒提取外周血样本的DNA,利用V基因和J基因特异性引物对DNA样本进行扩增,引物中带有部分测序接头,以便获得带有部分测序接头的V基因样本和J基因样本。Collect 5 mL of peripheral blood from 1000 volunteers, extract the DNA from the peripheral blood samples using a DNA extraction kit, and amplify the DNA samples using V gene and J gene specific primers with partial sequencing adapters in order to obtain DNA samples with V gene samples and J gene samples of partially sequenced adapters.
针对所得到的扩增样本,再利用带有测序接头的引物进行进一步扩增建库,并对测序文库进行高通量测序。For the obtained amplified samples, primers with sequencing adapters are used to further amplify and build a library, and the sequencing library is subjected to high-throughput sequencing.
2、测序数据分析2. Sequencing data analysis
数据下机后,对测序数据进行如下分析:After the data is off the computer, the sequencing data is analyzed as follows:
(1)采用SOAPnuke(v1.5.3)对原始测序数据进行接头污染序列、低质量碱基和序列(根据序列中碱基的平均质量值和所含的N碱基数量占比两个指标进行过滤,“read的碱基质量值小于等于20”、“N碱基数大于等于5”,两者满足其一或全满足的被过滤掉)的过滤;(1) SOAPnuke (v1.5.3) was used to perform junction contamination sequences, low-quality bases and sequences (filtered according to the average quality value of the bases in the sequence and the proportion of the number of N bases contained in the sequence) on the original sequencing data. , "the base quality value of the read is less than or equal to 20", "the number of N bases is greater than or equal to 5", the two satisfy one or all of them are filtered out);
(2)把FASTQ文件转换为FASTA文件;(2) Convert the FASTQ file to a FASTA file;
(3)采用blastall(v2.2.25)对预处理之后的FASTA序列比对到V(D)J参考基因序列上,并进行重比对,选择最佳的比对结果;(3) Use blastall (v2.2.25) to align the pretreated FASTA sequence to the V(D)J reference gene sequence, and perform multiple alignments to select the best alignment result;
(4)将比对后的序列数据进行结构分析(错误校正和区域确定),采用华大基因结构分析程序对PCR和测序环节的引入的错误进行矫正,其次利用V/J基因参考序列与保守氨基酸的规律与建立的计算方法确定CDR3区域。(4) Perform structural analysis (error correction and region determination) on the aligned sequence data, and use the BGI gene structure analysis program to correct the errors introduced in PCR and sequencing. The regularity of amino acids and the established computational method determine the CDR3 region.
3、指标统计与预测3. Indicator statistics and forecasting
对免疫组库特征数据进行统计,并根据自主开发模型进行免疫力预测和分析。Statistics on immune repertoire feature data, and immunity prediction and analysis based on self-developed models.
统计特征主要包括以下几个:Statistical features mainly include the following:
V/J基因使用多样性,即Shannon_index(V-J);V/J gene usage diversity, i.e. Shannon_index(V-J);
免疫多样性,即Shannon_index(CDR3_aa);Immune diversity, i.e. Shannon_index(CDR3_aa);
免疫细胞种类,即Uniq_number(CDR3_aa);Immune cell type, i.e. Uniq_number (CDR3_aa);
免疫细胞均一性,即Clone_Gini。Immune cell homogeneity, i.e. Clone_Gini.
以上指标中,Shannon_index表示Shannon指数,计算公式如下:Among the above indicators, Shannon_index represents the Shannon index, and the calculation formula is as follows:
Figure PCTCN2021117149-appb-000019
Figure PCTCN2021117149-appb-000019
其中,如果以CDR3为例,S表示唯一CDR3的总数,p(i)表示CDR3的频率。Among them, if CDR3 is taken as an example, S represents the total number of unique CDR3s, and p(i) represents the frequency of CDR3s.
Uniq_number表示唯一序列数。Uniq_number represents the unique sequence number.
Clone_Gini表示Gini指数,计算公式如下:Clone_Gini represents the Gini index, and the calculation formula is as follows:
Figure PCTCN2021117149-appb-000020
Figure PCTCN2021117149-appb-000020
其中,x指每一种免疫细胞类型出现的频率,n指免疫细胞种类数。Among them, x refers to the frequency of each immune cell type, and n refers to the number of immune cell types.
基于以上特征指数,结合血常规生化指标采用MAP(maximum a posteriori probability estimate,最大后验概率估计)模型进行IA计算,从而进行综合性免疫力评估和机体风险预测。Based on the above characteristic indexes, combined with blood routine biochemical indexes, the MAP (maximum a posteriori probability estimate) model was used for IA calculation, so as to conduct comprehensive immunity assessment and body risk prediction.
Figure PCTCN2021117149-appb-000021
Figure PCTCN2021117149-appb-000021
其中
Figure PCTCN2021117149-appb-000022
为使函数f(x|θ)g(θ)取最大值的参数,即预测Immune Age(IA)的系数。若观测值为n维的(即x=(x 1,x 2,…,x n)),则
Figure PCTCN2021117149-appb-000023
in
Figure PCTCN2021117149-appb-000022
The parameter to maximize the function f(x|θ)g(θ), that is, the coefficient of predicting Immune Age (IA). If the observed value is n-dimensional (ie x=(x 1 ,x 2 ,...,x n )), then
Figure PCTCN2021117149-appb-000023
IA的预测公式如下:The prediction formula of IA is as follows:
Figure PCTCN2021117149-appb-000024
Figure PCTCN2021117149-appb-000024
确定免疫力:Determine immunity:
基于预测出来的IA,结合群体分布特征,最终确定个体免疫力Immune Index(II)情况,具体模型如下:Based on the predicted IA, combined with the population distribution characteristics, the individual immunity Immune Index (II) is finally determined. The specific model is as follows:
Figure PCTCN2021117149-appb-000025
Figure PCTCN2021117149-appb-000025
其中,IA表示预测样本的免疫年龄,IA max和IA min分别表示群体分布中的上限和下限。 where IA represents the immune age of the predicted sample, and IA max and IA min represent the upper and lower bounds in the population distribution, respectively.
实施例2:Example 2:
1、测序数据获取1. Sequencing data acquisition
采集439例志愿者的外周血液5mL,利用DNA提取试剂盒提取外周血样本的DNA,利用V基因和J基因特异性引物对DNA样本进行扩增,引物中带有部分测序接头,以便获得带有部分测序接头的V基因样本和J基因样本。Collect 5 mL of peripheral blood from 439 volunteers, extract the DNA from the peripheral blood samples using a DNA extraction kit, and amplify the DNA samples using V gene and J gene specific primers with partial sequencing adapters in order to obtain DNA samples with V gene samples and J gene samples of partially sequenced adapters.
针对所得到的扩增样本,再利用带有测序接头的引物进行进一步扩增建库,并对测序文库进行高通量测序。For the obtained amplified samples, primers with sequencing adapters are used to further amplify and build a library, and the sequencing library is subjected to high-throughput sequencing.
2、测序数据分析2. Sequencing data analysis
数据下机后,对测序数据进行如下分析:After the data is off the computer, the sequencing data is analyzed as follows:
(1)采用SOAPnuke(v1.5.3)对原始测序数据进行接头污染序列、低质量碱基和序列(根据序列中碱基的平均质量值和所含的N碱基数量占比两个指标进行过滤,“read的碱基质量值小于等于20”、“N碱基数大于等于5”,两者满足其一或全满足的被过滤掉)的过滤;(1) SOAPnuke (v1.5.3) was used to perform junction contamination sequences, low-quality bases and sequences (filtered according to the average quality value of the bases in the sequence and the proportion of the number of N bases contained in the sequence) on the original sequencing data. , "the base quality value of the read is less than or equal to 20", "the number of N bases is greater than or equal to 5", the two satisfy one or all of them are filtered out);
(2)把FASTQ文件转换为FASTA文件;(2) Convert the FASTQ file to a FASTA file;
(3)采用blastall(v2.2.25)对预处理之后的FASTA序列比对到V(D)J参考基因序列上,并进行重比对,选择最佳的比对结果;(3) Use blastall (v2.2.25) to align the pretreated FASTA sequence to the V(D)J reference gene sequence, and perform multiple alignments to select the best alignment result;
(4)将比对后的序列数据进行结构分析(错误校正和区域确定),采用华大基因结构分析程序对PCR和测序环节的引入的错误进行矫正,其次利用V/J基因参考序列与保守氨基酸的规律与建立的计算方法确定CDR3区域。(4) Perform structural analysis (error correction and region determination) on the aligned sequence data, and use the BGI gene structure analysis program to correct the errors introduced in PCR and sequencing. The regularity of amino acids and the established computational method determine the CDR3 region.
3、指标统计3. Indicator statistics
对免疫组库特征数据进行统计,统计特征主要包括以下3个:Statistical data on immune repertoire characteristics mainly include the following three:
免疫多样性,即Shannon_index(CDR3_aa);Immune diversity, i.e. Shannon_index(CDR3_aa);
免疫细胞种类,即Uniq_number(CDR3_aa);Immune cell type, i.e. Uniq_number (CDR3_aa);
序列多样性,即Uniq_number(seq_aa)。Sequence diversity, i.e. Uniq_number(seq_aa).
以上指标中,Shannon_index表示Shannon指数,计算公式如下:Among the above indicators, Shannon_index represents the Shannon index, and the calculation formula is as follows:
Figure PCTCN2021117149-appb-000026
Figure PCTCN2021117149-appb-000026
其中,如果以CDR3为例,S表示唯一CDR3的总数,p(i)表示CDR3的频率。Among them, if CDR3 is taken as an example, S represents the total number of unique CDR3s, and p(i) represents the frequency of CDR3s.
Uniq_number表示唯一序列数。Uniq_number represents the unique sequence number.
4、预处理4. Preprocessing
移除3个含有缺失值的样本。Remove 3 samples with missing values.
5、模型训练5. Model training
将剩余436个样本按年龄分为3组(20-30岁、30-50岁、>50岁),从每组各随机抽取75%的样本并将其合并为训练集,剩余的111个样本作为测试集。Divide the remaining 436 samples into 3 groups by age (20-30 years old, 30-50 years old, >50 years old), randomly select 75% of the samples from each group and combine them into the training set, the remaining 111 samples as a test set.
使用训练集,基于上述3个免疫组库特征指数,采用MAP(maximum a posteriori probability estimate,最大后验概率估计)模型进行IA计算,从而进行综合性免疫力评估和机体风险预测。模型参数的训练过程如下:Using the training set, based on the above three immune repertoire characteristic indices, the MAP (maximum a posteriori probability estimate, maximum a posteriori probability estimation) model was used for IA calculation, so as to conduct comprehensive immunity assessment and body risk prediction. The training process of the model parameters is as follows:
Figure PCTCN2021117149-appb-000027
Figure PCTCN2021117149-appb-000027
其中
Figure PCTCN2021117149-appb-000028
为使函数f(x|θ)g(θ)取最大值的参数,即预测Immune Age(IA)的系数。此处观测值是3维的(即x=(x 1,x 2,x 3)),则
Figure PCTCN2021117149-appb-000029
in
Figure PCTCN2021117149-appb-000028
The parameter to maximize the function f(x|θ)g(θ), that is, the coefficient of predicting Immune Age (IA). Here the observations are 3-dimensional (ie x=(x 1 , x 2 , x 3 )), then
Figure PCTCN2021117149-appb-000029
基于训练出的参数,得到IA的预测公式:Based on the trained parameters, the prediction formula of IA is obtained:
Figure PCTCN2021117149-appb-000030
Figure PCTCN2021117149-appb-000030
基于预测出来的IA,结合群体分布特征,最终确定个体免疫力Immune Index(II)。具体公式如下:Based on the predicted IA, combined with the population distribution characteristics, the individual immunity Immune Index (II) was finally determined. The specific formula is as follows:
Figure PCTCN2021117149-appb-000031
Figure PCTCN2021117149-appb-000031
其中,IA表示预测样本的免疫年龄,IA max和IA min分别表示群体分布中的上限和下限。 where IA represents the immune age of the predicted sample, and IA max and IA min represent the upper and lower bounds in the population distribution, respectively.
6、II预测结果6. II prediction results
从图5和6可以看出,随着年龄的增加,免疫力指数呈下降趋势。尽管年龄段大于50的样本量较少,图6呈现出的免疫力指数下降趋势不太明显,但图5中呈现出的免疫力指数下降趋势较明显。因此,该实施例的结果表明,免疫力指数可以作为一个用来评估健康指数的指标。As can be seen from Figures 5 and 6, the immunity index showed a downward trend with increasing age. Although the sample size of the age group greater than 50 is small, the decline trend of the immunity index shown in Figure 6 is not obvious, but the decline trend of the immunity index shown in Figure 5 is more obvious. Therefore, the results of this example show that the immunity index can be used as an index for evaluating the health index.
尽管上面已经示出和描述了本发明的实施例,可以理解的是,上述实施例是示例性的, 不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施例进行变化、修改、替换和变型。Although the embodiments of the present invention have been shown and described above, it should be understood that the above embodiments are exemplary and should not be construed as limiting the present invention. Embodiments are subject to variations, modifications, substitutions and variations.

Claims (23)

  1. 一种确定个体免疫力指数的方法,其特征在于,包括:A method for determining an individual immunity index, characterized in that it comprises:
    (1)获取待测个体的核酸测序数据;(1) Obtain nucleic acid sequencing data of the individual to be tested;
    (2)通过将所述测序结果与参考序列比对,确定所述核酸样本中所包含V/J序列以及CDR序列;(2) by aligning the sequencing result with the reference sequence, determine the V/J sequence and the CDR sequence contained in the nucleic acid sample;
    (3)基于所述核酸样本中所包含V/J序列以及CDR序列,确定统计特征,所述统计特征包括选自下列的至少之一:V/J基因使用多样性指数、免疫细胞多样性指数、免疫细胞种类数目、免疫细胞均一性指数;(3) Determine statistical features based on the V/J sequences and CDR sequences contained in the nucleic acid sample, and the statistical features include at least one selected from the following: V/J gene usage diversity index, immune cell diversity index , number of immune cell types, immune cell homogeneity index;
    (4)基于所述统计特征,确定所述个体的免疫年龄数值;和(4) determining the immune age value of the individual based on the statistical characteristics; and
    (5)基于所述免疫年龄数值,确定所述个体的免疫力指数。(5) Determine the immunity index of the individual based on the immune age value.
  2. 根据权利要求1所述的方法,其特征在于,所述测序数据是通过下列步骤获得的:The method according to claim 1, wherein the sequencing data is obtained by the following steps:
    (1-1)获取待测个体的核酸样本,所述核酸样本包括DNA分子和RNA分子的至少之一;(1-1) Obtaining a nucleic acid sample of the individual to be tested, the nucleic acid sample includes at least one of a DNA molecule and an RNA molecule;
    (1-2)采用VJ特异性引物进行第一扩增处理,以便获得第一扩增产物;(1-2) using VJ-specific primers for the first amplification process to obtain the first amplification product;
    (1-3)对所述第一扩增产物进行第二扩增处理,以便获得第二扩增产物,其中,所述第二扩增产物携带测序接头;(1-3) performing a second amplification process on the first amplification product to obtain a second amplification product, wherein the second amplification product carries a sequencing adapter;
    (1-4)对所述第二扩增产物进行测序,以便获得测序结果;(1-4) Sequencing the second amplification product to obtain a sequencing result;
    所述核酸样本是从所述个体的血液或者组织样本中获得的。The nucleic acid sample is obtained from a blood or tissue sample of the individual.
  3. 根据权利要求1所述的方法,其特征在于,所述VJ特异性引物含有所述测序接头的一部分序列。The method of claim 1, wherein the VJ-specific primer contains a portion of the sequence of the sequencing adapter.
  4. 根据权利要求1所述的方法,其特征在于,所述CDR序列为CDR1、CDR2和CDR3序列的至少之一。The method according to claim 1, wherein the CDR sequence is at least one of CDR1, CDR2 and CDR3 sequences.
  5. 根据权利要求4所述的方法,其特征在于,所述CDR序列为CDR3序列。The method of claim 4, wherein the CDR sequence is a CDR3 sequence.
  6. 根据权利要求1所述的方法,其特征在于,所述V/J基因使用多样性指数和免疫细胞多样性指数的至少之一为香农指数。The method according to claim 1, wherein at least one of the V/J gene usage diversity index and the immune cell diversity index is a Shannon index.
  7. 根据权利要求1所述的方法,其特征在于,所述免疫细胞的种类是基于所述CDR3序列确定的。The method of claim 1, wherein the type of the immune cell is determined based on the CDR3 sequence.
  8. 根据权利要求1所述的方法,其特征在于,所述免疫细胞均一性指数为基尼指数。The method according to claim 1, wherein the immune cell homogeneity index is a Gini index.
  9. 根据权利要求1所述的方法,其特征在于,基于至少一个所述统计特征,利用最大后验概率估计,确定所述免疫年龄数值。The method of claim 1, wherein the immune age value is determined using a maximum a posteriori probability estimate based on at least one of the statistical features.
  10. 根据权利要求1所述的方法,其特征在于,在步骤(4)中,进一步包括:The method according to claim 1, characterized in that, in step (4), further comprising:
    (4-1)利用预先确定的免疫年龄预测系数分布,基于所述统计特征的每一个,分别确定各所述统计特征所对应的免疫年龄预测系数;和(4-1) Using a predetermined distribution of immune age prediction coefficients, based on each of the statistical features, respectively determine the immune age prediction coefficient corresponding to each of the statistical features; and
    (4-2)按照公式
    Figure PCTCN2021117149-appb-100001
    确定所述个体的免疫年龄,
    (4-2) According to the formula
    Figure PCTCN2021117149-appb-100001
    determining the immune age of said individual,
    其中,IA表示所述个体的免疫年龄,i表示所述统计特征的编号,n表示所述统计特征的数目,θi表示第i个所述统计特征所对应的免疫年龄预测系数,xi表示第i个所述统计特征的数值,θ0表示预测模型中的偏置项。Wherein, IA represents the immune age of the individual, i represents the number of the statistical feature, n represents the number of the statistical feature, θi represents the immune age prediction coefficient corresponding to the i-th statistical feature, and xi represents the i-th statistical feature The numerical values of the statistical features, θ0 represents the bias term in the prediction model.
  11. 根据权利要求10所述的方法,其特征在于,所述免疫力指数是通过下列公式确定的:The method of claim 10, wherein the immunity index is determined by the following formula:
    Figure PCTCN2021117149-appb-100002
    Figure PCTCN2021117149-appb-100002
    其中,IA表示在步骤(4)中确定的所述免疫年龄数值,IAmax表示预先确定的群体中的IA上限,IAmin表示预先确定的群体中的IA下限。Wherein, IA represents the immune age value determined in step (4), IAmax represents the upper limit of IA in the predetermined population, and IAmin represents the lower limit of IA in the predetermined population.
  12. 一种确定个体免疫力指数的设备,其特征在于,包括:A device for determining an individual immunity index, characterized in that it includes:
    测序数据获取单元,用于获取待测个体的核酸测序数据;A sequencing data acquisition unit for acquiring nucleic acid sequencing data of the individual to be tested;
    测序结果分析单元,用于通过将所述测序结果与参考序列比对,确定所述核酸样本中所包含V/J序列以及CDR序列;A sequencing result analysis unit for determining the V/J sequence and the CDR sequence contained in the nucleic acid sample by comparing the sequencing result with a reference sequence;
    统计单元,用于基于所述核酸样本中所包含V/J序列以及CDR序列,确定统计特征,所述统计特征包括选自下列的至少之一:V/J基因使用多样性指数、免疫细胞多样性指数、免疫细胞种类数目、免疫细胞均一性指数;A statistical unit for determining statistical features based on the V/J sequences and CDR sequences contained in the nucleic acid sample, the statistical features including at least one selected from the following: V/J gene usage diversity index, immune cell diversity Sex index, number of immune cell types, immune cell homogeneity index;
    免疫年龄确定单元,用于基于所述统计特征,确定所述个体的免疫年龄数值;和an immune age determination unit for determining an immune age value of the individual based on the statistical characteristics; and
    免疫力指数确定单元,用于基于所述免疫年龄数值,确定所述个体的免疫力指数。An immunity index determination unit, configured to determine the immunity index of the individual based on the immune age value.
  13. 根据权利要求12所述的设备,其特征在于,所述测序数据获取单元进一步包括:The device according to claim 12, wherein the sequencing data acquisition unit further comprises:
    核酸样本获取模块,用于获取待测个体的核酸样本,所述核酸样本包括DNA分子和RNA分子的至少之一;a nucleic acid sample acquisition module, used for acquiring a nucleic acid sample of an individual to be tested, the nucleic acid sample comprising at least one of DNA molecules and RNA molecules;
    第一扩增模块,用于采用VJ特异性引物进行第一扩增处理,以便获得第一扩增产物;a first amplification module for performing a first amplification process using VJ-specific primers, so as to obtain a first amplification product;
    第二扩增模块,用于对所述第一扩增产物进行第二扩增处理,以便获得第二扩增产物,其中,所述第二扩增产物携带测序接头;a second amplification module, configured to perform a second amplification process on the first amplification product, so as to obtain a second amplification product, wherein the second amplification product carries a sequencing adapter;
    测序模块,用于对所述第二扩增产物进行测序,以便获得测序结果;a sequencing module for sequencing the second amplification product, so as to obtain a sequencing result;
    所述核酸样本是从所述个体的血液或者组织样本中获得的。The nucleic acid sample is obtained from a blood or tissue sample of the individual.
  14. 根据权利要求12所述的设备,其特征在于,所述VJ特异性引物含有所述测序接头的一部分序列。The apparatus of claim 12, wherein the VJ-specific primer contains a portion of the sequence of the sequencing adapter.
  15. 根据权利要求12所述的设备,其特征在于,所述CDR序列为CDR1、CDR2和 CDR3序列的至少之一,优选CDR3序列。The device according to claim 12, wherein the CDR sequence is at least one of CDR1, CDR2 and CDR3 sequences, preferably a CDR3 sequence.
  16. 根据权利要求12所述的设备,其特征在于,所述V/J基因使用多样性指数和免疫细胞多样性指数的至少之一为香农指数。The device according to claim 12, wherein at least one of the V/J gene usage diversity index and the immune cell diversity index is a Shannon index.
  17. 根据权利要求12所述的设备,其特征在于,所述免疫细胞的种类是基于所述CDR3序列确定的。The device of claim 12, wherein the type of immune cells is determined based on the CDR3 sequence.
  18. 根据权利要求12所述的设备,其特征在于,所述免疫细胞均一性指数为基尼指数。The device according to claim 12, wherein the immune cell homogeneity index is a Gini index.
  19. 根据权利要求12所述的设备,其特征在于,所述免疫年龄确定单元适于基于至少一个所述统计特征,利用最大后验概率估计,确定所述免疫年龄数值。13. The apparatus of claim 12, wherein the immune age determination unit is adapted to determine the immune age value based on at least one of the statistical features using a maximum a posteriori probability estimate.
  20. 根据权利要求12所述的设备,其特征在于,所述免疫年龄确定单元用于:The device according to claim 12, wherein the immune age determination unit is used for:
    利用预先确定的免疫年龄预测系数分布,基于所述统计特征的每一个,分别确定各所述统计特征所对应的免疫年龄预测系数;和Using a predetermined distribution of immune age prediction coefficients, based on each of the statistical features, determine the immune age prediction coefficient corresponding to each of the statistical features, respectively; and
    按照公式
    Figure PCTCN2021117149-appb-100003
    确定所述个体的免疫年龄,
    According to the formula
    Figure PCTCN2021117149-appb-100003
    determining the immune age of said individual,
    其中,IA表示所述个体的免疫年龄,i表示所述统计特征的编号,n表示所述统计特征的数目,θi表示第i个所述统计特征所对应的免疫年龄预测系数,xi表示第i个所述统计特征的数值,θ0表示预测模型中的偏置项。Wherein, IA represents the immune age of the individual, i represents the number of the statistical feature, n represents the number of the statistical feature, θi represents the immune age prediction coefficient corresponding to the i-th statistical feature, and xi represents the i-th statistical feature The numerical values of the statistical features, θ0 represents the bias term in the prediction model.
  21. 根据权利要求20所述的设备,其特征在于,所述免疫力指数是通过下列公式确定的:The device of claim 20, wherein the immunity index is determined by the following formula:
    Figure PCTCN2021117149-appb-100004
    Figure PCTCN2021117149-appb-100004
    其中,IA表示在所述免疫年龄确定单元中确定的所述免疫年龄数值,IAmax表示预先确定的群体中的IA上限,IAmin表示预先确定的群体中的IA下限。Wherein, IA represents the immune age value determined in the immune age determination unit, IAmax represents the upper limit of IA in the predetermined population, and IAmin represents the lower limit of IA in the predetermined population.
  22. 一种电子设备,其特征在于,包括处理器和存储器,所述存储器存储有能够被所述处理器执行的机器可执行指令,所述处理器执行所述机器可执行指令以实现权利要求1-11任一项所述的确定个体免疫力指数的方法。An electronic device, characterized by comprising a processor and a memory, wherein the memory stores machine-executable instructions that can be executed by the processor, and the processor executes the machine-executable instructions to implement claims 1- The method for determining an individual immunity index according to any one of 11.
  23. 一种机器可读存储介质,其特征在于,该机器可读存储介质存储有机器可执行指令,该机器可执行指令在被处理器调用和执行时,机器可执行指令促使处理器实现权利要求1-11任一项所述的确定个体免疫力指数的方法。A machine-readable storage medium, characterized in that the machine-readable storage medium stores machine-executable instructions, and when the machine-executable instructions are invoked and executed by a processor, the machine-executable instructions cause the processor to implement claim 1 - The method for determining an individual immunity index according to any one of 11.
PCT/CN2021/117149 2021-03-30 2021-09-08 Method and device for determining immunity index of individual, electronic device, and machine-readable storage medium WO2022205775A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202180065823.0A CN116391237A (en) 2021-03-30 2021-09-08 Method, device, electronic device and machine readable storage medium for determining an individual immunity index

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110342463.6 2021-03-30
CN202110342463 2021-03-30

Publications (1)

Publication Number Publication Date
WO2022205775A1 true WO2022205775A1 (en) 2022-10-06

Family

ID=83455556

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/117149 WO2022205775A1 (en) 2021-03-30 2021-09-08 Method and device for determining immunity index of individual, electronic device, and machine-readable storage medium

Country Status (2)

Country Link
CN (1) CN116391237A (en)
WO (1) WO2022205775A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100062473A1 (en) * 2006-06-15 2010-03-11 Katsuiku Hirokawa Immunity evaluation method, immunity evaluation apparatus, immunity evaluation program and data recording medium having the immunity evaluation program stored therein
US20140235478A1 (en) * 2013-02-04 2014-08-21 The Board Of Trustees Of The Leland Stanford Junior University Measurement and Comparison of Immune Diversity by High-Throughput Sequencing
US20180356403A1 (en) * 2017-06-09 2018-12-13 The Regents Of The University Of California Use of Immune Repertoire Diversity For Predicting Transplant Rejection
WO2019215740A1 (en) * 2018-05-07 2019-11-14 Technion Research & Development Foundation Limited Immune age and use thereof
WO2020178816A1 (en) * 2019-03-04 2020-09-10 The National Institute for Biotechnology in the Negev Ltd. Kits, compositions and methods for evaluating immune system status
CN112331344A (en) * 2020-11-12 2021-02-05 深圳泛因医学有限公司 Immune state evaluation method and application

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100062473A1 (en) * 2006-06-15 2010-03-11 Katsuiku Hirokawa Immunity evaluation method, immunity evaluation apparatus, immunity evaluation program and data recording medium having the immunity evaluation program stored therein
US20140235478A1 (en) * 2013-02-04 2014-08-21 The Board Of Trustees Of The Leland Stanford Junior University Measurement and Comparison of Immune Diversity by High-Throughput Sequencing
US20180356403A1 (en) * 2017-06-09 2018-12-13 The Regents Of The University Of California Use of Immune Repertoire Diversity For Predicting Transplant Rejection
WO2019215740A1 (en) * 2018-05-07 2019-11-14 Technion Research & Development Foundation Limited Immune age and use thereof
WO2020178816A1 (en) * 2019-03-04 2020-09-10 The National Institute for Biotechnology in the Negev Ltd. Kits, compositions and methods for evaluating immune system status
CN112331344A (en) * 2020-11-12 2021-02-05 深圳泛因医学有限公司 Immune state evaluation method and application

Also Published As

Publication number Publication date
CN116391237A (en) 2023-07-04

Similar Documents

Publication Publication Date Title
US20190325988A1 (en) Method and system for rapid genetic analysis
CN104271759B (en) Detection as the type spectrum of the same race of disease signal
JP2014503223A (en) Method for evaluating immune diversity and use thereof
WO2018160548A1 (en) Markers for coronary artery disease and uses thereof
US20200357487A1 (en) Computer-implemented method and system for determining a disease status of a subject from immune-receptor sequencing data
WO2021232388A1 (en) Method for determining base type of predetermined site in embryonic cell chromosome, and application thereof
CN105506115A (en) DNA library for detection and diagnosis of hereditary cardiomyopathy causing genes and application thereof
WO2014186036A1 (en) Methods for evaluating copd status
CN110904213B (en) Ulcerative colitis biomarker based on intestinal flora and application thereof
JP2022512890A (en) Sample quality evaluation method
CN107208131A (en) Method for lung cancer parting
Habgood-Coote et al. Diagnosis of childhood febrile illness using a multi-class blood RNA molecular signature
WO2019224668A1 (en) Method for determining the probability of the risk of chromosomal and genetic disorders from free dna of fetal origin
CN109072306A (en) Isolated nucleic acid and application
WO2022205775A1 (en) Method and device for determining immunity index of individual, electronic device, and machine-readable storage medium
WO2023086999A1 (en) Systems and methods for evaluating immunological peptide sequences
CN113178257A (en) Training method of classification model of pulmonary nodules
JP2022533656A (en) Immune repertoire health assessment system and method
CN112118781A (en) Assessment of transplant rejection status by analysis of T cell receptor subunit pool diversity
Ghraichy et al. Maturation of the human B-cell receptor repertoire with age
CN116287207B (en) Use of biomarkers in diagnosing cardiovascular related diseases
WO2022210606A1 (en) Method for evaluating future risk of developing dementia
Pinal-Fernandez Transcriptome profiling and longitudinal cohort studies of myositis subsets
Aterido et al. Seven chain adaptive immune receptor repertoire analysis in rheumatoid arthritis: association to disease and clinically relevant phenotypes
CN108603870A (en) Marker of coronary artery disease and application thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21934414

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE