CN111613269B

CN111613269B - Method for predicting HLA match probability and mismatch type

Info

Publication number: CN111613269B
Application number: CN202010424265.XA
Authority: CN
Inventors: 李杨; 何军
Original assignee: First Affiliated Hospital of Suzhou University
Current assignee: First Affiliated Hospital of Suzhou University
Priority date: 2020-05-19
Filing date: 2020-05-19
Publication date: 2024-01-05
Anticipated expiration: 2040-05-19
Also published as: CN111613269A

Abstract

The invention discloses a method for predicting HLA matching probability and mismatch type, which comprises the following steps: (1) constructing an HLA database; (2) inputting and submitting genotypes to be aligned; (3) Converting the format of the alleles entered in step (2) to match the format in the HLA database; (4) Arranging and combining the genotypes subjected to format conversion in the step (3) to obtain a haplotype combination; (5) Comparing the haplotype combination obtained in the step (4) with the HLA database constructed in the step (1); and (6) obtaining a prediction result through comparison. The method of the invention is helpful for clinically selecting the primary screening donor with larger 10/10 matching probability to carry out the confirmation typing, and selecting the optimal irrelevant donor which can allow the mismatch from 8-9/10 mismatch donors, thereby saving the detection cost of patients and having important influence on reducing the transplantation complications and improving the transplantation survival.

Description

Method for predicting HLA match probability and mismatch type

Technical Field

The invention belongs to the field of biomedicine, and particularly relates to a method for predicting HLA matching probability and mismatch type.

Background

Human leukocyte antigens (human leucocyte antigen, HLA) are not only closely related to allogeneic transplantation, but also play an important role in the genetic laws of human evolution, the occurrence and development of immune diseases, tumor escape, and vaccine preparation. Patients with clinically developed tumors and organ failure eventually only save lives by xenograft, while HLA is the most common transplantation antigen, and the extent to which the patient and donor HLA match or mismatch directly affects the prognosis of transplantation and the long-term survival of the patient.

At present, the potential donor for the allogeneic transplantation is a related donor and a non-blood donor, so that HLA genotyping results of patients are screened from the related donor and search matching of the unrelated donor is carried out in a Chinese bone marrow bank before the transplantation is carried out clinically. In the case where the patient has no related donor available for selection of a transplant, it is important to select the best matching donor among the selected preliminary matching unrelated donors. At present, the retrieval flow of the Chinese bone marrow bank is as follows: 1) There are two results after the primary screening of unrelated donors: HLA-A, B, DRB1, C, DQB1 five locus 10 allele high score results; HLA-A, B, DRB1 three locus 6 allele high score or low score results. 2) The confirmatory typing was reconfirmed of samples and results from the above-described prescreening donors and patients with varying degrees of match.

Because different patients have different opportunities to retrieve preliminary matching donors; a, B, DRB1 locus 6/6 high score or low score, 9/10 match preliminary match donor, the probability that HLA various loci match or mismatch appears in the stage of confirming the parting is different, so the existing search method has the defects of large blindness, high patient detection cost and the like.

Although there are obvious differences between HLA alleles and haplotype frequencies of Chinese population and other international ethnicities, foreign literature mentions that HLA genotyping is utilized to predict the match probability of donors, however, complete search system is not formed and is applied in clinical cases.

Disclosure of Invention

In order to solve the above problems in the prior art, an object of the present invention is to provide a method for predicting HLA match probability and mismatch type.

In order to achieve the above object, the present invention provides the following technical solutions:

a method for predicting HLA match probability and mismatch type, comprising the steps of:

(1) Constructing an HLA database;

(2) Inputting and submitting genotypes to be compared, wherein the input genotypes are HLA-A, B, C, DRB1 and DQB1 locus genotypes;

(3) Converting the format of the alleles entered in step (2) to match the format in the HLA database:

(4) Arranging and combining the genotypes subjected to format conversion in the step (3) to obtain a haplotype combination;

(5) Comparing the haplotype combination obtained in the step (4) with the HLA database constructed in the step (1);

(6) And obtaining a prediction result through comparison.

Further, the HLA database comprises an HLA allele CWD database, an HLA haplotype frequency database and an HLA negative linkage database; wherein the HLA haplotype frequency database comprises a 10/10 phase matching prediction database, a 9/10 phase matching prediction database and a C, DQB1 prediction database.

Further, the HLA allele CWD database comprises genotypes, C/WD/R, frequency; wherein C is a Common gene, WD is a confirming gene, and R is a rare gene (ref: common and well-documented HLA alleles: report of the Ad-Hoc committee of the american society for histocompatiblity and immunogenes. Human immunology. 2007);

the 10/10-phase prediction database comprises A-B-C-DRB1-DQB1 haplotypes, A, B, C, DRB1, DQB1 genotypes, frequency, sequencing, C/WD/R;

the 9/10-phase prediction database comprises an A mismatch database, a B mismatch database, a C mismatch database, a DRB1 mismatch database and a DQB1 mismatch database; each mismatch database comprises mismatched haplotypes and mismatched genotypes and frequencies corresponding to the haplotypes;

the C, DQB1 prediction database comprises C, DQB1 genotypes and frequency corresponding to the A-B-DRB1 haplotype and haplotype.

Further, the number of haplotype combinations in the step (4) is 2 ^n-1 And a group, wherein n is the number of gene loci.

Further, the comparison method in the step (5) comprises 10/10 combined search, 9/10 combined search, A, B, 6/6 combined search of DRB1 locus, allele CWD interpretation and negative linkage search.

A system for predicting HLA match probability and mismatch type, which executes the above method for predicting HLA match probability and mismatch type, and comprises an HLA database, a genotype entry module, a format conversion module, a genotype combination module and a comparison module;

the HLA database is used for storing reference data;

the genotype input module is used for inputting genotypes to be compared and checking the format of the genotypes;

the format conversion module is used for processing and converting the input alleles and unifying formats matched with an HLA database:

the genotype combination module is used for arranging and combining genotypes after format conversion to obtain a haplotype combination;

and the comparison module is used for comparing the haplotype combination with the HLA database constructed in the step (1) to obtain HLA matching probability and mismatch type.

A storage medium for predicting HLA-matched probabilities and mismatch types, the storage medium performing the above-described method of predicting HLA-matched probabilities and mismatch types.

A processor for predicting HLA match probability and mismatch type, the processor being configured to run a program, the program being configured to perform the above-described method of predicting HLA match probability and mismatch type.

A device for predicting HLA match probability and mismatch type for performing the above-described method of predicting HLA match probability and mismatch type.

The beneficial effects are that: the invention provides a method for predicting HLA match probability and mismatch type, which can predict and search HLA-10/10 match or 8-9/10 match donor probability in early disease, wherein the possible results of A, B and DRB1 locus 6/6 high-score or low-score match donor C and DQB1 are helpful for clinically selecting a primary screening donor with a higher 10/10 match probability for confirming and typing, and selecting an optimal irrelevant donor which can allow mismatch in 8-9/10 mismatch donors, thereby saving the detection cost of patients and having important influence on reducing transplantation complications and improving transplantation survival.

The method can carry out code development on different platforms and by adopting different programming languages, and is easy to realize, develop and popularize; the practical requirements in clinical transplantation work are closely met, the practicability is strong, and the method can be extended to the field of tumor immunity and immune diseases; the user can complete the prediction of all results by only inputting the genotype results into the designated positions by one key, and the method is simple to operate and has wide development and application prospects.

Drawings

FIG. 1 is a flow chart of a technical route of the method of the present invention.

Detailed Description

The invention is further described below in connection with specific embodiments, which are exemplary only and do not limit the scope of the invention in any way. It will be understood by those skilled in the art that various modifications and substitutions can be made in the details and form of the present invention without departing from the spirit and scope of the invention, and that various modifications and substitutions can be made without departing from the spirit and scope of the invention.

The invention starts from the requirement of prejudgement of the donor result in the clinical transplantation field, and based on deep understanding of research results such as HLA allele and haplotype frequency, CWD, interrelation, linkage disequilibrium and the like, the success of the scientific research results is converted into a search and prediction tool which is easy to develop, operate and popularize. Firstly, a background reference database with a specific format is established and is used for comparing with haplotypes with a matching format virtually obtained by HLA genotypes at the foreground, so that a retrieval function is realized; then, by setting the screening conditions and the preferred parameters, the patient's probability of retrieving 10/10, 9/10 co-donors, 6/6 co-donors C, DQB1 possible results, and the type of possible mismatch for the patient are predicted.

The technical scheme of the invention comprises six main parts: establishing a background reference database, inputting and submitting a foreground genotype result, processing genotype data formats, generating virtual haplotypes and linkage genes, comparing search data with the reference database, and giving a prediction result according to screening conditions and preferred parameters. The technical route flow chart of the method of the invention is shown in figure 1. In the first part, a China crowd HLA allele CWD database, an HLA haplotype frequency database and an HLA linkage disequilibrium parameter database are established, and are written into a format which can be used for searching and comparing, and are used as background reference databases, the content of the database is required to be hidden to avoid leakage and error modification, the database is protected by means of setting passwords and the like, and updating and maintenance can be carried out on the database after authorization.

(1) HLA allele CWD reference database format

Comprises 3 columns A-C, which are respectively genotype, C/WD/R and frequency, and the format is shown in Table 1:

TABLE 1

Genotype of the type	C/WD/R	Frequency number
			A＊01：01	C	76477
A＊01：03	WD	300
			A＊01：06	R	3
A＊01：127	R	2
			A＊01：129	R	1
A＊01：141	R	1

(2) HLA haplotype frequency and CWD reference database format

1) 10/10-phase prediction database

Comprises 9 columns A-I, namely A-B-C-DRB1-DQB1 haplotypes, A, B, C, DRB1 and DQB1 genotypes, frequency, sequencing and C/WD/R respectively, and the formats are shown in Table 2:

TABLE 2

2) 9/10 coherent prediction database

A mismatch (B-C-DRB 1-DQB1 haplotype), B mismatch (A-C-DRB 1-DQB1 haplotype), C mismatch (A-B-DRB 1-DQB1 haplotype), DRB1 mismatch (A-B-C-DQB 1 haplotype), DQB1 mismatch (A-B-C-DRB 1 haplotype) and a total of 5 mismatch databases. Taking A mismatch as an example, the A mismatch comprises 3 columns A-C, wherein the A mismatch (B-C-DRB 1-DQB1 haplotype) and the A genotype and frequency corresponding to the haplotype are respectively shown in a format shown in Table 3:

TABLE 3 Table 3

3) C, DQB1 predictive database

Comprises 4 columns A-D, namely C corresponding to A-B-DRB1 haplotype and haplotype, and genotype and frequency of DQB1, wherein the format is shown in table 4:

TABLE 4 Table 4

(3) HLA negative linkage reference database

Comprises 4 columns A-D, which are respectively linked genes, D', r2 and P values, and the format is shown in the table 5:

TABLE 5

Linkage gene	D′	r2	P
				A＊01：01-C＊01：02：01G	-0.6784	0.0025	0.0002
A＊02：01-C＊06：02	-0.6332	0.0066	0.0000
				A＊02：03-C＊03：04	-0.2285	0.0002	0.3161
A＊02：03-C＊08：01：01G	-0.5882	0.0010	0.0172
				A＊02：06-C＊03：02	-0.7246	0.0023	0.0003

The second part, filling HLA-A, B, C, DRB1, DQB1 locus genotype results in a genotype entry box, wherein the filling requirements comprise: (1) Each locus contains two alleles, when one genotype entry box has content, the other genotype entry box cannot be empty; (2) The filled-in content may only contain arabic numerals, 26 letters (case-less), western colon ": "(the input Chinese colon automatically changes to the Western state), letters are not allowed to follow the colon, and the total length of the input cannot exceed 13 bytes; (3) The input boxes of A, B and DRB1 are necessary filling items, if the input boxes are not filled completely, the next search can not be carried out, and the prompt is that 'please fill the complete resubmission data with the results of A, B and DRB1 sites', (4) if the results of A, B and DRB1 are filled only, the prediction function of possible results of C and DQB1 can be executed only; five bit results of A, B, C, DRB1 and DQB1 are completely filled to execute all prediction functions. The genotype entry format is exemplified in table 6 below:

TABLE 6

The third part, processing and converting the input alleles, unifies the format matching the background database:

(1) Genotype four-position

All "entered alleles" only take the first colon and the tandem arabic numerals, if the last digit is G or P, then the values are complete, for example: 02:06:01:01 or 02:06:01 or 02:06 all became 02:06, 24:02:01:02L or 24:02:01L or 24:02L all became 24:02, 35:108:02:01 or 35:108:02 or 35:108 all became 35:108, 12:01:01G still 12:01:01G.

(2) G group judgment

HLA G group is public resource on https:// www.ebi.ac.uk/ipd/imgt/HLA/network, the invention picks up the G group name and the included alleles related in the HLA allele CWD database and the HLA haplotype frequency database, and writes the HLA alleles in the HLA group name CWD database into a G group reference database in a unified format, wherein the G group reference database comprises A, B columns which are respectively G group names (example 04: 01G) and gene names (example 04:01 or 04: 82).

Comparing the genotype obtained in the step (1) with the B column of the background G group reference database, and if matching content is searched, converting the genotype obtained in the step (1) into the content of the A column, for example: 04:01 or 04:82 or 04:01:01G each becomes 04:01:01G.

(3) Grabbing letters

The last letter is grasped from the "entered allele" and a null value is returned if the last byte is the letter G or P or not.

(4) Merging

This step produces two branches, branch one: combining the results of steps (2) and (3), for example: 02:06:01 or 02:06:06 eventually both become 02:06, 24:02:01:02l or 24:02:01l or 24:02l eventually both become 24:02l,35:108:02:01 or 35:108:02 or 35:108 eventually both become 35:108; branch two: combining the results of steps (1) and (3).

(5) Addition site

Adding "loci" to the first and second branches of "pooled alleles" in step (4) to obtain "loci-1" and "loci-2", respectively "

The "locus genotype-1" format is consistent with the HLA haplotype frequency and allele format referred to in column A of the CWD reference database, and the results of the conversion of the "locus allele" by this step are shown in Table 7 below:

TABLE 7

The "locus genotype-2" format is consistent with the allele format referred to in column a of the allele CWD reference database, HLA negative linkage reference database, and the results of the "locus allele" conversion by this scheme are shown in table 8 below:

TABLE 8

Fourth, the fourth part is arranged and combined by using the locus-added genotype-1 to virtually obtain 2 ^n-1 Group (2) ⁿ Bars) theoretical haplotype results (n is the number of sites), examples:

(1) The A, B, C, DRB1 and DQB1 have 5 sites, and 16 groups (32) of A-B-C-DRB1-DQB1 theoretical haplotypes can be virtually obtained;

(2) Taking the site lacking A as an example, when any 1 site lacks, 8 groups (16) of B-C-DRB1-DQB1 theoretical haplotypes can be virtually obtained;

(3) Taking the example of lack of C and DQB1 sites, when any 2 sites are absent, 4 groups (8) of A-B-DRB1 theoretical haplotypes can be virtually obtained. Alleles at each locus were then pooled and separated by western bars "-" for subsequent alignment with the reference database. The three types of arrangements are shown in the following tables 9 to 11, respectively:

TABLE 9A-B-C-DRB1-DQB1 haplotype combinations

TABLE 10B-C-DRB1-DQB1 (lack of A site) haplotype combinations

TABLE 11A-B-DRB1 (lack of C, DQB1 site) haplotype combinations

Fifth part, comparing the retrieved data with the reference database

(1) Allelic CWD interpretation

And (3) respectively searching the 10 loci-added genotypes-2 into the A column of the background genotype CWD reference database, and if the matched genotypes are searched, assigning the C/WD/R and the frequency of the B column and the C column to the designated background position.

(2) Complete match (10/10 match) search

Searching A-B-C-DRB1-DQB1 haplotypes, namely searching A column of a background A-B-C-DRB1-DQB1 haplotype reference database by using 32 virtual haplotypes, and if matching haplotypes are searched, assigning genotypes, frequencies, sequences and C/WD/R of B-J columns to appointed positions of a background calculation library; if the matching class capacity is not retrieved, a null value is given; each combination comprises two haplotypes, when one haplotype in the combination searches the matching class capacity, the other haplotype is displayed at a designated position even if the other haplotype is not matched, and the haplotype CWD is assigned as a content null value such as 'no match', sorting, frequency and the like.

(3) Gene mismatch (9/10 match) search

The A locus mismatch search, namely the B-C-DRB1-DQB1 locus haplotype search lacking the A locus; b site mismatch retrieval, namely, A-C-DRB1-DQB1 site haplotype retrieval lacking B site; c site mismatch retrieval, namely, retrieval of A-B-DRB1-DQB1 site haplotype lacking the C site; a DRB1 locus mismatch search, namely, a-B-C-DQB1 locus haplotype search lacking the DRB1 locus; and (3) performing DQB1 locus mismatch search, namely A-B-C-DRB1 locus haplotype search lacking the DQB1 locus. There are 16 haplotypes for each mismatch type. Taking A locus mismatch retrieval as an example, comparing all values of A column in a background reference database with 16 virtual haplotypes of A mismatch one by one, and if matching content is retrieved, assigning A genotypes and frequency numbers of B column and C column to appointed positions of a background calculation library; if no matching class is retrieved, a null value is assigned.

(4) A, B, DRB1 locus 6/6 combined search

Searching the A-B-DRB1 haplotype, comparing all contents of the A column of a background A-B-DRB1 haplotype reference database with 8 virtual haplotypes one by one, and if matching contents are searched, assigning C, DQB1 results and frequency numbers of the B-E columns to the appointed position of a background calculation library; if no matching class is retrieved, a null value is assigned.

(5) Negative linkage search

Combining the loci genotype-2 with each other, namely combining the loci genotype A-B, genotype A-C, genotype A-DRB1, genotype A-DQB1, genotype B-C, genotype B-DRB1, genotype B-DQB1, genotype C-DQB1 and genotype C-DQB1 with each other, comparing the loci A with the column A of the background negative linkage reference database, and assigning the contents of the column A to the designated positions of the background if the matched contents are searched.

A sixth section for giving a prediction result based on the screening conditions and the preference parameters and displaying the prediction result in the foreground

(1) Allelic CWD interpretation

C/WD/R and the number of cases are displayed in a foreground 'genotype CWD' display frame, C only needs to display C, WD or R needs to display WD (frequency) or R (frequency).

(2) Complete match (10/10 match) search

1) 10/10 phase theory probability calculation: half of the sum of the two haplotype frequencies (Haplotype Frequency, HF) in a set is the theoretical probability (Haplotype Matching Probability, HMP) that the haplotype combinations can be 10/10 combined, i.e., hmp= (HF 1+ HF 2)/2. The total probability of the 16-group haplotypes 10/10 combining (Total Haplotype Matching Probability, THMP) is the sum of the 16-group HMPs, i.e., thmp= Σhmpi, i=16. The THMP value is displayed in the foreground "10/10-match probability prediction" display frame.

2) 10/10 match opportunity judgment

The method is divided into 9 layers of high+, high-, medium+, medium, low+, low- ". (1) Dividing 10/10 of the combination of haplotypes into three layers of high, medium and low, and if both haplotypes are Common or WD, the two haplotypes are high; one is Common or WD, the other is Rare, no match or null, then "Medium"; both are Rare, no match or null, then "low". (2) The high, medium and low are further classified into high, high-, medium-, low- "according to the HMP value of the combination, and the high is classified into high, high and high-" by taking the HMP value of 0.4% -0.2% as a boundary; the 'middle' is further divided into 'middle+, middle and middle-' by taking the HMP value of 0.1% -0.05% as a boundary; the "low" is classified as "low+, low-, by the HMP value of 0.05% -0.02%. (3) The frequency of "high+, high-, medium+, medium-, low+, low-" is counted. (4) When the same level appears twice or more, the level can be increased upwards by one level, and finally the highest level is taken as the total rating, and the total level is displayed in a foreground 10/10 combined probability prediction display frame.

3) The retrieved haplotype combinations are arranged from high to low according to the 10/10 combined theoretical probability, and the haplotype combinations, the corresponding CWDs, the ranking, the frequency and the opportunity are displayed in the foreground.

(3) Gene mismatch (9/10 match) search

1) The searched mismatched haplotypes of A, B, C, DRB1 and DQB1 are arranged from large to small according to frequency;

2) Obtaining theoretical probability (Haplotype Mismatching Probability, HMMP) of the mismatch type using frequency/database haplotype count/16;

3) 9/10 phase sum probability calculation, average of sum of A, B, C, DRB, DQB1 site HMMP and THMP, namely THMMP= (HMMP) _A +HMMP _B +HMMP _C +HMMP _DRB1 +HMMP _DQB1 +THMP)/2；

4) 9/10 of the matching opportunity judges that the THMMP value is divided into three layers of high, medium and low by taking 1% -0.1% of the THMMP value as a boundary;

5) HMMP values, total rank, a, B, C, DRB1, DQB1 each locus likely mismatch genotype and probability are shown in the foreground.

(4) Prediction of C, DQB1 site results upon 6/6 phase of A, B, DRB1 site

1) 8 (4 groups) virtual haplotypes and the retrieved C, DQB1 results are paired in groups, for example: haplotype-1 and haplotype-2 are a group, 2 types of C are searched by the haplotype-1, the results of DQB1 are C, DQB1-1, C and DQB1-2 respectively, 2 types of C are searched by the haplotype-2, the results of DQB1 are C respectively, DQB1-3, C, DQB1-4, then there are 4 possibilities for the possible outcomes of the group C, DQB1, namely C, DQB1-1& C, DQB1-3, C, DQB1-1& C, DQB1-4, C, DQB1-2& C, DQB1-3, C, DQB1-2& C, DQB1-4.

2) Summarizing 4 groups of possible C and DQB1 results of the virtual haplotypes, sorting from large to small according to the frequency, and using the probability of one C and DQB1 result as the frequency/total number of the haplotypes of the database/2;

3) All possible combinations and probabilities of C, DQB1 are displayed in the foreground.

(5) Negative linkage search

Example 1

The 10/10 match probability rating is "high+" for example, a combination of 2 Common, so it is first judged "high", then based on the combined HMP value >0.4%, so it is then judged "high+"; the combination two is 1 Common and 1 Rare, so that the combination is firstly judged as 'middle', and then the combination is judged as 'middle' according to the fact that the HMP value is more than 0.05% and the combination is 0.1% >; the combination three is 2 Rares, so that the combination three is firstly judged to be 'low', and then the combination three is judged to be 'low' according to the fact that the HMP value is more than 0.02% of the combination three; the combination three and four are 1 Rare and 1 Rare without matching, so that the combination is judged to be 'low', and then the combination is judged to be 'low' according to the HMP value of < 0.02%. The two 'low-' increases to 1 'low', then increases to 1 'low+' with the original 1 'low', and finally takes the highest level 'high+' as the final rating of 10/10; the total 9/10 match probability THMMP is >1% and so the rating is "high".

Table 12

Example 2

The 10/10 match probability rating is "high-" for example, 3 "medium+" steps up to 1 "high-", so the highest level "high-" is finally taken as the 10/10 match final rating; the total probability of 9/10 match is 1% > THMMP >0.1%, so the rating is "medium".

TABLE 13

Example 3

The reason that the 10/10 match probability rating is "middle+" for example, C, DQB1 prediction is null is that 1 mismatch appears in each group of haplotypes, so no result appears in the step of performing in-combination permutation and combination, and reference is made to the 9/10 match C, DQB1 prediction.

TABLE 14

Example 4

10/10 odds of the match are rated as "low+" for example, 3 "low-" increments to 1 "low", and 1 "low+" is added to the original 2 "low", so the highest rated "low+" is finally taken as the final rating; the total 9/10 match probability THMMP is <0.1%, so the rating is "low".

TABLE 15

Example 5

The genotype CWD interpretation results are exemplified by that genotype A is WD, thus the number of examples need to be displayed, and the rest are C, thus the number of examples need not to be displayed.

Table 16

Example 6

A, B, DRB1 locus 6/6 co-donors C, DQB1 outcome prediction examples, C, DQB1 predicted 2 sets of outcomes, but the first set of outcomes were significantly more probable than the second set.

TABLE 17

In each of examples 1 to 4, the search of the independent provider was performed, and since the search page of the independent provider was displayed only for the first 15 search providers, if the search page exceeds 15, the page is temporarily displayed, and if the search page is less than 15. The results displayed on the search page are compared with the predicted results of the present invention for verifying the reliability of the predictions of the present invention. These cases are 10/10 match and 9/10 match predictions and mismatch types are highly consistent with the patient's actual opportunity to retrieve unrelated donors, and the mismatch types shown on the web page fall largely within the predicted mismatch types, as detailed in the following table:

TABLE 18

The patient of example 6 had two A, B, DRB1 site 6/6 matched donors for confirmatory typing, so that the predicted results of C, DQB1 were compared with the actual results of confirmatory typing, which were fully matched with the predicted high probability results, as shown in the following Table:

TABLE 19

Claims

1. A method for predicting HLA match probability and mismatch type, comprising the steps of: (1) constructing an HLA database; (2) Inputting and submitting genotypes to be compared, wherein the input genotypes are HLA-A, B, C, DRB1 and DQB1 locus genotypes; (3) Converting the format of the alleles entered in step (2) to match the format in the HLA database; (4) Arranging and combining the genotypes subjected to format conversion in the step (3) to obtain a haplotype combination; (5) Comparing the haplotype combination obtained in the step (4) with the HLA database constructed in the step (1); (6) obtaining a prediction result through comparison; the HLA database comprises an HLA allele CWD database, an HLA haplotype frequency database and an HLA negative linkage database; wherein the HLA haplotype frequency database comprises a 10/10 phase matching prediction database, a 9/10 phase matching prediction database and a C, DQB1 prediction database;

the HLA allele CWD database comprises genotype, C/WD/R, frequency;

2. The method of claim 1, wherein the number of haplotype combinations in step (4) is 2n-1, where n is the number of genetic loci.

3. The method of claim 1, wherein the alignment of step (5) comprises 10/10-phase search, 9/10-phase search, a, B, DRB1 site 6/6-phase search, allele CWD interpretation, and negative linkage search.

4. A system for predicting HLA match probability and mismatch type, wherein the system performs the method for predicting HLA match probability and mismatch type according to any one of claims 1 to 3, and the system comprises an HLA database, a genotype entry module, a format conversion module, a genotype combination module, and a comparison module;

the HLA database is used for storing reference data;

the format conversion module is used for processing and converting the input alleles and unifying a format matched with an HLA database;

5. A storage medium for predicting HLA match probability and mismatch type, wherein the storage medium performs the method for predicting HLA match probability and mismatch type according to any one of claims 1 to 3.

6. A processor for predicting HLA match probability and mismatch type, wherein the processor is configured to run a program, the program being configured to execute the method for predicting HLA match probability and mismatch type according to any one of claims 1 to 3.

7. A device for predicting HLA match probability and mismatch type, characterized in that it is used for carrying out the method for predicting HLA match probability and mismatch type according to any one of claims 1-3.