CN111613269A

CN111613269A - Method for predicting HLA matching probability and mismatch type

Info

Publication number: CN111613269A
Application number: CN202010424265.XA
Authority: CN
Inventors: 李杨; 何军
Original assignee: First Affiliated Hospital of Suzhou University
Current assignee: First Affiliated Hospital of Suzhou University
Priority date: 2020-05-19
Filing date: 2020-05-19
Publication date: 2020-09-01
Anticipated expiration: 2040-05-19
Also published as: CN111613269B

Abstract

The invention discloses a method for predicting HLA matching probability and mismatch types, which comprises the following steps: (1) constructing an HLA database; (2) inputting and submitting genotypes to be compared; (3) converting the format of the allele recorded in the step (2) into a format matched with the format in an HLA database; (4) arranging and combining the genotypes subjected to format conversion in the step (3) to obtain a haplotype combination; (5) comparing the haplotype combination obtained in the step (4) with the HLA database constructed in the step (1); (6) and obtaining a prediction result through comparison. The method of the invention is helpful for clinically selecting 10/10 primary screening donors with higher matching probability to confirm typing, and selecting allowable mismatch optimal unrelated donors from 8-9/10 mismatch donors, thereby saving the detection cost of patients and having important influence on reducing the transplantation complications and improving the transplantation survival.

Description

Method for predicting HLA matching probability and mismatch type

Technical Field

The invention belongs to the field of biomedicine, and particularly relates to a method for predicting HLA matching probability and mismatch types.

Background

Human Leukocyte Antigen (HLA) is not only closely associated with allogeneic transplantation, but also plays an important role in the genetic regularity of human evolution, the occurrence and development of immune diseases, tumor escape and vaccine preparation. Patients with clinically developed tumors and organ failure eventually only survive allogeneic transplantation, with HLA as the most common transplantation antigen, and the degree of HLA match or mismatch between the patient and donor directly affecting the prognosis of the transplantation and the long-term survival of the patient.

At present, the source of potential donor for heterogenic transplantation includes kindred donor and non-blood donor, so that before transplantation, the HLA genotyping result of patient is screened in kindred donor and the irrelevant donor is searched and matched in Chinese marrow bank. In the case where the patient does not have a parental donor to choose for transplantation, it is important to select the best matching donor among the selected preliminary matching unrelated donors. The current Chinese bone marrow pool search process is as follows: 1) there were two results after initial screening by unrelated donors: HLA-A, B, DRB1, C, DQB1 five sites 10 allele high score results; HLA-A, B, DRB1 three sites 6 alleles high score or low score results. 2) Confirmation typing is the reconfirmation of the samples and results of the primary screening donors and patients with different matching degrees.

The chance of retrieving a preliminary matching donor varies from patient to patient; a, B, DRB1 site 6/6 high score or low score, 9/10 match the primary match donor, appear HLA various sites match or mismatch probability difference in confirming the typing stage, so the current search method has the disadvantage of large blindness, high patient detection cost, etc.

The HLA allele and haplotype frequency of Chinese population are obviously different from those of other international ethnic groups, and although documents in foreign countries refer to the prediction of the donor coincidence probability by HLA genotyping, a complete retrieval system is not formed and is applied to clinical cases.

Disclosure of Invention

In order to solve the above problems of the prior art, the present invention provides a method for predicting HLA match probability and mismatch type.

In order to achieve the purpose, the invention provides the following technical scheme:

a method for predicting HLA match probability and mismatch type, comprising the steps of:

(1) constructing an HLA database;

(2) recording and submitting genotypes to be compared, wherein the recorded genotypes are HLA-A, B, C, DRB1 and DQB1 locus genotypes;

(3) converting the format of the allele entered in step (2) to match the format in the HLA database:

(4) arranging and combining the genotypes subjected to format conversion in the step (3) to obtain a haplotype combination;

(5) comparing the haplotype combination obtained in the step (4) with the HLA database constructed in the step (1);

(6) and obtaining a prediction result through comparison.

Further, the HLA database comprises an HLA allele CWD database, an HLA haplotype frequency database and an HLA negative linkage database; wherein the HLA haplotype frequency database comprises 10/10 concordance prediction database, 9/10 concordance prediction database and a C, DQB1 prediction database.

Further, the HLA allele CWD database comprises genotype, C/WD/R and frequency; wherein C is a Common gene, WD is a confirmed gene, and R is a rare gene (reference: Common and well-documented HLA allols: report of the Ad-Hoc committee of the experimental facility for human immunology, 2007);

the 10/10 concordance prediction database comprises A-B-C-DRB1-DQB1 haplotype, A, B, C, DRB1, DQB1 genotype, frequency, sequence, C/WD/R;

the 9/10 concordant predictive databases include an A mismatch database, a B mismatch database, a C mismatch database, a DRB1 mismatch database, a DQB1 mismatch database; each mismatch database comprises mismatched haplotypes, mismatched genotypes corresponding to the haplotypes and frequency;

the C, DQB1 prediction database contains the haplotype of A-B-DRB1, the C, DQB1 genotype corresponding to the haplotype, and frequency.

Further, the number of haplotype combinations in step (4) is 2^n-1And (b) a group, wherein n is the number of gene loci.

Further, the alignment mode of the step (5) comprises 10/10 concordance search, 9/10 concordance search, A, B, DRB1 locus 6/6 concordance search, allele CWD interpretation and negative linkage search.

A system for predicting HLA match probability and mismatch type, the system executes the above method for predicting HLA match probability and mismatch type, the system includes HLA database, genotype recording module, format conversion module, genotype combination module and comparison module;

the HLA database is used for storing reference data;

the genotype input module is used for inputting the genotypes to be compared and checking the formats of the genotypes;

the format conversion module is used for processing and converting the recorded alleles to unify the recorded alleles into a format matched with an HLA database:

the genotype combination module is used for arranging and combining the genotypes after format conversion to obtain haplotype combination;

and the comparison module is used for comparing the haplotype combination with the HLA database constructed in the step (1) to obtain HLA matching probability and mismatch types.

A storage medium for predicting HLA match probability and mismatch type, the storage medium performing the above-described methods of predicting HLA match probability and mismatch type.

A processor for predicting HLA match probability and mismatch type, the processor being configured to execute a program that executes the above-described method of predicting HLA match probability and mismatch type.

A device for predicting HLA match probability and mismatch type for implementing the above-described methods of predicting HLA match probability and mismatch type.

Has the advantages that: the method can predict the probability of searching HLA-10/10 matched donors or 8-9/10 matched donors in early disease, can predict the possible results of 6/6 high-grade or low-grade matched donors C and DQB1 at the site of A, B and DRB1, is favorable for clinically selecting 10/10 initial-screened donors with higher matched probability for confirmation typing, and can select the independent donors with the allowable mismatch optimum from 8-9/10 mismatched donors, thereby saving the detection cost of patients and having important influence on reducing the transplantation complications and improving the transplantation survival.

The method of the invention can be used for code development on different platforms and by adopting different programming languages, and is easy to realize, develop and popularize; closely meets the actual requirements in clinical transplantation work, has strong practicability, and can be extended to the fields of tumor immunity and immune diseases; the user can complete the prediction of all the results by only inputting the genotype results into the designated positions, the operation is simple, and the method has wide development and application prospects.

Drawings

FIG. 1 is a technical route flow diagram of the method of the present invention.

Detailed Description

The present invention is further described below with reference to specific examples, which are only exemplary and do not limit the scope of the present invention in any way. It will be understood by those skilled in the art that modifications or substitutions, additions or deletions to the details and forms of the present invention may be made without departing from the spirit and scope of the invention, and these modifications and substitutions are within the scope of the invention.

The invention starts from the requirement of the clinical transplantation field for the predicition of the result of the donor, and successfully converts the scientific research result into a retrieval and prediction tool which is easy to develop, operate and popularize on the basis of the deep understanding of the research results such as the frequency of HLA alleles and haplotypes, CWD, mutual relation, linkage disequilibrium and the like. Firstly, establishing a background reference database with a specific format for comparing with a foreground haplotype with a matching format virtualized by HLA genotype to realize a retrieval function; then, by setting the screening conditions and the preferred parameters, the probability of 10/10, 9/10 co-donors being retrieved by the patient, the possible outcome of 6/6 co-donors C, DQB1, and the possible mismatch types of donor patients are predicted.

The technical scheme of the invention comprises six main parts: establishing a background reference database, inputting and submitting foreground genotype results, processing genotype data formats, generating virtual haplotypes and linked genes, comparing retrieval data with the reference database, and giving a prediction result according to screening conditions and preferred parameters. The technical route flow chart of the method of the invention is shown in figure 1. The first part is to establish a CWD database, an HLA haplotype frequency database and an HLA linkage disequilibrium parameter database of Chinese population, to compile the databases into a format for retrieval and comparison, to serve as a background reference database, to hide the database contents to avoid leakage and error modification, to protect the database by setting a password, and to update and maintain the database after authorization.

(1) HLA allele CWD reference database format

Contains 3 columns A-C, which are genotype, C/WD/R and frequency respectively, and the format is shown in Table 1:

TABLE 1

Genotype(s)	C/WD/R	Frequency of occurrence
			A＊01：01	C	76477
A＊01：03	WD	300
			A＊01：06	R	3
A＊01：127	R	2
			A＊01：129	R	1
A＊01：141	R	1

(2) HLA haplotype frequency and CWD reference database format

1)10/10 database of coherent predictions

The gene comprises 9 columns A to I, namely A-B-C-DRB1-DQB1 haplotype, A, B, C, DRB1 and DQB1 genotype, frequency, sequencing and C/WD/R, and the format is shown in Table 2:

TABLE 2

2)9/10 database of coherent predictions

The total of 5 mismatch databases are divided into A mismatch (B-C-DRB1-DQB1 haplotype), B mismatch (A-C-DRB1-DQB1 haplotype), C mismatch (A-B-DRB1-DQB1 haplotype), DRB1 mismatch (A-B-C-DQB1 haplotype) and DQB1 mismatch (A-B-C-DRB1 haplotype). Taking the A mismatch as an example, the A mismatch includes 3 columns of A-C, which are respectively the A mismatch (B-C-DRB1-DQB1 haplotype), haplotype corresponding to A genotype, frequency, and the format is shown in Table 3:

TABLE 3

3) C, DQB1 prediction database

The gene type and frequency of the gene type are respectively C, DQB1, the formats of which are shown in a table 4, wherein the gene type and frequency of the gene type are respectively C, DQB1 and C is C, DQB1 haplotype and frequency, and the formats of the gene type and frequency are shown in the following table 4:

TABLE 4

(3) HLA negative linkage reference database

Contains a total of 4 columns A to D, which are respectively a linked gene, D', r2 and a P value, and the format is shown in Table 5:

TABLE 5

Linkage gene	D′	r2	P
				A＊01：01-C＊01：02：01G	-0.6784	0.0025	0.0002
A＊02：01-C＊06：02	-0.6332	0.0066	0.0000
				A＊02：03-C＊03：04	-0.2285	0.0002	0.3161
A＊02：03-C＊08：01：01G	-0.5882	0.0010	0.0172
				A＊02：06-C＊03：02	-0.7246	0.0023	0.0003

And in the second part, filling out HLA-A, B, C, DRB1 and DQB1 locus genotype results in a genotype entry box, wherein the filling requirements comprise: (1) each locus comprises two alleles, and when one of the genotype input boxes contains content, the other genotype input box cannot be empty; (2) the filled-in content may contain only arabic numerals, 26 letters (case-insensitive), western colon ": "(the input Chinese colon is automatically converted to Western state), the letters are not allowed to follow the colon, and the total length of the input cannot exceed 13 bytes; (3) if the input boxes of A, B and DRB1 are indispensable items, if the items are not completely filled, the next retrieval cannot be carried out, and a prompt is given to complete filling of the results of the sites A, B and DRB1 and then submitting data is given, and (4) if only the results of A, B and DRB1 are filled, the function of predicting the possible results of C and DQB1 can be executed; all the results of five sites, namely A, B, C, DRB1 and DQB1, must be completely filled in to perform all the prediction functions. Examples of the genotype entry format are shown in Table 6 below:

TABLE 6

And the third part is used for processing and converting the input allele to unify the input allele into a format matched with the background database:

(1) taking the four position of the genotype

All the 'recorded alleles' only take the first colon and Arabic numerals before and after the colon, and if the last digit is G or P, the value is completely taken, for example: 02:06:01:01 or 02:06:01 or 02:06 became 02:06, 24:02:01:02L or 24:02:01L or 24:02L to 24:02, 35:108:02:01 or 35:108:02 or 35:108 to 35:108, 12:01:01G and still 12:01: 01G.

(2) G group judgment

The HLA G group is https:// www.ebi.ac.uk/ipd/imgt/HLA/public resources on the network, the invention extracts the G group name and the contained allele related in an HLA allele CWD database and an HLA haplotype frequency database, and compiles the HLA alleles in a unified format into a G group reference database comprising A, B two columns, wherein the two columns are respectively the G group name (example 04:01:01G) and the gene name (example 04:01 or 04: 82).

Comparing the genotype after four-digit selection in the step (1) with a background G group reference database B column, and if matching contents are searched, assigning the allele after four-digit selection to the contents of the A column, for example: 04:01 or 04:82 or 04:01:01G is changed into 04:01: 01G.

(3) Grabbing letter

The last letter is grabbed from "entered allele" and a null value is returned if the last byte is the letter G or P or not.

(4) Merging

This step results in two branches, branch one: the results of steps (2) and (3) are combined, example: 02:06:01:01 or 02:06 eventually becomes 02:06, 24:02:01:02L or 24:02:01L or 24:02L eventually becomes 24:02L, 35:108:02:01 or 35:108:02 or 35:108 eventually becomes 35: 108; and branch two: combining the results of steps (1) and (3).

(5) Add site

Adding sites to the branch I and the branch II of the combined allele in the step (4) respectively to obtain the plus-site genotype-1 and the plus-site genotype-2 "

The format of "locus-added genotype-1" is consistent with the HLA haplotype frequency and the allele format referred to in column a of the CWD reference database, and the results of the conversion of the "entered allele" by this step are shown in table 7 below:

TABLE 7

The "plus site genotype-2" format is consistent with the allele format referred to in column a of the allele CWD reference database, HLA negative linked reference database, and the results of the "entered allele" transformed by this protocol are shown in table 8 below:

TABLE 8

Fourthly, the genotype with added sites-1 is used for permutation and combination to virtualize 2^n-1Group (2)ⁿBars) theoretical haplotype results (n is the number of loci), for example:

(1) the A, B, C, DRB1 and DQB1 have 5 sites, and 16 groups (32 strips) of A-B-C-DRB1-DQB1 theoretical haplotypes can be virtualized;

(2) taking the lack of A sites as an example, when any 1 site is lacked, 8 groups (16 pieces) of B-C-DRB1-DQB1 theoretical haplotypes can be virtualized;

(3) taking the lack of C, DQB1 locus as an example, when any 2 loci are lacked, 4 groups (8 pieces) of A-B-DRB1 theoretical haplotypes can be virtualized. Alleles at each site were then pooled and separated by a western bar "-" for subsequent alignment with the reference database. The above three types of permutation and combination are shown in the following tables 9-11, respectively:

TABLE 9A-B-C-DRB1-DQB1 haplotype combinations

TABLE 10B-C-DRB1-DQB1 (lack of A site) haplotype combinations

TABLE 11A-B-DRB1 (lack of C, DQB1 site) haplotype combinations

Fifth, comparison of search data with reference database

(1) Allele CWD interpretation

And respectively searching 10 'genotype-2 with added sites' in the A column of the background genotype CWD reference database, and assigning the C/WD/R sum frequency numbers of the B column and the C column to a background designated position if a matched genotype is searched.

(2) Perfect match (10/10 congruence) retrieval

Searching A-B-C-DRB1-DQB1 haplotypes, searching A columns of background A-B-C-DRB1-DQB1 haplotype reference databases by using the 32 virtualized haplotypes, and assigning the genotypes, frequency, sequencing and C/WD/R of the B-J columns to the designated positions of a background calculation library if matched haplotypes are searched; if the matching class is not retrieved, assigning a null value; each combination comprises two haplotypes, when one haplotype in the combination retrieves the matching class content, the other haplotype is displayed at a specified position even if the haplotype is not matched, the haplotype CWD is assigned with a 'no match', and the contents such as the sequence, the frequency and the like are assigned with null values.

(3) One gene mismatch (9/10 consensus) search

A site mismatch retrieval, namely B-C-DRB1-DQB1 site haplotype retrieval lacking A site; b site mismatch retrieval, namely the haplotype retrieval of A-C-DRB1-DQB1 site lacking B site; c site mismatch retrieval, namely C site-lacking A-B-DRB1-DQB1 site haplotype retrieval; DRB1 locus mismatch retrieval, namely A-B-C-DQB1 locus haplotype retrieval lacking DRB1 locus; DQB1 site mismatch search, i.e. A-B-C-DRB1 site haplotype search lacking DQB1 site. There were 16 haplotypes for each mismatch type. Taking A site mismatch retrieval as an example, comparing all values of an A column in a background reference database with 16 virtual haplotypes mismatched with A one by one, and assigning the A genotypes and the frequency numbers of a B column and a C column to the specified positions of a background calculation database if the matched contents are retrieved; if no matching class container is retrieved, a null value is assigned.

(4) A, B, DRB1 site 6/6 consensus search

A-B-DRB1 haplotype retrieval, comparing all contents of A row of a background A-B-DRB1 haplotype reference database with 8 virtual haplotypes one by one, and if matching contents are retrieved, assigning results and frequency of C and DQB1 of B-E row to the designated position of a background calculation database; if no matching class container is retrieved, a null value is assigned.

(5) Negative linkage search

Combining the 'locus-added genotypes-2', namely combining the loci according to A-B, A-C, A-DRB1, A-DQB1, B-C, B-DRB1, B-DQB1, C-DRB1, C-DQB1 and DRB1-DQB1, comparing the loci with the sequence A of the background negative linkage reference database, and assigning the content of the sequence A to a specified position of the background if the matched content is searched.

The sixth part, the prediction result is given according to the screening condition and the optimized parameter and is displayed in the foreground

(1) Allele CWD interpretation

C/WD/R and the sample are displayed in a foreground genotype CWD display frame, and C only needs to display C, WD or R and needs to display WD (frequency) or R (frequency).

(2) Perfect match (10/10 congruence) retrieval

1)10/10 calculation of the theoretical probability of coincidence: half of the sum of the Frequency of two haplotypes in a set (HF) is the theoretical probability that the Haplotype combination would be 10/10 compatible (HMP), i.e., (HF1+ HF 2)/2. The total Probability of Matching of 16 haplotypes 10/10 (THMP) is the sum of 16 HMPs, i.e., THMP ∑ HMPi, i ═ 16. The THMP value is displayed in the foreground "10/10 match probability prediction" display box.

2)10/10 judgment of chance of coincidence

It is divided into 9 levels of high +, high-, middle +, middle-, low +, low and low-. Dividing 10/10 chance of haplotype combination into three levels of high, middle and low, if two haplotypes are Common or WD, then high; one is Common or WD, and the other is Rare, no match or null value, and is 'middle'; both are Rare, no match or null, then "low". Dividing the high, medium and low into 'high +, high-, medium +, medium-, low +, low-' according to the HMP value of the combination, and dividing the 'high' into 'high +, high-' by taking the HMP value of 0.4-0.2%; dividing the 'middle' into 'middle +, middle and middle-' by taking the HMP value of 0.1% -0.05% as a boundary; dividing "low" into "low +, low-" with HMP value of 0.05% -0.02%. Thirdly, counting the frequency of high +, high-, middle +, middle-, low +, low-, and low- ". Fourthly, evaluating the general opportunity of 16 haplotypes 10/10 for combination, when the same grade appears twice or more, increasing the grade upwards by one grade, and finally taking the highest grade as the total grade, and displaying the total grade in the display box of the front-end 10/10 probability prediction.

3) The haplotype combinations retrieved are ranked from high to low according to the theoretical probability of 10/10 concordance and displayed in the foreground along with the corresponding CWD, rank, frequency and chance.

(3) One gene mismatch (9/10 consensus) search

1) The searched A, B, C, DRB1 and DQB1 mismatched haplotypes are arranged from large to small according to frequency;

2) obtaining the theoretical Probability (HMMP) of the mismatch type by using frequency/total number of haplotypes in the database/16;

3)9/10 calculating the total probability of combination, i.e. THMP (HMMP) being the mean of the sum of HMMP at A, B, C, DRB1 and DQB1 sites and THMP_A+HMMP_B+HMMP_C+HMMP_DRB1+HMMP_DQB1+THMP)/2；

4)9/10 judging the matching opportunity, dividing the THMMP into three layers of high, medium and low by taking the value of 1-0.1% of the THMMP as a boundary;

5) the HMMP value, the total grade, the A, B, C, DRB1 and the probability that each locus of DQB1 is possible to mismatch the genotype are displayed in the foreground.

(4) Prediction of C, DQB1 site results when A, B, DRB1 site 6/6 are combined

1) The 8 (4) virtual haplotypes, the C, DQB1 results retrieved, were paired intra-group, for example: haplotype-1 and haplotype-2 form a group, haplotype-1 retrieves 2 types of C, DQB1 results are respectively C, DQB1-1, C, DQB1-2, haplotype-2 retrieves 2 types of C, DQB1 results are respectively C, DQB1-3, C, DQB1-4, and the group C, DQB1 results have 4 possibilities, namely C, DQB1-1& C, DQB1-3, C, DQB1-1& C, DQB1-4, C, DQB1-2& C, DQB1-3, C, DQB1-2& C, DQB 1-4.

2) Summarizing the possible C and DQB1 results of the 4 groups of virtual haplotypes, sorting the results from large to small according to frequency, and using the probability of one C and DQB1 result as frequency/total number of haplotypes in the database/2;

3) all possible combinations and probabilities of C, DQB1 are displayed in the foreground.

(5) Negative linkage search

Example 1

10/10 the matching probability is rated as "high +" for example, the combination one is 2 Common, so it is judged as "high" first, then it is judged as "high +" according to the HMP value > 0.4% of the combination; the combination II is 1 Common and 1 Rare, so the combination is firstly judged as 'middle', and then the combination is judged as 'middle' according to the condition that 0.1% > HMP value is more than 0.05%; the combination three is 2 Rare, so the combination is judged to be 'low', and then the combination is judged to be 'low' according to the condition that 0.05% > HMP value is more than 0.02%; the combination of three and four is 1 Rare and 1 Rare without matching, so the combination is judged as 'low' firstly, and then the combination is judged as 'low-' according to the HMP value of less than 0.02%. Two low- 'are gradually increased to 1 low', then are gradually increased to 1 low + 'with the original 1 low', and finally the highest level 'high +' is taken as 10/10 to be combined for final rating; 9/10 Total probability of agreement THMMP > 1%, so the rating is "high".

TABLE 12

Example 2

10/10 the coincidence probability is rated "high-" for example, 3 "middle +" is increased upwards to 1 "high-" so the highest level "high-" is finally taken as 10/10 coincidence final rating; 9/10 Total probability of agreement 1% > THMMP > 0.1%, so the rating is "Medium".

Watch 13

Example 3

10/10 the match probability rating "center +" for example, the reason why the predicted result of C, DQB1 is empty is that there are 1 non-matches in each group of haplotypes, so there is no result in the arranged combination step in the combination, which can refer to the predicted result of C, DQB1 in 9/10 the match.

TABLE 14

Example 4

10/10, the probability rating of the combination is "low +" for example, 3 "low-" are increased to 1 "low", and the 3 "low-" are increased to 1 "low +" with the original 2 "low", so the highest level "low +" is finally taken as the final rating; 9/10 Total probability of agreement THMMP < 0.1%, so the rating is "low".

Watch 15

Example 5

As an example of the genotype CWD interpretation results, genotype A24: 10 is WD and therefore the number of examples is required, and the rest are C and therefore the number of examples is not required.

TABLE 16

Example 6

A, B, DRB1 site 6/6 consensus donor C, DQB1 results prediction example, C, DQB1 predicted 2 groups of results, but the first group of results was significantly more probable than the second group.

TABLE 17

In each of examples 1 to 4, irrelevant donor search was performed, and the irrelevant donor search page only displays the first 15 search donors at a time, and when the irrelevant donor search page exceeds 15, the page is not displayed for a while, when the irrelevant donor search page is less than 15, the page is displayed. Therefore, the result displayed on the retrieval page is compared with the prediction result of the invention to verify the reliability of the prediction of the invention. These cases are 10/10 matched and 9/10 matched predictions and mismatch types are highly consistent with actual chances of patient retrieval of unrelated donor retrieval, and the mismatch types shown on the web page fall largely within the predicted mismatch types, as detailed in the following table:

watch 18

The patient of example 6 was confirmed typed with two matched donors at position 6/6 of a, B, DRB1, and the predicted outcome of C, DQB1 was compared to the actual outcome of confirmed typing, which was in full agreement with the predicted high probability, as shown in the table below:

watch 19

Claims

1. A method for predicting HLA match probability and mismatch type, comprising the steps of: (1) constructing an HLA database; (2) recording and submitting genotypes to be compared, wherein the recorded genotypes are HLA-A, B, C, DRB1 and DQB1 locus genotypes; (3) converting the format of the allele recorded in the step (2) into a format matched with the format in an HLA database; (4) arranging and combining the genotypes subjected to format conversion in the step (3) to obtain a haplotype combination; (5) comparing the haplotype combination obtained in the step (4) with the HLA database constructed in the step (1); (6) and obtaining a prediction result through comparison.

2. The method of claim 1, wherein the HLA database comprises an HLA allele CWD database, an HLA haplotype frequency database, and an HLA negative-linked database; wherein the HLA haplotype frequency database comprises 10/10 concordance prediction database, 9/10 concordance prediction database and a C, DQB1 prediction database.

3. The method of claim 2, wherein the HLA allele CWD database comprises genotype, C/WD/R, frequency;

4. The method of claim 1, wherein the number of haplotype combinations in step (4) is 2^n-1And (b) a group, wherein n is the number of gene loci.

5. The method of claim 2, wherein the alignment of step (5) comprises 10/10 concordance search, 9/10 concordance search, A, B, DRB1 locus 6/6 concordance search, allele CWD interpretation and negative linkage search.

6. A system for predicting HLA match probability and mismatch type according to any one of claims 1 to 5, the system comprising an HLA database, a genotype recording module, a format conversion module, a genotype combination module, and an alignment module;

the HLA database is used for storing reference data;

the format conversion module is used for processing and converting the recorded alleles to unify the recorded alleles into a format matched with an HLA database;

7. A storage medium for predicting HLA match probability and mismatch type, the storage medium performing the method of any one of claims 1-5.

8. A processor for predicting HLA match probability and mismatch type, the processor being configured to execute a program, the program executing the method for predicting HLA match probability and mismatch type according to any one of claims 1-5.

9. An apparatus for predicting HLA match probability and mismatch type according to any one of claims 1-5, wherein the apparatus is used for implementing the method for predicting HLA match probability and mismatch type according to any one of claims.