CN112921078B - Method for determining HLA-I type of subject and anchor sequence group - Google Patents

Method for determining HLA-I type of subject and anchor sequence group Download PDF

Info

Publication number
CN112921078B
CN112921078B CN202110172484.8A CN202110172484A CN112921078B CN 112921078 B CN112921078 B CN 112921078B CN 202110172484 A CN202110172484 A CN 202110172484A CN 112921078 B CN112921078 B CN 112921078B
Authority
CN
China
Prior art keywords
sequence
anchor
dna
hla
rna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110172484.8A
Other languages
Chinese (zh)
Other versions
CN112921078A (en
Inventor
田野
杨梦磊
葛晨涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Huben Biomedical Co ltd
Original Assignee
Hangzhou Huben Biomedical Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Huben Biomedical Co ltd filed Critical Hangzhou Huben Biomedical Co ltd
Priority to CN202110172484.8A priority Critical patent/CN112921078B/en
Publication of CN112921078A publication Critical patent/CN112921078A/en
Application granted granted Critical
Publication of CN112921078B publication Critical patent/CN112921078B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6881Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/156Polymorphic or mutational markers

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Physics & Mathematics (AREA)
  • Organic Chemistry (AREA)
  • Immunology (AREA)
  • Biophysics (AREA)
  • Zoology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Wood Science & Technology (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Cell Biology (AREA)
  • Genetics & Genomics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present disclosure relates to a method of determining HLA typing: the method comprises the following steps: s1, screening out an anchor sequence; s2, obtaining a DNA sample of the person to be tested; obtaining an amplification product; s3, obtaining a sequencing result sequence; s4, comparing the sequencing result sequence with an anchor sequence; s5, if the sequencing result sequence can match at least two anchor sequences in the same anchor sequence group, judging that the sequencing result sequence is an available sequence; and S6, comparing the available sequences with all alleles in the typing library one by one, and taking the successfully-compared alleles as the typing of the testee on the given HLA locus. The technical scheme is a rapid, accurate and economical HLA typing method.

Description

Method for determining HLA-I type of subject and anchor sequence group
Technical Field
The present disclosure relates to the field of biomedicine, and in particular, to a method for determining the HLA type of a subject.
Background
The HLA (Human leucocyte antigen) gene system is the most abundant genetic system of Human polymorphism, and is located on the short arm of chromosome 6, and is 3600kb in length, wherein 224 gene loci are provided. Wherein, the HLA class I gene refers to 3 functional genes which are discovered at the earliest, namely HLA-A, HLA-B and HLA-C, and both encode heavy chains (alpha chains) of HLA class I molecules. HLA-A, HLA-B and HLA-C have a large number of variants (allele, alleles) in the human population, and by 10 months in 2018, it has been found that 4340 alleles exist for the HLA-A gene, 5212 alleles exist for the HLA-B gene, and 3930 alleles exist for the HLA-C gene. In the population, there are almost no HLA-identical except for the monozygotic twin, and thus HLA can be regarded as an individual's "identification card". In the whole human, tens of thousands of HLA subtypes have been identified. Important implications for the typing of HLA include: (1) can reduce rejection of organ transplantation by appropriate HLA matching; (2) HLA matching can serve as a genetic marker for certain diseases; (3) HLA matching can be an important means for paternity or forensic identification. At present, more sequencing data are needed for realizing high-resolution HLA typing, and the corresponding sequencing data processing is also more complex, so that the method has the defect of higher cost and limits the application of the high-resolution HLA typing.
Disclosure of Invention
It is an object of the present disclosure to provide a method for obtaining high resolution HLA typing using less sequencing data and simple data processing. In order to achieve the above object, the present disclosure provides a method of determining HLA-I type of a subject, the method comprising the steps of: s1, determining a plurality of anchor sequences in an HLA-I typing library, wherein the anchor sequences only appear once in the same allele and are conserved among different alleles; s2, obtaining a DNA sample of a person to be tested, carrying out PCR amplification to obtain an amplification product, and carrying out Sanger sequencing on the amplification product to obtain a sequencing result sequence; s3, comparing the sequencing result sequence with the plurality of anchor sequences to obtain a plurality of compared anchor sequences; s4, selecting an anchor sequence with a conserved sequence position from the aligned anchor sequences, and pairing the anchor sequences with the conserved sequence position pairwise to form an anchor sequence pair; s5, selecting the longest sequence among the anchor sequence pairs from the sequencing result sequences to obtain an available sequence; s6, comparing all the available sequences with all the alleles in the typing library one by one to obtain candidate alleles; s7, pairwise pairing the candidate alleles and pairing the candidate alleles with the candidate alleles to obtain candidate pairs; s8, comparing the double peaks in the available sequence in the sequencing result with the double peaks in the candidate pairing pairs, and obtaining the HLA-I type of the sequencing result after the candidate pairing is successfully compared. The disclosure also provides an anchor sequence group, wherein the anchor sequence group comprises the sequences of SEQ ID NO. 1-188.
According to the technical scheme, the information of the exons and the introns can be effectively read based on one or more rounds of first-generation sequencing results, and the workload of sequence comparison is reduced, so that high-resolution HLA typing can be obtained by using less sequencing data and simple and convenient data processing, and the method is a rapid, accurate and economic HLA typing method.
Detailed Description
In one aspect, the present disclosure provides a method of determining HLA type of a subject, the method comprising the steps of: s1, determining a plurality of anchor sequences in an HLA-I typing library, wherein the anchor sequences only appear once in the same allele and are conserved among different alleles; s2, obtaining a DNA sample of a person to be tested, carrying out PCR amplification to obtain an amplification product, and carrying out Sanger sequencing on the amplification product to obtain a sequencing result sequence; s3, comparing the sequencing result sequence with the plurality of anchor sequences to obtain a plurality of compared anchor sequences; s4, selecting an anchor sequence with conserved sequence position from the multiple aligned anchor sequences, and pairing the anchor sequences with conserved sequence position pairwise to form an anchor sequence pair; s5, selecting the longest sequence among the anchor sequence pairs from the sequencing result sequences to obtain an available sequence; s6, comparing all the available sequences with all the alleles in the typing library one by one to obtain candidate alleles; s7, pairwise pairing the candidate alleles and pairing the candidate alleles with the candidate alleles to obtain candidate pairs; s8, comparing the double peaks in the available sequence in the sequencing result with the double peaks in the candidate pairing pairs, and obtaining the HLA-I type of the sequencing result after the candidate pairing is successfully compared. The method for determining HLA-I type of a subject of the present disclosure can be used for non-diagnostic purposes, and the obtained HLA-I type belongs to intermediate information and does not belong to a disease diagnosis result.
The anchor sequence is mainly used for positioning the position of a sequencing sequence and a fragment thereof in an allele, and avoiding sequence comparison errors caused by base insertion and base deletion. The length of the anchor sequence may vary within wide limits and may, for example, be 8-20bp, preferably 10-15 bp.
Wherein, in order to select the anchor sequence with conserved sequence positions, the DNA sequences of all alleles in the HLA-I typing library are divided into a plurality of extended exons, and each extended exon consists of a translated exon and sequences at two sides of the exon. Alternatively, 3-10 of said anchor sequences may be identified in each of said extended exons. For example, the anchor sequence includes SEQ ID NO.1~ 188 sequence.
Wherein the aligned anchor sequences refer to anchor sequences that are successfully aligned in the sequencing result sequence. Wherein the candidate allele refers to the allele successfully aligned with the available sequence in the typing library.
Wherein, by the selection of amplification primers and the selection of sequencing primers, the sequencing result sequence comprises at least one sequence of the full length or fragment of the extended exon; specifically, the DNA sample may be PCR amplified using HLA-I site-specific amplification primers to obtain an amplification product containing an HLA-I sequence, and then sequencing primers may be used to obtain a sequencing result sequence comprising a sequence of at least one of the extended exons. To achieve higher resolution, more sequencing primers can be used to obtain more sequencing result sequences, which in turn can be based on more sequencing result sequences to obtain more accurate typing.
For example, in step S2, the HLA-I includes HLA-A, HLA-B and HLA-C. The expansion primer of HLA-A is SEQ ID No. 189-190, and the sequencing primer is selected from at least one of SEQ ID No. 195-197; the anchor sequence of the HLA-A is selected from SEQ ID NO. 1-59; the amplification primer of the HLA-B is SEQ ID NO. 191-192, and the sequencing primer is selected from at least one of SEQ ID NO. 198-199; the anchor sequence of the HLA-B is selected from SEQ ID NO. 60-111; the amplification primer of the HLA-C is SEQ ID NO. 193-194, and the sequencing primer is selected from at least one of SEQ ID NO. 200-201; the anchor sequence of HLA-C is selected from SEQ ID NO. 112-188.
The disclosure also provides an anchor sequence group, wherein the anchor sequence group comprises the sequences of SEQ ID NO. 1-188. The anchor sequence set can be used for HLA-I typing.
Wherein the operation of comparing the sequencing result sequence with the plurality of anchor sequences in step S3 comprises: searching an anchor sequence in the sequencing result sequence, wherein the search criterion is that one base mutation is allowed and base insertion or base deletion is not allowed; if an anchor sequence is searched in the whole sequencing sequence only once, the anchor sequence is the aligned anchor sequence.
In step S4, the operation of selecting the anchor sequence with conserved sequence positions includes: for any anchor sequence, calculating the number of bases between the first base of the anchor sequence and the first base of the exon in the extended exon where the anchor sequence is located, and marking the number as n0 ', and taking n 0' with the highest frequency of occurrence in all alleles as n0 of the anchor sequence; in the sequencing result sequence, calculating the number of bases between the initial base of each aligned anchor sequence and the initial base of the sequencing result sequence, and marking as n 1; for each aligned anchor sequence, n1 is used to subtract n0, which is marked as n2, and the anchor sequence with the same n2 is the anchor sequence with conserved sequence position.
In step S6, comparing the available sequences with all alleles in the typing library one by one, scoring by using a base weight algorithm, and if a score of an allele is higher than a critical score, determining that the allele is a candidate allele; the critical score is the score that allows 0-3 false base matches.
In step S8, a double peak in the available sequence in the sequencing result is recorded as an available double peak; for each available doublet, calculating the base number between the available doublet and the first base of the available sequence, and marking as n 3; searching for a first anchor sequence of the available sequence in the allele, and regarding a base at a position n3 bases after the first base of the first anchor sequence searched in the allele as a comparable base pair; for each candidate pair, corresponding to each available doublet, a comparable pair base pair is obtained; if all the bases of the comparable pair base pairs of the candidate pair are identical to the bases of the available doublet corresponding to the comparable pair base pairs, the candidate pair is the HLA-I type of the sequencing result.
The present disclosure is further illustrated by the following examples, but is not to be construed as being limited thereby.
Example 1 this example serves to illustrate the selection of anchor sequences from a typing library and the determination of the relative positions of the anchor sequences. This step needs to be done only once for a given typing library and need not be done repeatedly in each typing job. The typing library of HLA may be updated based on the update of the IMGT-HLA database.
DNA sequences of HLA-A, HLA-B and HLA-C provided using IMGT-HLA (fromhttps:// www.ebi.ac.uk/ipd/imgt/hla/Downloaded) as a database, Release 3.42.0,2020-10(Robinson J, Barker DJ, Georgiou X, Cooper MA, Flicek P, Marsh SGE. IPD-IMGT/HLA database. nucleic Acids Research (2020)48: D948-55) is specifically used.
Programmed and run using the R language. Invalid sequences, i.e., sequences in the name containing the letter N (null), are first removed. Dividing each HLA-I gene into 7 extended exon fragments by including a small number of intron fragments before and after each translated exon as an extended exon in the full-length sequence of the remaining HLA-I alleles in the typing library with each translated exon as a core, each of said extended exons consisting of one translated exon and sequences flanking said exon, and if said HLA-I gene has 8 exons, combining the seventh exon and the eighth exon into the same extended exon fragment. For each allele, the method of segmenting it into extended exons comprises: among 2890 alleles of HLA-A, conserved sequences of 15-20bp are searched, conserved sequences located in introns are selected, and one conserved sequence located in each intron is selected and called a segmentation sequence. For each allele, these split sequences were searched (no more than 3 base mutations allowed) and the last base of these split sequences was taken as the end of an extended exon. For example, in 2890 HLA-A alleles, a divided sequence GGTGAGTGCGGGGTCG (SEQ ID NO.209, located in intron 1) was searched, the divided sequence was used as the end of extended exon one, and the next base of the divided sequence was used as the start base of extended exon two. Similarly, the segmentation sequences used to segment the other extended exons are: split sequence between extended exon two and extended exon three: TTTCAGTTTAGGCCAAAA (SEQ ID NO. 210); split sequence between extended exon three and extended exon four: TGGTTCCCTTTGACAC (SEQ ID NO. 211); split sequence between extended exon four and extended exon five: CACCCTGAGATGGGGTAA (SEQ ID NO. 212); split sequence between extended exon five and extended exon six: GAGGAAGAGCTCAGGT (SEQ ID NO. 213); split sequence between extended exon six and extended exon seven: CTTCTGTGGGATCTGACC (SEQ ID NO. 214); similarly, the segmentation sequence for segmenting HLA-B is: split sequence between extended exon one and extended exon two: CCGGTGAGTGCGGG (SEQ ID NO. 215); split sequence between extended exon two and extended exon three: CGGCCCGGGTCGCC (SEQ ID NO. 216); split sequence between extended exon three and extended exon four: TGAGGGCCCCCTCT (SEQ ID NO. 217); split sequence between extended exon four and extended exon five: AGCAGGAGCCCTTC (SEQ ID NO. 218); split sequence between extended exon five and extended exon six: GGACCCTGTGTGCC (SEQ ID NO. 219); split sequence between extended exon six and extended exon seven: GTGTGGAGGAGCTC (SEQ ID NO. 220); the segmentation sequences used to segment HLA-C were: split sequence between extended exon one and extended exon two: CCCGGCGAGGGCGC (SEQ ID NO. 221); split sequence between extended exon two and extended exon three: CTGAGATCCACCCC (SEQ ID NO. 222); split sequence between extended exon three and extended exon four: TCAGGCCTTGTTCTC (SEQ ID NO. 223); split sequence between extended exon four and extended exon five: GAAAGCAGAAG (SEQ ID NO. 224); split sequence between extended exon five and extended exon six: CCTGGGACCCTGTG (SEQ ID NO. 225); split sequence between extended exon six and extended exon seven: GGCGTGTGGAGGAGC (SEQ ID NO. 226); calculating, for HLA-a, for each extended exon of the 2890 alleles, the relative position within the extended exon of the first base of the exon contained in the extended exon; then screening out conserved sequences which have the length of 10-15bp and are only appeared once in the same allele and are conserved among different alleles as much as possible (3-10 pieces) in each extended exon, and marking the conserved sequences as anchor sequences; all anchor sequences from the same extended exon are grouped into one anchor sequence group. For each anchor sequence, the number of bases between its first base and the first base of the exon in the extended exon in which it is located was calculated and designated n0 ', and the n 0' with the highest frequency of occurrence in all alleles was taken and designated n0 for that anchor sequence. Wherein the anchor sequence is more than two positions upstream of the first base of the exon, n0 is a negative number, the anchor sequence is one position upstream of the first base of the exon, n0 is 0, the first base of the anchor sequence overlaps with the first base of the exon, n0 is 1, the anchor sequence is downstream of the first base of the exon, and n0 is a positive number greater than 1. For example, HLA-A anchor sequence A4, sequence GGGGAGAAGCAA (SEQ ID NO.4), belonging to extended exon two, whose number of bases with the first base of exon two is 1 in 2890 alleles upstream of the first base of exon two and 88 with the first base of exon two (n0 'is-87) (A03: 01:01:21), 3 in 2890 alleles upstream of the first base of exon two and 95 with the first base of exon two (n 0' is-94) (A03: 01:01:06, A24: 02:01:62 and A31: 01:02:21), the remaining 2886 in exon two are upstream of the first base and 96 with the first base of exon two (n0 'is-95), the number of bases with the first base of exon two is found between the extended exon two (n 3838' and 0 in the extended exon two bases of the first base of exon two and the first base of the extended exon two is found between the extended exon two and the extended exon two bases of the extended exon two Is-95. If the number of different n 0's present in all alleles for a given anchor sequence is less than 10, it is preferred to use that anchor sequence. If the anchor sequence has more than 10 different n 0's in all alleles, which would make the subsequent steps difficult to perform, it is preferable not to use the anchor sequence. If an anchor sequence does not match on 10 or more alleles, it is also preferred not to use the anchor sequence. Screening of the anchor sequences in HLA-B and HLA-C was performed according to the method described above. An exemplary set of HLA-A, HLA-B and HLA-C anchor sequences selected are shown in Table 1.
TABLE 1
Figure BDA0002939215740000051
Figure BDA0002939215740000061
Figure BDA0002939215740000071
Figure BDA0002939215740000081
EXAMPLE 2 this example illustrates the results of DNA amplification and sequencing of a subject
(1) A small amount of fresh blood of the subject is extracted according to the standard blood extraction procedure, and PBMC is separated by Ficoll density gradient centrifugation. Then, a DNA small extraction kit (Biyuntian D0063) was selected and DNA extraction was performed according to the instructions. (2) And (3) selecting a site-specific primer for PCR amplification (the sequence of the amplification primer is shown in Table 2), and obtaining a PCR product after amplification. (3) The amplified PCR products were subjected to Sanger sequencing using specific sequencing primers (shown in Table 3) (from Hangzhou Ongke biomedical Co., Ltd.). After Sanger sequencing is finished, the sequencing result of the sample is obtained.
TABLE 2 PCR amplification primers
Figure BDA0002939215740000082
Figure BDA0002939215740000091
TABLE 3 sequencing primers
Numbering Sequencing primer name Sequence of SEQ ID NO.
HLA-A-1 Seqing-AF930 TTTCAGTTTAGGCCAAAAAT 195
HLA-A-2 Seqing-AF1775 GGTGTCCTGTCCATTCTCAAGAT 196
HLA-A-3 Seqing-AR841 ATCTCGGACCCGGAGACTG 197
HLA-B-1 Seqing-BF375 AGGGAAATGGCCTCTG 198
HLA-B-2 Seqing-BR1363 AGGGCGACATTCTAGCGC 199
HLA-C-1 Seqing-CF1986 ACCCCAGGTGTCCTGTCC 200
HLA-C-2 Seqing-CR1598 ATTCTCCATTCAAGGGAGGGC 201
Selecting a DNA sample of a specific subject as a sample No.1, which is hereinafter referred to as sample-1; the sequence of the clearly readable portion of the final sample-1 sequencing result is shown in Table 4.
TABLE 4 sequencing results
Numbering of sequencing primers Sequencing the resulting sequence
HLA-A-1 SEQ ID NO.202
HLA-A-2 SEQ ID NO.203
HLA-A-3 SEQ ID NO.204
HLA-B-1 SEQ ID NO.205
HLA-B-2 SEQ ID NO.206
HLA-C-1 SEQ ID NO.207
HLA-C-2 SEQ ID NO.208
EXAMPLE 3 this example illustrates the alignment analysis procedure for sequencing results of sample-1
Aligning the Sanger sequencing result obtained in example 2 with all anchor sequence groups of the given HLA-I gene (i.e., the anchor sequence groups shown in table 1), the alignment operation being performed by searching for an anchor sequence in the entire sequencing sequence, the search criterion being that one base mutation is allowed and base insertion or base deletion is not allowed; if one anchor sequence is searched in the whole sequencing sequence and only once, the comparison is judged to be successful. And selecting successfully aligned anchor sequences in the sequencing result sequence and marking as an aligned anchor sequence, and marking a plurality of aligned anchor sequences from the same extended exon as an aligned anchor sequence group. In the sequencing result sequence, the number of bases between each aligned anchor sequence and the starting base of the sequencing result sequence was calculated and designated as n 1.
For example, anchor A20, having sequence GCTGACCGCGGGGT, belonging to the third exon extension, anchor A20 having n0 of-24, aligned at position 3 of the sequenced sequence (Table 4, HLA-A-1), i.e., n1 is 3, was assigned n2 (relative position difference) as 27 by subtracting n0 from n 1. If a plurality of anchor sequences have the same n2, it can be judged that they form anchor sequence pairs between each other, that is, the portion overlapping on the sequencing sequence between each other and the portion overlapping in the HLA database, without insertion or deletion of bases, and base-by-base comparison can be performed. The longest of the sequences between the anchor sequence pair is the available sequence. The available sequences of Sample-1 are shown in Table 5.
TABLE 5 sequencing alignment results
Figure BDA0002939215740000101
Figure BDA0002939215740000111
The above available sequences were aligned one by one with all alleles in the typing library. As a result of the sequencing of HLA-A-1 in Table 4, anchor sequences A20 to A35 were successfully aligned with each other, i.e., 16 aligned anchor sequences were obtained. The n2 of all 16 aligned anchor sequences were calculated to be 27, and the 567 base sequence covered between A20 and A35 was the longest, i.e., the available sequence. This available sequence was base-by-base aligned with the sequence of 567 bases covering between anchor sequences A20 and A35 in the third extended exon of the 2890 HLA-A alleles in the database. During the comparison process, the comparison score of the allele is reduced by one per occurrence of a mismatched base (the initial value of the comparison score of each allele is 0). Thus, the sequencing results for HLA-A-1 gave respective alignment scores for 2890 HLA-A alleles. Likewise, the sequencing results HLA-A-2 and HLA-A-3 were also subjected to the above-mentioned treatment, whereby respective alignment scores for 2890 HLA-A alleles were obtained for the sequencing results of HLA-A-2, and whereby respective alignment scores for 2890 HLA-A alleles were obtained for the sequencing results of HLA-A-3. And for each HLA-A allele, adding the alignment score of the sequencing result aiming at the HLA-A-1, the alignment score of the sequencing result aiming at the HLA-A-2 and the alignment score of the sequencing result aiming at the HLA-A-3 to obtain the alignment score of the sequencing result aiming at sample-1 of the HLA-A allele. HLA-A alleles with alignment scores ranging from-3 to 0 (540 in total, as shown in Table 5-1) are selected as alleles with successful alignment and serve as HLA-A typing candidate alleles of sample-1.
TABLE 5-1
Figure BDA0002939215740000112
Figure BDA0002939215740000121
Figure BDA0002939215740000131
The HLA-a typing candidate alleles were paired pairwise (145530 pairwise), and HLA-a typing candidate alleles were self-paired (540 self-pairwise), and the pairwise and self-pairwise were combined to obtain 146070 candidate pairings.
And (3) recording the double peaks in the usable sequence in the sequencing result as usable double peaks, and calculating the base number between the usable double peaks and the first base of the usable sequence in each usable double peak, wherein the base number is recorded as n 3. In the allele, the first anchor sequence of the above available sequences is searched, and the base at the position n3 bases after the first base of the first anchor sequence found in the allele is used as a comparable base pair. Thus, for each candidate pair, corresponding to each available doublet, a comparable pair base pair is obtained. If all the bases of the comparable pair base pairs of the candidate pair are identical to the bases of the available doublet corresponding to the comparable pair base pairs, the candidate pair is the HLA-I type of the sequencing result.
For example, in the sequencing result of HLA-B-1 of sample-1, there are 9 doublets in the available sequence, N (T/A), N (C/G), N (A/C), N (G/C), N (C/A), N (C/G), N (G/T), and N (A/T), which are respectively designated as first available doublet, second available doublet, third available doublet, fourth available doublet, fifth available doublet, sixth available doublet, seventh available doublet, eighth available doublet, and ninth available doublet.
For the first available doublet, calculating the base number between the available doublet and the first base of the available sequence, and recording as n3-1 (223 specifically); for the second available doublet, the number of bases between the available doublet and the first base of the available sequence in which it is located is calculated and is denoted as n3-2 (224 in particular); for the third available doublet, the number of bases between the available doublet and the first base of the available sequence in which it is located is calculated and is designated as n3-3 (226 in particular); for the fourth available doublet, the number of bases between the available doublet and the first base of the available sequence in which it is located is calculated and is denoted as n3-4 (specifically 231); for the fifth available doublet, the number of bases between the available doublet and the first base of the available sequence in which it is located is calculated and is denoted as n3-5 (specifically 232); calculating the number of bases between the sixth available doublet and the first base of the available sequence in which it is located, and is denoted as n3-6 (specifically 234); for the seventh available doublet, the number of bases between the available doublet and the first base of the available sequence in which it is located is calculated and is designated as n3-7 (specifically 236); for the eighth available doublet, the number of bases between the available doublet and the first base of the available sequence in which it is located is calculated and is denoted as n3-8 (specifically 246); for the ninth available doublet, the number of bases between the available doublet and the first base of the available sequence in which it is located is calculated and is designated as n3-9 (253 in particular).
In the candidate paired two alleles, respectively searching the first anchor sequence (i.e. anchor sequence 9) of the above-mentioned available sequences, and regarding the base at the position N3 bases after the first base of the first anchor sequence found in the allele as a comparable base, for example, in the pairing of B46:01:01:01, B15:01:01:01, the 682 base of B46:01:01 is the comparable base corresponding to the first available doublet, A, the 682 base of B15:01:01 is the comparable base corresponding to the first available doublet, T, and is completely consistent with the base selection of the first available doublet N (A/T), so that the pairing of B46:01:01:01, B15:01:01:01 is identical to the first available doublet base on the comparable base, and the same result is obtained by the same double peak sequencing, B01: 46:01:01, the pairings of B15:01:01:01 were also identical to the corresponding available bimodal sequencing results on the remaining alignable bases, respectively, and the following results were considered as correct typing results. However, for the other pair, base 682 of B15: 65 was the comparable base corresponding to the first usable doublet and was T, base 682 of B15:01:01 was the comparable base corresponding to the first usable doublet and was T, and the base choice for N (A/T) of the first usable doublet was not consistent with that for B15: 65 and B15:01:01:01, and the pair was deleted.
The above procedure was also performed on the sequencing result of HLA-B-2 of sample-1, and deletion was continued in the remaining candidate pairs of HLA-B-1, and the final remaining candidate pairs were the typing results of HLA-B of sample-1, as shown in Table 6.
TABLE 6
Figure BDA0002939215740000141
Figure BDA0002939215740000151
As can be seen from the data in table 6, the method of the present application can obtain high-resolution HLA typing based on one or more rounds of one generation sequencing results (usually, only one round), using less sequencing data and simple data processing.
The preferred embodiments of the present disclosure have been described in detail above, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all fall within the protection scope of the present disclosure. It should be noted that, in the foregoing embodiments, various features described in the above embodiments may be combined in any suitable manner, and in order to avoid unnecessary repetition, various combinations that are possible in the present disclosure are not described again. In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.
Sequence listing
<110> field
<120> a method for determining HLA-I type of a subject and an anchor sequence set
<130> 16537TY
<160> 226
<170> SIPOSequenceListing 1.0
<210> 1
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 1
atggccgtca tgg 13
<210> 2
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 2
gccctgaccc agac 14
<210> 3
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 3
ggtgagtgcg gggtcg 16
<210> 4
<211> 12
<212> DNA/RNA
<213> Artificial Sequence
<400> 4
ggggagaagc aa 12
<210> 5
<211> 17
<212> DNA/RNA
<213> Artificial Sequence
<400> 5
ggcgggggcg caggacc 17
<210> 6
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 6
ccgcgccggg aggagggt 18
<210> 7
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 7
cactccatga ggtatttc 18
<210> 8
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 8
cgtgtcccgg cccg 14
<210> 9
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 9
ggggagcccc gcttcatc 18
<210> 10
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 10
gttcgtgcgg ttcgacag 18
<210> 11
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 11
ccgcgggcgc cgtggata 18
<210> 12
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 12
ctactacaac cagagcga 18
<210> 13
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 13
cggtgagtga cccc 14
<210> 14
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 14
catcccccac ggacgggc 18
<210> 15
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 15
cacagtctcc gggtccga 18
<210> 16
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 16
cccgaagccg cggg 14
<210> 17
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 17
cgggagaggc ccaggcgc 18
<210> 18
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 18
tttcagttta ggccaaaa 18
<210> 19
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 19
ggttggtcgg ggc 13
<210> 20
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 20
gctgaccgcg gggt 14
<210> 21
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 21
gggccaggtt ctca 14
<210> 22
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 22
gctgcgacgt gggg 14
<210> 23
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 23
gcgcttcctc cgcgggta 18
<210> 24
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 24
cggcaaggat tacatcgc 18
<210> 25
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 25
tcttggaccg cggcggac 18
<210> 26
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 26
gagagcctac ctgg 14
<210> 27
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 27
atacctggag aacgggaa 18
<210> 28
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 28
accaggggcc acggggcg 18
<210> 29
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 29
tctcccgggc tggc 14
<210> 30
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 30
tcccacaagg aggggaga 18
<210> 31
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 31
cctgagggag aggaatcc 18
<210> 32
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 32
agagagtgac tctgaggt 18
<210> 33
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 33
aattaaggga taaa 14
<210> 34
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 34
gacgatccct cgaatact 18
<210> 35
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 35
tggttccctt tgacac 16
<210> 36
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 36
atcccaggtg cctgtgtc 18
<210> 37
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 37
tgtcctgtcc attctcaa 18
<210> 38
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 38
acagatgcaa aatgcctg 18
<210> 39
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 39
tctctgacca tgaggcca 18
<210> 40
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 40
taccctgcgg agatcaca 18
<210> 41
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 41
gaccagaccc aggacacg 18
<210> 42
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 42
aggggatgga accttcca 18
<210> 43
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 43
agatacacct gccatgtg 18
<210> 44
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 44
caccctgaga tggggtaa 18
<210> 45
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 45
tagcagggtc agggcccc 18
<210> 46
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 46
tcccctcttt tcc 13
<210> 47
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 47
tcttcccagc cca 13
<210> 48
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 48
cgtgggcatc attgctgg 18
<210> 49
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 49
ctggagctgt ggtcgctg 18
<210> 50
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 50
gaggaagagc tcaggt 16
<210> 51
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 51
tcacaggaca tttt 14
<210> 52
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 52
atagaaaagg agg 13
<210> 53
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 53
ctgcaagtaa gtatg 15
<210> 54
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 54
ggatattgtg tttgg 15
<210> 55
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 55
acaattcctc ctc 13
<210> 56
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 56
cttctgtggg atctgacc 18
<210> 57
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 57
ctaccccagg cagt 14
<210> 58
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 58
cccagggctc tga 13
<210> 59
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 59
gcttgtaaag gtga 14
<210> 60
<211> 10
<212> DNA/RNA
<213> Artificial Sequence
<400> 60
atgcgggtca 10
<210> 61
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 61
ggcgccccga acc 13
<210> 62
<211> 12
<212> DNA/RNA
<213> Artificial Sequence
<400> 62
tcctgctgct ct 12
<210> 63
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 63
tggccctgac cgaga 15
<210> 64
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 64
ccggtgagtg cggg 14
<210> 65
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 65
agggaaatgg cctctg 16
<210> 66
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 66
aggggaccgc aggcgg 16
<210> 67
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 67
ggagccgcgc cgg 13
<210> 68
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 68
gggtctcagc ccct 14
<210> 69
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 69
gctcccactc catgag 16
<210> 70
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 70
cccggcccgg ccgc 14
<210> 71
<211> 12
<212> DNA/RNA
<213> Artificial Sequence
<400> 71
gtgggctacg tg 12
<210> 72
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 72
ttcgacagcg acgcc 15
<210> 73
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 73
accagagcga ggccg 15
<210> 74
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 74
ggggcgcagg tcacg 15
<210> 75
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 75
cggcccgggt cgcc 14
<210> 76
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 76
gccagggtct caca 14
<210> 77
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 77
cgcctcctcc gcgg 14
<210> 78
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 78
attacatcgc cctg 14
<210> 79
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 79
ggaggcggcc cgtg 14
<210> 80
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 80
gagcctacct ggag 14
<210> 81
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 81
gagaacggga agga 14
<210> 82
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 82
ggcagtgggg agcc 14
<210> 83
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 83
gatggcctcc cacg 14
<210> 84
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 84
tagaatgtcg ccct 14
<210> 85
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 85
tgagggcccc ctct 14
<210> 86
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 86
tcccaggtgc ctgc 14
<210> 87
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 87
gggttctgtg cccc 14
<210> 88
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 88
ggtggtccta gggt 14
<210> 89
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 89
agcgcctgaa tttt 14
<210> 90
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 90
atcagacccc ccaa 14
<210> 91
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 91
tgacccacca cccc 14
<210> 92
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 92
ggtgctgggc cctg 14
<210> 93
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 93
accaaactca ggac 14
<210> 94
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 94
cagaagtggg cagc 14
<210> 95
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 95
accctgagat gggg 14
<210> 96
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 96
agcaggagcc cttc 14
<210> 97
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 97
agcagggtca gggc 14
<210> 98
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 98
tcctttccca gagc 14
<210> 99
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 99
gcctggctgt ccta 14
<210> 100
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 100
gatgtgtagg agga 14
<210> 101
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 101
gccccaggta gaag 14
<210> 102
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 102
ggaccctgtg tgcc 14
<210> 103
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 103
atacttctgg aaat 14
<210> 104
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 104
ttcctctaag atct 14
<210> 105
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 105
tgcttcctcc cagt 14
<210> 106
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 106
ttcccacagg tgga 14
<210> 107
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 107
tcaggctgcg tgta 14
<210> 108
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 108
gtgtggagga gctc 14
<210> 109
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 109
accaggtcct gttt 14
<210> 110
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 110
cagcgacagt gccc 14
<210> 111
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 111
ctctcacagc ttga 14
<210> 112
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 112
atgcgggtca tggcg 15
<210> 113
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 113
ctgctgctct cgg 13
<210> 114
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 114
cgagacctgg gcc 13
<210> 115
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 115
cccggcgagg gcgc 14
<210> 116
<211> 12
<212> DNA/RNA
<213> Artificial Sequence
<400> 116
gccgcgcagg ga 12
<210> 117
<211> 12
<212> DNA/RNA
<213> Artificial Sequence
<400> 117
cgggtctcag cc 12
<210> 118
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 118
cccaggctcc cactcc 16
<210> 119
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 119
cagtgggcta cgtg 14
<210> 120
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 120
acgccgcgag tcc 13
<210> 121
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 121
aggaggggcc gga 13
<210> 122
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 122
gcgccaggca cag 13
<210> 123
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 123
ctgcgcggct act 13
<210> 124
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 124
aaccagagcg agg 13
<210> 125
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 125
tgagtgaccc cgg 13
<210> 126
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 126
cggggcgcag gtc 13
<210> 127
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 127
catcccccac gga 13
<210> 128
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 128
gcccgggtcg ccc 13
<210> 129
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 129
ctgagatcca cccc 14
<210> 130
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 130
ggggctcggg ggac 14
<210> 131
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 131
ggccagggtc tcac 14
<210> 132
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 132
acgggcgcct cctc 14
<210> 133
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 133
gcctacgacg gcaa 14
<210> 134
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 134
tacatcgccc tgaa 14
<210> 135
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 135
ctgcgctcct ggac 14
<210> 136
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 136
ctcagatcac ccag 14
<210> 137
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 137
ggaggcggcc cgt 13
<210> 138
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 138
gagcctacct ggagg 15
<210> 139
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 139
gtgcgtggag tggc 14
<210> 140
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 140
agacgctgca gcg 13
<210> 141
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 141
gggagccttc ccc 13
<210> 142
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 142
aaatgggatc agc 13
<210> 143
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 143
cctcccttga atgg 14
<210> 144
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 144
ctgagtttcc tct 13
<210> 145
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 145
aagggatgaa gtc 13
<210> 146
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 146
tggaggggaa gac 13
<210> 147
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 147
tgcagcagct gtggt 15
<210> 148
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 148
tcaggccttg ttctc 15
<210> 149
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 149
tgattccagc ttttc 15
<210> 150
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 150
gcctccactc aggt 14
<210> 151
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 151
tcctccctca gagac 15
<210> 152
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 152
caggctggcg tctgg 15
<210> 153
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 153
caggtgtcct gtcc 14
<210> 154
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 154
ggatggtcac atggg 15
<210> 155
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 155
aattttctga ctctt 15
<210> 156
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 156
agacacacgt gacc 14
<210> 157
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 157
atgaggccac cctg 14
<210> 158
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 158
ggccctgggc ttct 14
<210> 159
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 159
cactgacctg gcag 14
<210> 160
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 160
agcttgtgga gacca 15
<210> 161
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 161
aagtgggcag ctgt 14
<210> 162
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 162
ggtgccttct gga 13
<210> 163
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 163
agagatacac gtgc 14
<210> 164
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 164
agcacgaggg gct 13
<210> 165
<211> 11
<212> DNA/RNA
<213> Artificial Sequence
<400> 165
gtaaggaggg g 11
<210> 166
<211> 11
<212> DNA/RNA
<213> Artificial Sequence
<400> 166
gaaagcagaa g 11
<210> 167
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 167
tcagggcccc tcac 14
<210> 168
<211> 12
<212> DNA/RNA
<213> Artificial Sequence
<400> 168
cagcccacca tc 12
<210> 169
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 169
gggcatcgtt gctg 14
<210> 170
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 170
taggaggaag agct 14
<210> 171
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 171
ggtctgggtt ttct 14
<210> 172
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 172
tcaagcccca ggt 13
<210> 173
<211> 13
<212> DNA/RNA
<213> Artificial Sequence
<400> 173
gcaccatcca cac 13
<210> 174
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 174
cctgggaccc tgtg 14
<210> 175
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 175
ggggaaggtc cctg 14
<210> 176
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 176
taggagggca gttg 14
<210> 177
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 177
gatcctgccc tggg 14
<210> 178
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 178
ttctggaaac ttct 14
<210> 179
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 179
gaggttcccc taaga 15
<210> 180
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 180
agggcatttt cttcc 15
<210> 181
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 181
gtggaaaagg aggga 15
<210> 182
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 182
ggctgcgtgt aagt 14
<210> 183
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 183
ggcgtgtgga ggagc 15
<210> 184
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 184
ccataattcc tctt 14
<210> 185
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 185
tcctgcgggc tctg 14
<210> 186
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 186
tttgttctac ccca 14
<210> 187
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 187
agcaacagtg cccag 15
<210> 188
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 188
atgagtctct catc 14
<210> 189
<211> 23
<212> DNA/RNA
<213> Artificial Sequence
<400> 189
ccccaactcc gcagtttctt ttc 23
<210> 190
<211> 22
<212> DNA/RNA
<213> Artificial Sequence
<400> 190
acaaagggaa gggcaggaac aa 22
<210> 191
<211> 24
<212> DNA/RNA
<213> Artificial Sequence
<400> 191
gggtccttct tccaggatac tcgt 24
<210> 192
<211> 24
<212> DNA/RNA
<213> Artificial Sequence
<400> 192
cccactctag accccaagaa tctc 24
<210> 193
<211> 21
<212> DNA/RNA
<213> Artificial Sequence
<400> 193
actcatgacg cgtccccaat t 21
<210> 194
<211> 20
<212> DNA/RNA
<213> Artificial Sequence
<400> 194
tcacggtgga cacgggggtg 20
<210> 195
<211> 20
<212> DNA/RNA
<213> Artificial Sequence
<400> 195
tttcagttta ggccaaaaat 20
<210> 196
<211> 23
<212> DNA/RNA
<213> Artificial Sequence
<400> 196
ggtgtcctgt ccattctcaa gat 23
<210> 197
<211> 19
<212> DNA/RNA
<213> Artificial Sequence
<400> 197
atctcggacc cggagactg 19
<210> 198
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 198
agggaaatgg cctctg 16
<210> 199
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 199
agggcgacat tctagcgc 18
<210> 200
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 200
accccaggtg tcctgtcc 18
<210> 201
<211> 21
<212> DNA/RNA
<213> Artificial Sequence
<400> 201
attctccatt caagggaggg c 21
<210> 202
<211> 771
<212> DNA/RNA
<213> Artificial Sequence
<220>
<221> misc_feature
<222> (50)..(50)
<223> n is g or a
<400> 202
gggctgaccg cggggtccgg gccaggttct cacaccgtcc agaggatgtn tggctgcgac 60
gtggggtcgg actggcgctt cctccgcggg taccaccagt acgcctacga cggcaaggat 120
tacatcgccc tgaaagagga cctgcgctct tggaccgcgg cggacatggc agctcagacc 180
accaagcaca agtgggaggc ggcccatgtg gcggagcagt tgagagccta cctggagggc 240
acgtgcgtgg agtggctccg cagatacctg gagaacggga aggagacgct gcagcgcacg 300
ggtaccaggg gccacggggc gcctccctga tcgcctgtag atctcccggg ctggcctccc 360
acaaggaggg gagacaattg ggaccaacac tagaatatcg ccctccctct ggtcctgagg 420
gagaggaatc ctcctgggtt tccagatcct gtaccagaga gtgactctga ggttccgccc 480
tgctctctga cacaattaag ggataaaatc tctgaaggaa tgacgggaag acgatccctc 540
gaatactgat gagtggttcc ctttgacaca cacaggcagc agccttgggc ccgtgacttt 600
tcctctcagg ccttgttctc tgcttcacac tcaatgtgtg tgggggtctg agtccagcac 660
ttctgagtcc ttcagcctcc actcaggtca ggaccagaag tcgctgttcc ctcttcaggg 720
actagaattt tccacggaat aggagattat cccaggtgcc tgtgtccagg c 771
<210> 203
<211> 751
<212> DNA/RNA
<213> Artificial Sequence
<400> 203
ctgactcttc ctgacagacg cccccaaaac gcatatgact caccacgctg tctctgacca 60
tgaagccacc ctgaggtgct gggccctgag cttctaccct gcggagatca cactgacctg 120
gcagcgggat ggggaggacc agacccagga cacggagctc gtggagacca ggcctgcagg 180
ggatggaacc ttccagaagt gggcggctgt ggtggtgcct tctggacagg agcagagata 240
cacctgccat gtgcagcatg agggtttgcc caagcccctc accctgagat ggggtaagga 300
gggagacggg ggtgtcatgt cttttaggga aagcaggagc ctctctgacc tttagcaggg 360
tcagggcccc tcaccttccc ctcttttccc agagccgtct tcccagccca ccatccccat 420
cgtgggcatc attgctggcc tggttctctt tggagctgtg atcactggag ctgtggtcgc 480
tgctgtgatg tggaggagga agagctcagg tggggaaggg gtgaagggtg ggtctgagat 540
ttcttgtctc actgagggtt ccaagaccca ggtagaagtg tgccctgcct cgttactggg 600
aagcaccacc cacaattatg ggcctaccca gcctgggccc tgtgtgccag cacttactct 660
tttgtaaagc acctgttaaa atgaaggaca gatttatcac cttgattaca gcggtgatgg 720
gacctgatcc cagcagtcac aagtcacagg g 751
<210> 204
<211> 641
<212> DNA/RNA
<213> Artificial Sequence
<400> 204
tcaccggcct cgctctggtt gtagtagccg cgcagggtcc ccaggtccac tcggtgagtc 60
tgtgagtggg ccttcacttt ccgtgtctcc ccgtcccaat actccggacc ctcctgctct 120
atccacggcg cccgcggctc catcctctgg ctcgcggcgt cgctgtcgaa ccgcacgaac 180
tgcgtgtcgt ccacgtagcc cactgcgatg aagcggggct ccccgcggcc gggccgggac 240
acggatgtga agaaatacct catggagtga gagcctgggg acgaggagtg gctgagaccc 300
gcccgaccct cctcccggcg cggcttcccg ggtcctgcgc ccccgccagg cgggcccgtt 360
gcttctcccc acagaggccg tttccctccc gaccccgcac tcacccgccc aggtctgggt 420
cagggccaga gcccccgaga gtagcaggac gagggttcgg ggcgccatga cggccatcct 480
cggcgtctgg ggagaatctg agtcccggtg ggtgcgtgcg gactttagaa ccgcgaccgc 540
gacgacactg attggcttct ctggaaaccc gacacccaat gggagtgaga actgggtccg 600
cgtcgtgagt atccaggaag aaggacccta cataggttgg g 641
<210> 205
<211> 751
<212> DNA/RNA
<213> Artificial Sequence
<220>
<221> misc_feature
<222> (228)..(228)
<223> n is a or t
<220>
<221> misc_feature
<222> (229)..(229)
<223> n is c or g
<220>
<221> misc_feature
<222> (231)..(231)
<223> n is a or c
<220>
<221> misc_feature
<222> (236)..(236)
<223> n is a or c
<220>
<221> misc_feature
<222> (237)..(237)
<223> n is c or g
<220>
<221> misc_feature
<222> (239)..(239)
<223> n is a or c
<220>
<221> misc_feature
<222> (241)..(241)
<223> n is c or g
<220>
<221> misc_feature
<222> (242)..(242)
<223> n is a or g
<220>
<221> misc_feature
<222> (251)..(251)
<223> n is g or t
<220>
<221> misc_feature
<222> (258)..(258)
<223> n is a or t
<400> 205
tcgggcgggt ctcagcccct cctcgccccc aggctcccac tccatgaggt atttctacac 60
cgccatgtcc cggcccggcc gcggggagcc ccgcttcatc gcagtgggct acgtggacga 120
cacccagttc gtgaggttcg acagcgacgc cgcgagtccg aggatggcgc cccgggcgcc 180
atggatagag caggaggggc cggagtattg ggaccgggag acacagannt ncaagnncna 240
nncacagact naccgagnga gcctgcggaa cctgcgcggc tactacaacc agagcgaggc 300
cggtgagtga ccccggcctg gggcgcaggt cacgactccc catcccccac gtacggcccg 360
ggtcgccccg agtctccggg tccgagatcc gcccccctga ggccgcggga cccgcccaaa 420
ccctcgaccg gcgagagccc caggcgcgtt tacccggttt cattttcagt tgaggccaaa 480
atccccgcgg gttggtcggg gcggggcggg gctcggggga cggggctgac cgcggggcct 540
gggccagggt ctcacaccct ccagaggatg tacggctgcg acgtggggcc ggacgggcgc 600
ctcctccgcg ggcatgacca gtccgcctac gacggcaagg attacatcgc cctgaacgag 660
gacctgagct cctggaccgc ggcggacacg gcggctcaga tcacccagcg caagtgggag 720
gcggcccgtg aggcggagca gtggagagcc t 751
<210> 206
<211> 631
<212> DNA/RNA
<213> Artificial Sequence
<400> 206
tcgggcgggt ctcagcccct cctcgccccc aggctcccac tccatgaggt atttctacac 60
actgcccctg gtacccgcgc gctgcagcgt ctccttcccg ttctccaggt atctgcggag 120
ccactccacg cacaggccct ccaggtaggc tctccactgc tccgcctcac gggccgcctc 180
ccacttgcgc tgggtgatct gagccgccgt gtccgccgcg gtccaggagc tcaggtcctc 240
gttcagggcg atgtaatcct tgccgtcgta ggcggactgg tcatgcccgc ggaggaggcg 300
cccgtccggc cccacgtcgc agccgtacat cctctggagg gtgtgagacc ctggcccagg 360
ccccgcggtc agccccgtcc cccgagcccc gccccgcccc gaccaacccg cggggatttt 420
ggcctcaact gaaaatgaaa ccgggtaaac gcgcctgggg ctctcgccgg tcgagggttt 480
gggcgggtcc cgcggcctca ggggggcgga tctcggaccc ggagactcgg ggcgacccgg 540
gccgtacgtg ggggatgggg agtcgtgacc tgcgccccag gccggggtca ctcaccggcc 600
tcgctctggt tgtagtagcc gcgcaggttc c 631
<210> 207
<211> 721
<212> DNA/RNA
<213> Artificial Sequence
<220>
<221> misc_feature
<222> (126)..(126)
<223> n is c or t
<220>
<221> misc_feature
<222> (184)..(184)
<223> n is a or g
<220>
<221> misc_feature
<222> (332)..(332)
<223> n is a or g
<400> 207
acacacgtga cccaccatcc cgtctctgac catgaggcca ccctgaggtg ctgggccctg 60
ggcttctacc ctgcggagat cacactgacc tggcagtggg atggggagga ccaaactcag 120
gacacngagc ttgtggagac caggccagca ggagatggaa ccttccagaa gtgggcagct 180
gtgntggtgc cttctggaga agagcagaga tacacgtgcc atgtgcagca cgaggggctg 240
ccggagcccc tcaccctgag atggggtaag gagggggatg aggggtgatg tgtcttctca 300
gggaaagcag aagtcctgga gcccttcagc cnggtcaggg ctgaggcttg gaggtcaggg 360
cccctcacct tcccctcctt tcccagagcc gtcttcccag cccaccatcc ccatcgtggg 420
catcgttgct ggcctggctg tcctggctgt cctagctgtc ctaggagctg tggtggctgt 480
tgtgatgtgt aggaggaaga gctcaggtag ggaaggggtg aggagtgggg tctgggtttt 540
cttgttccac tgggagtttc aagccccagg tagaagtgtg ccccacctcg ttactggaag 600
caccatccac acatgggcca tcccagcctg ggaccctgtg tgccagcact tactctgttg 660
tgaagcacat gacaatgaag gacagatgta tcaccttgat gattatggtg ttggggtcct 720
t 721
<210> 208
<211> 291
<212> DNA/RNA
<213> Artificial Sequence
<220>
<221> misc_feature
<222> (163)..(163)
<223> n is c or g
<220>
<221> misc_feature
<222> (190)..(190)
<223> n is a or g
<220>
<221> misc_feature
<222> (281)..(281)
<223> n is t or c
<220>
<221> misc_feature
<222> (288)..(288)
<223> n is a or t
<400> 208
tggggaaggc tccccactgc ccctggtacc cgcgcgctgc agcgtctcct tcccgttctc 60
caggtatctg cggagccact ccacgcacgt gccctccagg taggctctcc gctgctccgc 120
ctcacgggcc gcctcccact tgcgctgggt gatctgagcc gcngtgtccg cggcggtcca 180
ggagcgcagn tcctcgttca gggcgatgta atccttgccg tcgtaggcgt actggtcata 240
cccgcggagg aggcgcccgt cgggccccac gtcgcagcca nacatccnct g 291
<210> 209
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 209
ggtgagtgcg gggtcg 16
<210> 210
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 210
tttcagttta ggccaaaa 18
<210> 211
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 211
tggttccctt tgacac 16
<210> 212
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 212
caccctgaga tggggtaa 18
<210> 213
<211> 16
<212> DNA/RNA
<213> Artificial Sequence
<400> 213
gaggaagagc tcaggt 16
<210> 214
<211> 18
<212> DNA/RNA
<213> Artificial Sequence
<400> 214
cttctgtggg atctgacc 18
<210> 215
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 215
ccggtgagtg cggg 14
<210> 216
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 216
cggcccgggt cgcc 14
<210> 217
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 217
tgagggcccc ctct 14
<210> 218
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 218
agcaggagcc cttc 14
<210> 219
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 219
ggaccctgtg tgcc 14
<210> 220
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 220
gtgtggagga gctc 14
<210> 221
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 221
cccggcgagg gcgc 14
<210> 222
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 222
ctgagatcca cccc 14
<210> 223
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 223
tcaggccttg ttctc 15
<210> 224
<211> 11
<212> DNA/RNA
<213> Artificial Sequence
<400> 224
gaaagcagaa g 11
<210> 225
<211> 14
<212> DNA/RNA
<213> Artificial Sequence
<400> 225
cctgggaccc tgtg 14
<210> 226
<211> 15
<212> DNA/RNA
<213> Artificial Sequence
<400> 226
ggcgtgtgga ggagc 15

Claims (9)

1. A method of determining HLA-I typing in a subject, the method comprising the steps of:
s1, determining a plurality of anchor sequences in an HLA-I typing library, wherein the anchor sequences only appear once in the same allele and are conserved among different alleles;
s2, obtaining a DNA sample of a person to be tested, carrying out PCR amplification to obtain an amplification product, and carrying out Sanger sequencing on the amplification product to obtain a sequencing result sequence;
s3, comparing the sequencing result sequence with the plurality of anchor sequences to obtain a plurality of compared anchor sequences;
s4, selecting an anchor sequence with a conserved sequence position from the aligned anchor sequences, and pairing the anchor sequences with the conserved sequence position pairwise to form an anchor sequence pair;
s5, selecting the longest sequence among the anchor sequence pairs from the sequencing result sequences to obtain an available sequence;
s6, comparing all the available sequences with all the alleles in the typing library one by one to obtain candidate alleles;
s7, pairwise pairing the candidate alleles and pairing the candidate alleles with the candidate alleles to obtain candidate pairs;
s8, comparing the double peaks in the available sequence in the sequencing result with the double peaks in the candidate pairing pairs, wherein the candidate pairing which is successfully compared is the HLA-I typing of the sequencing result;
in step S4, the operation of selecting the anchor sequence with conserved sequence positions includes:
calculating the number of bases between the first base of any anchor sequence and the first base of an exon in an extended exon where the anchor sequence is located, and marking the number as n0 ', and taking n 0' with the highest occurrence frequency in all alleles as n0 of the anchor sequence;
in the sequencing result sequence, calculating the number of bases between the initial base of each aligned anchor sequence and the initial base of the sequencing result sequence, and marking as n 1;
for each aligned anchor sequence, n1 is used to subtract n0, which is marked as n2, and the anchor sequence with the same n2 is the anchor sequence with conserved sequence position.
2. The method of claim 1, wherein,
the length of the anchor sequence is 10-15 bp;
dividing DNA sequences of all alleles in an HLA-I typing library into a plurality of extended exons, wherein each extended exon consists of a translated exon and sequences on two sides of the exon;
3-10 of said anchor sequences are identified in each of said extended exons.
3. The method of claim 1, wherein,
the aligned anchor sequence refers to an anchor sequence which is successfully aligned in the sequencing result sequence;
the candidate allele refers to an allele in the typing library which is successfully aligned with the available sequence;
the anchor sequence comprises a sequence of SEQ ID number 1-188.
4. The method of claim 2, wherein said sequencing result sequence comprises at least one sequence of a full length or fragment of said extended exon.
5. The method of claim 4, wherein, in step S2, the HLA-I comprises HLA-A, HLA-B and HLA-C;
the amplification primer of the HLA-A is SEQ ID number 189-190, and the sequencing primer is selected from at least one of SEQ ID number 195-197;
the anchor sequence of the HLA-A is selected from SEQ ID NO. 1-59;
the amplification primer of the HLA-B is SEQ ID number 191-192, and the sequencing primer is selected from at least one of SEQ ID number 198-199;
the anchor sequence of the HLA-B is selected from SEQ ID NO. 60-111;
the amplification primer of the HLA-C is SEQ ID number 193-194, and the sequencing primer is selected from at least one of SEQ ID number 200-201;
the anchor sequence of HLA-C is selected from SEQ ID NO. 112-188.
6. The method of any one of claims 1 to 3, wherein the act of aligning the sequencing result sequence with the plurality of anchor sequences in step S3 comprises: searching an anchor sequence in the sequencing result sequence, wherein the search criterion is that one base mutation is allowed and base insertion or base deletion is not allowed; if a certain anchor sequence is searched in the whole sequencing sequence only once, the anchor sequence is the aligned anchor sequence.
7. The method according to claim 1, wherein in step S6, the available sequence is scored against all alleles in the typing library one by one using a base weight algorithm, and if an allele score is higher than a threshold score, the allele is determined to be a candidate allele; the critical score is the score that allows 0-3 false base matches.
8. The method of claim 1, wherein in step S8, a double peak in the available sequences in the sequencing result is marked as available double peak; for each available doublet, calculating the number of bases between the available doublet and the first base of the available sequence in which the available doublet is located, and marking as n 3; searching for a first anchor sequence of the available sequence in the allele, and regarding a base at a position n3 bases after the first base of the first anchor sequence searched in the allele as a comparable base pair; for each candidate pair, corresponding to each available doublet, a comparable pair base pair is obtained; if all the bases of the comparable pair base pairs of the candidate pair are identical to the bases of the available doublet corresponding to the comparable pair base pairs, the candidate pair is the HLA-I type of the sequencing result.
9. The anchor sequence group is characterized by comprising sequences of SEQ ID numbers 1-188.
CN202110172484.8A 2021-02-08 2021-02-08 Method for determining HLA-I type of subject and anchor sequence group Active CN112921078B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110172484.8A CN112921078B (en) 2021-02-08 2021-02-08 Method for determining HLA-I type of subject and anchor sequence group

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110172484.8A CN112921078B (en) 2021-02-08 2021-02-08 Method for determining HLA-I type of subject and anchor sequence group

Publications (2)

Publication Number Publication Date
CN112921078A CN112921078A (en) 2021-06-08
CN112921078B true CN112921078B (en) 2022-07-08

Family

ID=76171217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110172484.8A Active CN112921078B (en) 2021-02-08 2021-02-08 Method for determining HLA-I type of subject and anchor sequence group

Country Status (1)

Country Link
CN (1) CN112921078B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1997031126A1 (en) * 1996-02-20 1997-08-28 The Perkin-Elmer Corporation Methods and reagents for typing hla class i genes
US6103465A (en) * 1995-02-14 2000-08-15 The Perkin-Elmer Corporation Methods and reagents for typing HLA class I genes
CN101892317A (en) * 2010-07-29 2010-11-24 苏州大学 HLA high-resolution gene sequencing kit
CN101921842A (en) * 2010-06-30 2010-12-22 深圳华大基因科技有限公司 HLA (Human Leukocyte Antigen)-A,B genotyping PCR (Polymerase Chain Reaction) primer and application method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6103465A (en) * 1995-02-14 2000-08-15 The Perkin-Elmer Corporation Methods and reagents for typing HLA class I genes
WO1997031126A1 (en) * 1996-02-20 1997-08-28 The Perkin-Elmer Corporation Methods and reagents for typing hla class i genes
CN101921842A (en) * 2010-06-30 2010-12-22 深圳华大基因科技有限公司 HLA (Human Leukocyte Antigen)-A,B genotyping PCR (Polymerase Chain Reaction) primer and application method thereof
CN101892317A (en) * 2010-07-29 2010-11-24 苏州大学 HLA high-resolution gene sequencing kit

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SOAPTyping: an open-source and crossplatform tool for sequence-based typing for HLA class I and II alleles;Yong Zhang等;《BMC Bioinformatics》;20200708;第1-9页 *
两种HLA分型方法对比;原应博等;《现代畜牧科技》;20191231;第1-6页 *

Also Published As

Publication number Publication date
CN112921078A (en) 2021-06-08

Similar Documents

Publication Publication Date Title
CN105339508B (en) Multiple DNA typing method and kit for HLA gene
Yaspo et al. Model for a transcript map of human chromosome 21: isolation of new coding sequences from exon and enriched cDNA libraries
CN117070637B (en) Molecular marker related to chicken immune traits and application thereof
US20190153528A1 (en) Method for target specific rna transcription of dna sequences
Padhi et al. Independent expansion of the keratin gene family in teleostean fish and mammals: an insight from phylogenetic analysis and radiation hybrid mapping of keratin genes in zebrafish
Lahbib-Mansais et al. Contribution to high-resolution mapping in pigs with 101 type I markers and progress in comparative map between humans and pigs
CN112921078B (en) Method for determining HLA-I type of subject and anchor sequence group
Romanov et al. Construction of a California condor BAC library and first-generation chicken–condor comparative physical map as an endangered species conservation genomics resource
Bortolini et al. Genetic variability in two Brazilian ethnic groups: a comparison of mitochondrial and protein data
Le Provost et al. A survey of the goat genome transcribed in the lactating mammary gland
CN102994511B (en) Cloning and application for bovine slaughter trait related gene CMYA4
Phillips et al. Characterization of the OmyY1 region on the rainbow trout Y chromosome
Christian et al. An evaluation of the assembly of an approximately 15-Mb region on human chromosome 13q32–q33 linked to bipolar disorder and schizophrenia
Chua et al. Molecular genetic approaches to obesity
Alföldi Sequence of the mouse Y chromosome
Weikard et al. Targeted construction of a high-resolution, integrated, comprehensive, and comparative map for a region specific to bovine chromosome 6 based on radiation hybrid mapping
CN109628611B (en) ARID5B gene mutation site influencing intramuscular fat content of beef cattle and application thereof
CN110272994B (en) Gene mutation diagnosis of CVM and application thereof
CN116590431B (en) Molecular marker related to chicken carcass traits and application thereof
KR20220123246A (en) Nucleic Acid Sequence Analysis Methods
Iwata et al. Aging-related occurrence in Ashkenazi Jews of leukocyte heteroplasmic mtDNA mutation adjacent to replication origin frequently remodeled in Italian centenarians
WO2008050870A1 (en) Organ-specific gene, method for identifying the same and use thereof
Levitt Molecular genetic methods for mapping disease genes
GB2433320A (en) Diagnostic method
Shi et al. The novel HLA‐DRB1* 14: 246 allele, identified by Sanger dideoxy nucleotide sequencing in a Chinese individual.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20220606

Address after: 310020 room 103-10, building 7, Chuangzhi Green Valley Development Center, 788 HONGPU Road, Jianggan District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou Huben biomedical Co.,Ltd.

Address before: 3 / F, building 4, Zhejiang agricultural science and Technology Innovation Park, 298 Desheng Middle Road, Jianggan District, Hangzhou City, Zhejiang Province, 310000

Applicant before: Tian Ye

GR01 Patent grant
GR01 Patent grant