CN112608994A

CN112608994A - Detection method, device and storage medium for hypertrophic cardiomyopathy and related genes

Info

Publication number: CN112608994A
Application number: CN202011630004.XA
Authority: CN
Inventors: 吴莉萍; 辜清泉; 罗宏敏; 杨旭
Original assignee: Nanchang Ruiyinkang Biotechnology Co ltd
Current assignee: Nanchang Ruiyinkang Biotechnology Co ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2021-04-06

Abstract

The application discloses a method, a device and a storage medium for detecting hypertrophic cardiomyopathy and related genes, which comprise the following steps: acquiring a target sequence, namely acquiring four target gene sequences from a genome sequence of a sample to be detected by adopting PCR amplification; a sequencing step, which comprises carrying out high-throughput sequencing on the obtained four-target gene sequence; reo-hit reading step, which comprises obtaining variation information according to the high-throughput sequencing result, and obtaining pathogenicity evaluation information of the sample to be tested by adopting a human frequency database, a disease database, a variation database and pathogenicity prediction software; a report generation step, which comprises outputting a pathogenicity evaluation report according to the result of the reo-hit interpretation step. According to the method, the related gene detection of the hypertrophic cardiomyopathy is completed only through the high-throughput sequencing panel of 4 genes, so that the gene detection cost of the hypertrophic cardiomyopathy is reduced, and the disease screening efficiency of the hypertrophic cardiomyopathy is greatly improved.

Description

Detection method, device and storage medium for hypertrophic cardiomyopathy and related genes

Technical Field

The application relates to the field of gene detection of hypertrophic cardiomyopathy, in particular to a detection method, a detection device and a storage medium for hypertrophic cardiomyopathy and related genes.

Background

Hereditary Hypertrophic Cardiomyopathy (HCM) is a common single-gene cardiovascular disease, and is also one of the main causes of sudden death of teenagers, and is characterized by myocardial hypertrophy without other causes and accompanied with the damage of diastolic function. The incidence of genetic hypertrophic cardiomyopathy is about 1/500, but the actual prevalence may be as high as 1/200 due to the presence of asymptomatic carriers of pathogenic variations of HCM.

The genetic factor of HCM is mainly the gene variation of coding myocardial sarcomere and related protein, which accounts for more than 90% of the HCM patients with the found pathogenic variation; the variation of the non-muscle sarcomere gene is very rare, and mainly causes a plurality of syndromes and metabolic diseases accompanied with the phenotype of myocardial hypertrophy. These relatively rare phenocopies are sometimes associated with extra-cardiac clinical manifestations, such as noonan syndrome with developmental delay, and Fabry and Pompe disease with metabolic abnormalities, and are sometimes indistinguishable from simple HCM in clinical phenotype before other systemic abnormalities do not appear, especially for inexperienced physicians. In this case, genetic testing has shown unparalleled advantages, allowing definitive diagnosis before the patient's symptoms are fully manifested, and may provide some degree of clinical guidance.

MYBPC3 (myostatin binding protein C, cardiac) is a gene that encodes a cardiac myosin binding protein. The cardiac myosin binding protein-C is located in the C region of the myocardial sarcomere and represents approximately 2% of the total protein in the myofibrils. This protein belongs to the immunoglobulin superfamily and is expressed only in the heart. The coding gene MYBPC3 is located on chromosome 11p11.2, contains 35 exons and expresses 1274 amino acid residues. Its protein is composed of 11 major subdomains, numbered C0-C10. The myocardial myosin binding protein belongs to an intracellular globulin superfamily, and the binding site with myosin is positioned in the domains of C8-C10. Combined with MyBP-C myosin and titin, has more stable sarcomere structure. The cardiac myosin binding protein has 1 unique N-terminal motif between the C1 and C2 domains, and serves as a site of action for cAMP-dependent protein kinases and calmodulin-dependent protein kinases. The specific module sequence of the phosphorylated myocardium regulates myocardial contractility, so that MyBP-C not only participates in the maintenance of myocardial structure, but also participates in intracellular information transmission, and influences the relaxation and contraction movement of myofilaments. Most mutations in MYBPC3 gene are deletion, insertion, repetition and the like, which result in generation of a truncated protein lacking myosin and a myoprotein aggregation site, thereby causing structural and functional damage to muscle segments.

MYH7 (myostatin heavy chain 7) is a gene that encodes the heavy chain of cardiac β -myosin. MYH7 is the most important causative gene of Hypertrophic Cardiomyopathy (HCM), and about 30-50% of HCM patients are mainly caused by MYH7 gene mutation. The heavy chain of cardiac beta-myosin is the main component of human ventricular myosin, and plays a very important role in the energy supply of myocardial cells and the maintenance of the concentration of calcium ions inside and outside the myocardial cells. The MYH7 gene is located at 14q12 and contains 40 exons, 38 of which are involved in coding a protein containing 1935 amino acid residues. The encoded protein has 3 functional regions: the globular head region S1, the head-rod binding S2 and the rod-like tail region, which contains the atpase activity site and the interface to the actin binding site and to the essential chain (ELC), are important functional regions of myosin. More than 200 single base mutations have been reported at home and abroad, and some mutations can show the malignant manifestations of high penetrance, early symptom, fast development, heavy cardiac muscle, already-occurring heart failure and even sudden death and the like. Mutations in this gene are generally associated with familial hypertrophic cardiomyopathy, myosin storage myopathy, dilated cardiomyopathy, and early distal myopathy in lacing.

TNNI3 (tropin I3) is a gene encoding cardiac troponin I. Troponin i (tni) and troponin t (tnt), troponin c (tnc) are three subunits constituting the striated muscle filament troponin complex. TnI is an inhibitory subunit that blocks actin-myosin interactions, thereby regulating striated muscle relaxation. The gene encodes TnI-cardiac protein and is uniquely expressed in myocardial tissues. The cTnI gene has a full length of 612kb and contains 8 exons, and the mutation of the gene causes familial hypertrophic cardiomyopathy and familial restrictive cardiomyopathy.

TNNT2 (tropinin T2, cardiac type) is a gene encoding cardiac troponin T, located on chromosome 1q32 at approximately 17kb, consisting of 17 exons. Troponin T can bind to troponin C, troponin I, tropomyosin, and is present in Ca²⁺Plays a key role in regulating the activity of the thin muscle filaments. 30 different mutations of TNNT2 have now been found. HCM caused by mutations (missense mutations, splice signal mutations, small fragment deletions) in the various TNNT2 genes accounted for approximately 15% of all HCMs.

The four genes are core genes of hypertrophic cardiomyopathy, and carriers are screened out in the non-onset or onset early stage by adopting a gene detection means, so that early screening and management of diseases can be realized. However, the current genetic detection means usually adopts high-throughput sequencing panel of hundreds of genes, and although the comprehensiveness of the genetic detection is guaranteed, the genetic detection method is suitable for clinical diagnosis of difficult and complicated diseases, but the cost is too high, and the genetic detection method is not suitable for screening hypertrophic cardiomyopathy of large-scale people.

How to simplify the gene detection method of hypertrophic cardiomyopathy and reduce the detection cost so as to adapt to large-scale population screening is the difficulty of hypertrophic cardiomyopathy screening.

Disclosure of Invention

The application aims to provide a method, a device and a storage medium for detecting hypertrophic cardiomyopathy and related genes, so as to be suitable for screening hypertrophic cardiomyopathy of large-scale people.

In order to achieve the purpose, the following technical scheme is adopted in the application:

the first aspect of the application discloses a method for detecting hypertrophic cardiomyopathy and related genes, which comprises the following steps:

a target sequence obtaining step, which comprises obtaining MYBPC3, MYH7, TNNT2, TNNI3 and four target gene sequences from the genome sequence of a sample to be detected by adopting PCR amplification;

a sequencing step, which comprises the high-throughput sequencing of the four obtained target gene sequences;

reo-hit interpretation step, which comprises obtaining variation information according to the high-throughput sequencing result, annotating each variation by adopting a human frequency database, a disease database, a variation database and pathogenicity prediction software, collecting at least 50 evaluation parameters of each variation, converting the parameters into 28 evaluation parameters of ACMG guideline, and obtaining pathogenicity evaluation information of the sample to be tested; the human frequency database prompts the variation pathogenicity through the variation frequency, and the higher the frequency is, the lower the pathogenicity is; the disease database is used for prompting the association of diseases and genes, and accordingly, genes and variation related to the phenotype of the patient are searched; the mutation database is used for prompting the reported pathogenic mutation or benign mutation and adjusting the weight of the mutation obtained by detection;

and a report generation step, which comprises outputting at least one group of patient information, disease description information, gene information, variation characteristics, an evidence list, a pathogenicity evaluation result, unknown variation information and experiment quality control parameters according to the result of the reo-hit interpretation step.

The key point of the application lies in that 50 evaluation parameters of the variation characteristics in the high-throughput sequencing data of the four pathogenic genes of the hypertrophic cardiomyopathy are automatically converted into 28 evaluation parameters of the ACMG guideline by an reo-hit reading system, so that the pathogenicity evaluation of the hypertrophic cardiomyopathy is obtained, the detection of related genes of the hypertrophic cardiomyopathy is completed only by the high-throughput sequencing panel of the four genes, and the pathogenicity of the hypertrophic cardiomyopathy is evaluated independently, so that the gene detection cost of the hypertrophic cardiomyopathy is reduced, and the disease screening efficiency of the hypertrophic cardiomyopathy is greatly improved.

In one implementation of the present application, the four target gene sequences include sequences covering the exon regions of the four genes and the extension of their cleavage sites by at least 15 bp.

In one implementation of the present application, conditions for high throughput sequencing are >300 × sequencing depth, 1 × coverage > 99%, 20 × coverage > 98%.

In one implementation of the present application, the human frequency database includes a 1000genome database, an ExAC database, a genome ad database, an EVS database, and an In-house database;

preferably, the disease database comprises an OMIM database and a CGD database;

preferably, the variant database comprises a clinvar database, an HGMD database and an OMIM database;

preferably, the pathogenicity prediction software comprises at least one of LRT, mutationTaster, FATHMM, PROVEAN, MetaSVM, MetaLR, CADD, fatmm MKLcoding, phyloP100way verbate, phyloP20way mammalitan, phastCons100way verbate, phastCons20way mammalitan, SiPhy 29way logOdd;

preferably, the reo-hit interpretation step further comprises supplementary annotation using a GWAS-catalog database.

The second aspect of the application discloses a multiplex PCR primer for detecting hypertrophic cardiomyopathy and related genes, wherein the multiplex PCR primer is used for amplifying MYBPC3, MYH7, TNNT2 and TNNI3 and four target gene sequences.

In one implementation of the present application, the multiplex PCR primers cover exon regions of four target genes and sequences of at least 15bp of their cleavage site extensions.

The third aspect of the application also discloses a detection device for hypertrophic cardiomyopathy and related genes, which comprises a target sequence acquisition module, a sequencing module, an reo-hit reading module and a report generation module,

the target sequence acquisition module is used for acquiring MYBPC3, MYH7, TNNT2 and TNNI3 and four target gene sequences from a genome sequence of a sample to be detected through PCR amplification;

the sequencing module is used for carrying out high-throughput sequencing on the obtained four target gene sequences;

the reo-hit interpretation module is used for obtaining variation information according to a high-throughput sequencing result, annotating each variation by adopting a human frequency database, a disease database, a variation database and pathogenicity prediction software, acquiring at least 50 evaluation parameters of each variation, converting the evaluation parameters into 28 evaluation parameters of ACMG (acute coronary syndrome Virus-associated syndrome) guidelines and obtaining pathogenicity evaluation information of a sample to be tested; the human frequency database prompts the variation pathogenicity through the variation frequency, and the higher the frequency is, the lower the pathogenicity is; the disease database is used for prompting the association of diseases and genes, and accordingly, genes and variation related to the phenotype of the patient are searched; the mutation database is used for prompting the reported pathogenic mutation or benign mutation and adjusting the weight of the mutation obtained by detection;

and the report generation module comprises a module for outputting at least one group of patient information, disease description information, genetic information, variation characteristics, an evidence list, pathogenicity evaluation results, unknown variation information and experiment quality control parameters according to the result of the reo-hit interpretation module.

In one implementation of the present application, the four target gene sequences include sequences covering exon regions of the four genes and at least 15bp of their splicing site extensions;

preferably, the conditions for high throughput sequencing are >300 × sequencing depth, 1 × coverage > 99%, 20 × coverage > 98%;

preferably, the human frequency database comprises a 1000genome database, an ExAC database, a genome AD database, an EVS database, and an In-house database;

preferably, the disease database comprises an OMIM database and a CGD database;

The fourth aspect of the present application further includes a device for detecting hypertrophic cardiomyopathy and related genes, the device comprising a memory and a processor;

a memory including a memory for storing a program;

and the processor comprises a program for executing the program stored in the memory to realize the detection method of the hypertrophic cardiomyopathy and the related genes.

The fifth aspect of the present application also discloses a computer-readable storage medium, in which a program is stored, and the program can be executed by a processor to implement the detection method for hypertrophic cardiomyopathy and related genes.

Due to the adoption of the technical scheme, the beneficial effects of the application are as follows:

according to the method, 50 evaluation parameters of variation characteristics in high-throughput sequencing data of four pathogenic genes of the hypertrophic cardiomyopathy are automatically converted into 28 evaluation parameters of an ACMG guide through an reo-hit reading system, and then the pathogenicity evaluation of the hypertrophic cardiomyopathy is obtained, so that the relevant gene detection of the hypertrophic cardiomyopathy is completed only through the high-throughput sequencing panel of the four genes, the pathogenicity of the hypertrophic cardiomyopathy genes is evaluated independently, the gene detection cost of the hypertrophic cardiomyopathy is reduced, and the disease screening efficiency of the hypertrophic cardiomyopathy is greatly improved.

Drawings

FIG. 1 is a block diagram of a method for detecting hypertrophic cardiomyopathy and related genes according to an embodiment of the present disclosure;

fig. 2 is a block diagram of a detecting apparatus for hypertrophic cardiomyopathy and related genes provided in an embodiment of the present application.

Detailed Description

The present application will be described in further detail with reference to specific embodiments. In the following description, numerous details are set forth in order to provide a better understanding of the present application. However, those skilled in the art will readily recognize that some of the features may be omitted or replaced with other elements, materials, methods in different instances. In some instances, certain operations related to the present application have not been shown or described in detail in order to avoid obscuring the core of the present application from excessive description, and it is not necessary for those skilled in the art to describe these operations in detail, so that they may be fully understood from the description in the specification and the general knowledge in the art.

Furthermore, the features, operations, or characteristics described in the specification may be combined in any suitable manner to form various embodiments. Also, the various steps or actions in the method descriptions may be transposed or transposed in order, as will be apparent to one of ordinary skill in the art. Thus, the various sequences in the specification are for the purpose of clearly describing one embodiment only and are not meant to be necessarily order unless otherwise indicated where a certain order must be followed.

As shown in FIG. 1, the present embodiment provides a method for detecting hypertrophic cardiomyopathy and related genes, comprising the following steps,

s201, a target sequence obtaining step, which comprises the steps of obtaining MYBPC3, MYH7, TNNT2 and TNNI3 and four target gene sequences from a genome sequence of a sample to be detected by adopting PCR amplification;

specifically, the sample to be detected may be blood or oral mucosa cells collected by an oral swab, and the genome sequence of the sample to be detected is obtained by performing DNA extraction on the sample to be detected. The manner of extracting DNA in this embodiment may be any conventional manner in the art, and is not particularly limited. In the embodiment, multiple PCR primer pairs and probes are designed according to four pathogenic genes MYBPC3, MYH7, TNNT2 and TNNI3 of hypertrophic cardiomyopathy, and the genome sequence of a sample to be detected is subjected to multiple PCR amplification for subsequent high-throughput sequencing. In one implementation of this example, the annealing temperature for multiplex PCR amplification ranges from 55 degrees Celsius to 70 degrees Celsius, and the number of cycles ranges from 25 to 35 cycles. In one implementation of this embodiment, the four target gene sequences include sequences that encompass the exonic regions of the four genes and their splice sites by at least 15bp of extension.

S202, a sequencing step, which comprises the step of carrying out high-throughput sequencing on the four acquired target gene sequences;

and performing high-throughput sequencing on the amplified gene sequence to obtain high-throughput sequencing data of the sample to be detected about the four pathogenic genes, and performing variation detection on the high-throughput sequencing data to obtain variation characteristics of the sample to be detected. In one implementation of this example, conditions for high throughput sequencing are >300 × sequencing depth, 1 × coverage > 99%, 20 × coverage > 98%.

Specifically, since the sequencing-performed raw off-machine data comprises a linker (adapter) sequence, bases with low sequencing quality and undetected bases (denoted by N) and causes great interference on subsequent reo-hit interpretation, the raw off-machine data is firstly filtered to obtain high-quality data (clean data or clean reads), and then the clean data of each sample is aligned to a human reference genome (GRCh37) by using alignment software BWA (Burrows-WheeAligner) to obtain an alignment result file in a BAM format. Based on the comparison results, PCR repeat reads are removed, and the BAM format data is subjected to mutation detection and filtration by using the GATK to obtain the mutation characteristics of the sample sequencing data.

S203, reo-hit interpretation step, which comprises obtaining variation information according to the high-throughput sequencing result, adopting a human frequency database, a disease database, a variation database and pathogenicity prediction software to annotate each variation, collecting at least 50 evaluation parameters of each variation, converting the parameters into 28 evaluation parameters of ACMG guideline, and obtaining pathogenicity evaluation information of the sample to be tested; the human frequency database prompts the variation pathogenicity through the variation frequency, and the higher the frequency is, the lower the pathogenicity is; the disease database is used for prompting the association of diseases and genes, and accordingly, genes and variation related to the phenotype of the patient are searched; the mutation database is used for prompting the reported pathogenic mutation or benign mutation and adjusting the weight of the mutation obtained by detection;

specifically, the human frequency database includes a 1000genome database, an ExAC database, a genome ad database, an EVS database, and an In-house database for providing more frequency information; the disease database comprises an OMIM database and a CGD database; the variation database comprises a clinvar database, an HGMD database and an OMIM database; the pathogenicity prediction software comprises at least one of LRT, mutationTaster, FATHMM, PROVEAN, MetaSVM, MetaLR, CADD, FATHMM KL coding, phyloP100way vertibrate, phyloP20way mammarian, phastCons100way vertibrate, phastCons20way mammarian and SiPhy 29way logOdds, and prediction algorithms of different software are different and are complementary to each other, so that the pathogenicity of variation can be predicted more comprehensively.

The human frequency database, the disease database and the variation database are used for providing an ACMG evidence set, the forecast pathogenicity software is used for finding out variation characteristics conforming to the ACMG evidence definition and highlighting the variation characteristics for further comprehensive analysis to annotate the variation characteristics, 50 evaluation parameters are collected for each variation characteristic, then the 50 evaluation parameters are automatically converted into 28 evaluation parameters of ACMG guidelines, and the pathogenicity evaluation of the hypertrophic cardiomyopathy is generated according to the 28 evaluation parameters of the ACMG guidelines.

Specifically, the 50 evaluation parameters include basic mutation information, functional annotations, frequency information, gene-to-disease association, mutation-to-disease association, and the like, wherein the basic mutation information includes chromosome position, base change, and mutated gene; functional annotations include missense mutations, shear mutations, frameshift mutations, and the like; the frequency information comprises variant frequency data extracted from databases such as 1000genome, ExAC, EVS, In-house and the like; the association of genes with disease is used to illustrate what disease genes can cause according to the reports of CGD database and OMIM database; the association of a variation with a disease is used to illustrate what disease the variation can cause as reported by OMIM, clinvar, HGMD.

Further, the 50 evaluation parameters are automatically converted into 28 evaluation parameters of the ACMG guideline by rules defined by 28 evidences in the ACMG guideline, such as PVS1, PS1, PS2, PS3, PS4, PM1, PM2, PM3, PM4, PM5, PM6, PP1, PP2, PP3, PP4, PP5, BA1, BS1, BS2, BS3, BS4, BP1, BP2, BP3, BP4, BP5, BP6, BP 7.

For example, for the variant features with the population frequency of 0, according to the definition of PM2 evidence of ACMG guideline, ACMG PM2 evaluation parameters can be obtained: the mutation is extremely rare, and is not recorded in a 1000genome database, an ExAC database, an EVS database and an owned database; for the variant features, functionally annotated as frameshift mutations, according to the definition of PVS1 evidence by ACMG guidelines: certain types of mutations (e.g., nonsense, frameshift, splice site within ± 2 mutations, start codon mutations, single or multiple exon deletions) are generally assumed to completely result in deletion of the gene product, resulting in disruption of gene function, and can be assessed as ACMG PVS 1: this variation is generally assumed to result in complete loss of the gene product, resulting in disruption of gene function.

In one implementation of this embodiment, for a rule that is ambiguous in ACMG and difficult to generate automated evaluation parameters, for example, PS 1: it is reported that the change of the same amino acid at the same position is a definite pathogenic variation, indicating that the variation is also a pathogenic variation with a high possibility; PS 4: population studies of the disease have shown that the frequency of the present variation in the patient population is significantly higher than in the normal population; PM 5: it is reported that the change of different amino acids at the same position is a definite pathogenic variation, which indicates that the variation has a certain possibility to be also a pathogenic variation; PP 2: the mutation is missense mutation, which is known to be a common pathogenic mechanism of the gene, and the benign rate of missense mutation on the gene is low; BS 1: searching the variation in the crowd frequency database, comparing the variation with the expected pathogenic variation frequency of the target disease, and possibly indicating that the variation is not pathogenic variation when the crowd frequency of the variation is greater than the expected pathogenic variation frequency; the reo-hit interpretation step can also define ACMG evidence through the interpretation practice of Clinical Sequencing laboratory Research Consortium 9 laboratories, and automatically generate corresponding evaluation parameters of ACMG guidelines. For example, ACMG has PP2 evidence: the mutation is missense mutation which is known to be a common pathogenic mechanism of the gene, and the benign rate of missense mutation on the gene is low, however, how to define PP2 evidence is not explained in ACMG guideline, the reo-hit interpretation step in this embodiment defines PP2 evidence that more than 60% of missense mutation on the gene is malignant, thereby realizing the automatic generation of corresponding evaluation parameters of ACMG guideline according to the definition of the evidence.

In one implementation of this embodiment, the reo-hit interpretation step further includes supplementary annotation with a GWAS-catalog database, for example, for PS4 evidence: population studies of the disease show that the frequency of the present variation in the patient population is significantly higher than in the normal population, and automated supplementary annotation is performed.

And S204, a report generation step, which comprises the step of outputting at least one group of patient information, disease description information, genetic information, variation characteristics, an evidence list, a pathogenicity evaluation result, unknown significance variation information and experiment quality control parameters according to the result of the reo-hit interpretation step.

Specifically, a gene detection report related to hypertrophic cardiomyopathy is generated according to modules such as patient information, disease description information, gene information, variation characteristics, an evidence list, pathogenicity evaluation results, unknown significance variation information, experimental quality control parameters and the like. According to an implementation manner of the embodiment, different modules of the report can be selectively added or deleted according to requirements, so that more brief report information can be provided for a user.

According to the embodiment, the detection of the related gene of the hypertrophic cardiomyopathy is completed only through the high-throughput sequencing panel of four genes, and the pathogenicity of the hypertrophic cardiomyopathy gene is independently evaluated, so that the gene detection cost of the hypertrophic cardiomyopathy is reduced, and the disease screening efficiency of the hypertrophic cardiomyopathy is greatly improved.

The embodiment also provides a multiplex PCR primer for detecting hypertrophic cardiomyopathy and related genes, wherein the multiplex PCR primer is used for amplifying MYBPC3, MYH7, TNNT2 and TNNI3, and four target gene sequences, wherein each target gene sequence comprises a plurality of amplification fragments, and the sequences of the multiplex PCR primer and the amplification fragments thereof are specifically shown in table 1:

TABLE 1

In one implementation of this embodiment, the multiplex PCR primers cover the exon regions of the four target genes and their cleavage sites with at least 15bp of sequence extension.

Those skilled in the art will appreciate that all or part of the functions of the various methods in the above embodiments may be implemented by hardware, or may be implemented by computer programs. When all or part of the functions of the above embodiments are implemented by a computer program, the program may be stored in a computer-readable storage medium, and the storage medium may include: a read only memory, a random access memory, a magnetic disk, an optical disk, a hard disk, etc., and the program is executed by a computer to realize the above functions. For example, the program may be stored in a memory of the device, and when the program in the memory is executed by the processor, all or part of the functions described above may be implemented. In addition, when all or part of the functions in the above embodiments are implemented by a computer program, the program may be stored in a storage medium such as a server, another computer, a magnetic disk, an optical disk, a flash disk, or a removable hard disk, and may be downloaded or copied to a memory of a local device, or may be version-updated in a system of the local device, and when the program in the memory is executed by a processor, all or part of the functions in the above embodiments may be implemented.

Therefore, as shown in fig. 2, in an embodiment of the present invention, an apparatus for detecting hypertrophic cardiomyopathy and related genes includes: a target sequence acquisition module 301, a sequencing module 302, an reo-hit interpretation module 303, a report generation module 304.

Specifically, the target sequence obtaining module 301 is configured to obtain, through PCR amplification, MYBPC3, MYH7, TNNT2, and TNNI3, four target gene sequences from a genome sequence of a sample to be tested; in one implementation of this embodiment, the four target gene sequences include sequences that encompass the exonic regions of the four genes and their splice sites by at least 15bp in extension.

A sequencing module 302, which comprises a high throughput sequencing module for performing high throughput sequencing on the obtained four target gene sequences; in one implementation of this example, conditions for high throughput sequencing are >300 × sequencing depth, 1 × coverage > 99%, 20 × coverage > 98%.

reo-hit interpretation module 303, including being used for obtaining variation information according to the high throughput sequencing result, annotating each variation by adopting a human frequency database, a disease database, a variation database and pathogenicity prediction software, collecting at least 50 evaluation parameters for each variation, and converting into 28 evaluation parameters of ACMG guideline, obtaining pathogenicity evaluation information of the sample to be tested; the human frequency database prompts the variation pathogenicity through the variation frequency, and the higher the frequency is, the lower the pathogenicity is; the disease database is used for prompting the association of diseases and genes, and accordingly, genes and variation related to the phenotype of the patient are searched; the mutation database is used for prompting the reported pathogenic mutation or benign mutation and adjusting the weight of the mutation obtained by detection; in one implementation of this embodiment, the human frequency database includes a 1000genome database, an ExAC database, a genome ad database, an EVS database, and an In-house database; the disease database comprises an OMIM database and a CGD database; the variation database comprises a clinvar database, an HGMD database and an OMIM database; pathogenicity prediction software includes at least one of LRT, MutationTaster, FATHMM, PROVEAN, MetaSVM, MetaLR, CADD, FATHMM KL coding, phyloP100way vertex, phyloP20way vertex, phastCons100way vertex, phastCons20way vertex, SiPhy 29way logOdd.

In one implementation of this embodiment, the reo-hit interpretation module further includes a GWAS-catalog database for supplementary annotation.

And the report generation module 304 comprises a module for outputting at least one group of patient information, disease description information, genetic information, variation characteristics, evidence lists, pathogenicity evaluation results, unknown variation information and experiment quality control parameters according to the result of the reo-hit interpretation module.

Another embodiment of the present application further provides a device for detecting hypertrophic cardiomyopathy and related genes, comprising: a memory for storing a program; a processor for implementing the following method by executing the program stored in the memory: a target sequence obtaining step, which comprises obtaining MYBPC3, MYH7, TNNT2, TNNI3 and four target gene sequences from the genome sequence of a sample to be detected by adopting PCR amplification; a sequencing step, which comprises the high-throughput sequencing of the four obtained target gene sequences; reo-hit interpretation step, which comprises obtaining variation information according to the high-throughput sequencing result, annotating each variation by adopting a human frequency database, a disease database, a variation database and pathogenicity prediction software, collecting at least 50 evaluation parameters of each variation, converting the parameters into 28 evaluation parameters of ACMG guideline, and obtaining pathogenicity evaluation information of the sample to be tested; the human frequency database prompts the variation pathogenicity through the variation frequency, and the higher the frequency is, the lower the pathogenicity is; the disease database is used for prompting the association of diseases and genes, and accordingly, genes and variation related to the phenotype of the patient are searched; the mutation database is used for prompting the reported pathogenic mutation or benign mutation and adjusting the weight of the mutation obtained by detection; and a report generation step, which comprises outputting at least one group of patient information, disease description information, gene information, variation characteristics, an evidence list, a pathogenicity evaluation result, unknown variation information and experiment quality control parameters according to the result of the reo-hit interpretation step.

Another embodiment of the present application also provides a computer-readable storage medium containing a program executable by a processor to implement a method of: a target sequence obtaining step, which comprises obtaining MYBPC3, MYH7, TNNT2, TNNI3 and four target gene sequences from the genome sequence of a sample to be detected by adopting PCR amplification; a sequencing step, which comprises the high-throughput sequencing of the four obtained target gene sequences; reo-hit interpretation step, which comprises obtaining variation information according to the high-throughput sequencing result, annotating each variation by adopting a human frequency database, a disease database, a variation database and pathogenicity prediction software, collecting at least 50 evaluation parameters of each variation, converting the parameters into 28 evaluation parameters of ACMG guideline, and obtaining pathogenicity evaluation information of the sample to be tested; the human frequency database prompts the variation pathogenicity through the variation frequency, and the higher the frequency is, the lower the pathogenicity is; the disease database is used for prompting the association of diseases and genes, and accordingly, genes and variation related to the phenotype of the patient are searched; the mutation database is used for prompting the reported pathogenic mutation or benign mutation and adjusting the weight of the mutation obtained by detection; and a report generation step, which comprises outputting at least one group of patient information, disease description information, gene information, variation characteristics, an evidence list, a pathogenicity evaluation result, unknown variation information and experiment quality control parameters according to the result of the reo-hit interpretation step.

The present application will be described in further detail with reference to specific examples. The following examples are intended to be illustrative of the present application only and should not be construed as limiting the present application.

Example 1

A blood sample of a patient with a cardiomyopathy is obtained, and a subject clinically complains about hypertrophic obstructive heart disease and 46 years old heart disease and has a family history of surgical treatment. After extracting and purifying genome DNA of a blood sample of a detected person, constructing a genome library, amplifying coding regions and adjacent intron regions (15bp) of MYBPC3, MYH7, TNNT2 and TNNI3, 4 target genes by a multiple PCR primer pair, and sequencing by an NGS high-throughput sequencer to obtain sequencing data, wherein the sequencing coverage results are shown in Table 2:

TABLE 2

Sequencing quality parameter	Numerical value
		Length of target region (bp)	20k
Coverage of target area	99.9％
		Average depth of coverage of target area	500.17
Target area average coverage depth > 20 x occupation ratio	99.1％

The sequencing data is compared with a human reference sequence GRCh37 by adopting a BWA tool, and the GATK process is adopted to detect variation so as to obtain the genetic variation information of the examined person. Further, based on the mutation information, 50 evaluation parameters are collected according to a human frequency database, a disease database, a mutation database and pathogenicity prediction software to obtain evaluation parameters annotated for MYBPC3 genetic mutation, which are specifically shown in table 3:

TABLE 3

Therefore, the hybrid mutation c.3190+5G > A is found on the MYBPC3 gene of the detected person in the detection, belongs to a shearing interval mutation, and occurs in an exon region No. 29, so that the splicing mode of an RNA precursor can be changed, the generated mature RNA contains an intron or lacks an exon sequence, and the function of the protein is changed. Further, automatically converting the evaluation parameters annotated for the MYBPC3 gene mutation into evaluation parameters of ACMG guidelines according to rules defined by ACMG guidelines, specifically comprising:

ACMG PM 2: the mutation is an extremely rare mutation which is not recorded in a 1000genome east sub-database, an ExAC east sub-database and an owned database, and is extremely rare and is a characteristic of pathogenic mutation;

ACMG PS 4: population studies of the disease have shown that the frequency of the present variation is significantly higher in the patient population than in the normal population. The presence of this variation was found in at least 10;

ACMG PP 1: the co-segregation of the present mutation from the disease was found among several family members, and the co-segregation of the mutation from hypertrophic cardiomyopathy was reported in 7 patients out of 3 families;

ACMG PS 3: good functional verification experiments show that the mutation has harmful influence on the protein function. In vitro functional validation experiments have shown that this variation leads to exon 29 skipping, which may lead to loss of function through premature protein truncation or nonsense-mediated mRNA decay;

ACMG PP 5: the reports of this variation in clinvar database were pathogenic, with a report review rating of 2 stars, where the star rating represented the meaning: providing judgment criteria, multiple submitters and no contradictory reports;

ACMG PP 3: the protein function prediction is carried out on the variation by using software, and a plurality of pieces of software uniformly predict the variation which is possibly harmful;

ACMG PP 4: the subject's phenotype and family history may support hypertrophic obstructive cardiomyopathy characteristics.

Wherein, the evidence of ACMG PM2 is the medium pathogenic evidence, the evidence of ACMG PS4 is the strong pathogenic evidence, the evidence of ACMG PP1 is the weak pathogenic evidence, the evidence of ACMG PS3 is the strong pathogenic evidence, the evidence of ACMG PP5 is the weak pathogenic evidence, and the evidence of ACMG PP3 is the weak pathogenic evidence. Combining the above ACMG evidences, the MYBPC3 gene mutation of the subject is judged as a pathogenic mutation.

And outputting a gene detection report of the hypertrophic cardiomyopathy of the patient according to the patient information, the disease description information, the gene information, the variation characteristics, the evidence list, the pathogenicity evaluation result and the experimental quality control parameters.

The present application has been described with reference to specific examples, which are provided only to aid understanding of the present application and are not intended to limit the present application. For a person skilled in the art to which the application pertains, several simple deductions, modifications or substitutions may be made according to the idea of the application.

Claims

1. A method for detecting hypertrophic cardiomyopathy and related genes is characterized in that: comprises the following steps of (a) carrying out,

and a report generation step, which comprises outputting at least one group of patient information, disease description information, genetic information, variation characteristics, an evidence list, pathogenicity evaluation results, unknown variation information and experiment quality control parameters according to the result of the reo-hit interpretation step.

2. The detection method according to claim 1, characterized in that: the four target gene sequences comprise sequences covering exon regions of the four genes and at least 15bp of the extension of the cleavage sites thereof.

3. The detection method according to claim 1, characterized in that: the conditions for high throughput sequencing were >300 × depth of sequencing, 1 × coverage > 99%, 20 × coverage > 98%.

4. The detection method according to claim 1, characterized in that: the human frequency database comprises a 1000genome database, an ExAC database, a genome AD database, an EVS database and an In-house database;

preferably, the disease database comprises an OMIM database and a CGD database;

preferably, the pathogenicity prediction software comprises at least one of LRT, mutationTaster, FATHMM, PROVEAN, MetaSVM, MetaLR, CADD, FATHMM MKLcoding, phyloP100way verbate, phyloP20way mammalitan, phastCons100way verbate, phastCons20way mammalitan, SiPhy 29way logOdd;

5. A multiplex PCR primer for detecting hypertrophic cardiomyopathy and related genes is characterized in that: the multiplex PCR primers were used to amplify MYBPC3, MYH7, TNNT2 and TNNI3, four target gene sequences.

6. The multiplex PCR primer according to claim 5, characterized in that: the multiplex PCR primer covers the exon regions of the four target genes and the sequences of which the shearing sites are extended by at least 15 bp.

7. A detection device for hypertrophic cardiomyopathy and related genes is characterized in that: comprises a target sequence acquisition module, a sequencing module, an reo-hit reading module and a report generation module,

the target sequence acquisition module and the target sequence acquisition step comprise the steps of acquiring MYBPC3, MYH7, TNNT2 and TNNI3 and four target gene sequences from a genome sequence of a sample to be detected by adopting PCR amplification;

8. The detection device according to claim 7, wherein: the four target gene sequences comprise sequences which cover exon regions of the four genes and at least 15bp of the extension of the shearing sites thereof;

preferably, the conditions for high throughput sequencing are sequencing depth >300 ×, 1 × coverage > 99%, 20 × coverage > 98%;

preferably, the disease database comprises an OMIM database and a CGD database;

9. A detection device for hypertrophic cardiomyopathy and related genes is characterized in that: the apparatus includes a memory and a processor;

the memory including a memory for storing a program;

the processor includes a program for implementing the method for detecting hypertrophic cardiomyopathy and related genes according to any one of claims 1-4 by executing the program stored in the memory.

10. A computer-readable storage medium characterized by: the storage medium stores a program that can be executed by a processor to implement the method for detecting hypertrophic cardiomyopathy and related genes according to any one of claims 1 to 4.