CN102533992B

CN102533992B - Method and kit for sequencing phenylalanine hydroxylase gene

Info

Publication number: CN102533992B
Application number: CN201110445128.5A
Authority: CN
Inventors: 盛司潼
Original assignee: Individual
Current assignee: Individual
Priority date: 2011-12-27
Filing date: 2011-12-27
Publication date: 2014-05-07
Anticipated expiration: 2031-12-27
Also published as: CN102533992A

Abstract

The invention relates to the field of genetic engineering, and provides a method and a kit for sequencing a phenylalanine hydroxylase (PAH) gene. The method comprises the following steps: A. amplifying a plurality of target regions in a sample to be detected by using a PAH gene specific primer, and constructing a sequencing library based on an amplified product; B. conducting single-molecule amplification for the sequencing library to obtain a plurality of single-molecule amplified products corresponding to the plurality of target regions; and C. conducting high-throughput gene sequencing for the plurality of single-molecule amplified products to obtain the sequence information of the plurality of target regions. By utilizing the method and the kit, the plurality of target regions are simultaneously sequenced by using the high-throughput gene sequencing technology so that the detection efficiency is improved, the accuracy and sensitivity of detection can be also improved, and furthermore, multi-region detection can be conducted for a great deal of samples simultaneously.

Description

A kind of method that Phenylalanine Hydroxylase Gene is checked order and test kit

Technical field

The present invention relates to genetically engineered field, more particularly, relate to a kind of method that Phenylalanine Hydroxylase Gene is checked order and test kit.

Background technology

Phenylalanine hydroxylase (phenylalanine hydroxylase, PAH) gene is positioned at No. 12 karyomit(e)s (12q22-12q24.1), approximately by 1.5Mb based composition, coding region comprises 13 exons and 12 introns, mRNA size is 1353bp, translates into containing 451 amino acid whose enzyme monomers.The sudden change of PAH gene has following characteristics: 1. sudden change position is changeable; 2. mutation type is various.Found so far 530 kinds of above PAH gene mutation types, spreaded all over all exons, domestic more than 30 kinds of having found, more than 60% are wherein missense mutation.PAH is a kind of amino acid metabolism enzyme being produced by liver, and catalysis phenylalanine becomes the reaction of tyrosine, thereby the process of involved in sugar heteroplasia is necessary enzyme in the interior phenylalanine metabolic process of cell.Therefore, PAH gene is checked order, determine the sequence of the PAH gene in this sample, have great importance.

At present, the common methods that PAH gene is checked order is Sanger sequencing, by Sanger sequencing, can carry out region detection to target area, but Sanger sequencing can only check order to a certain section of region (being not more than 900bp) of the PAH gene of a sample at every turn, detection efficiency is low, order-checking cost is high, and due to the restriction of sanger order-checking principle, when the signal detecting with the sample of heterozygote, there will be bimodal and even multimodal, the order-checking peak figure disorder of gained causes checking order, even cannot analyze the sequence information obtaining, check order unsuccessfully.A large amount of research is verified, and PCR product direct sequencing only can detect the template of ratio >=20% in hybrid template, and cannot draw the accurate variation ratio in the site of undergoing mutation.And in prior art there is same defect in the test kit based on Sanger sequencing.

Therefore, need a kind of novel method and novel agent box that detects Phenylalanine Hydroxylase Gene, can to multiple regions of PAH gene, carry out region detection simultaneously, improved detection efficiency, can also improve accuracy and the sensitivity of detection, and can further to a large amount of samples, detect simultaneously simultaneously.

Summary of the invention

The object of the present invention is to provide a kind of method that Phenylalanine Hydroxylase Gene is checked order and test kit, can to multiple target areas of PAH gene, carry out region detection simultaneously, improved detection efficiency, can also improve accuracy and the sensitivity of detection, and can further to a large amount of samples, detect simultaneously simultaneously.

The present invention is achieved in that a kind of method that Phenylalanine Hydroxylase Gene is checked order, and comprises the following steps:

A. utilize PAH gene-specific primer, increased in the multiple target areas in testing sample, and build sequencing library based on amplified production;

B. sequencing library is carried out to unit molecule amplification, obtain multiple unit molecule amplified productions corresponding with described multiple target areas;

C. described multiple unit molecule amplified productions are carried out to high-throughput gene sequencing simultaneously, obtain the sequence information of described multiple target areas.

Wherein, described steps A comprises the following steps:

A1. utilize PAH gene-specific primer, increased in the multiple target areas in testing sample, obtain the amplified production corresponding with described multiple target areas;

A2. utilize joint component, the amplified production corresponding with described multiple target areas connects, and obtains sequencing library; Described joint component adopts at least one in flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure.

Wherein, steps A 2 comprises the following steps:

A21. the amplified production corresponding with described multiple target areas carried out to fragmentation, obtain fragmentation product;

A22. utilize joint component, be connected with fragmentation product, build sequencing library.

The length of the target area sequencing library described in steps A, without particular restriction, is preferably 25bp～500bp.More preferably 50bp～200bp, more preferably 70bp～130bp.

Wherein, described steps A 22 comprises the following steps:

A221. utilize the first joint to be connected with the two ends of fragmentation product, obtain the first connection product;

A222. cyclisation first connects product, obtains cyclisation product;

A223. II s type digestion with restriction enzyme cyclisation product, obtains enzyme and cuts product;

A224. at enzyme, cut product two ends and connect the second joint and the 3rd joint, obtain sequencing library.

Wherein, described steps A 22 comprises the following steps:

A221 '. utilize the 4th joint to be connected with fragmentation product, obtain the second connection product;

A222 '. II s type digestion with restriction enzyme second connects product, must be with the endonuclease bamhi of the 4th joint;

A223 '. the endonuclease bamhi with the 4th joint is connected with the 5th joint, forms sequencing library.

Wherein, at least one joint described in steps A 2 in joint component includes the first sequence label, in library construction process, the sequencing library of different testing samples is carried out to mark.This first sequence label, is preferably the nucleic acid molecule with specific base sequence, and its base number is not limit.

Further, this first sequence label base number is 3～20, more preferably 4～10.

Wherein, the Auele Specific Primer described in steps A and target area complete complementary or part are complementary.

Further, in Auele Specific Primer corresponding to each target area, at least one primer and this target area part complementation, 5 ' end of the primer of this part complementation includes the second sequence label.This second sequence label, in amplification target area process, carries out mark to the target area amplified production of different testing samples.This second sequence label, is preferably the nucleic acid molecule with specific base sequence, and its base number is not limit.

Further, this second sequence label base number is 3～20, more preferably 4～10.

Wherein, the amplification of each target area of the PAH gene of same sample in described steps A is carried out simultaneously or part is carried out simultaneously or independently carries out respectively.

Wherein, PAH Auele Specific Primer in described steps A comprises: SEQ ID NO:1 and SEQ ID NO:26, and SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, in SEQ ID NO:22 and SEQ ID NO:24 at least one, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, in SEQ ID NO:23 and SEQ ID NO:25 at least one.

Wherein, the method for the amplification of the unit molecule described in step B is at least one in emulsion-based PCR, bridge-type PCR.

Wherein, the high throughput sequencing technologies described in step C is synthetic sequencing based on polysaccharase or the connection sequencing based on ligase enzyme.

Of the present inventionly also provide a kind of test kit that can be used in any sequence measurement of the present invention, the present invention is achieved in that a kind of test kit that Phenylalanine Hydroxylase Gene is checked order, and comprising:

PAH gene-specific primer, increases for the multiple target areas to testing sample;

Joint component, builds sequencing library for being combined with amplified production.

Wherein, described joint component adopts at least one in flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure.

Wherein, at least one joint in described joint component includes the first sequence label, and this first sequence label, in library construction process, carries out mark to the sequencing library of different testing samples.This first sequence label is preferably the nucleic acid molecule with specific base sequence, and its base number is not limit, and the base number of this first sequence label is preferably 3～20.

Wherein, described PAH gene-specific primer and target area complete complementary or part are complementary.

Further, in PAH gene-specific primer corresponding to each target area, at least one primer and this target area part complementation, 5 ' end of the primer of this part complementation includes the second sequence label, for in amplification target area process, the target area amplified production of different testing samples is carried out to mark.This second sequence label is preferably the nucleic acid molecule with specific base sequence, and its base number is not limit, and the base number of this second label is preferably 3～20, and more preferably 4～10.

Wherein, described PAH gene-specific primer comprises: SEQ ID NO:1 and SEQ ID NO:26, and SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, in SEQ ID NO:22 and SEQ ID NO:24 at least one, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, in SEQ ID NO:23 and SEQ ID NO:25 at least one.

Compared with prior art, method of the present invention and test kit utilize high throughput sequencing technologies to carry out degree of depth order-checking to multiple target areas of PAH gene simultaneously, improved detection efficiency, can also improve accuracy and the sensitivity of detection, and can further to a large amount of samples, carry out multizone order-checking simultaneously simultaneously.

Accompanying drawing explanation

Fig. 1 is the method flow diagram in one embodiment of the invention, PAH gene being checked order;

Fig. 2 is the structural representation of the single protruding terminus joint in one embodiment of the invention;

Fig. 3 is the structural representation of the two protruding terminus joints in one embodiment of the invention;

Fig. 4 is the structural representation of the joint with loop-stem structure in one embodiment of the invention;

Fig. 5 is the structural representation of the y splice in one embodiment of the invention;

Fig. 6 is the structural representation of the T end y splice in one embodiment of the invention;

Fig. 7 is the structural representation of the y splice in another embodiment of the present invention;

Fig. 8 is the structural representation of the two deoxidation y splices in one embodiment of the invention;

Fig. 9 utilizes fragmentation product and joint component to build the method flow diagram of sequencing library in one embodiment of the invention;

Figure 10 utilizes fragmentation product and joint component to build the method flow diagram of sequencing library in another embodiment of the present invention;

Figure 11 utilizes fragmentation product and joint component to build the method flow diagram of sequencing library in another embodiment of the present invention.

Embodiment

In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.

Target area of the present invention, for the arbitrary sequence on PAH gene, can select as required, include but not limited to the internal sequence of gene, the regulation and control region, outside of gene, the internal sequence of described gene, include but not limited to the intron region, exon region of gene, simultaneously contain intron and exon region.

Fig. 1 shows a kind of method flow that Phenylalanine Hydroxylase Gene is checked order of the present invention, and the method comprises the following steps:

S1. utilize PAH gene-specific primer, increased in the multiple target areas in testing sample, and build sequencing library based on amplified production;

S2. sequencing library is carried out to unit molecule amplification, obtain multiple unit molecule amplified productions corresponding with described multiple target areas;

S3. described multiple unit molecule amplified productions are carried out to high-throughput gene sequencing simultaneously, obtain the sequence information of described multiple target areas.

Present method is carried out degree of depth order-checking by high throughput sequencing technologies to the target area of testing sample, the method can detect multiple target areas of PAH gene simultaneously, improved detection efficiency, can also improve accuracy and the sensitivity of detection simultaneously, accurately obtain the sequence information in these regions, comprise the variation situation in each mutational site of known mutations and unknown mutation, and the frequency that morphs of each mutational site.In addition, by controlling the size of each target area amplified production, can remove fragmentation step from, reduce experimental procedure, improve conventional efficient, reduce cost.

By the sequence information of present method gained, can be used for various scientific researches, include but not limited to crowd's sequential analysis, gene functional research, protein function research.

It should be noted that:

Testing sample described in step S1 is the sample that can extract the arbitrary form of nucleic acid, includes but not limited to: whole blood, serum, blood plasma and tissue sample; Described tissue sample includes but not limited to: paraffin-embedded tissue, flesh tissue and frozen section.

In step S1 gained sequencing library, there is multiple sequencing library molecule, sequencing library is carried out to unit molecule amplification, refer to, by the multiple library molecule in sequencing library, form with denier (even unit molecule) is spatially isolated (but these library molecules still belong to same reaction system on the whole), and realizes amplification in space separately, to promote the signal of various molecules in follow-up sequencing reaction.

In prior art, Sanger sequencing technologies can only check order to a certain section of region of a sample at every turn, realize the order-checking to multiple target areas, can only be to realize by repeatedly reacting.And in the present invention, each molecule in sequencing library is after unit molecule amplification, each sequencing library molecule all forms unit molecule copy array, each unit molecule copy array when carrying out high-throughput gene sequencing in different positions, make the hybridization between sequencing primer and unit molecule copy array, and the extension under enzyme effect can carry out simultaneously, do not interfere with each other each other.Therefore, can to a large amount of (millions of up to ten million, even more than one hundred million, billions of) unit molecule copy arrays, carry out sequencing reaction, then by gathering corresponding signal simultaneously simultaneously, and then obtain required sequence information, and the sensitivity of order-checking is higher compared with Sanger.

Wherein, described in step S1, PAH gene-specific primer and target area complete complementary or part are complementary.

Further, in every pair of primer for PAH gene target area, have a primer and target area part complementation at least, and 5 ' end of this primer is with the second sequence label.This second sequence label, in amplification target area process, carries out mark to the target area amplified production of different testing samples.

This second sequence label, is preferably the nucleic acid molecule with particular sequence, and its base number is not limit.The base number of this second sequence label is preferably 3～20, and more preferably 4～10.

In addition, described Auele Specific Primer also can, with other marker, include but not limited to: biotin labeling, poly histidine mark, antigen, antibody, thus make the purifying of target area amplified production very convenient.

In addition, the amplification in the different target region of same testing sample can be carried out simultaneously or independently carry out respectively or part is carried out simultaneously.In concrete experimentation, can select as required above-mentioned any scheme to carry out.

If increased respectively in each target area respectively, can guarantee in step S1 to be consistent for the molecule number of the target area amplified production of establishing target region sequencing library by measuring the amount of amplified production, can not cause because of amplification step the different target region copy number difference of same testing sample, and then affect follow-up sequencing reaction result.

Certainly, when the size of each target area close, when GC content is also close, by rational design primer, utilize multiple PCR technique, step S1 can increase to multiple target areas simultaneously, and guarantees that the amplification efficiency between each target area keeps basically identical, so just can effectively improve conventional efficient, reduce the cost of reaction.

When testing sample has when multiple, the amplification of the target area of different samples must be carried out respectively.

Wherein, described in step S1, target area comprises at least one in the exons 1, exon 2, exon 3, exon 4, exon 5, exon 6, exon 7, exon 8, exon 9, exons 10, exons 11, exons 12, exons 13 of PAH gene.

Further, described PAH gene-specific primer comprises: SEQ ID NO:1 and SEQ ID NO:26, and SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, in SEQ ID NO:22 and SEQ ID NO:24 at least one, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, in SEQ ID NO:23 and SEQ ID NO:25 at least one.

State in the choice in the primer process of particular sequence, need follow a fundamental principle, that is, selected primer there will not be the phenomenon repeatedly of the same fragment amplification in PAH gene in follow-up amplification procedure.

In one embodiment, the specific implementation process of step S1 is:

S11. utilize PAH gene-specific primer, increased in the multiple target areas in testing sample, obtain the amplified production corresponding with described multiple target areas;

S12. utilize joint component, the amplified production corresponding with described multiple target areas connects, and obtains sequencing library; Described joint component adopts at least one in flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure.

It should be noted that:

In step S11, to the amplification of the multiple target areas in same testing sample, can carry out simultaneously or independently carry out respectively or partly carry out simultaneously.Can be according to practical situation, as: the annealing temperature of specific amplification primer, the size of the target area of amplification, GC content, the quantity of the target area of amplification etc., adjust accordingly.The amplification of the target area of different testing samples must be carried out respectively.

In step S12, the mode of connection of described joint component and amplified production, can adopt various ways to realize, and comprises that joint component is directly connected with amplified production, or connects after amplified production is processed again.

Joint component in step S12, for building sequencing library, can comprise one or more joints.

Wherein, at least one joint described in step S12 in joint component includes the first sequence label, and this first sequence label, in library construction process, carries out mark to the sequencing library of different testing samples.Like this, obtaining respectively after the sequencing library of target area, the target area sequencing library of different testing samples may be combined in same reaction system, carries out unit molecule amplified reaction, and then carries out high-flux sequence simultaneously.Improve the efficiency of sequencing reaction, reduced the cost of sample detection.

This first sequence label is preferably the nucleic acid molecule with specific base sequence, and its base number is not limit.Further, the base number of this first sequence label is 3～20, like this, and at every turn at least can be to 4 ³individual sample detects simultaneously.Considering after various situations, as: the length of the specificity of label, the cost of joint, joint etc., the base number of the first sequence label is preferably 4～10.By the combination of the second label and the first label, invention of the present invention can detect at least 4 at every turn ³× 4 ³individual sample, i.e. 4096 samples.

The modification mode of joint has multiple, includes but not limited to: by biotinylation or methylate, or simultaneously by biotinylation with methylate.In one embodiment, this joint is by biotinylation, and is connected the separation and purification that has the sequencing library that is beneficial to structure of vitamin H with not biotinylated fragmentation product.In another embodiment, this joint is methylated, and be connected with unmethylated fragmentation product, then with the digestion with restriction enzyme that only cuts methylate DNA, connect product, because only have the connection product successfully connecting to be cut, so only have successfully the connector connecting cut, thereby guarantee that enzyme cuts the unicity of product.

The structure formation of joint also has multiple, includes but not limited to: flat end fitting, protruding terminus joint, the joint with loop-stem structure and y splice.Build in sequencing library process and can use one or more joints.Wherein, protruding terminus joint, the joint with loop-stem structure and y splice all can effectively prevent that multiple joints are from the generation that connects phenomenon in connection procedure.For the above-mentioned joint form of joint, below will provide multiple embodiment.

In the first embodiment, joint component adopts flat end fitting, and this joint is the nucleic acid molecule of double-stranded complete complementary.

In a second embodiment, joint component adopts protruding terminus joint, and this joint is double chain acid molecule, and this double chain acid molecule at least comprises a protruding terminus.The base number of this protruding terminus, without concrete restriction, is preferably 1～10 base.According to the structure of this double chain acid molecule, protruding terminus joint can be divided into two classes, is respectively single protruding terminus joint, two protruding terminus joint.

Single protruding terminus joint as shown in Figure 2, its one end is flat end, the other end is protruding terminus.Wherein with single protruding terminus joint of y splice, can prevent that joint is from connecting.For prevent one end be single protruding terminus joint of flat end from connecting, can modify to 3 ' OH on flat end (including but not limited to amino sealing hydroxyl), maybe 5 ' phosphate group on flat end is removed.

Two protruding terminus joints as shown in Figure 3, it contains two protruding terminuses, and these two protruding terminuses can on a nucleotide chain, (a) or on different nucleotide chains (Fig. 3 be b) for Fig. 3.When these two protruding terminuses are in different nucleic acid chains, they are not complementary each other, in case occur that when connecting joint is from connecting.

In the 3rd embodiment, joint component adopts the joint with loop-stem structure, as shown in Figure 4.This joint is single stranded nucleic acid molecule, this single stranded nucleic acid molecule comprises the first complementary pairing district 1, Jing Huan district 2 and second complementary pairing district 3(Fig. 4 a), the first complementary pairing district 1 can with the second complementary pairing district 3 complementary pairings, and the complementary pairing district that they form comprises at least one restriction enzyme enzyme recognition site, and cut recognition site by this enzyme, specific endonuclease capable Jiang Jinghuan cuts in district or excision, thereby single stranded nucleic acid molecule is become to double chain acid molecule, so that follow-up operation.Preferably, as shown in Figure 4 b, the joint with loop-stem structure also can be with protruding terminus 4, and this protruding terminus can be positioned at 5 ' end or the 3 ' end of single stranded nucleic acid molecule.The existence of protruding terminus 4 can further prevent that joint is from the generation that connects phenomenon.This protruding terminus is preferably T.

In the 4th embodiment, joint component adopts y splice, and as shown in Figure 5, this joint is double chain acid molecule, comprises complementary district and crotch region, and two strands of described crotch region respectively comprise at least one amplimer binding site.Preferably, the complementary district of described y splice comprises at least one restriction enzyme enzyme recognition site, and this enzyme is cut recognition site can cut formation end building enzyme in the process of storehouse, so that carry out follow-up operation.

The bifurcation design of this y splice can avoid multiple joints from connecting the appearance of phenomenon building storehouse process; The amplimer binding site comprising on described crotch region, can be directly used in conjunction with amplimer, carries out amplified reaction

Wherein, every chain of the crotch region of described y splice contains N Nucleotide; Preferably, 9≤N≤30.Wherein, the Nucleotide logarithm of the complementary district complementary pairing of described y splice is not limit; Preferably, the Nucleotide logarithm of complementary pairing is 7～15, preferred, and the Nucleotide logarithm of complementary pairing is 9～13.

Wherein, the 3 ' end in the complementary district of described y splice is protruding terminus or flat end.Preferably, the 3 ' end in the complementary district of described y splice is protruding terminus, this protruding terminus can with the sticky end complementary pairing of described fragmentation product, improved joint efficiency, be beneficial to build carrying out smoothly of sequencing library reaction.

Preferably, described y splice is T end y splice, and the 3 ' end in the complementary district of this joint is protruding terminus, and last base of protruding terminus is T; Example T end y splice as shown in Figure 6, in figure, N is any in A, T, C, G base.

Preferably, the 3 ' end in the complementary district of described y splice is protruding terminus, and the Nucleotide of protruding terminus comprises universal base.Wherein, the base number of protruding terminus, without particular restriction, is preferably 1～4.Example y splice as shown in Figure 7, in figure, N is any in A, T, C, G base, X is universal base.

Preferably, described y splice is two deoxidation joints, and the 3 ' end in the complementary district of this joint is flat end, and last Nucleotide of 3 ' end is the Nucleotide with two deoxidation bases; Example two deoxidation y splices as shown in Figure 8, in figure, N is any in A, T, C, G base, dd represents that this last Nucleotide of 3 ' end is the cytidylic acid(CMP) with two deoxidation bases.

It should be noted that above-mentioned joint component is part embodiment, not in order to limit the scope of the invention.

About the implementation of step S12:

In one embodiment, step S12 adopts joint component and the direct-connected mode of amplified production, constructs sequencing library.

In another embodiment, when target area amplified production is larger or need the target area sequencing library that builds hour, after amplified production being carried out to fragmentation processing, fragmentation product is connected with joint component again, build sequencing library, as shown in Figure 9, step S12 comprises the following steps:

S121. the amplified production corresponding with described multiple target areas carried out to fragmentation, obtain fragmentation product;

S122. utilize joint component, be connected with fragmentation product, build sequencing library.

By fragmentation step S121, target area amplified production is become to less fragmentation product, thereby contribute to the further degree of depth order-checking to target area.In addition, can also, by fragmentation processing, different amplified productions be become to the similar fragmentation product of length, can contribute to follow-up unified order-checking.

It should be noted that:

In step S121, the method for described fragmentation target area amplified production has multiple, includes but not limited to: ultrasonic method, spray method, chemical shearing method and enzyme cutting method.Can be according to practical situation, adopt the method adapting to test.

Described fragmentation also can comprise the separation and purification of fragmentation product and end modified step after processing.According to the fragment length needs of order-checking, the nucleic acid fragment obtaining for fragmentation, carries out the separation and purification of object nucleic acid fragment, and separation method can adopt common method, as gel electrophoresis, saccharose gradient or cesium chloride gradient sedimentation, column chromatography for separation etc.According to used fragmentation method, further end modified to the object nucleic acid fragment of gained, include but not limited to: phosphorylation or dephosphorylation, end-filling and end add A, so that follow-up joint component connects.Above-mentioned purpose nucleic acid fragment length is not limit, and is preferably 25bp～500bp, more preferably 30bp～200bp, more preferably 40bp～100bp.

Described in step S121, the length of fragmentation product is not limit, and is preferably 25bp～500bp, and more preferably 30bp～200bp, more preferably between 40bp～100bp.Realizing under the prerequisite of the order-checking to target area, along with shortening of the target area fragment length containing in the sequencing library molecule of target area, the order-checking degree of depth to target area of high throughput sequencing technologies is deepened; And the order-checking degree of depth is darker, the order-checking number of times of each the base position to target area is more, and sequencing result is more accurate, just sensitiveer to the detection of a small amount of sudden change in sample; So just can effectively prevent because on the low side with the ratio of target area of sudden change in sample, and cause the absolute value of order-checking signal of this sudden change on the low side, the inaccurate phenomenon of sequencing result occurs.

In order to realize the restriction to target area sequencing library size, can, after step S121 or S122, to fragmentation product or target area sequencing library, carry out separation and purification.The method of separation and purification has multiple, includes but not limited to: gel method, saccharose gradient or cesium chloride gradient sedimentation and column chromatography for separation.Can be according to practical situation, adopt the method adapting to test.

According to above-mentioned joint component, for step S122, will to this step, be further detailed by multiple embodiment and accompanying drawing below.

In one embodiment of the invention, directly at the two ends of fragmentation product, connect joint and form sequencing library.

Described joint can adopt at least one in above-mentioned flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure.

In another embodiment of the present invention, as shown in figure 10, step S122 specifically can be realized by following steps:

S1221. utilize the first joint to be connected with the two ends of fragmentation product, obtain the first connection product;

S1222. cyclisation first connects product, obtains cyclisation product;

S1223. II s type digestion with restriction enzyme cyclisation product, obtains enzyme and cuts product;

S1224. at enzyme, cut product two ends and connect the second joint and the 3rd joint, obtain sequencing library.

In step S1221, described the first joint can adopt the one in flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure, and described the first joint includes II s type digestion with restriction enzyme recognition site, described II s type restriction enzyme is the restriction enzyme of cleavage site outside recognition sequence, include but not limited to: Acu I, Alw I, Bbs I, BbV I, Bcc I, BceA I, BciV I, BfuA I, Bmr I, Bpm I, BpuE I, Bsa I, BseM II, BseR I, Bsg I, BsmA I, BsmB I, BsmF I, BspCN I, BspM I, BspQ I, BtgZ I, Ear I, Eci I, EcoP15 I, Fau I, Fok I, Hga I, Hph I, HpyAV, Mbo II, Mly I, Mme I, Mnl I, NmeA III, Ple I, Sap I, SfaN I and TspDT I, be preferably Acu I, Bsg I, EcoP15 I or Mme I.

When described the first joint is y splice, this II s type digestion with restriction enzyme recognition site is positioned at complementary district; When described the first joint is the joint with loop-stem structure, the distance between restriction enzyme enzyme recognition site and loop-stem structure, compared with the near distance between II s type digestion with restriction enzyme recognition site and loop-stem structure.

If fragmentation product is repaired through end repair enzyme, and end adds A reaction, and described the first joint is preferably the y splice with T end.If fragmentation product is just repaired through end repair enzyme, by the end-filling of fragmentation product, the preferably two deoxidation joints of described the first joint.

In step S1222, cyclisation first connects product multiple implementation.

In one embodiment of this invention, step S1222 comprises the following steps:

S12221. utilize enzyme to cut primer pair first and connect product and increase, obtain amplified production;

S12222. amplified production is carried out to enzyme and cut, make amplified production form sticky end, and self loop changes into cyclisation product.

Described enzyme is cut 3 ' of primer and is held two terminal portions complementations that are connected respectively product with first, and 5 ' end all contains restriction enzyme enzyme recognition site.The amplified production forming through the amplification of step S12221, two end all contains restriction enzyme enzyme recognition site, then, under the effect of corresponding enzyme, makes the two ends of amplified production form sticky end, and these two sticky end complementations, can carry out recirculation.

In another embodiment of the present invention, described the first joint includes 2 enzymes and cuts recognition site, one of them is restriction enzyme enzyme recognition site, the two ends that are used for making first of step S1222 formation connect product are under the effect of corresponding enzyme, form sticky end, and complementary between them, can carry out recirculation.Another is II s type digestion with restriction enzyme recognition site, at step S1223, utilizes the enzyme identification cyclisation product of this restriction enzyme site of identification, carries out enzyme and cuts, and then obtain enzyme and cut product.

Should illustrate, above two embodiment only, for two kinds in the present invention are realized the embodiment that cyclisation first connects product, do not do any concrete restriction for protection scope of the present invention.

In step S1223, the enzyme that utilization can be identified on the first joint is cut recognition site, and cuts cyclisation product (DNA) but the enzyme that do not cut the first joint carries out enzyme cuts.Enzyme on described the first joint is cut recognition site and is included but not limited to: Mme I enzyme cuts that recognition site, Acu I enzyme are cut recognition site, Bsg I enzyme is cut recognition site.

In step S1224, described the second joint can be the one in flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure, can be with biotin labeling.Described the 3rd joint can be the one in flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure.The second joint and the 3rd joint can be identical or different.Preferably, described the second joint is identical with the 3rd joint, is y splice.Preferred, described y splice is T end y splice or two deoxidation y splice.

It should be noted that, between step S1222 and S1223, also can comprise step S1222A: rolling circle amplification cyclisation product, obtains rolling circle amplification product.By step S1222A, can guarantee that follow-up enzyme cuts step S1223 and have enough starting material.

Or, between step S1221 and S1222, also can comprise step S1221A: utilize amplimer to be connected product to first and increase, obtain amplified production.Described amplimer is connected respectively the joint sequence complementation at product two ends with first.By step S1221A, can guarantee that follow-up cyclisation step has enough starting material.

This programme can avoid carrying out rolling circle amplification after step S1222, and carry out common pcr amplification with the step S1221A before step S1222, replaces, and can effectively reduce follow-up enzyme to cut in step the consumption of II s type restriction enzyme.

Wherein, described the first joint is preferably y splice.

Wherein, the complementary district of described y splice can comprise at least one enzyme and cuts recognition site.Described enzyme is cut recognition site and be can be common restriction enzyme enzyme recognition site, also can be II s type digestion with restriction enzyme recognition site.

Wherein, amplimer is preferably biotinylation primer described in step S1221A, is conducive to the recovery purifying of amplified production.

Wherein, described in step S1221A, amplimer is cut recognition site with at least one specific enzymes.

If on amplimer with specific enzymes to cut recognition site be uridylic base, in step S1222, utilize uridylic specificity excision reagent to carry out enzyme and cut, and then connect cyclisation.

If it is restriction enzyme enzyme recognition site that specific enzymes that amplimer is with is cut recognition site, in step S1222, utilizes corresponding restriction enzyme to carry out enzyme and cut, and then connect cyclisation.

In another embodiment of the present invention, as shown in figure 11, step S122 specifically can be realized by following steps:

S1221 '. utilize the 4th joint to be connected with fragmentation product, obtain the second connection product;

S1222 '. II s type digestion with restriction enzyme second connects product, must be with the endonuclease bamhi of the 4th joint;

S1223 '. the endonuclease bamhi with the 4th joint is connected with the 5th joint, forms sequencing library.

It should be noted that:

In step S1221 ', described the 4th joint can adopt the one in flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure, and described the 4th joint includes II s type digestion with restriction enzyme recognition site; When described the 4th joint is y splice, this II s type digestion with restriction enzyme recognition site is positioned at complementary district; When described the 4th joint is the joint with loop-stem structure, the distance between restriction enzyme enzyme recognition site and loop-stem structure, compared with the near distance between II s type digestion with restriction enzyme recognition site and loop-stem structure.

In step S1222 ', the enzyme that utilization can be identified on the 4th joint is cut recognition site, and cuts cyclisation product (DNA) but the enzyme that do not cut the 4th joint carries out enzyme cuts.Enzyme on described the 4th joint is cut recognition site and is included but not limited to: Mme I enzyme cuts that recognition site, Acu I enzyme are cut recognition site, Bsg I enzyme is cut recognition site.If fragmentation product is repaired through end repair enzyme, and end adds A reaction, and described the 4th joint is preferably the y splice with T end.If fragmentation product is just repaired through end repair enzyme, the preferably two deoxidation joints of described the 4th joint.

In step S1223 ', described the 5th joint can be the one in flat end fitting, protruding terminus joint, joint and y splice with loop-stem structure, can be with biotin labeling.

In another embodiment of the present invention, step S122 specifically comprises the following steps:

S1221 ' '. directly at the two ends of fragmentation product, connect the joint with loop-stem structure, form the fragmentation product with loop-stem structure;

S1222 ' '. utilize restriction enzyme, the fragmentation product Jing Huan district with loop-stem structure is cut or excision, thereby form sequencing library.

The joint of the technical program utilization with loop-stem structure, has prevented that multiple joints are from the generation that connects phenomenon.

Very few when the quantity of the sequencing library forming by above-mentioned several schemes, while being unfavorable for follow-up unit molecule amplification step, can also carry out following steps:

Utilize and the primer of the shank complementation at sequencing library two ends, target area, take the sequencing library of gained as template increases, obtain the sequencing library after amplification.This step can meet the requirement of the amount of subsequent experimental to target sequencing library.

Wherein, unit molecule amplification described in step S2 refers to the molecule in the sequencing library of target area, form with denier (even unit molecule) is spatially isolated (but these library molecules still belong to same reaction system on the whole), and in space separately, realize amplification, to promote the signal of various molecules in follow-up sequencing reaction.The method of described unit molecule amplification includes but not limited to: emulsion-based PCR (Emulsion PCR, EPCR), bridge-type PCR.

The feature of described EPCR maximum is can form the huge independent reaction space of number to carry out DNA cloning.Primary process is before PCR reaction, the aqueous solution that comprises all reacted constituents of PCR is injected into the mineral oil surface of high speed rotating, and aqueous solution moment forms the numerous little water droplet being wrapped up by mineral oil.These little water droplets have just formed independently PCR reaction compartment.Under perfect condition, each little water droplet is only containing a DNA profiling (target area sequencing library molecule) and a magnetic bead, on magnetic bead, contain the primer complementary with the consensus sequence (being introduced by joint component) of target area sequencing library molecule, after PCR reaction, magnetic bead surfaces is just fixed with the DNA profiling amplified production in the same source of copy huge amount.The concrete steps of EPCR can be with reference to PCR amplification from single DNA molecules on magnetic beads in emulsion:application for high-throughput screening of transcription factor targets; Takaaki Kojima; Yoshiaki Takei; Miharu Ohtsuka et al; Nucleic Acids Research; 2005, Vol.33, No.17; Dual primer emulsion PCR for nextgeneration DNA sequencing, Ming Yan Xu, Anthony D.Aragon, Monica R.Mascarenas et al, BioTechniques48:409-412 (May2010); BEAMing:single-molecule PCR on microparticles in water-in-oil emulsions, Frank Diehl, Meng Li, Yiping He, nature methods, Vol.3, No.7, July2006 etc.

The ultimate principle of described bridge-type PCR is, the primer of bridge-type PCR is fixed on solid phase carrier, in PCR process, pcr amplification product can be fixed on solid phase carrier, and pcr amplification product can with solid phase carrier on primer complementary pairing, Cheng Qiaozhuan, then the primer of complementary pairing is to increase as template with the amplified production of its Cheng Qiao.By controlling the amount that adds of original template, after bridge-type PCR react, amplified production form with cluster bunch on solid phase carrier exists, and the amplified production of every cluster is with the DNA profiling amplified production of originating.The principle that it is concrete and embodiment can be with reference to Publication about Document: CN20061009879.X, US6227604.

As previously mentioned, in prior art, Sanger sequencing technologies, due to the technical limitation of self, can only check order to a certain section of region of a sample at every turn.For disposable realization is to detecting when multiple region in sample, the present invention takes high-throughput gene order surveying method on sequence measurement.It is more convenient sensitive that the relative Sanger sequencing of high-throughput gene sequencing detects sequence information, each molecule in sequencing library is after unit molecule amplification, each sequencing library molecule all forms unit molecule copy array, each unit molecule copy array when carrying out high-throughput gene sequencing in different positions, make the hybridization between sequencing primer and unit molecule copy array, and the extension under enzyme effect can carry out simultaneously, do not interfere with each other each other.Therefore, can to a large amount of (millions of up to ten million, even several hundred million, billions of) unit molecule copy arrays, carry out sequencing reaction, then by gathering corresponding signal simultaneously simultaneously, and then obtain accurately required sequence information, and the sensitivity of order-checking is higher compared with Sanger.Especially the amplified production of multiple target areas has been carried out to fragmentation processing, the order-checking number of times that is equivalent to each base of the target area molecule to identical sequence has increased, and can further improve the sensitivity of order-checking.

Wherein, the high throughput sequencing technologies described in step S3 includes but not limited to: the synthetic sequencing based on polysaccharase, the connection sequencing based on ligase enzyme.

Synthetic sequencing based on polysaccharase is based on being with the Nucleotide that can remove mark to carry out.In each building-up reactions, each template strand can only extend once at the most, and the roughly flow process of the synthetic sequencing based on polysaccharase is as follows:

A. sequencing primer is combined in (this unit molecule amplified production is fixed on primer-solid phase carrier mixture) on the total known array of unit molecule amplified production by complementary pairing, under the effect of archaeal dna polymerase, the Nucleotide that can remove mark with band carries out single-basic extension building-up reactions, collect the marking signal that this time adds Nucleotide, can obtain the base sequence information with the next bit of the unit molecule amplified production (being fixed on primer-solid phase carrier mixture) of sequencing primer 3 ' least significant end base complementrity.

B. excision can be removed mark, then under the effect of archaeal dna polymerase, with band, can remove the Nucleotide of mark and proceed single-basic extension building-up reactions, collection adds the marking signal of Nucleotide, can obtain the base sequence information of lower two with the unit molecule amplified production of sequencing primer 3 ' terminal bases complementation.

Repeat b step, until can not proceed building-up reactions, thus obtain the full sequence information of unit molecule amplified production.

Connection sequencing based on ligase enzyme is all based on being with fluorescently-labeled oligonucleotide probe to carry out.Wherein a kind of connection sequencing of ligase enzyme is based on that specific position carries out with fluorescently-labeled oligonucleotide probe, this oligonucleotide probe is with n base, a bit strip from its 5 ' terminal number has fluorescent mark, wherein specifically fluorescent mark of different base pairs, because 3 ' end of this oligonucleotide probe or 5 ' end have carried out specific modification, between oligonucleotide probe, can not directly interconnect, each ligation, each unit molecule amplified production can only connect an oligonucleotide probe.The roughly flow process of this connection sequencing is as follows:

A. sequencing primer is combined on the total known array of unit molecule amplified production on (this unit molecule amplified production is fixed on primer-solid phase carrier mixture) by complementary pairing, utilize above-mentioned oligonucleotide probe, under the effect of ligase enzyme, nucleic acid probe is connected with above-mentioned oligonucleotide chain, then gather fluorescent signal, get final product with 3 ' end of the total known array of unit molecule amplified production after or the front a bit base of 5 ' end sequence information.

B. excise the fluorescent mark on oligonucleotide, under the effect of ligase enzyme, take above-mentioned oligonucleotide probe as raw material, proceed ligation, then gather fluorescent signal, thus the base sequence information of after 3 ' end of the total known array of unit molecule amplified production or the front 2a of 5 ' end position.

Repeat B step, until can not proceed ligation, thereby after obtaining 3 ' end of the total known array of unit molecule amplified production or the front a of 5 ' end, 2a, 3a, 4a ... the base sequence information of position.

Then sequencing primer and oligonucleotide probe sex change from unit molecule amplified production of connecting thereof are eluted, the primer of using compared with sequencing primer before 3 ' end or the few base of 5 ' end instead repeats above-mentioned reaction, thereby after obtaining 3 ' end of the total known array of unit molecule amplified production or the front a-1 of 5 ' end, 2a-1,3a-1,4a-1 ... the base sequence information of position.Repeat this step, after finally obtaining 3 ' end of the total known array of unit molecule amplified production or the front a-(a-1 of 5 ' end), 2a-(a-1), 3a-(a-1), 4a-(a-1) ... the base sequence information of position, thereby the full sequence information of acquisition single-stranded amplification product.

The connection sequencing of another kind of ligase enzyme is equally also based on carrying out with fluorescently-labeled oligonucleotide probe, this oligonucleotide probe is with n base, be divided into h(h≤n) group, the Different Alkali basic sequence of the corresponding same specific position of different fluorescent marks of same group of oligonucleotide probe, the difference between is not on the same group: the specific position difference that different fluorescent marks are corresponding, because 3 ' end of this oligonucleotide probe or 5 ' end have carried out specific modification, between oligonucleotide probe, can not directly interconnect, each ligation, each unit molecule amplified production can only connect an oligonucleotide probe.The roughly flow process of this connection sequencing is as follows:

A. sequencing primer is combined on the total known array of unit molecule amplified production on (this unit molecule amplified production is fixed on primer-solid phase carrier mixture) by complementary pairing, (the base position that fluorescent mark is corresponding is x to utilize in above-mentioned oligonucleotide probe one group, x≤h), under the effect of ligase enzyme, nucleic acid probe is connected with above-mentioned oligonucleotide chain, then gather fluorescent signal, get final product with 3 ' end of the total known array of single-stranded amplification product after or the front x bit base of 5 ' end sequence information, sequencing primer and oligonucleotide probe sex change from unit molecule amplified production of connecting thereof are eluted.

B. then again sequencing primer is combined on unit molecule amplified production, (the base position that fluorescent mark is corresponding is y to use the oligonucleotide probe group different from a step instead, y≤h), under the effect of ligase enzyme, nucleic acid probe is connected with above-mentioned oligonucleotide chain, then gather fluorescent signal, get final product with 3 ' end of the total known array of single-stranded amplification product after or the front y bit base of 5 ' end sequence information, sequencing primer and oligonucleotide probe sex change from unit molecule amplified production of connecting thereof are eluted.

C. repeating step b, until h group oligonucleotide probe carried out a ligation respectively, thereby after obtaining 3 ' end of the total known array of unit molecule amplified production or 5 ' end front the 1st, 2 ..., h position base sequence information.

The primer of using compared with sequencing primer before 3 ' end or the many one or more universal base of 5 ' end instead reacts by above-mentioned principle, after can extending 3 ' end of the total known array of the unit molecule amplified production of acquisition or the 5 ' proterminal base sequence read long.

This order-checking ratio juris of the connection based on ligase enzyme and specific embodiments can be with reference to CN200710170507.1.

Synthetic sequencing and tetra-sodium sequencing based on polysaccharase have certain similarity, and therefore, in theory, tetra-sodium sequencing can be applicable to detection method of the present invention equally.But that existing tetra-sodium sequencing adopts in order-checking process is natural dNTP, make it in order-checking process, the mensuration for the treatment of continuous single base repetitive sequence that may exist on sequencing library has difficulties; And Nucleotide in synthetic sequencing based on polysaccharase with the mark removed, can guarantee only to extend a base at every turn; 3 ' end with fluorescently-labeled probe or the 5 ' end of connection order-checking based on ligase enzyme in sending out modified, and guarantees only to connect a fluorescent probe in the fragment of each unit molecule amplified production; Therefore the accuracy of the synthetic sequencing based on polysaccharase of the present invention and the connection sequencing based on ligase enzyme is high compared with tetra-sodium order-checking.

In addition, because the aperture on the etching optical fiber slide (PTP plate) in existing tetra-sodium order-checking instrument is large, (55 μ m × 44 μ m), for holding the amplified production (amplified production of emulsion-based PCR is fixed on the pearl of 10 μ m) of order-checking emulsion-based PCR gained before, the sequencing throughput that this has limited tetra-sodium sequencing greatly, makes the reagent cost of its sequencing reaction higher.In addition, tetra-sodium sequencing also needs to add in the aperture of etching optical fiber slide (PTP plate) mixture that contains multiple protein to guarantee carrying out smoothly of sequencing reaction in order-checking process, and this will improve the reagent cost of sequencing reaction greatly.

Relative, synthetic sequencing based on polysaccharase and the connection sequencing based on ligase enzyme can fix by the magnetic bead of 1 μ m or slide the product of unit molecule amplification, make its flux higher, and except need to be with removing the Nucleotide of mark or the fluorescently-labeled probe that 3 ' end or 5 ' end have carried out modifying, other required reagent are without particular requirement, and this just greatly reduces the reagent cost of sequencing reaction.Obtaining same quantity of data in the situation that, the order-checking cost of the synthetic sequencing based on polysaccharase and the connection sequencing based on ligase enzyme is two thousandths of tetra-sodium sequencing or still less.What therefore in the present invention program, adopt is that synthetic sequencing based on polysaccharase or the connection sequencing based on ligase enzyme check order to unit molecule amplified production.

In a specific embodiment of the present invention, detect 1 to 13 exon of PAH gene simultaneously.Design be respectively used to the to increase Auele Specific Primer of these exons: E1F(SEQ ID NO:1) and E1R(SEQ ID NO:2), E2F(SEQ ID NO:3) and E2R(SEQ ID NO:4), E3F(SEQ ID NO:5) and E3R(SEQ ID NO:6), E4F(SEQ ID NO:7) and E4R(SEQ ID NO:8), E5F(SEQ ID NO:9) and E5R(SEQ ID NO:10), E6F(SEQ ID NO:11) and E6R(SEQ ID NO:12), E7F(SEQ ID NO:13) and E7R(SEQ ID NO:14), E8F(SEQ ID NO:15) and E8R(SEQ ID NO:16), E9F(SEQ ID NO:17) and E9R(SEQ ID NO:18), E10F(SEQ ID NO:19) and E10R(SEQ ID NO:20), E11F(SEQ ID NO:21) and E11R(SEQ ID NO:22), E12F(SEQ ID NO:23) and E12R(SEQ ID NO:24), E13F(SEQ ID NO:25) and E13R(SEQ ID NO:26).

One, the extraction of testing sample DNA

Utilize nucleic acid extraction kit common on market to extract respectively the DNA of whole blood sample (1 to 10), paraffin organization sample (11 to 20), and do respectively corresponding mark; Take saltant type plasmid 1(, contain saltant type sequence SEQ ID NO:27), wild plasmid 1(contains wild-type sequence SEQ ID NO:28), saltant type plasmid 2(contains saltant type sequence SEQ ID NO:29), wild plasmid 2(contains wild-type sequence SEQ ID NO:30) be contrast, corresponding numbering is respectively 21,22,23,24.

Two, the amplification of PAH gene target area

Utilize above-mentioned PAH gene-specific primer, increased in the target area of PAH gene, obtain amplified production.The amplification of the target area of each PAH gene is carried out respectively.Reaction system is as follows:

PCR reaction conditions is as follows:

95℃3min；

94 ℃ of 30s, 58 ℃ of 30s, 72 ℃ of 30s; Repeat 25 circulations;

72℃7min。

Utilize PCR cleaning agents box, respectively the amplified production of each sample is cleaned, remove primer and the dNTP of not amplification, reclaim amplified production.

Three, build sequencing library

This step can have multiple embodiments, in one embodiment of the present of invention, comprises following two steps:

1. the fragmentation of amplified production

Utilize ultrasonic method to carry out fragmentation processing.Concrete operations are:

Measure reclaim after amplified production concentration, and by etc. mole number the amplified production after the recovery in the different target region of the PAH gene of same sample is mixed, obtain mixed solution, and do corresponding mark.

The mixed solution of each sample (approximately 50 μ L) is added in the TE buffer of 400 μ L, ultrasonic 4s under 430W power condition, interval 20s, 5 times repeatedly, obtains fragmentation product.Utilize 1% sepharose to carry out separation and purification to the fragmentation product of each sample, cut glue and reclaim the fragmentation product of size between 40bp to 100bp.

2. build sequencing library

Before the sequencing library of establishing target region, need to cut glue reclaim fragmentation product carry out respectively end modified so that the connection of joint component.In the present embodiment, the end modified of fragmentation product that cuts glue recovery comprised to phosphorylation, end-filling and end add A reaction.

Be implemented as follows:

1) phosphorylation and end-filling reaction

System is:

Reaction conditions is: hatch 20min for 20 ℃.Reaction finishes rear utilization recovery test kit and carries out purifying recovery.

2) end adds A tail

Reaction system is:

Reaction conditions is: hatch 30min for 37 ℃.After finishing, reaction utilize purification kit purifying to reclaim.

3) jointing 1

In the present embodiment, employing T end y splice is as shown in Figure 6 as joint 1, and the T end y splice using of same sample is identical, the T end y splice difference of different samples, difference is sequence label difference, and the sequence label that different samples are corresponding is as shown in table 1 below.

The each sample of table 1 is built the sequence label on the first joint in the process of storehouse

Under the effect of T4 ligase enzyme, above-mentioned T end y splice respectively with add A tail after the product that reclaims of purifying be connected, form the fragment of belt lacing 1.Linked system is:

Reaction conditions is: 16 ℃ hatch 4h more than.After finishing, reaction utilize purification kit purifying to reclaim.

4) fragment of pcr amplification belt lacing 1

Amplification system is:

PCR reaction conditions is as follows:

95℃3min；

94 ℃ of 30s, 58 ℃ of 30s, 72 ℃ of 30s; Repeat 25 circulations;

72℃7min。

Utilize PCR cleaning agents box, respectively the amplified production of each sample is cleaned, remove primer and the dNTP of not amplification, reclaim the amplified production of the fragment of belt lacing 1.

5) II s type digestion with restriction enzyme

The amplified production that utilizes the fragment of the belt lacing 1 after Acu I enzyme cuts back to close, reaction system is as follows:

Reaction conditions: hatch 1h for 37 ℃.After reaction finishes, utilize purification kit purifying to reclaim enzyme and cut product.

6) jointing 2

Utilization protruding terminus y splice as shown in Figure 7, as joint 2, is cut product with the enzyme reclaiming and is connected, and obtains sequencing library, and linked system is as follows:

Condition of contact, hatches 2h for 14 ℃.After reaction finishes, utilize purification kit purifying to reclaim, then sex change forms strand, obtains sequencing library.

Four, sequencing library is carried out to unit molecule amplification

24 sequencing libraries that determination step three obtains concentration separately, then, by mixing with isoconcentration, then carries out unit molecule amplification, obtains unit molecule amplified production, and the method for described unit molecule amplification can adopt EPCR or bridge-type PCR.

Be preferably EPCR amplification, unit molecule amplimer is preferably: SEQ ID NO:25 and SEQ ID NO:26.

Five, unit molecule amplified production is carried out to high-flux sequence

Described sequence measurement can adopt the synthetic sequencing based on synthetic enzyme, also can adopt the connection sequencing based on ligase enzyme.Sequencing result is carried out to bioinformatic analysis, and the PAH gene order information that can obtain above-mentioned 26 samples is as follows:

It is R413P that the exons 12 of sample 2 exists G1238C() heterozygous mutant; It is R243Q that the exon 7 of sample 5 exists G728A() heterozygous mutant; It is R243Q that the exon 7 of sample 8 exists G728A() heterozygous mutant; It is R413P that the exons 12 of sample 9 exists G1238C() homozygous mutation; It is R413P that the exons 12 of sample 13 exists G1238C() heterozygous mutant; It is R243Q that the exon 7 of sample 15 exists G728A() heterozygous mutant; It is R413P that the exons 12 of sample 18 exists G1238C() heterozygous mutant.Other sequences of above-mentioned sample are wild-type.Other samples are wild-type.

The detected result of control sample all conforms to the gene order that control sample itself contains.

Should illustrate, the present embodiment is a specific embodiment of the present invention, to the present invention without any restriction effect, the concrete sequence of for example PAH gene-specific primer and combination all can be replaced accordingly, in addition, multiple steps in the present embodiment all can be replaced with reference to aforesaid method, do not repeat them here.

For the sensitivity of the method that PAH gene is checked order of the present invention, the present invention adopts following embodiment to verify.

Pass through conventional design, build the plasmid that contains respectively PAH gene extron 7 wild-type sequences (SEQ ID NO:28) and saltant type sequence (SEQ ID NO:27), the plasmid that contains PAH gene extron 12 wild-type sequences (SEQ ID NO:30) and saltant type sequence (SEQ ID NO:29).

Then configure mutation rate and be respectively 20%, 10%, 5%, 3%, 1%, 0% plasmid mixed solution.To contain 1000 above-mentioned plasmid mixed solutions that copy plasmids as template, then by the method with reference to a upper embodiment, detect.Detected result is in Table 2.

Table 2 PAH gene sensitivity experiment detected result

Result shows, the lowest detection of the method that PAH gene is checked order of the present invention is limited to 3% left and right, 5% and above detected result and practical situation basically identical.

The test kit that Phenylalanine Hydroxylase Gene is checked order, comprising:

This test kit is for target area order-checking, can to multiple target areas of sample, carry out degree of depth order-checking simultaneously, improved detection efficiency, accuracy and the sensitivity that can also improve detection simultaneously, accurately draw the sequence information of these target areas, comprises the variation situation in each mutational site of known mutations and position sudden change, and the frequency that morphs of each mutational site, detection sensitivity is high, can be used for the large-scale crowd examination of target area, determines the ratio of the range gene type of this target area.

Wherein, described joint component is for establishing target region sequencing library, comprises at least one or multiple joint.

Wherein, described joint component can adopt various ways, as at least one in: flat end fitting, protruding terminus joint, joint, y splice with loop-stem structure.

Wherein, described joint can be methylated or biotinylation, or simultaneously by biotinylation with methylate.

Wherein, at least one joint in described joint component includes the first sequence label, in library construction process, the sequencing library of different testing samples is carried out to mark.This first sequence label is preferably the nucleic acid molecule with specific base sequence, and its base number is not limit, and the base number of this first sequence label is preferably 3～20, and more preferably 4～20.

Wherein, described PAH gene-specific primer and target area are complementary or partly complementary.

Further, in Auele Specific Primer corresponding to each target area, at least one primer and this target area part complementation, 5 ' end of this primer includes the second sequence label, this second sequence label, for in amplification target area process, the target area amplified production of different testing samples is carried out to mark.This second sequence label is preferably the nucleic acid molecule with specific base sequence, and its base number is not limit.

Further, the base number of this second sequence label is 3～20, more preferably 4～10.

Wherein, described target area comprises at least one in the exons 1, exon 2, exon 3, exon 4, exon 5, exon 6, exon 7, exon 8, exon 9, exons 10, exons 11, exons 12, exons 13 of PAH gene.

Wherein, described test kit can also comprise pcr amplification reagent, and described pcr amplification reagent comprises PCR enzyme, dNTP, PCR damping fluid, Mg ²⁺solution.

Wherein, described test kit also can comprise the enzyme of cutting recognition site for identifying enzyme on joint component.

It should be noted that by method of the present invention or test kit and detect the gene order information obtaining, can be used for various scientific researches, include but not limited to crowd's sequential analysis, gene functional research, protein function research.For example, in conjunction with statistical study of follow-up further molecular biology test, clinical trial, clinical observation and integrated data etc., thereby realize multiple scientific research object, include but not limited to: genotype distribution, the mutation type of each gene and the relation of relevant physiological activity etc. of some specific gene of certain specific crowd.The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims

1. test kit Phenylalanine Hydroxylase Gene being checked order, is characterized in that, comprising:

Joint component, builds sequencing library for being combined with amplified production;

Described joint component adopts at least one in y splice, flat end fitting, protruding terminus joint and the joint with loop-stem structure, and at least adopt this kind of joint of y splice, described y splice is double chain acid molecule, comprise complementary district and crotch region, two strands of described crotch region respectively comprise at least one amplimer binding site, and described complementary district comprises II s type digestion with restriction enzyme recognition site.

2. the test kit that Phenylalanine Hydroxylase Gene is checked order according to claim 1, is characterized in that, the 3 ' end in the complementary district of described y splice is protruding terminus or flat end.

3. the test kit that Phenylalanine Hydroxylase Gene is checked order according to claim 2, is characterized in that, the 3 ' end in the complementary district of described y splice is protruding terminus, and this protruding terminus is T.

4. the test kit that Phenylalanine Hydroxylase Gene is checked order according to claim 2, is characterized in that, the 3 ' end in the complementary district of described y splice is flat end, and this last Nucleotide of flat end is the Nucleotide with two deoxidation bases.

5. the test kit that Phenylalanine Hydroxylase Gene is checked order according to claim 4, it is characterized in that, at least one joint in described joint component includes the first sequence label, in library construction process, the sequencing library of different testing samples is carried out to mark.

6. according to the test kit that Phenylalanine Hydroxylase Gene is checked order described in any one in claim 1 to 4, it is characterized in that, in Auele Specific Primer corresponding to each target area, have a primer and this target area part complementation at least, 5 ' end of the primer of this part complementation includes the second sequence label, for in amplification target area process, the target area amplified production of different testing samples is carried out to mark.

7. according to the test kit that Phenylalanine Hydroxylase Gene is checked order described in any one in claim 1 to 4, it is characterized in that, described PAH gene-specific primer comprises: SEQ ID NO:1 and SEQ ID NO:26, and SEQ ID NO:2, SEQ ID NO:4, SEQ ID NO:6, SEQ ID NO:8, SEQ ID NO:10, SEQ ID NO:12, SEQ ID NO:14, SEQ ID NO:16, SEQ ID NO:18, SEQ ID NO:20, in SEQ ID NO:22 and SEQ ID NO:24 at least one, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7, SEQ ID NO:9, SEQ ID NO:11, SEQ ID NO:13, SEQ ID NO:15, SEQ ID NO:17, SEQ ID NO:19, SEQ ID NO:21, in SEQ ID NO:23 and SEQ ID NO:25 at least one.

8. according to the test kit that Phenylalanine Hydroxylase Gene is checked order described in any one in claim 1 to 4, it is characterized in that, also comprise pcr amplification reagent, described pcr amplification reagent comprises PCR enzyme, dNTP, PCR damping fluid, Mg ²⁺solution.

9. according to the test kit that Phenylalanine Hydroxylase Gene is checked order described in any one in claim 1 to 4, it is characterized in that, also comprise the II s type restriction enzyme for identifying the II s type digestion with restriction enzyme recognition site on described y splice.