CN105567681B - A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing - Google Patents

A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing Download PDF

Info

Publication number
CN105567681B
CN105567681B CN201511035148.XA CN201511035148A CN105567681B CN 105567681 B CN105567681 B CN 105567681B CN 201511035148 A CN201511035148 A CN 201511035148A CN 105567681 B CN105567681 B CN 105567681B
Authority
CN
China
Prior art keywords
seq
dna
sequence
label
nucleotide sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201511035148.XA
Other languages
Chinese (zh)
Other versions
CN105567681A (en
Inventor
伍泳彰
曾宏彬
陈杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Sagene Biotech Corp
Original Assignee
Guangzhou Sagene Biotech Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Sagene Biotech Corp filed Critical Guangzhou Sagene Biotech Corp
Priority to CN201511035148.XA priority Critical patent/CN105567681B/en
Publication of CN105567681A publication Critical patent/CN105567681A/en
Application granted granted Critical
Publication of CN105567681B publication Critical patent/CN105567681B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Zoology (AREA)
  • Wood Science & Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Immunology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention provides a kind of method based on the noninvasive biopsy virus of high-throughput gene sequencing.By this method to realize the parallel sequencing to DNA and RNA, the ecological composed structure of patient's body virus is accurately determined afterwards.The present invention can avoid the interference of host cell nucleic acid to the maximum extent, be effectively reduced sequencing cost, and can be detected to multiple unknown intersection viruses in the same time.The present invention devises 80 kinds of DNA label connectors altogether, and related reagent is assembled into library construction Kit, in the case where flux allows, can be parallel sequenced for 80 samples.Experimental design is rigorous, high degree of automation, can complete viruses indentification in batches in 24 hours, meet the needs of scientific research and clinical detection to some extent, and reference is provided for the clinical diagnosis and medication of virus infection relevant disease.

Description

A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing
Technical field
The present invention relates to a kind of method for detecting virus, and in particular to one kind is based on the noninvasive biopsy virus of high-throughput gene sequencing Method and used label connector.
Background technology
Virus is broadly divided into DNA virus, RNA virus and protein virus, is present in intimate institute in such a way that heterotrophism is parasitic In some biologies, some viruses can cause damages to host even dead.In various environment, often by a variety of viral structures At the ecosystem, cross-infection simultaneously causes disease concurrent, and double-stranded DNA, single-stranded may be present in nucleic acid genetic materials of these viruses Four type of DNA, double-stranded RNA and single stranded RNA.
Traditional viruses indentification method includes that filtering screening, tissue cultures, electron microscope observation, serology, vaccine connect Kind and cell culture observation etc..But these methods are dependent on the separation and concentration to virus, virus on albumen and nucleic acid level The information such as the conservative of sequence, for some viruses that cannot cultivate, be not easy to expand under experimental conditions or oneself know sequence The identification of the very low new virus of conservative is with regard to helpless.The flux of Standard PCR is low, and detection zone is short, just for DNA types Virus, these features restrict the development of viral nucleic acid detection technique.
Viral metagenomics learn investigative technique as a kind of emerging virus group, need not cultivate virus, and In the case of unknown virus sequence, can the virulent inhereditary material of institute is as research object directly using in environmental samples, soon The virus composition in sample is identified fastly.But there is also defects for the sequencing approach of conventional single nucleic acid type used: When DNA builds library, RNA is considered as pollution and is removed, and when RNA builds library, then DNA is considered as impurity removes, final to survey DNA and RNA and not equal to complete macro genome summation, cannot completely reflect the original ecology composition of viral sample.
Ion Torrent PGM high-flux sequence platforms are with quick, flux is moderate, does not depend on known array to expand Advantage is highly suitable for small-sized gene order-checking.Problem needed to be considered is that current high-flux sequence platform is import instrument, And sequencing experiment is specified using the exclusive mating kit of official, only on single the cost of the sample treatment of machine with regard to 1,000 with On, this allows numerous users to be difficult to undertake.
Simultaneously as the sensitivity of high-flux sequence, has high requirement to sample purity.But viral genome is normal It advises and only has several copies in intraor extracellular, 3kb to kb up to a hundred, and the Matrix attachment region of people is 3Gb, differs millions of times.So passing The method for extracting of system can not often avoid the pollution of host cell Matrix attachment region, cause sequencing data that can not use.To economy Effectively, completely original physiologic state viral in reflected sample, it is also contemplated that solving two problems, first, how effectively to reduce Cost is sequenced, second is that how to be effectively removed host nucleic acids pollution, improves the abundance of viral nucleic acid.
Therefore, establish it is a kind of both can shorten the time cycle, can same all nucleic acid of time in situ detection sample it is (including double Chain DNA, four type of single stranded DNA, double-stranded RNA and single stranded RNA), and experimental cost can be reduced, effectively remove host nucleic acids dirt Dye, moreover it is possible to which the method for detecting virus of complete reflected sample original physiologic state is particularly important.
Invention content
It is provided a kind of based on high-throughput gene it is an object of the invention to overcome the shortcomings of the prior art place The method that noninvasive biopsy virus is sequenced, the present invention provides the DNA labels and DNA label connectors used in this method.
To achieve the above object, the technical solution taken:One group of DNA label connector, the DNA labels connector includes A- Barcode001-F/R~A-Barcode010-F/R.
Preferably, the DNA labels connector includes A-Barcode011-F/R~A-Barcode020-F/R.
Preferably, the DNA labels connector includes A-Barcode021-F/R~A-Barcode030-F/R.
Preferably, the DNA labels connector includes A-Barcode031-F/R~A-Barcode040-F/R.
Preferably, the DNA labels connector includes A-Barcode041-F/R~A-Barcode050-F/R.
Preferably, the DNA labels connector includes A-Barcode051-F/R~A-Barcode060-F/R.
Preferably, the DNA labels connector includes A-Barcode061-F/R~A-Barcode070-F/R.
Preferably, the DNA labels connector includes A-Barcode071-F/R~A-Barcode080-F/R.
A-Barcode001-F/R~A-Barcode010-F/R of the present invention refers to A-Barcode001-F, A- Barcode001-R、A-Barcode002-F、A-Barcode002-R、A-Barcode003-F、A-Barcode003-R、A- Barcode004-F、A-Barcode004-R、A-Barcode005-F、A-Barcode005-R、A-Barcode006-F、A- Barcode006-R、A-Barcode007-F、A-Barcode007-R、A-Barcode008-F、A-Barcode008-R、A- The combination of Barcode009-F, A-Barcode009-R, A-Barcode010-F and A-Barcode010-R.The A- Barcode011-F/R~A-Barcode020-F/R, A-Barcode021-F/R~A-Barcode030-F/R, A- Barcode031-F/R~A-Barcode040-F/R, A-Barcode041-F/R~A-Barcode050-F/R, A- Barcode051-F/R~A-Barcode060-F/R, A-Barcode061-F/R~A-Barcode070-F/R and A- The meaning in Barcode071-F/R~A-Barcode080-F/R respectively referred to generations can refer to above-mentioned A-Barcode001-F/R~ The definition of A-Barcode010-F/R explains.
The present invention also provides one group of DNA labels connectors described above in the purposes for building in DNA tag libraries.
The present invention also provides one group of DNA labels connectors described above for the noninvasive biopsy disease of high-throughput gene sequencing Purposes in poison.
The present invention provides a kind of design method of label connector, the design method includes the following steps:
(1) by sequence label barcode X be set as 11 bases and the ends sequence label barcode X last A base is C, and constituting CG with the G of connector core signal identification sequence B arcode Adapter beginnings stablizes base-pair;
The connector core signal identification sequence B arcode Adapter are GAT;
The double thio-modifications of 3 ' protrusions of double-strand label connector;
GC or CG structures in the sequence label barcode X are no more than 2;
The base percentage of the sequence label barcode X:%A, %T, %C and %G are respectively 18.00- 30.00 %A+T=45.00-60.00;%C+G=40.00-55.00;%A+%T+%C+%G=100%;
(2) label mark sequence barcode X and connector core signal that universal sequence, step (1) design are identified into sequence Barcode Adapter are sequentially connected, the as described label connector.
Preferably, in the step (1) sequence label barcode X 11 base contents ratios:%A=27.27, %G =18.18, %T=27.27, %C=27.27, %A+T=54.55, %C+G=45.45;
The base number that the sequence label barcode X contain:3A, 3C, 2G and 3T.
The present invention provides a kind of methods based on the noninvasive biopsy virus of high-throughput gene sequencing, and the method includes following Step:
(1) DNAse I and RNase are added into n measuring samples, n is integer and 1≤n≤80, is digested outside viral shell After host nucleic acids, carries out DNA and RNA and extract altogether;
(2) DNA and RNA of step (1) extraction are changed into the double chain DNA sequence of flat end;
(3) double chain DNA sequence that step (2) obtains is interrupted;
(4) double chain DNA sequence for interrupting step (3) carries out DNA fragmentation end reparation;
(5) 3 ' end connection P1 universal joints and 5 ' end connection DNA label connectors, connect different samples different DNA Label connector, the DNA labels connector are DNA label connectors described above;
(6) recovery purifying and PCR amplification are carried out to the connection product of the upper different DNA labels connectors of connection in step (5), PCR amplification primer is pcr-priemer-A and pcr-priemer-P1;The sequence of the pcr-priemer-A such as SEQ ID NO:Shown in 241, the sequence such as SEQ ID NO of the pcr-priemer-P1:Shown in 242;
(7) connection product for being connected with different label connectors after amplification in step (6) is mixed, is then sequenced, passes through Database compares and locking label joint sequence, to identify the letter related to the ecosystem of contained viral species in each sample Breath.
Label connector design of the present invention is special according to Ion Torrent high-flux sequence platform labels joint structure with synthesis Instrument compatibility of seeking peace optimizes 80 kinds of barcode sequence labels being defined, and is become single-stranded connector by making annealing treatment At the double-stranded adapters of specificity.
Library label connector is divided into 5 ' the end A connectors and 3 ' end P1 connectors of target fragment, and wherein A connectors are label connector, Its universal sequence containing part, P1 are universal joint.The complementation that its universal sequence part is used for amplimer combines.Joint structure Sequence is as follows:
A-BarcodeX-F is
CCATCTCATCCCTGCGTGTCTCCGACTCAG|barcode X-F|Barcode Adapter-F|;
A-BarcodeX-R is
|Barcode Adapter-R|barcode X-R|CTGAGTCGGAGACACGCAGGGATGAGATGG*T*T;
P1-Forward is
CCACTACGCCTCCGCTTTCCTCTCTATGGGCAGTCGGTGAT;
P1-Reverse is
ATCACCGACTGCCCATAGAGAGGAAAGCGGAGGCGTAGTGG*T*T。
Optimize customized label connector design principle:
Customized label connector namely voluntarily encodes 5 ' end A connectors according to certain joint structure rule and order The base sequence of barcodeX-F, barcodeX-R.Design principle is as follows:
1. sequence label barcode X are 11 bases and the last one base of the ends all barcode X is C, with The G of Barcode Adapter beginnings constitutes CG and stablizes base-pair;
2. connector core signal identifies that sequence B arcode Adapter are GAT;
3. the double thio-modifications (* T*T) of 3 ' protrusions of double-strand label connector, in polymeric enzyme reaction, prevent excision enzyme Digestion;
4. within each sequence label barcode X, bases G C or CG structure are no more than 2;
5. the base percentage of sequence label should meet:%A, %T, %C, %G=18.00-30.00;
%A+T=45.00-60.00;%C+G=40.00-55.00;%A+%T+%C+%G=100%.
11 base contents of the DNA library sequence label barcode X of the design are matched than keeping best base:
Total bases:
%A=27.27, %G=18.18, %T=27.27, %C=27.27;
%A+T=54.55, %C+G=45.45;
Base number:3A、3C、2G、3T.
The beneficial effects of the present invention are:The present invention provides a kind of based on the noninvasive biopsy virus of high-throughput gene sequencing Method has the advantage that compared with prior art:
1, applied widely:All nucleic acid of original state, including double-strand are in the same time in situ detection sample of energy DNA, four type of single stranded DNA, double-stranded RNA and single stranded RNA.Its viral ecology that cross-infection can completely be detected is constituted, It offers reference scheme for the research of viral macro genome and macro transcript profile;Meanwhile being also applied for other DNA and RNA mixing samples Type is sequenced.
2, flux is big, speed is fast:It is available that the present invention devises 80 kinds of label connectors, and detection method has been arranged in pairs or groups at present most The Ion Torrent PGM sequenators of fast high-flux sequence platform life brands, can be with a sample parallel detections up to a hundred, upper machine It is only 2.5 hours that the time, which is sequenced,.
3, sequencing cost is reduced:The method for synthesizing by the design of customized label connector, primer and assembling kit, Cost is substantially reduced, scientific research, clinical needs can be preferably met.
4, the viral nucleic acid abundance extracted is high:By first other nucleic acid in addition to virus removal shell of DNase I and RNase, then Cracking extracting viral nucleic acid, can reduce the pollution of host cell nucleic acid to the maximum extent.
5, biopsy in situ:Sample that is fresh or freezing can be selected, screening and culturing is needed not move through, is with whole gene group Label, the ecology composition of the original virus of biopsy in situ, the accuracy of species identification are high.
6, case actual combat:Selecting Respiratory Tract Adenovirus, syncytial virus (RSV), (nucleic acid type is respectively double for research object Chain DNA and single stranded RNA, the cross-infection of simulated virus), carry out practical experience using the mixture of both viruses as sample. Both viruses are clinically most common Respirovirus.
The present invention is learned with high throughput group as means, detects to fast and convenient, economy, efficient and sensible the ecology of viral sample Composition realizes the taxology identification to hybrid virus in complex sample and accurate quantification.
In terms of clinical diagnosis, by the way that unknown viral sample is sequenced, breathing problem can be detected, it can be with The dynamic relationship for monitoring tumour and certain viruses provides reference for the prevention and treatment of disease, is clinical respiratory virus sense The more preceding of disease of catching an illness monitors in real time more afterwards.
In terms of scientific research, the characteristics of various types nucleic acid can disposably be detected due to it, it can be found that new is micro- Biological macro genome and macro transcript profile, and viral inter-species regulation relationship.This method is suitable for each disease prevention and control center Screening virus and identification metabolite, have a vast market foreground and larger economical, societal benefits, are suitable for a wide range of promote Using.
Description of the drawings
Fig. 1 is that E-gel cuts glue DNA library Piece Selection result in the embodiment of the present invention 1;
Fig. 2 is that Agilent 2100 is distributed library fragments in the embodiment of the present invention 1 and molar concentration detects;
Fig. 3 is library effective coverage in 1 chips of the embodiment of the present invention;
Fig. 4 is that 1 Chinese library of the embodiment of the present invention is averagely read to grow;
Fig. 5 is that sequence quality is distributed box traction substation in the embodiment of the present invention 1;
Fig. 6 is base quality distribution diagram in the embodiment of the present invention 1.
Specific implementation mode
To better illustrate the object, technical solutions and advantages of the present invention, below in conjunction with specific embodiment to the present invention It is described further.
Embodiment 1
Material used in the present embodiment is 8 parts of Respiratory Tract Adenovirus that Zhujiang Hospital attached to Nanfang Medical Univ. provides and conjunction born of the same parents (simulation has the respiratory tract nasopharyngeal secretions that virus infects to the mixture of viral (RSV) sample, and nucleic acid type is respectively double-stranded DNA And single stranded RNA).8 parts of mixtures are as 8 groups of samples, and 10 repetitions of each group of sample, totally 80 are reacted, in library construction, often A kind of DNA labels connector is added in a sample.I.e. the 1st group of sample is separately added into Barcode01-10;2nd group of sample is separately added into Barcode11-20;3rd group of sample is separately added into Barcode21-30, and so on to Barcode80.The present embodiment combination is tested 80 DNA Barcode label connectors are demonstrate,proved.The present invention is not limited to this 80 label joint sequences, wherein any one label The combination of joint sequence or any number of label joint sequences belongs within invention scope.
The experiment flow of the present invention is as follows:
1. viral nucleic acid sample preparation
1.1 sample pre-treatments
The above mixture is taken, 8 groups of samples of equivalent are divided into.1ml 1*PBS mixings are separately added into, 4 DEG C of low temperature acts on 1h, then Fully be vortexed and be resuspended, 3000g, 10min take supernatant, be placed in -80 DEG C it is spare.
1.2 host nucleic acids digest
I buffer solutions of 10*DNase are added in every 200 μ l supernatants, phase is added in the ratio of 50U DNAse I and 50U RNase After answering substance, 37 DEG C digestion 30min, 75 DEG C, 5min, by enzyme-deactivating.
1.3 viral nucleic acids extract altogether
Preparation of samples:In anti-virus operation room, by postdigestive supernatant in strict accordance with viral nucleic acid extraction agent box (Thermo Scientific GeneJET Viral DNA and RNA Purification Kit) carries out DNA, RNA and takes out altogether It carries.The specific method is as follows:
1) 50 μ l Column Preparation Liquid are added to centrifugal column center, balance permeable membrane.
2) 200 μ l samples are added to empty 1.5ml cracking tubes;200ul Lysis Solution and 50 μ l are added Proteinase K, fully blow and beat mixing;56 DEG C of standing 15min, centrifuge 5s at full speed.
3) 300 μ l, 100% ethyl alcohol is added, blows and beats mixing;It is stored at room temperature 3min;Centrifugation 5s at full speed.
4) lysate is added and is set on centrifugal column, 6000g, 1min;Remove collecting pipe, upper prop is set in new 2ml collecting pipes.
5) 700 μ l wash buffer 1 are added and set upper prop, centrifuge 6000g, 1min, remove collecting pipe, upper prop is set newly 2ml collecting pipes in.
6) 500 μ l wash buffer 2 are added and set upper prop, centrifuge 6000g, 1min, remove collecting pipe, upper prop is set newly 2ml collecting pipes in.
7) 500 μ l wash buffer 2 are added and set upper prop, centrifuge 6000g, 1min, remove collecting pipe, upper prop is set newly 2ml collecting pipes in.
8) 3min, 16000g are centrifuged, collecting pipe is removed.
9) upper prop is transferred in new 1.5ml elution pipes;The 20ul Eluent of 56 DEG C of preheatings are added;It is stored at room temperature 2min;13000g, 1min are centrifuged, upper prop is removed, retains eluent (containing sample nucleic).
1.4 nucleic acid concentrations detect
1 μ l of the sample after nucleic acid extraction are taken, sample is detected using HS DNA and the RNA kits of Qubit2.0 are accurate respectively Concentration ensure total amount in 10ng or more to calculate library construction initial amount.
2.RNA reverse transcriptions
2.1 the first chains of cDNA synthesize
Contain viral DNA genome or RNA genetic material in sample, therefore digested without DNA in process of reverse-transcription, Not high-temperature process, to keep the integrality of DNA genomes.And 65 DEG C can make the RNA of double-strand open the unified denaturation of secondary structure At single-stranded RNA, single double-stranded RNA of virus can be allowed all to become the DNA of double-strand by reverse transcription.Reverse Transcriptase kit uses “Goscript reverse transcirption system KIT(#A5001)”。
1) as shown in table 1, sequentially add in the 0.2ml EP pipes that following reagent is polluted to nuclease free, mixing, it is of short duration from The heart.
1 the first chain synthesis reactions of cDNA system 1 of table
2) it is subsequently placed in PCR instrument, 65 DEG C, is placed on cooled on ice after 5min at once.
3) it as shown in table 2, sequentially adds in following reagent to above-mentioned 0.2ml EP pipes after cooling.
2 the first chain synthesis reactions of cDNA system 2 of table
4) soft mixing and centrifugation, are reacted in PCR instrument.
2.2 the second chains of cDNA synthesize
Remaining non-reverse transcription RNA nucleic acid in sample is removed, single-stranded DNA is synthesized to the DNA of double-strand.
1) following reagent as shown in table 3, is sequentially added to 20ul cDNA the first chain reaction liquid, is placed on ice.
3 the second chains of cDNA of table synthesize
2) soft mixing, centrifugation.15 DEG C are set, 2hour.It is more than 15 DEG C that not allow temperature.
3) 2 μ l (12.5U) T4DNA Polymerase (#EP0061) are added, set 15 DEG C, 5min.
4) 0.5 μ l 0.5M EDTA, pH8.0 (#R1021) is added and terminates reaction.
2.3 product purification
Obtained product is the double-stranded DNA of flat end, by XP Beads magnetic beads for purifying, removes reaction substrate and enzyme.
1) 180 μ l Ampure XP Beads (1.8x) are added to DNA product, stand 5min after mixing gently is resuspended, sets In 2min on MPC magnetic frames, supernatant is carefully sucked out.
2) to 500 μ l, 75% fresh ethanols are added in not collophore, using magnetic force, magnetic bead 2 is cleaned by rotary tube Secondary, last time removes residual ethanol.
3) the not collophore containing magnetic bead is placed on 37 DEG C of constant-temperature metal baths, dries residual ethanol.
4) magnetic bead is resuspended with 43 μ l low TE eluents, is stored at room temperature 3min, collophore is placed in 2min on magnetic frame, It is product after purification to draw supernatant.
3. label connector designs and synthesis
The label connector of this experiment is advised in strict accordance with the sequence of Ion Torrent high-flux sequence platform DNA library connectors It is then designed synthesis, and single-stranded connector is become to the double-stranded adapters of specificity by making annealing treatment.
3.1 optimized design principles and principle
1) label connector in library is divided into 5 ' end A connectors of target fragment and 3 ' end P1 connectors, wherein A connectors connect for label Head, universal sequence containing part, P1 are universal joint.The complementation that its universal sequence part is used for amplimer combines.Originally it sets The Ion Torrent high-flux sequence platform label joint structures of meter are shown in Table 4.By single-stranded A-BarcodeX-F, A- Two chain complementary pairings of BarcodeX-R combine and hold A connectors for 5 ' of the double-strand containing label;By P1-Forward, P1- Two chain complementary pairings of Reverse combine and hold P1 connectors for the 3 ' of double-strand.* T*T refer to thio-modification in sequence table of the present invention.
4 label joint structure of table
2) customized label splice tag sequence design principle.It is self-defined according to certain joint structure rule and order The barcodeX-F of 5 ' end A connectors, barcodeX-R series.Design principle is as follows:
(1) sequence label barcode X are 11 bases and the last one base of the ends all barcode X is C, with The G of Barcode Adapter beginnings constitutes CG and stablizes base-pair;
(2) connector core signal identification sequence B arcode Adapter are GAT;
(3) the double thio-modifications (* T*T) of 3 ' protrusions of double-strand label connector, in polymeric enzyme reaction, prevent excision enzyme Digestion;
(4) within each sequence label barcode X, bases G C or CG structure are no more than 2;
(5) the base percentage of sequence label should meet:%A, %T, %C, %G=18.00-30.00;
%A+T=45.00-60.00;%C+G=40.00-55.00;%A+%T+%C+%G=100%.
11 base contents of the DNA library sequence label barcode X of the design are matched than keeping best base:
Total bases:
%A=27.27, %G=18.18, %T=27.27, %C=27.27;
%A+T=54.55, %C+G=45.45;
Base number:3A、3C、2G、3T.
This experiment customized label splice tag sequence refers to table 5.
It can be obtained 5 ' the end A connectors that can be directly synthesized and 3 ' end P1 connector particular sequences in conjunction with table 4 and 5, and A connectors are respectively designated as Barcode 01 to Barcode 80 in order, the BarcodeX` is by single-stranded A- Two chain complementary pairing of BarcodeX-F, A-BarcodeX-R, combines as the double-stranded adapters containing label;As described in Borcode01 is by single-stranded A-Barcode001-F, and two chain complementary pairings of A-Barcode001-R combine to contain label Double-stranded adapters;The Borcode02 be by single-stranded A-Barcode002-F, two chain complementary pairings of A-Barcode002-R, It combines as the double-stranded adapters containing label;And so on, the Borcode80 is by single-stranded A-Barcode080-F, A- Two chain complementary pairings of Barcode080-R, combine as the double-stranded adapters containing label;The nucleosides of the A-Barcode001-F Acid sequence such as SEQ ID NO:Shown in 81, the nucleotide sequence such as SEQ ID NO of the A-Barcode001-R:Shown in 82, institute State the nucleotide sequence such as SEQ ID NO of A-Barcode002-F:Shown in 83, the nucleotide sequence of the A-Barcode002-R Such as SEQ ID NO:Shown in 84, the nucleotide sequence such as SEQ ID NO of the A-Barcode003-F:Shown in 85, the A- The nucleotide sequence of Barcode003-R such as SEQ ID NO:Shown in 86, the nucleotide sequence of the A-Barcode004-F is such as SEQ ID NO:Shown in 87, the nucleotide sequence such as SEQ ID NO of the A-Barcode004-R:Shown in 88, the A- The nucleotide sequence of Barcode005-F such as SEQ ID NO:Shown in 89, the nucleotide sequence of the A-Barcode005-R is such as SEQ ID NO:Shown in 90, the nucleotide sequence such as SEQ ID NO of the A-Barcode006-F:Shown in 91, the A- The nucleotide sequence of Barcode006-R such as SEQ ID NO:Shown in 92, the nucleotide sequence of the A-Barcode007-F is such as SEQ ID NO:Shown in 93, the nucleotide sequence such as SEQ ID NO of the A-Barcode007-R:Shown in 94, the A- The nucleotide sequence of Barcode008-F such as SEQ ID NO:Shown in 95, the nucleotide sequence of the A-Barcode008-R is such as SEQ ID NO:Shown in 96, the nucleotide sequence such as SEQ ID NO of the A-Barcode009-F:Shown in 97, the A- The nucleotide sequence of Barcode009-R such as SEQ ID NO:Shown in 98, the nucleotide sequence of the A-Barcode010-F is such as SEQ ID NO:Shown in 99, the nucleotide sequence such as SEQ ID NO of the A-Barcode010-R:Shown in 100, the A- The nucleotide sequence of Barcode011-F such as SEQ ID NO:Shown in 101, the nucleotide sequence of the A-Barcode011-R is such as SEQ ID NO:Shown in 102, the nucleotide sequence such as SEQ ID NO of the A-Barcode012-F:Shown in 103, the A- The nucleotide sequence of Barcode012-R such as SEQ ID NO:Shown in 104, the nucleotide sequence of the A-Barcode013-F is such as SEQ ID NO:Shown in 105, the nucleotide sequence such as SEQ ID NO of the A-Barcode013-R:Shown in 106, the A- The nucleotide sequence of Barcode014-F such as SEQ ID NO:Shown in 107, the nucleotide sequence of the A-Barcode014-R is such as SEQ ID NO:Shown in 108, the nucleotide sequence such as SEQ ID NO of the A-Barcode015-F:Shown in 109, the A- The nucleotide sequence of Barcode015-R such as SEQ ID NO:Shown in 110, the nucleotide sequence of the A-Barcode016-F is such as SEQ ID NO:Shown in 111, the nucleotide sequence such as SEQ ID NO of the A-Barcode016-R:Shown in 112, the A- The nucleotide sequence of Barcode017-F such as SEQ ID NO:Shown in 113, the nucleotide sequence of the A-Barcode017-R is such as SEQ ID NO:Shown in 114, the nucleotide sequence such as SEQ ID NO of the A-Barcode018-F:Shown in 115, the A- The nucleotide sequence of Barcode018-R such as SEQ ID NO:Shown in 116, the nucleotide sequence of the A-Barcode019-F is such as SEQ ID NO:Shown in 117, the nucleotide sequence such as SEQ ID NO of the A-Barcode019-R:Shown in 118, the A- The nucleotide sequence of Barcode020-F such as SEQ ID NO:Shown in 119, the nucleotide sequence of the A-Barcode020-R is such as SEQ ID NO:Shown in 120, the nucleotide sequence such as SEQ ID NO of the A-Barcode021-F:Shown in 121, the A- The nucleotide sequence of Barcode021-R such as SEQ ID NO:Shown in 122, the nucleotide sequence of the A-Barcode022-F is such as SEQ ID NO:Shown in 123, the nucleotide sequence such as SEQ ID NO of the A-Barcode022-R:Shown in 124, the A- The nucleotide sequence of Barcode023-F such as SEQ ID NO:Shown in 125, the nucleotide sequence of the A-Barcode023-R is such as SEQ ID NO:Shown in 126, the nucleotide sequence such as SEQ ID NO of the A-Barcode024-F:Shown in 127, the A- The nucleotide sequence of Barcode024-R such as SEQ ID NO:Shown in 128, the nucleotide sequence of the A-Barcode025-F is such as SEQ ID NO:Shown in 129, the nucleotide sequence such as SEQ ID NO of the A-Barcode025-R:Shown in 130, the A- The nucleotide sequence of Barcode026-F such as SEQ ID NO:Shown in 131, the nucleotide sequence of the A-Barcode026-R is such as SEQ ID NO:Shown in 132, the nucleotide sequence such as SEQ ID NO of the A-Barcode027-F:Shown in 133, the A- The nucleotide sequence of Barcode027-R such as SEQ ID NO:Shown in 134, the nucleotide sequence of the A-Barcode028-F is such as SEQ ID NO:Shown in 135, the nucleotide sequence such as SEQ ID NO of the A-Barcode028-R:Shown in 136, the A- The nucleotide sequence of Barcode029-F such as SEQ ID NO:Shown in 137, the nucleotide sequence of the A-Barcode029-R is such as SEQ ID NO:Shown in 138, the nucleotide sequence such as SEQ ID NO of the A-Barcode030-F:Shown in 139, the A- The nucleotide sequence of Barcode030-R such as SEQ ID NO:Shown in 140, the nucleotide sequence of the A-Barcode031-F is such as SEQ ID NO:Shown in 141, the nucleotide sequence such as SEQ ID NO of the A-Barcode031-R:Shown in 142, the A- The nucleotide sequence of Barcode032-F such as SEQ ID NO:Shown in 143, the nucleotide sequence of the A-Barcode032-R is such as SEQ ID NO:Shown in 144, the nucleotide sequence such as SEQ ID NO of the A-Barcode033-F:Shown in 145, the A- The nucleotide sequence of Barcode033-R such as SEQ ID NO:Shown in 146, the nucleotide sequence of the A-Barcode034-F is such as SEQ ID NO:Shown in 147, the nucleotide sequence such as SEQ ID NO of the A-Barcode034-R:Shown in 148, the A- The nucleotide sequence of Barcode035-F such as SEQ ID NO:Shown in 149, the nucleotide sequence of the A-Barcode035-R is such as SEQ ID NO:Shown in 150, the nucleotide sequence such as SEQ ID NO of the A-Barcode036-F:Shown in 151, the A- The nucleotide sequence of Barcode036-R such as SEQ ID NO:Shown in 152, the nucleotide sequence of the A-Barcode037-F is such as SEQ ID NO:Shown in 153, the nucleotide sequence such as SEQ ID NO of the A-Barcode037-R:Shown in 154, the A- The nucleotide sequence of Barcode038-F such as SEQ ID NO:Shown in 155, the nucleotide sequence of the A-Barcode038-R is such as SEQ ID NO:Shown in 156, the nucleotide sequence such as SEQ ID NO of the A-Barcode039-F:Shown in 157, the A- The nucleotide sequence of Barcode039-R such as SEQ ID NO:Shown in 158, the nucleotide sequence of the A-Barcode040-F is such as SEQ ID NO:Shown in 159, the nucleotide sequence such as SEQ ID NO of the A-Barcode040-R:Shown in 160, the A- The nucleotide sequence of Barcode041-F such as SEQ ID NO:Shown in 161, the nucleotide sequence of the A-Barcode041-R is such as SEQ ID NO:Shown in 162, the nucleotide sequence such as SEQ ID NO of the A-Barcode042-F:Shown in 163, the A- The nucleotide sequence of Barcode042-R such as SEQ ID NO:Shown in 164, the nucleotide sequence of the A-Barcode043-F is such as SEQ ID NO:Shown in 165, the nucleotide sequence such as SEQ ID NO of the A-Barcode043-R:Shown in 166, the A- The nucleotide sequence of Barcode044-F such as SEQ ID NO:Shown in 167, the nucleotide sequence of the A-Barcode044-R is such as SEQ ID NO:Shown in 168, the nucleotide sequence such as SEQ ID NO of the A-Barcode045-F:Shown in 169, the A- The nucleotide sequence of Barcode045-R such as SEQ ID NO:Shown in 170, the nucleotide sequence of the A-Barcode046-F is such as SEQ ID NO:Shown in 171, the nucleotide sequence such as SEQ ID NO of the A-Barcode046-R:Shown in 172, the A- The nucleotide sequence of Barcode047-F such as SEQ ID NO:Shown in 173, the nucleotide sequence of the A-Barcode047-R is such as SEQ ID NO:Shown in 174, the nucleotide sequence such as SEQ ID NO of the A-Barcode048-F:Shown in 175, the A- The nucleotide sequence of Barcode048-R such as SEQ ID NO:Shown in 176, the nucleotide sequence of the A-Barcode049-F is such as SEQ ID NO:Shown in 177, the nucleotide sequence such as SEQ ID NO of the A-Barcode049-R:Shown in 178, the A- The nucleotide sequence of Barcode050-F such as SEQ ID NO:Shown in 179, the nucleotide sequence of the A-Barcode050-R is such as SEQ ID NO:Shown in 180, the nucleotide sequence such as SEQ ID NO of the A-Barcode051-F:Shown in 181, the A- The nucleotide sequence of Barcode051-R such as SEQ ID NO:Shown in 182, the nucleotide sequence of the A-Barcode052-F is such as SEQ ID NO:Shown in 183, the nucleotide sequence such as SEQ ID NO of the A-Barcode052-R:Shown in 184, the A- The nucleotide sequence of Barcode053-F such as SEQ ID NO:Shown in 185, the nucleotide sequence of the A-Barcode053-R is such as SEQ ID NO:Shown in 186, the nucleotide sequence such as SEQ ID NO of the A-Barcode054-F:Shown in 187, the A- The nucleotide sequence of Barcode054-R such as SEQ ID NO:Shown in 188, the nucleotide sequence of the A-Barcode055-F is such as SEQ ID NO:Shown in 189, the nucleotide sequence such as SEQ ID NO of the A-Barcode055-R:Shown in 190, the A- The nucleotide sequence of Barcode056-F such as SEQ ID NO:Shown in 191, the nucleotide sequence of the A-Barcode056-R is such as SEQ ID NO:Shown in 192, the nucleotide sequence such as SEQ ID NO of the A-Barcode057-F:Shown in 193, the A- The nucleotide sequence of Barcode057-R such as SEQ ID NO:Shown in 194, the nucleotide sequence of the A-Barcode058-F is such as SEQ ID NO:Shown in 195, the nucleotide sequence such as SEQ ID NO of the A-Barcode058-R:Shown in 196, the A- The nucleotide sequence of Barcode059-F such as SEQ ID NO:Shown in 197, the nucleotide sequence of the A-Barcode059-R is such as SEQ ID NO:Shown in 198, the nucleotide sequence such as SEQ ID NO of the A-Barcode060-F:Shown in 199, the A- The nucleotide sequence of Barcode060-R such as SEQ ID NO:Shown in 200, the nucleotide sequence of the A-Barcode061-F is such as SEQ ID NO:Shown in 201, the nucleotide sequence such as SEQ ID NO of the A-Barcode061-R:Shown in 202, the A- The nucleotide sequence of Barcode062-F such as SEQ ID NO:Shown in 203, the nucleotide sequence of the A-Barcode062-R is such as SEQ ID NO:Shown in 204, the nucleotide sequence such as SEQ ID NO of the A-Barcode063-F:Shown in 205, the A- The nucleotide sequence of Barcode063-R such as SEQ ID NO:Shown in 206, the nucleotide sequence of the A-Barcode064-F is such as SEQ ID NO:Shown in 207, the nucleotide sequence such as SEQ ID NO of the A-Barcode064-R:Shown in 208, the A- The nucleotide sequence of Barcode065-F such as SEQ ID NO:Shown in 209, the nucleotide sequence of the A-Barcode065-R is such as SEQ ID NO:Shown in 210, the nucleotide sequence such as SEQ ID NO of the A-Barcode066-F:Shown in 211, the A- The nucleotide sequence of Barcode066-R such as SEQ ID NO:Shown in 212, the nucleotide sequence of the A-Barcode067-F is such as SEQ ID NO:Shown in 213, the nucleotide sequence such as SEQ ID NO of the A-Barcode067-R:Shown in 214, the A- The nucleotide sequence of Barcode068-F such as SEQ ID NO:Shown in 215, the nucleotide sequence of the A-Barcode068-R is such as SEQ ID NO:Shown in 216, the nucleotide sequence such as SEQ ID NO of the A-Barcode069-F:Shown in 217, the A- The nucleotide sequence of Barcode069-R such as SEQ ID NO:Shown in 218, the nucleotide sequence of the A-Barcode070-F is such as SEQ ID NO:Shown in 219, the nucleotide sequence such as SEQ ID NO of the A-Barcode070-R:Shown in 220, the A- The nucleotide sequence of Barcode071-F such as SEQ ID NO:Shown in 221, the nucleotide sequence of the A-Barcode071-R is such as SEQ ID NO:Shown in 222, the nucleotide sequence such as SEQ ID NO of the A-Barcode072-F:Shown in 223, the A- The nucleotide sequence of Barcode072R such as SEQ ID NO:Shown in 224, the nucleotide sequence of the A-Barcode073-F is such as SEQ ID NO:Shown in 225, the nucleotide sequence such as SEQ ID NO of the A-Barcode073-R:Shown in 226, the A- The nucleotide sequence of Barcode074-F such as SEQ ID NO:Shown in 227, the nucleotide sequence of the A-Barcode074-R is such as SEQ ID NO:Shown in 228, the nucleotide sequence such as SEQ ID NO of the A-Barcode075-F:Shown in 229, the A- The nucleotide sequence of Barcode075-R such as SEQ ID NO:Shown in 230, the nucleotide sequence of the A-Barcode076-F is such as SEQ ID NO:Shown in 231, the nucleotide sequence such as SEQ ID NO of the A-Barcode076-R:Shown in 232, the A- The nucleotide sequence of Barcode077-F such as SEQ ID NO:Shown in 233, the nucleotide sequence of the A-Barcode077-R is such as SEQ ID NO:Shown in 234, the nucleotide sequence such as SEQ ID NO of the A-Barcode078-F:Shown in 235, the A- The nucleotide sequence of Barcode078-R such as SEQ ID NO:Shown in 236, the nucleotide sequence of the A-Barcode079-F is such as SEQ ID NO:Shown in 237, the nucleotide sequence such as SEQ ID NO of the A-Barcode079-R:Shown in 238, the A- The nucleotide sequence of Barcode080-F such as SEQ ID NO:Shown in 239, the nucleotide sequence of the A-Barcode080-R is such as SEQ ID NO:Shown in 240.
3.2 label connector pre-treatments
The primer of synthesis is hydroxyl single stranded nucleotide sequence, need to be processed into double-stranded adapters.
1) the primer dry powder of synthesis is diluted to 200 μM respectively with 1*Low TE.
2) DNA library 5 ' holds A connectors to synthesize double-strand method:(by taking the A adapter-primers of each pair of 5 ' end as an example) respectively takes the A- diluted 10 10 μ L of μ L, A-BarcodeX-R of BarcodeX-F, are added 20 μ L 5*T4DNA Ligase Buffer and 40 μ L free nucleic acids Water makes final concentration of 40 μM after mixing.Annealing synthesis double-stranded adapters, reaction condition are as follows in PCR instrument:95 DEG C, 5min;72 DEG C, 5min;60 DEG C, 5min;50 DEG C, 3min;40 DEG C, 3min;30 DEG C, 3min;20 DEG C, 3min;10 DEG C, 3min;4 DEG C, ∞ min.20 μM of Adaptor Mix are finally obtained, -20 DEG C of preservations are set.
3) DNA library 3 ' holds P1 connectors synthesis double-strand method to be same as above.
4) finally, library rear and front end connector equivalent isoconcentration is mixed, finally obtains 20 μ that can be directly used in reaction MAdaptor Mix set -20 DEG C of preservations.
3.3 amplified library PCR primers synthesize
The primer sequence that PCR is sequenced is provided by the DNA library structure kit specification of Life Technologies, is seen Table 6.By the pcr-priemer-A and pcr-priemer-P1 10mM Tris pH 7.5 after synthesis, 10 μM are diluted to, is matched The PCR primer mix reagents that can be directly used for reaction are made.The sequence of the pcr-priemer-A such as SEQ ID NO:241 It is shown, the sequence such as SEQ ID NO of the pcr-priemer-P1:Shown in 242.
PCR primer sequence is sequenced in table 6
4.DNA library constructions and kit assembling
By the single tube reagent of Thermo brands, amplimer, the barcode label connectors voluntarily synthesized of arranging in pairs or groups, assembling Kit is built at DNA library.It builds library to read to grow according to 200bp, 100ng initial amount types carry out library construction, and for virus Genome type feature optimizes processing.
4.1 DNA fragmentations
Interrupting instrument by contactless Bioruptor ultrasounds makes DNA precise fragments.Take the viral nucleic acid products prepared About 100ng is added to interrupting in pipe for low adsorption, is mended to 80ul systems with seedless sour water.Sink is added ice water and 30min is pre-chilled, Frequency is adjusted to high frequency " H ", is a cycle with 0.5min work, 0.5min intervals, totally 18 cycles.Finally by viral DNA It is 250bp to interrupt to main peak, and range is used in combination 2% low range agarose gel electrophoresis to carry out Quality Control and is tested in the segment of 50-600bp Card.
It repairs 4.2 ends
According to the reaction system of table 7, corresponding agent formulations, mixing are added, placement is incubated 30min at ambient temperature.
Repair reaction system in 7 end of table
Wherein End Repair Enzymes are by 20U T4DNA polymerase, klenow Fragment, T4PNK tri- Kind enzyme is mixed with equal unit titers (U), and the 10nM dNTP of 0.2X volumes are added.
Product purification:By bulk product ratio, the Agencourt AM Pure magnetic beads that 1.8 times of volumes are added are combined with sample Capture is used in combination 70% ethanol solution of Fresh to wash 2 times, and 1*Low TE eluents recycle end and repair product.
4.3 notch filling-in, jointing
According to the reaction system of table 8, corresponding agent formulations are added in the PCR pipe of 0.2mL, PCR is placed on after mixing On instrument, 25 DEG C, 25min, 72 DEG C, 5min, 4 DEG C, hold.8 groups of samples of the present embodiment, 10 multiple holes of every group of sample, totally 80 it is anti- Ying Kong.Each reacting hole corresponds to a kind of Barcode, selects Barcode01-10, Barcode11-20 respectively successively, Barcode21-30,Barcode31-40,Barcode41-50,Barcode51-60,Barcode61-70,Barcode71- 80。
8 connector of table connects, and notch repairs reaction system
Connect reaction product purifying.By bulk product ratio, be added the Agencourt AM Pure magnetic beads of 1.2 times of volumes with Sample combines capture, is used in combination 70% ethanol solution of Fresh to wash 2 times, 1*Low TE eluents recycle connection product.
4.4 Piece Selection
Since sequencing instrument reads long limitation, a length of 200bp of average reading of the Ion Torrent platforms.Two end connectors are long Degree is respectively 43bp, then it is that 330 segment is recycled that should select main peak, can improve the matter that follow-up template is prepared and is sequenced in this way Amount.
Segment recycles:UsingSizeSelectTMAgarose Gel automation bale cutting instruments, 2% pre-prepared colloid, Program setting is " SizeSelectTM2% ", the time is set as " 20min ", and 50bp DNA Ladder, 330bp bands are run to receipts When collecting hole, recycled.Fig. 1 is the partial results that E-GEL cuts glue DNA library Piece Selection, as shown, that reference is 50bp The 350bp bands of DNA Ladder Maker recycle target DNA main peak 330bp bands.
4.5 amplified library
According to the reaction system of table 9, corresponding agent formulations are added, is placed in PCR instrument and reacts after mixing.
9 amplified library reaction system of table and response procedures
Amplified library product purification.By bulk product ratio, be added the Agencourt AM Pure magnetic beads of 1.2 times of volumes with Sample combines capture, and 70% ethanol solution of Fresh is used in combination to wash 2 times, and 1*Low TE eluents carry out recycling connection production Object, it is ensured that retain target main peak 330 in Piece Selection to greatest extent, while removing connector dimer and primer pollution.It is pure at this time Product after change is DNA library.
4.6 libraries quantify and Quality Control
1) 1 μ l DNA libraries are taken, the rough concentration range of Qubit dsDNA HS Assay Kit fluoremetries is used;
2) 1 libraries μ l (concentration detected according to previous step is diluted to 1ng/ μ l) are taken, are usedHigh The clip size and concentration in Sensitivity DNA Kit detections library.As shown in 2100 testing results of Fig. 2 portion thereof libraries, (abscissa is clip size, and ordinate is fluorescence signal value) library fragments main peak 330bp, unimodal a concentration of 593.97pg/ Ul, molar concentration 2990.0pmol/L.It can be seen that PCR reactions keep higher amplification efficiency, customized label connector and The preliminary identification that DNA library builds kit assembling is qualified.
Molar concentration calculates formula:
Library molar concentration (pM)=library concentration (the ng/ μ L) libraries * 1.515*1000/ length (kb)
With reference to the molar concentration that Agilent 2100 provides, library is diluted to 26pM, is prepared for template.
5. prepared by template
The dilution gfactor after the above-mentioned 80 encoded libraries mixed in equal amounts of 20 μ L is taken, using Ion OneTouch OT2200Kit Kit and OneTouch instruments carry out emulsion-based PCR;On amplification plate, corresponding reagent ingredient is added by 10 reaction system of table, mixes It is even.After the completion of prepared by template, ISPs enrichments are carried out using MyOne beads magnetic beads and Ion OneTouch ES instruments, it is final to obtain To 3 ' positive template libraries of the end with magnetic bead.
10 template of table prepares reaction system
6.Ion Torrent PGM sequencings
In strict accordance with Ion PGM Sequencing 200kit v2 kit standard operating processes.Cleaned by chlorine water, Deionized water cleaning after adjusting the operations such as PH to 7.65, initialization, 318 chips is put on PGM sequenators, are arranged 500 Flow sequencing cycles, operation three hours of reaction, until sequencing is completed.
This sequencing result is reported:As shown in Fig. 3 chip sequencing results, the actually measured effective coverage of 318 genetic chips Up to 81%, the raw sequencing data total amount finally generated is 864Mb;Key signal value is 66;It is analyzed through data filtering, wherein connecting It is 100% to be connected to library, removes polyclonal, low quality and test reference segment, and final effective library is 61%, totally 5, 513,567 reads read long segment;As shown in Fig. 4 sequencing length histograms, (abscissa is fragment length, and ordinate is Reads numbers) a length of 267bp of sequencing main peak reading.
The total Total Bases obtained after filter analysis are 527Mbs, in this sequencing, 80 samples The equal equivalent of reads of Barcode01 to Barcode80 reads about 5Mbs data volumes, and to find out, Barcode01 is extremely The label connector of Barcode80 has higher quality.
7. data Quality Control
The sequencing data of most original is stored in the form of FASTQ formats.With Ion Torrent PGM server systems Analysis software included Torrent Suite4.2 carries out interpretation of result.By FastQC tools to data quality control and pre- place Reason.Analysis result is as shown in Figure 5,6:
Fig. 5 is that sequence quality is distributed box traction substation, and abscissa is reads base positions (5 '>3 '), ordinate is all Reads is (red in the site base statistic of attribute, including maximum value, minimum value, upper quartile value, lower quartile value, median Solid line) and average value (blue broken line).In general, 5 ' the ends of Reads and the base quality at 3 ' ends are relatively low, the alkali of middle section Matrix amount is higher, it can be seen that this filtered data average quality of sequencing is very high, is substantially distributed in green area (matter Magnitude Q>28).
Fig. 6 is base quality distribution diagram, and abscissa is the average quality of reads, and ordinate is reads numbers, from figure As can be seen that overwhelming majority reads mass is all distributed in 32 or so, illustrate that whole reads average qualities are all very high.
If Fig. 5, Fig. 6 are shown, for the total reads of this sequencing library within the scope of 50-250bp, mass value averagely reaches Q30, Reliability is 99.9%.And it is qualification, reliability 99% that general data mass value, which reaches Q20,.In this sequencing, 80 samples The mass value of product Barcode1 to Barcode80 reaches Q30.So as to judge, the mark of Barcode01 to Barcode80 It is qualified to sign joint quality, passes through verification.
8. virus composition Screening analysis
Using the Alignment that life servers carry compare plug-in unit and Ion Reporter cloud analysis platforms to data into Row compares analysis.
The annotation of the macro genome of virus.MG-RAST automated softwares are used using the collocation of VMGAP softwares, by integrating The macro genomic data of publication, to carry out virus taxis and function prediction.Statistics finds that Barcode01 is to Barcode80's 95% or more sequencing data can compare upper respiratory tract adenovirus and syncytial virus (RSV) genome.It can be seen that double-stranded DNA It is successively inserted into label connector with single stranded RNA nucleic acid segment, and completes the structure in viral gene library.It can be seen that library structure It builds kit to assemble successfully, the macro genome large scale sequencing pattern of this virus is established.
Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention rather than is protected to the present invention The limitation of range is protected, although being explained in detail to the present invention with reference to preferred embodiment, those skilled in the art should Understand, technical scheme of the present invention can be modified or replaced equivalently, without departing from the essence of technical solution of the present invention And range.

Claims (12)

1. one group of DNA label connector, which is characterized in that the DNA labels connector includes A-Barcode001-F/R~A- Barcode010-F/R;The nucleotide sequence of the A-Barcode001-F/R~A-Barcode010-F/R such as SEQ ID NO: 81~SEQ ID NO:Shown in 100.
2. one group of DNA label connectors according to claim 1, which is characterized in that the DNA labels connector further includes A- Barcode011-F/R~A-Barcode020-F/R;The nucleosides of the A-Barcode011-F/R~A-Barcode020-F/R Acid sequence such as SEQ ID NO:101~SEQ ID NO:Shown in 120.
3. one group of DNA label connectors according to claim 1, which is characterized in that the DNA labels connector further includes A- Barcode021-F/R~A-Barcode030-F/R;The nucleosides of the A-Barcode021-F/R~A-Barcode030-F/R Acid sequence such as SEQ ID NO:121~SEQ ID NO:Shown in 140.
4. one group of DNA label connectors according to claim 1, which is characterized in that the DNA labels connector further includes A- Barcode031-F/R~A-Barcode040-F/R;The nucleosides of the A-Barcode031-F/R~A-Barcode040-F/R Acid sequence such as SEQ ID NO:141~SEQ ID NO:Shown in 160.
5. one group of DNA label connectors according to claim 1, which is characterized in that the DNA labels connector further includes A- Barcode041-F/R~A-Barcode050-F/R;The nucleosides of the A-Barcode041-F/R~A-Barcode050-F/R Acid sequence such as SEQ ID NO:161~SEQ ID NO:Shown in 180.
6. one group of DNA label connectors according to claim 1, which is characterized in that the DNA labels connector further includes A- Barcode051-F/R~A-Barcode060-F/R;The nucleosides of the A-Barcode051-F/R~A-Barcode060-F/R Acid sequence such as SEQ ID NO:181~SEQ ID NO:Shown in 200.
7. one group of DNA label connectors according to claim 1, which is characterized in that the DNA labels connector further includes A- Barcode061-F/R~A-Barcode070-F/R;The nucleosides of the A-Barcode061-F/R~A-Barcode070-F/R Acid sequence such as SEQ ID NO:201~SEQ ID NO:Shown in 220.
8. one group of DNA label connectors according to claim 1, which is characterized in that the DNA labels connector further includes A- Barcode071-F/R~A-Barcode080-F/R;The nucleosides of the A-Barcode071-F/R~A-Barcode080-F/R Acid sequence such as SEQ ID NO:221~SEQ ID NO:Shown in 240.
9. one group of DNA label as described in claim 1-8 is any is in the purposes for building in DNA tag libraries.
10. a kind of design method of label connector, which is characterized in that the design method includes the following steps:
(1) sequence label barcode X are 11 bases and the last one base of the ends sequence label barcode X is C, CG, which is constituted, with the G of connector core signal identification sequence B arcode Adapter beginnings stablizes base-pair;
The connector core signal identification sequence B arcode Adapter are GAT;
The double thio-modifications of 3 ' protrusions of the reverse sequence in double-strand label connector;
GC or CG structures in the sequence label barcode X are no more than 2;
The base percentage of the sequence label barcode X:A, T, C and G are respectively 18.00%-30.00%, A+T= 45.00-60.00%;C+G=40.00-55.00%;A+T+C+G=100%;
(2) the label mark sequence barcode X and connector core signal identification sequence B arcode designed universal sequence, step (1) Adapter is sequentially connected, the as described label connector;
The universal sequence is CCATCTCATCCCTGCGTGTCTCCGACTCAG.
11. the design method of label connector according to claim 10, which is characterized in that label sequence in the step (1) Arrange 11 base contents ratios of barcode X:A=27.27%, G=18.18%, T=27.27%, C=27.27%, A+T= 54.55%, C+G=45.45%;
The base number that the sequence label barcode X contain:3A, 3C, 2G and 3T.
12. a method of based on the noninvasive biopsy virus of high-throughput gene sequencing, which is characterized in that the method includes following steps Suddenly:
(1) DNAse I and RNase are added into n measuring samples, n is integer and 1≤n≤80, digests the host outside viral shell After nucleic acid, carries out DNA and RNA and extract altogether;
(2) DNA and RNA of step (1) extraction are changed into the double chain DNA sequence of flat end;
(3) double chain DNA sequence that step (2) obtains is interrupted;
(4) double chain DNA sequence for interrupting step (3) carries out DNA fragmentation end reparation;
(5) 3 ' end connection P1 universal joints and 5 ' end connection DNA label connectors, connect different samples on different DNA labels Connector, the DNA labels connector are any DNA label connectors of claim 1-8;
(6) recovery purifying and PCR amplification are carried out to the connection product of the upper different DNA labels connectors of connection in step (5), PCR expands Increasing primer is pcr-priemer-A and pcr-priemer-P1;
The sequence of the pcr-priemer-A such as SEQ ID NO:Shown in 241, the sequence such as SEQ of the pcr-priemer-P1 ID NO:Shown in 242;
(7) connection product for being connected with different label connectors after amplification in step (6) is mixed, is then sequenced, passes through data Library compares and locking label joint sequence, to identify contained viral species and ecosystem relevant information in each sample.
CN201511035148.XA 2015-12-31 2015-12-31 A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing Active CN105567681B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201511035148.XA CN105567681B (en) 2015-12-31 2015-12-31 A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201511035148.XA CN105567681B (en) 2015-12-31 2015-12-31 A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing

Publications (2)

Publication Number Publication Date
CN105567681A CN105567681A (en) 2016-05-11
CN105567681B true CN105567681B (en) 2018-08-31

Family

ID=55878309

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201511035148.XA Active CN105567681B (en) 2015-12-31 2015-12-31 A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing

Country Status (1)

Country Link
CN (1) CN105567681B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023131910A3 (en) * 2022-01-07 2023-09-21 Agency For Science, Technology And Research (A*Star) Rapid pathogen identification and detection molecular diagnostics technology

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106055924B (en) * 2016-05-19 2019-02-01 完美(中国)有限公司 Microbiological manipulations taxon is determining and sequence assists isolated method and system
CN106834286B (en) * 2017-04-05 2020-02-21 北京泛生子基因科技有限公司 Primer combination for one-step method rapid construction of amplicon library
CN108103060A (en) * 2017-12-27 2018-06-01 广州赛哲生物科技股份有限公司 Tag joint, primer group, kit and database building method for ctDNA methylation database building
CN108165620B (en) * 2018-01-05 2019-05-14 东莞博奥木华基因科技有限公司 Label and its preparation method and application
CN110093455B (en) * 2019-04-27 2020-03-17 中国医学科学院病原生物学研究所 Respiratory virus detection method
CN114300046A (en) * 2021-12-28 2022-04-08 武汉百奥维凡生物科技有限公司 Identification method of new macrovirome viruses

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102061335A (en) * 2010-11-15 2011-05-18 苏州众信生物技术有限公司 Asymmetric deoxyribose nucleic acid (DNA) artificial adapters by using second-generation high-throughput sequencing technology and application thereof
CN102943111A (en) * 2012-11-16 2013-02-27 北京爱普益生物科技有限公司 Application of high-pass DNA (Deoxyribonucleic Acid) sequencing method on determination of short tandem repeat gene locus in human genome and method
CN102952895A (en) * 2011-08-23 2013-03-06 中国科学院上海生命科学研究院 Method for detecting unknown viruses through utilizing sequencing technology
CN105121649A (en) * 2012-11-16 2015-12-02 赛莱蒂克斯公司 Method for targeted modification of algae genomes

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9914979B2 (en) * 2013-03-04 2018-03-13 Fry Laboratories, LLC Method and kit for characterizing microorganisms

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102061335A (en) * 2010-11-15 2011-05-18 苏州众信生物技术有限公司 Asymmetric deoxyribose nucleic acid (DNA) artificial adapters by using second-generation high-throughput sequencing technology and application thereof
CN102952895A (en) * 2011-08-23 2013-03-06 中国科学院上海生命科学研究院 Method for detecting unknown viruses through utilizing sequencing technology
CN102943111A (en) * 2012-11-16 2013-02-27 北京爱普益生物科技有限公司 Application of high-pass DNA (Deoxyribonucleic Acid) sequencing method on determination of short tandem repeat gene locus in human genome and method
CN105121649A (en) * 2012-11-16 2015-12-02 赛莱蒂克斯公司 Method for targeted modification of algae genomes

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023131910A3 (en) * 2022-01-07 2023-09-21 Agency For Science, Technology And Research (A*Star) Rapid pathogen identification and detection molecular diagnostics technology

Also Published As

Publication number Publication date
CN105567681A (en) 2016-05-11

Similar Documents

Publication Publication Date Title
CN105567681B (en) A kind of method and label connector based on the noninvasive biopsy virus of high-throughput gene sequencing
CN105112569B (en) Virus infection detection and authentication method based on metagenomics
CN111349719B (en) Specific primer for detecting novel coronavirus and application thereof
CN107541791A (en) Construction method, kit and the application in plasma DNA DNA methylation assay library
CN111440896B (en) Novel beta coronavirus variation detection method, probe and kit
WO2012068919A1 (en) Dna library and preparation method thereof, and method and device for detecting snps
WO2020233094A1 (en) Molecular linker for ngs library construction, preparation method therefor and use thereof
CN108517567B (en) Adaptor, primer group, kit and library construction method for cfDNA library construction
US20210095393A1 (en) Method for preparing amplicon library for detecting low-frequency mutation of target gene
CN111073961A (en) High-throughput detection method for gene rare mutation
EP3320111B1 (en) Sample preparation for nucleic acid amplification
CN110484655B (en) Detection method for parainfluenza virus whole genome second-generation sequencing
CN110004225B (en) Tumor chemotherapeutic drug individualized gene detection kit, primers and method
CN104561372A (en) Combined primer for amplification and typing of human papilloma virogenes and application of combined primer
CN107475449A (en) A kind of transcript profile sequence measurement spliced suitable for dwarf virus section and geminivirus infection coe virus genome
CN106191311A (en) A kind of quick detection Cavia porcellus LCMV, SV, PVM, Reo 3 virus multiple liquid phase method for gene chip and reagent
WO2024104130A1 (en) Whole genome molecular marker development method utilizing degenerate primer amplification
CN107077533A (en) Sequencing data processing unit and method
CN107077538B (en) Sequencing data processing device and method
CN111979353A (en) Library construction method for sequencing novel coronavirus SARS-CoV-2 full-length genome
TW201321520A (en) Method and system for virus detection
CN109337966A (en) A kind of molecular label and its reagent and application
CN106086193A (en) A kind of method analyzing mixing sample DNA based on INDEL SNP linkage relationship
WO2021219114A1 (en) Sequencing method, analysis method therefor and analysis system thereof, computer-readable storage medium, and electronic device
CN107904297B (en) Primer group, joint group and sequencing method for microbial diversity research

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A noninvasive virus biopsy method based on high-throughput gene sequencing and its label connector

Effective date of registration: 20210527

Granted publication date: 20180831

Pledgee: Yuexiu sub branch of Bank of Guangzhou Co.,Ltd.

Pledgor: GUANGZHOU SAGENE BIOTECH Co.,Ltd.

Registration number: Y2021440000182

PE01 Entry into force of the registration of the contract for pledge of patent right
PP01 Preservation of patent right

Effective date of registration: 20220704

Granted publication date: 20180831

PP01 Preservation of patent right