CN110957005B - Design of primer for amplicon sequencing and construction method of amplicon sequencing library - Google Patents

Design of primer for amplicon sequencing and construction method of amplicon sequencing library Download PDF

Info

Publication number
CN110957005B
CN110957005B CN201910761748.6A CN201910761748A CN110957005B CN 110957005 B CN110957005 B CN 110957005B CN 201910761748 A CN201910761748 A CN 201910761748A CN 110957005 B CN110957005 B CN 110957005B
Authority
CN
China
Prior art keywords
primer
primers
pcr
sequencing
distance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910761748.6A
Other languages
Chinese (zh)
Other versions
CN110957005A (en
Inventor
刘继强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Compass Biotechnology Technology Co ltd
Original Assignee
Beijing Compass Biotechnology Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Compass Biotechnology Technology Co ltd filed Critical Beijing Compass Biotechnology Technology Co ltd
Priority to CN201910761748.6A priority Critical patent/CN110957005B/en
Publication of CN110957005A publication Critical patent/CN110957005A/en
Application granted granted Critical
Publication of CN110957005B publication Critical patent/CN110957005B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Artificial Intelligence (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a design method of a primer for sequencing an amplicon and a construction method of a sequencing library of the amplicon. The Primer design process is mainly designed based on Primer3 software, different treatment designs are respectively carried out on a target site away from a normal target site and a target site away from an abnormal target site by using the method for designing the Primer for sequencing the amplicon, so that the non-specific amplification is reduced, and the designed Primer has the property of stronger specificity.

Description

Design of primer for amplicon sequencing and construction method of amplicon sequencing library
Technical Field
The invention belongs to the technical field of biology, and particularly relates to a design method of a primer for amplicon sequencing and a construction method of an amplicon sequencing library.
Background
Amplicon sequencing is a high-targeting target region sequencing method, and a designed primer sequence is utilized to capture a target region in a targeted manner, and then high throughput sequencing (NGS) is carried out, and a sequencing result is analyzed to obtain corresponding information. Amplicon sequencing has become a popular choice in the field of gene sequencing because of the high flexibility of amplification reaction due to autonomous primer design. Therefore, the successful design of the primer is the key for determining the success of the amplicon sequencing experiment, and is the precondition and the important process of the experiment. The existing amplicon sequencing primer design method has unclear description on primer design details, lacks an effective design method aiming at special sites, and has relatively less designable amplicon weight. The invention provides a treatment method for special sites and simultaneously provides a screening method for the ultrahigh heavy amplicon primer panel. The primer design can be efficiently carried out aiming at the required sites, the primer weight is high, the non-specific amplification is realized, the primer dimer is less, and the defects of the existing primer design method are overcome.
Disclosure of Invention
The purpose of the present invention is to provide a method for designing a primer for sequencing an amplicon.
Another objective of the invention is to provide a method for constructing an amplicon sequencing library.
In order to achieve the object of the present invention, in a first aspect, the present invention provides a method for designing a Primer for sequencing an amplicon, the method comprising the steps of:
A. the user prepares the following files: a reference genome sequence file (reference. Fa), an SNP site physical location file, and length information files of each chromosome of a species;
B. running Primer3 software to generate a reference genome index file (reference. Fa. Myfasteridx), and acquiring SNP site flanking sequences from a reference genome according to the length specified by a user;
C. according to the distance between two SNP sites, the following three cases are treated:
treatment 1: if the Distance between the two SNP sites is larger than or equal to ND (Normal Distance), the software executes the Primer design related parameters input by the user, calls the Primer3_ core, and writes the Primer design result of the SNP site of the type into a file Primer3_ output.txt;
and (3) treatment 2: if adjacent sites exist, the Distance between two SNP sites is less than CD (Close Distance), and the Distance between two SNP sites in a plurality of SNPs is less than BD (Bad Distance), taking the section where the SNP sites are located as a target area, designing related parameters according to a primer input by a user, and calling a primer3_ core; and continuously writing the result into a file Primer3_ output.txt;
and (3) treatment: if the Distance between two SNP sites is less than CD (Close Distance), and the Distance between two SNPs in a plurality of SNPs is more than BD (Bad Distance), software generates a flanking sequence file sequence _ bases.fa of the segment where the SNPs are located; according to the relative positions of the SNP loci in the sequence provided by the fasta sequence title line, a user selects or rejects a plurality of SNP loci according to the relative positions, parameters are set by the user, and a software Primer3 is used for designing a Primer for each group of SNP loci;
D. specific multiplex PCR primers were designed according to the following principle:
d1. primer sequences were aligned to the genome: screening primers aligned to the unique position of the reference genome;
d2. alignment between primers: retaining primers that do not form primer dimers with each other;
d3. the size of the amplified product is 180-400bp;
and (3) manually correcting the primers meeting the requirements of d 1-d 3 to obtain the primers for sequencing the amplicon.
In the present invention, ND represents 2 times the length of the designated flanking sequence, CD represents the distance between two SNP sites for primer design without placing on one sequence, and BD represents the distance between two SNP sites with the farthest distance on the same sequence.
Preferably, in step d2, the primer screening principle is as follows: no complementary region exists between the 10bp at the tail end of the primer and other primers.
In a second aspect, the invention provides a primer for sequencing beef cattle amplicons obtained according to the method, wherein the sequence of the primer is shown as SEQ ID NO: 1-200.
In a third aspect, the present invention provides the use of primers designed according to the above method for the construction of an amplicon sequencing library.
In a fourth aspect, the present invention provides a method of constructing an amplicon sequencing library, the method comprising:
1) Designing a primer for sequencing the amplicon according to the method;
2) Extracting the genome DNA of a sample to be detected as a template, carrying out PCR reaction by using the primer designed in the step 1), and recovering and purifying a PCR product.
The PCR reaction system in the step 2): DNA template 22. Mu.L, PCR Enzyme Mix 45. Mu.L, PCR Primer Mix 10. Mu.L, RNase-free ddH 2 O (NF Water) 23. Mu.L.
And (3) PCR reaction conditions: 3min at 95 ℃; 20s at 98 ℃, 15s at 60 ℃, 30s at 72 ℃ and 7 cycles; 10min at 72 ℃.
By the technical scheme, the invention at least has the following advantages and beneficial effects:
by utilizing the design method of the primer for sequencing the amplicon, different treatment designs are respectively carried out on the target site (SNP site) with the normal distance and the target site with the abnormal distance, so that the non-specific amplification is reduced, and the designed primer has the property of stronger specificity.
Drawings
FIG. 1 is a flow chart of primer design in example 1 of the present invention.
Fig. 2 is an example of software operation in embodiment 1 of the present invention.
FIG. 3 is a reference genome sequence file of beef cattle in example 1 of the present invention
FIG. 4 is a length file of each chromosome of beef cattle in example 1 of the present invention.
Fig. 5 is a file of target sites of beef cattle in example 1 of the present invention.
FIG. 6 is a reference genomic index file hint for beef cattle in example 1 of the present invention.
FIG. 7 is the hint generated when the program was run for the same genomic file under this directory for a non-first time in example 1 of the present invention.
FIG. 8 is a diagram illustrating the primer3_ core related parameters input in embodiment 1 of the present invention.
FIG. 9 shows the result of electrophoresis of the PCR amplification product in example 1 of the present invention. Wherein, M is DNA molecular weight standard, and 1 is PCR amplification product.
Detailed Description
The invention provides a design method of a primer for sequencing an amplicon, which mainly comprises the following two parts: the specific method comprises the following steps:
1. primer design process
The Primer design process is mainly designed based on Primer3 software, and a user needs to prepare the following files:
(1) Reference genomic sequence files (reference. Fa);
(2) A target site physical location file;
(3) Each chromosome length information file of species;
the software runs to generate a reference genome index file (reference. Fa. Myfasteridx) that allows the user to obtain the sequences flanking the target site from the genome at the user-specified length.
Since SNPs are distributed at different intervals on a chromosome, there are 3 cases to be treated according to the distance between two SNPs:
treatment 1: if the Distance between the two SNP sites is larger than or equal to ND (Normal Distance), the software executes the Primer design related parameters input by the user, calls the Primer3_ core, and writes the biological design result of the SNP site of the type into a file (Primer 3_ output.txt);
and (3) treatment 2: if proximal sites are present: the Distance between two SNP sites is less than CD (Close Distance), the Distance between two SNPs at two ends in a plurality of SNPs is less than BD (Bad Distance), then the region section of the SNPs sites is taken as a target region, related parameters are designed according to the primer input by the user, and primer3_ core is called; and the result is continuously written into the file (Primer 3_ output.txt);
and (3) treatment: if the Distance between two SNP sites is less than CD (Close Distance), the Distance between two SNPs at two ends in a plurality of SNPs is more than BD (Bad Distance), the software can give the flanking sequence (sequences _ bases.fa) of the region where the SNPs are located, the relative position of the sites in the sequence can be given by the fasta sequence header, a user can cut off the plurality of SNP sites according to the relative position, parameters are set by the user, and the Primer design is carried out on each group of sites by using webpage version Primer 3.
The primer design process is introduced according to the distance between different SNP sites, and primer information comparison is needed after the primer design is completed so as to screen out the multiple PCR primers with stronger specificity. The specific method comprises the following steps:
2. primer screening
1. Primer sequences aligned to the genome
The comparison with the reference genome is mainly to check whether the designed primer is compared to the unique position of the reference genome, and if the designed primer is compared to a plurality of positions, whether the primer is reserved is determined according to set conditions.
Comparison condition 1: and (3) comparing the primer sequence with a reference genome by using blastn, analyzing the comparison result, and screening qualified primers.
Screening conditions 1: and (3) comparing the primers to a plurality of positions, setting evalue parameters of a second comparison position, and screening the primers.
2. Alignment between primers
The primer-to-primer comparison mainly judges whether primer dimers appear between primers, and the screening standard is a region with 10bp of the tail end of the primer and no complementarity with other primers.
Alignment condition 2: and selecting 10bp at the 3' end of each primer as an input sequence, comparing the input sequence with the full-length sequences of other primers, and recording the comparison position, corresponding evalue and other parameters if a region with complementary bases exists.
Screening conditions 2: selecting a primer with a 3 'end which is not complementary with other primers in a base group as an initial qualified primer, then randomly reserving a pair of primers with a product length of 180-400, randomly reserving a pair of primers at each site, screening comparison results of all primers by taking the screened pair primer set as a reference, comparing each primer in the comparison results with the qualified site set one by one during screening, and reserving the primer if a 10bp region with a base group complementary with other primers does not exist at the 3' end of the primer and the paired primers exist and the product length is 180-400.
3. Comparing manually
In order to verify the reliability of the comparison screening method, a plurality of primers in the final set are randomly selected, compared with other primers, and subjected to manual correction inspection.
If the manual comparison result meets all the conditions, the primer design is finished.
The primer design flow is shown in FIG. 1.
The terms:
length of flanking sequence: the user specifies the length (in bp) of the flanking sequence of the SNP site obtained on the reference genome;
ND (Normal Distance) value: 2-designation of flanking sequence length;
CD (Close Distance) value: the distance value between two SNP sites is designed without putting the SNP sites on a sequence;
BD (Bad Distance) value: the distance value of two SNP sites with the farthest distance on the same sequence;
an intermediate file: the intermediate file of the software operation is deleted after the operation is finished;
sequences _ normal sites. Txt: extracting flanking sequences of sites adjacent to the SNP space > ND value according to the requirements of users;
sequences _ closed states. Txt: extracting flanking sequences of sites with CD values smaller than the two farthest SNP intervals smaller than ND values on the same sequence according to user requirements;
sequences _ badsites. Txt: extracting flanking sequences of the farthest two sites with SNP (single nucleotide polymorphism) spacing smaller than BD (BD) value on the same sequence according to user requirements;
sequence _ closed states. Txt: obtaining flanking sequences of a series of sites with the distance between two SNP sites smaller than the length specified by a user according to the requirements of the user;
primer3_ outfile. Txt: calling an output file after the primer3_ core according to the parameters set by the user;
primer.csv: the file name can be automatically designated by the user according to the primer sequence designed for the SNP site provided by the user.
The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention. Unless otherwise specified, the technical means used in the examples are conventional means well known to those skilled in the art, and the raw materials used are commercially available products.
Example 1 design of primers for amplicon sequencing and construction of amplicon sequencing library, exemplified by beef cattle genome UMD version 3.1
1. Primer design
PrimersdesignV2.0.Py is run and corresponding information is entered according to screen prompts, see FIG. 2 for an example of the run.
1. Input file and related parameters
Beef cattle reference genome sequence file (. Fasta): bos _ taurus. Umd3.1.Dna. Chromosomes. Fa (fig. 3).
Beef cattle individual chromosome length file (. Txt): ttlelechlength txt (fig. 4).
Beef target site file (· txt): chr _ pos.
The length of The flying sequence of sites: the user designates 200bp sequences at both sides of the SNP locus;
ND value: 400, respectively;
CD value: 160;
BD value: 300.
the first time the software is run, a hint of the beef cattle reference genome index file is generated, as shown in fig. 6. When the program is not run for the first time under the directory for the same genomic file, the generated hint is shown in fig. 7, which shows that the program can be directly called without regeneration.
Processing according to the distance value between two SNP sites specified by a user
Treatment 1: a user designates that the distance between two SNP sites is more than or equal to 400 to generate an intermediate file (sequences _ normal sites.txt), a Primer3_ core related parameter is input (figure 8), and the biological design result of the SNP site of the type is written into the file (Primer 3_ output.txt);
primer3_ core Primer design parameters description:
prime _ PRODUCT _ SIZE _ RANGE: < product Length Range >
Prime _ NUM _ RETURN: < maximum amount of primer returned, required to be greater than 1>
PRIMER _ MIN _ SIZE: < shortest primer Length >
PRIMER _ OPT _ SIZEL: < optimal length of primer >
PRIMER _ MAX _ SIZE: < maximum primer Length >
PRIMER _ MIN _ TM: < minimum TM value of primer >
PRIMER _ OPT _ TM: < optimal TM value of primer >
PRIMER _ MAX _ TM: < maximum TM value of primer >
Prime _ PAIR _ MAX _ DIFF _ TM: < maximum difference between primer pairs TM >
Prime _ MIN _ GC: < minimum GC content of primer >
Prime _ OPT _ GC: < optimum GC content of primer >
Prime _ MAX _ GC: < maximum GC content of primer >
And (3) treatment 2: if proximal sites are present: the Distance between two SNP sites is less than 160 (Close Distance), the Distance between two SNPs in a plurality of SNPs is less than 300 (Bad Distance), then the region section of the SNPs sites is taken as a target region, the primer3_ core is called according to the relevant parameters of the primer design input by the user; and the result is continuously written into the file (Primer 3_ output.txt);
and (3) treatment: if the Distance between two SNP sites is less than 160 (Close Distance), the software with the Distance between two SNPs at two ends being more than 300 (Bad Distance) can provide flanking sequences (sequences _ bases.fa) of the region where the SNPs are located, the relative positions of the sites in the sequences can be provided by the fasta sequence header, a user can cut off the SNP sites according to the relative positions, parameters are set by the user, and the Primer design is carried out on each group of sites by using webpage version Primer 3.
2. Primer screening
According to the Primer design result (Primer 3_ outfile. Txt) of the Primer3 software, screening proper primers to obtain the Primer name, sequence, starting position and ending position of the corresponding site. Generating an original primer sequence file, and comparing primer information.
1) Primer sequences aligned to the genome
The comparison with the reference genome is mainly to check whether the designed primer is compared to the unique position of the reference genome, and if the designed primer is compared to a plurality of positions, whether the primer is reserved is determined according to set conditions.
Comparison condition 1: and (3) comparing the primer sequence with the reference genome by using blastn, analyzing the comparison result, and screening qualified primers.
Screening conditions 1: the primers are aligned to multiple positions, and according to the parameters for setting evalue, the primers are deleted, and other primers are reserved.
2) Alignment between primers
The primer-to-primer comparison mainly judges whether primer dimers appear between primers, and the screening standard is a region with 10bp of the tail end of the primer and no complementarity with other primers.
Alignment condition 2: and selecting 10bp at the 3' end of each primer as an input sequence, comparing the input sequence with the full-length sequences of other primers, and recording the comparison position, corresponding evalue and other parameters if a region with complementary bases exists.
Screening conditions 2: selecting a primer with a 3 'end which is not complementary with other primers in a base group as an initial qualified primer, then randomly reserving a pair of primers with a product length of 180-400, randomly reserving a pair of primers at each site, screening comparison results of all primers by taking the screened pair primer set as a reference, comparing each primer in the comparison results with the qualified site set one by one during screening, and reserving the primer if a 10bp region with a base group complementary with other primers does not exist at the 3' end of the primer and the paired primers exist and the product length is 180-400.
3) Comparing manually
In order to verify the reliability of the comparison screening method, 10 primers in the final set are randomly selected, compared with other primers, and subjected to manual correction inspection.
And (4) the manual comparison result meets all the deletion conditions, and the primer design is finished.
200 primers are obtained after screening, and primer synthesis and primer library establishment verification experiments are carried out on the primers.
3. Primer synthesis
Through the primer design process, 100 pairs of primers are synthesized by a primer company, and the product length is about 200 bp. The primer information is shown in Table 1 (SEQ ID NO: 1-200):
TABLE 1
Figure BSA0000188110230000071
/>
Figure BSA0000188110230000081
/>
Figure BSA0000188110230000091
4. Library construction and sequencing data analysis
After the primer screening is completed, blood of the cattle is taken as a sample. And performing quality control, comparison and variation detection on the sequencing data to verify whether the primer design is successful.
1.100 pairs of primers to construct a library
1) Preparing a primer panel: 100 pairs of primers are selected for library construction reaction, the primers are firstly gathered at the bottom of a plate by centrifugation, and then the primers are dissolved by water or TE buffer solution to prepare 10 multiplied by stock solution.
2) Genome quantification: by using
Figure BSA0000188110230000092
The genome was quantified using the dsDNA HS Assay Kit (Thermo Fisher).
3) Multiplex PCR amplification: and performing amplification reaction by adopting multiple PCR, and detecting whether a PCR product has a target band, whether the band is dispersed, whether primer dimer exists and the like by agarose gel electrophoresis.
And (3) PCR reaction system:
Primer Pool mix 8μL
bovine genome template 50ng
RNase-free water(NF water) Adjusting the volume according to the other components
2×QIAGEN Multiplex PCR Master Mix 20μL
Total 40μL
And (3) PCR reaction conditions:
Figure BSA0000188110230000101
4) Magnetic bead chip selection: and (4) screening magnetic bead fragments to remove impurities, genome fragments and primer dimers of the amplified product, and purifying the product.
5) End repair and joint connection: 50ng of sample was taken to the PCR tube according to the quantified concentration and TE buffer was supplemented to a total volume of 40. Mu.L. And preparing a tail end repairing reaction mixed solution in the PCR tube, and placing the PCR tube of the reaction mixed solution on a PCR instrument for tail end repairing. To the PCR tube of the end repair reaction solution, 5. Mu.L of Adapter Mix (MGIEASy DNA library preparation kit) was added to carry out linker ligation. After the reaction was complete, the PCR tube was removed, 20. Mu.L of TE buffer was added to the PCR tube to a volume of 100. Mu.L, and the entire product was transferred to a new 1.5mL EP tube.
6) And (3) PCR reaction: and (3) fusing the purified ligation product with the DNA of the blood sample, preparing a PCR reaction solution, and carrying out PCR reaction.
And (3) PCR reaction system:
Figure BSA0000188110230000102
Figure BSA0000188110230000111
and (3) PCR reaction conditions:
Figure BSA0000188110230000112
7) And (3) PCR product purification: the product was purified using magnetic beads.
8) Digestion and cyclization reactions: and (4) after cyclization is finished, purifying and quality-detecting a digestion product, completing library construction, and preparing for on-machine sequencing.
2. Sequencing data analysis
1) And (3) data filtering: obtaining sequencing data and performing data filtration. Data filtering has three conditions, namely, removing low-quality reads, removing reads (more than 1%) with higher N content, and removing reads with adapter.
Because the PCR product has a large length span (70-400 bp), the minimum length of reads needs to be set during filtration, and the minimum length is generally 5-10bp shorter than the shortest PCR product.
2) And (3) comparison: in the normal analysis procedure, deduplication is required after alignment. For the amplicon project, the PCR product is directly used for library building, so that the information such as sequencing depth, coverage and the like is directly counted without duplication removal.
3) And (3) mutation detection: GATK is currently used for mutation detection, and two cases are counted according to results.
The first is known locus genotype, and the comparison is directly carried out to count the typing consistency.
The second locus is not genotypically known and can be compared to each other using different samples from the same individual to statistically profile identity.
For some sites, there is no variation per se, there is no result in call variation, and if there is no data for the site in different samples from the same individual, the default typing is 0/0, and the consistency is 100%.
5. Results of the experiment
1. Construction of sequencing libraries
Adding a joint after the multiple PCR product is purified, carrying out PCR amplification on the connection product again, wherein the conditions for purifying the product are as follows:
Figure BSA0000188110230000113
the electrophoresis results are shown in FIG. 9. The experimental result shows that the purified product has single band and the total amount reaches the requirement of building a library. The result shows that the amplification and the library establishment of 100 primers are successful, and the quality reaches the requirement of on-machine sequencing.
2. High throughput sequencing
Sample sequencing Using a MGISEQ-2000 gene sequencer, the long PE150 was read by sequencing, with the following sample parameters:
sample name Barcode Primer pair number Annealing temperature (. Degree.C.) Number of reaction cycles Template amount (ng)
A100-35 512 100 60 35 100
3. Sequencing data analysis
1) As shown in Table 2, the results of the offline data quality control are shown. The results show that the effective rate is 99.23%, the Q20 (%) is 97.47%, the Q30 (%) is 91.13%, the data quality is qualified, and the library construction quality is qualified.
TABLE 2 offline data quality control results
Figure BSA0000188110230000121
2) The MAP table results show that: the correct coverage was 96.7%, indicating less impurities and waste information and better quality of the multiplex PCR product (Table 3).
TABLE 3 MAP Table
Figure BSA0000188110230000122
3) Statistics is carried out on the ratio of different sequencing depths, and the results show that the sequencing depths are all 100% within 50 weight, and the sequencing quality is good (Table 4).
TABLE 4 statistical results of different sequencing depth ratios
Sample name Total sites Depth>10(%) Depth>20(%) Depth>50(%)
A100-35 101 100 100 100
According to different data analysis results, the method is feasible and can be used for designing primers for sequencing the amplicon.
Although the invention has been described in detail hereinabove with respect to a general description and specific embodiments thereof, it will be apparent to those skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims (5)

1. A method for designing a Primer for sequencing an amplicon, characterized by designing a Primer based on Primer3 software, the method comprising the steps of:
A. the user prepares the following files: reference genome sequence files, SNP locus physical position files and length information files of each chromosome of species;
B. running Primer3 software to generate a reference genome index file, and acquiring SNP site flanking sequences from a reference genome according to the length specified by a user;
C. according to the distance between two SNP sites, the following three cases are treated:
treatment 1: if the distance between the two SNP sites is larger than or equal to ND, the software executes the Primer design related parameters input by the user, calls the Primer3_ core, and writes the Primer design result of the SNP site of the type into a file Primer3_ output.txt;
and (3) treatment 2: if adjacent sites exist, the distance between two SNP sites is smaller than CD, and the distance between two SNPs at two ends in a plurality of SNPs is smaller than BD, taking the section where the SNP sites are located as a target area, designing related parameters according to a primer input by a user, and calling a primer3_ core; continuously writing the Primer design result into a file Primer3_ output.txt;
and (3) treatment: if the distance between two SNP sites is smaller than CD, and the distance between two SNPs in a plurality of SNPs is larger than BD, software generates a flanking sequence file sequence _ badsites.fa of a section where the SNPs are located; according to the relative positions of the SNP loci in the sequence provided by the fasta sequence title line, a user selects or rejects a plurality of SNP loci according to the relative positions, parameters are set by the user, and a software Primer3 is used for designing a Primer for each group of SNP loci;
D. the primers for sequencing the amplicons were screened as follows:
d1. primer sequences were aligned to the reference genome: screening primers aligned to the unique position of the reference genome;
d2. alignment between primers: retaining primers that do not form primer dimers with each other;
manually correcting the primers satisfying d 1-d 2 to obtain the primers for sequencing the amplicon;
in the step C, ND represents 2 times of the length of the designated flanking sequence, CD represents the distance value between two SNP sites which are not placed on one sequence for primer design, and BD represents the distance value between two SNP sites which are farthest apart on the same sequence.
2. The method of claim 1, wherein in step d2, the primer screening is performed by the following steps: there is no complementary region between the 10bp of the primer end and other primers.
3. The primer for sequencing the beef cattle amplicon is characterized by comprising the following primer sequences:
Figure FDA0004000806170000011
Figure FDA0004000806170000021
Figure FDA0004000806170000031
Figure FDA0004000806170000041
4. use of primers designed according to the method of claim 1 or 2 for amplicon sequencing library construction.
5. A method of constructing an amplicon sequencing library, said method comprising:
1) Designing primers for sequencing an amplicon according to the method of claim 1 or 2; centrifuging to gather the primers at the bottom of the plate, and dissolving the primers with water or TE buffer solution to prepare 10 Xstock solution;
2) Genome quantification: by using
Figure FDA0004000806170000042
Quantifying the genome by using a dsDNAHS Assay Kit;
3) Multiplex PCR amplification: performing amplification reaction by adopting multiple PCR, and detecting whether a target strip, whether the strip is dispersed and whether a primer dimer exists in a PCR amplification product by agarose gel electrophoresis;
a multiplex PCR amplification reaction system: primer Pool mix 8 μ L, genomic template 50ng,2 XQIAGEN Multiplex PCRmastermix 20 μ L, RNase-free water to 40 μ L;
the reaction conditions of the multiplex PCR amplification are as follows: 15min at 95 ℃; 30s at 94 ℃, 90s at 60 ℃, 60s at 72 ℃ and 25-35 cycles; 30min at 60 ℃;
4) Magnetic bead chip selection: screening magnetic bead fragments to remove impurities, genome fragments and primer dimers of PCR amplification products, and purifying the products;
5) End repairing and joint connection: taking 50ng of sample to a PCR tube according to the quantitative concentration, and supplementing TE buffer to the total volume of 40 mu L; preparing a tail end repairing reaction mixed solution in a PCR tube, and placing the PCR tube of the tail end repairing reaction mixed solution on a PCR instrument for tail end repairing; adding 5 mu L of Adapter Mix into a PCR tube of the mixed solution of the end repair reaction to carry out the joint connection reaction; after the end of the adaptor ligation reaction, taking out the PCR tube, adding 20 μ L of TE buffer into the PCR tube until the volume is 100 μ L, and transferring all ligation products into a new 1.5mL EP tube;
6) And (3) PCR reaction: taking the connecting product to fuse with the DNA of the blood sample, preparing PCR reaction solution, and carrying out PCR reaction;
and (3) PCR reaction system: 22 mu L of DNA template, 45 mu L of PCR Enzyme Mix, 10 mu L of PCR Primer Mix and 23 mu L of RNase-free water;
and (3) PCR reaction conditions: 3min at 95 ℃; 20s at 98 ℃, 15s at 60 ℃, 30s at 72 ℃ and 7 cycles; 10min at 72 ℃;
7) And (3) PCR product purification: purifying the product using magnetic beads;
8) Digestion and cyclization reactions: and (4) after cyclization is finished, purifying and quality-detecting a digestion product, completing library construction, and preparing for on-machine sequencing.
CN201910761748.6A 2019-08-19 2019-08-19 Design of primer for amplicon sequencing and construction method of amplicon sequencing library Active CN110957005B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910761748.6A CN110957005B (en) 2019-08-19 2019-08-19 Design of primer for amplicon sequencing and construction method of amplicon sequencing library

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910761748.6A CN110957005B (en) 2019-08-19 2019-08-19 Design of primer for amplicon sequencing and construction method of amplicon sequencing library

Publications (2)

Publication Number Publication Date
CN110957005A CN110957005A (en) 2020-04-03
CN110957005B true CN110957005B (en) 2023-03-24

Family

ID=69976256

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910761748.6A Active CN110957005B (en) 2019-08-19 2019-08-19 Design of primer for amplicon sequencing and construction method of amplicon sequencing library

Country Status (1)

Country Link
CN (1) CN110957005B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012061814A1 (en) * 2010-11-05 2012-05-10 Transgenomic, Inc. Pcr primers and methods for rapid and specific genotyping
CN105718759A (en) * 2016-02-17 2016-06-29 湖南圣维基因科技有限公司 bPrimer batch PCR primer design method based on Primer 3
CN107345251A (en) * 2017-07-10 2017-11-14 中国烟草总公司郑州烟草研究院 Primer for identifying flue-cured tobacco Longjiang 911 combines and kit, application and authentication method
CN107937497A (en) * 2017-11-29 2018-04-20 拓普基因科技(广州)有限责任公司 A kind of multiple PCR primer design method based on Primer3
CN109415764A (en) * 2016-07-01 2019-03-01 纳特拉公司 For detecting the composition and method of nucleic acid mutation

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012061814A1 (en) * 2010-11-05 2012-05-10 Transgenomic, Inc. Pcr primers and methods for rapid and specific genotyping
CN105718759A (en) * 2016-02-17 2016-06-29 湖南圣维基因科技有限公司 bPrimer batch PCR primer design method based on Primer 3
CN109415764A (en) * 2016-07-01 2019-03-01 纳特拉公司 For detecting the composition and method of nucleic acid mutation
CN107345251A (en) * 2017-07-10 2017-11-14 中国烟草总公司郑州烟草研究院 Primer for identifying flue-cured tobacco Longjiang 911 combines and kit, application and authentication method
CN107937497A (en) * 2017-11-29 2018-04-20 拓普基因科技(广州)有限责任公司 A kind of multiple PCR primer design method based on Primer3

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于RAD-seq技术的南方鲇高密度遗传连锁图谱构建;谢蜜蜜;《中国优秀硕士学位论文全文数据库 农业科技辑》;20170215(第02期);全文 *
应用于靶向测序的多重PCR引物设计系统;王亚恒;《中国优秀硕士学位论文全文数据库 基础科学辑》;20190615(第06期);全文 *

Also Published As

Publication number Publication date
CN110957005A (en) 2020-04-03

Similar Documents

Publication Publication Date Title
US11286524B2 (en) Multi-position double-tag connector set for detecting gene mutation and preparation method therefor and application thereof
EP3191628B1 (en) Identification and use of circulating nucleic acids
KR101795124B1 (en) Method and system for detecting copy number variation
EP3607065B1 (en) Method and kit for constructing nucleic acid library
CN113337604A (en) Identification and use of circulating nucleic acid tumor markers
US6007231A (en) Method of computer aided automated diagnostic DNA test design, and apparatus therefor
CN105441432A (en) Composition and application thereof to sequencing and variation detection
US20150203907A1 (en) Genome capture and sequencing to determine genome-wide copy number variation
CN108220403B (en) Method and device for detecting specific mutation site, storage medium and processor
CN110669834A (en) Method for developing polymorphic SSR (simple sequence repeat) marker based on transcriptome sequence
CN105734048A (en) PCR-free sequencing library preparation method for genome DNA
KR101457983B1 (en) Method for Autosomal Analysing Human Subject of Analytes Using Multiplex Gene Amplification
CN109536615B (en) Development method and application of microsatellite marker primer
CN114774517A (en) Method and kit for sequencing human immune repertoire
CN109686404B (en) Method and device for detecting sample confusion
CN112195238B (en) Primer group and kit for amplifying PKD1 gene
US20180291369A1 (en) Error-proof nucleic acid library construction method and kit
JP2023521687A (en) floating barcode
CN110957005B (en) Design of primer for amplicon sequencing and construction method of amplicon sequencing library
CN111575349A (en) Linker sequence and application thereof
CN116083423B (en) Probe for target enrichment of nucleic acid
CN104726604B (en) Decayed-sample degradation DNA (deoxyribonucleic acid) detection method and application thereof
CN114214734A (en) Single-molecule target gene library building method and kit thereof
CN114277114A (en) Method for adding unique identifier in amplicon sequencing and application
CN113046448A (en) SNP genetic marker related to sheep lambing number and application thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant