CN110364225A - A method of utilizing raw letter technology mining ASFV detection of nucleic acids sequence - Google Patents
A method of utilizing raw letter technology mining ASFV detection of nucleic acids sequence Download PDFInfo
- Publication number
- CN110364225A CN110364225A CN201910763772.3A CN201910763772A CN110364225A CN 110364225 A CN110364225 A CN 110364225A CN 201910763772 A CN201910763772 A CN 201910763772A CN 110364225 A CN110364225 A CN 110364225A
- Authority
- CN
- China
- Prior art keywords
- asfv
- file
- gene
- detection
- genome
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
- G16B30/10—Sequence alignment; Homology search
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B50/00—ICT programming tools or database systems specially adapted for bioinformatics
- G16B50/30—Data warehousing; Computing architectures
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Health & Medical Sciences (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Medical Informatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biotechnology (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Bioethics (AREA)
- Databases & Information Systems (AREA)
- Chemical & Material Sciences (AREA)
- Analytical Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention discloses a kind of methods using raw letter technology mining ASFV detection of nucleic acids sequence, and the present invention relates to bioinformatics and technical field of virus detection.This utilizes the raw method for believing technology mining ASFV detection of nucleic acids sequence, provide 3 R language scripts, 3 Perl language scripts, collocation chewBBACA software, all ASFV whole genome sequences obtainable in public database are analyzed, it excavates, it finds conservative and special sequence and retains the matrix file of sequence information, by these sequences according to matrix file information, it redistributes and is restored to corresponding ASFV sequence, and by sequence according to from 5 ' to 3 ' direction sequencing in the genome, according to the annotation information of ASFV, the corresponding functional gene title of ORF where obtaining these sequences, finally obtain all genes and sequence information that can be used as ASFV detection of nucleic acids, the bioinformatics technique has important finger for the excavation of the detection of nucleic acids sequence of other viruses Lead meaning and higher application value.
Description
Technical field
It is specially a kind of to utilize raw letter technology mining ASFV the present invention relates to bioinformatics and technical field of virus detection
The method of detection of nucleic acids sequence.
Background technique
African swine fever virus (Africa Swine Fever Virus, hereinafter referred to as ASFV) is that a kind of contagiousness is strong, anxious
Property, high lethality rate, the virus for infecting pig, from nineteen twenty-one for the first time after Kenya occurs, the wide-scale distribution whole world, and in 2018
Year broken out in China, heavy losses caused to agricultural economy at present, precisely quickly detection ASFV for pre- preventing virus infection and
Control viral transmission plays a significant role.
Detection of nucleic acids has the advantage that sensitivity is high, specificity is good, is the main stream approach of ASFV early infection diagnosis.At present
The detection of nucleic acids of ASFV almost both for its p72 gene design primer probe, ASFV Genome Size in 171-193kb, and
P72 gene size only has 1941bp, only covers whole gene group about 1%, it is a large amount of that current research shows that ASFV genome exists
Insertion or deletion mutation, there is also larger amount of recombination events for genome, although the virus has not occurred in p72 gene at present
Upper mass mutation, but not can guarantee whether the following virus can occur biggish insertion or deletion mutation on the gene, if
These mutation occur, then will be unable to effectively detect ASFV, institute currently based on a large amount of detection kits that the gene (p72) designs
To develop a kind of method of the other genes of virus as detection of nucleic acids sequence, and excavate nucleic acid inspection as much as possible
Sequencing column are of great significance for complete detection ASFV and control viral transmission as deposit.
Detection of nucleic acids for virus requires the target sequence of amplification conservative and special, although ASFV is DNA virus,
It is easier to that the insertion or deletion mutation of genome occurs, so excavation is as detection target fragment for different genes segment
The effective means for guaranteeing nucleic acid detection method diversity and long-term effect will excavate conservative and special nucleic acid sequence as much as possible
Need to use bioinformatics means, so the present invention utilizes bioinformatics technique, developing one can be excavated in ASFV
All methods for guarding special nucleic acid sequence are used for the detection of nucleic acids of ASFV.
Summary of the invention
(1) the technical issues of solving
In view of the deficiencies of the prior art, technology mining ASFV detection of nucleic acids sequence is believed using life the present invention provides a kind of
Method, the target sequence for solving the detection of nucleic acids requirement amplification for being currently used for virus is conservative and special, although ASFV is DNA disease
Poison, but it is easier to the problem of insertion or deletion mutation of genome occurs.
(2) technical solution
In order to achieve the above object, the present invention is achieved by the following technical programs: a kind of to utilize raw letter technology mining
The method of ASFV detection of nucleic acids sequence, specifically includes the following steps:
S1, from the nucleic acid database of NCBI existing ASFV genome is obtained first, using each genome as a list
Only FASTA formatted file downloading, is then named file, then All Files are stored in a file, to file
Name, such as the genome fasta file designation that genome ID is AM712239 is AM712239.fa, as contained lower stroke in ID
Line deletes underscore without exception;
S2, separately one ref-genome file of creation, can customize name, randomly choose two or more genomes
File is put into wherein, and using one of as genome is referred to, the reference genome that the present invention selects is Genbank
Accession is the ASFV genome of U18466.2, first carries out analysis mining to genome with chewBBACA software and goes out whole
genes;
S3, it is called using chewBBACA software using Gene by gene allele calling algorithm
The gene of prodigal2.6.0 40 ASFV whole genome sequences of prediction simultaneously calls blastp to compare to whole genes, and
And it is based on BSR calculating sifting BSR value, then the gene using BSR value greater than 0.6 recycles the software to carry out as allele
Allele calling filters out core genes, the matrix of one core genes type comprising all genomes of output
File, containing genome is referred to, also with software transfer clustalw2.1 and mafft v7.4.07 by ASFV all types
Coregenes carry out Multiple Sequence Alignment with reference to the corresponding core genes sequence of genome, output one includes ASFV
The comparison result file of each core gene type, to obtain the information of all Conserved core genes of ASFV;
S4, the core genes type matrix file that the output of chewBBACA software is read in using the R language scripts write and
The representative sequence alignment file of core genes, it is by Ergodic Matrices data and pattern match core genes that ASFV is each
Core gene is redistributed, total fasta file of one all core genes sequence comprising all ASFV of output,
It is named as total.fasta;
S5, the total fasta file exported using R language scripts in the Perl language scripts read step S4 write, will be every
All core genes of a ASFV are assigned in an independent fasta file, i.e., each individually fasta file only includes one
All core genes of a ASFV itself;
The institute of each ASFV itself of perl script output in the perl script circulation read step S5 that S6, recycling are write
There is the independent fasta file of core genes, all core genes of each ASFV itself are arrived according in the genome 5 '
3 ' direction is ranked up, and generates the gene document to have sorted;
S7, genome is referred to the ASFV with complete annotation information according to gff3 file using the R language scripts write
All gene orders and Gene Name are extracted, to all gene nucleic acid sequence construct blast databases of extraction, present invention selection
Reference genome be Genbank accession be U18466.2 ASFV genome;
S8, using the corresponding sorted gene document of step S6, step S7 of being subject to building blast database carry out this
Ground blast filters out similarity greater than 90%, and length is greater than the optimal result of 450bp, utilizes step S7 corresponding to optimal result
The Gene Name of output file extracts the name of the gene used with the R language scripts write according to the gbk file of reference genome
Claim and annotation information, this output file include the Gene Name of all sequences that can be used as detection of nucleic acids screened, it will
The gene order that all ASFV have sorted merges in a fasta file;
S9, target detection gene order is individually extracted from reference genome, local blast database is constructed, by step
The local blast database of fasta file and building that S8 is obtained compares, and screening similarity is greater than 90%, and length is greater than 450bp
As a result, be named as result.txt, and extract all sequences, the result of this step output using the Perl language scripts write
For the conserved sequence excavated, then Multiple Sequence Alignment is carried out, then separately designs primer and probe for detection of nucleic acids.
Preferably, in the step S1 by file designation it is that genome ID number adds fa suffix, and file is named as
Genomes.fa can customize name.
Preferably, randomly selected two genomes are the genomes for having complete annotation information in the step S2.
Preferably, the entitled merge_all_allele2total_ for the R language scripts write in the step S4
fasta.R。
Preferably, the entitled assign_each_sample_core_ for the Perl language scripts write in the step S5
allele2each_file.pl。
Preferably, the entitled sort_each_sample_ for the Perl language scripts write in the step S6
allele.pl。
Preferably, the entitled extract_genesBygff.R for the R language scripts write in the step S7.
Preferably, the entitled extract_gene_name_ for the R language scripts write in the step S8
infoBygbk.R。
Preferably, the entitled extract_seqByid.pl for the Perl language scripts write in the step S9.
(3) beneficial effect
The present invention provides a kind of methods using raw letter technology mining ASFV detection of nucleic acids sequence.Compared with prior art
Have following the utility model has the advantages that the raw method for believing technology mining ASFV detection of nucleic acids sequence of the utilization, owns by excavating in ASFV
The method that can be used as detection of nucleic acids sequence, it is existing monistic using p72 gene as the nucleic acid detection method of purpose gene to overcome
Problem provides a kind of bioinformatics method applied to viral nucleic acid detection gene excavating, has excavated comprising p72 gene
Totally 52 genes that can be used for ASFV detection, randomly choosing and demonstrating the wherein detection of nucleic acids of 4 genes for ASFV is
Effectively, it realizes and obtains existing ASFV viral gene group information from the nucleic acid database of NCBI first, using Geneby
Gene allele calling method obtains ASFV virus different type allele matrix, then by writing script, utilizes square
Battle array information, reduction distribute in allele nucleic acid sequence to corresponding A SFV strain and with 5 ' to 3 ' direction sequencings of genome, structure
Allele library is built up, then selects the ASFV strain genome gbk file with complete genome annotation information at random, is extracted
All nucleic acid sequence titles excavated, it is conservative and special finally by writing script formulation from allele library and extracting
Sequence is used for ASFV detection of nucleic acids.
Detailed description of the invention
Fig. 1 is flow chart of the invention;
Fig. 2 is present procedure operation result figure;
Fig. 3 is all Gene Names that can be used as ASFV detection of nucleic acids of the present invention and coding Protein Information table figure;
Fig. 4 is primer probe sequence information table figure of the present invention;
Fig. 5 is the detection sequence regular-PCR amplification figure of MGF 360-15R and CP312R gene of the invention;
Fig. 6 is the detection sequence regular-PCR amplification figure of E184L gene of the invention;
Fig. 7 is ASFV MGF 360-15R gene magnification curve graph of the present invention;
Fig. 8 is ASFV CP312R gene magnification curve graph of the present invention;
Fig. 9 is ASFV E184L gene magnification curve graph of the present invention;
Figure 10 is 505 gene magnification curve graph of ASFV MGF of the present invention;
Figure 11 is the qPCR result figure of the detection sensitivity of present invention analysis MGF 360-15R gene;
Figure 12 is the qPCR result figure of the detection sensitivity of present invention analysis CP312R gene;
Figure 13 is the qPCR result figure of the detection sensitivity of present invention analysis E184L gene;
Figure 14 is the qPCR result figure of the detection sensitivity of the multicopy segment of present invention analysis 505 gene family of MGF.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Fig. 1-14 is please referred to, the embodiment of the present invention provides a kind of technical solution: a kind of to utilize raw letter technology mining ASFV core
The method of sour detection sequence, specifically includes the following steps:
S1, from the nucleic acid database of NCBI existing ASFV genome is obtained first, using each genome as a list
Only FASTA formatted file downloading, is then named file, then All Files are stored in a file, to file
Name, such as the genome fasta file designation that genome ID is AM712239 is AM712239.fa, as contained lower stroke in ID
Line deletes underscore without exception, by file designation is that genome ID number adds fa suffix, and file be named as genomes.fa or
Person can customize name;
S2, separately one ref-genome file of creation, can customize name, randomly choose two genome files and be put into
Wherein, and using one of genome as genome is referred to, the reference genome that the present invention selects is Genbank
Accession is the ASFV genome of U18466.2, and randomly selected two genomes are the genomes for having complete annotation information,
Analysis mining is carried out to genome and goes out whole genes;
S3, using chewBBACA software, using Gene by gene allele calling algorithm, to all genomes
It calls prodigal2.6.0 predicted gene, blastp to compare to whole genes, and is based on BSR calculating sifting BSR value, then
Gene using BSR value greater than 0.6 recycles the software to carry out allele calling, filters out core as allele
Genes, the matrix file of one core genes type comprising all genomes of output, contains and refers to genome, also with
Software transfer clustalw2.1 and mafft v7.4.07 is by the core genes of all types of ASFV and refers to genome
Corresponding core genes sequence carries out Multiple Sequence Alignment, one comparison knot comprising each core gene type of ASFV of output
Fruit file, to obtain the information of the Conserved core gene of ASFV;
S4, the core genes type matrix file that the output of chewBBACA software is read in using the R language scripts write and
The representative sequence alignment file of core genes, it is by Ergodic Matrices data and pattern match core genes that ASFV is each
Core gene is redistributed, total fasta file of one all core genes sequence comprising all ASFV of output,
It is named as total.fasta, the entitled merge_all_allele2total_fasta.R for the R language scripts write;
S5, the total fasta file exported using R language scripts in the Perl language scripts read step S4 write, will be every
All core genes of a ASFV are assigned in an independent fasta file, i.e., each individually fasta file only includes one
All core genes of a ASFV, the entitled assign_each_sample_core_ for the Perl language scripts write
allele2each_file.pl;
All core genes of each ASFV exported in the perl script circulation read step S5 that S6, recycling are write
Independent fasta file, all core genes of each ASFV are ranked up according in the genome 5 ' to 3 ' direction,
Obtain the gene document to have sorted.The entitled sort_each_sample_allele.pl for the Perl language scripts write;
S7, genome is referred to the ASFV with complete annotation information according to gff3 file using the R language scripts write
All gene orders are extracted, to the gene nucleic acid sequence construct blast database extracted, the name for the R language scripts write
Referred to as extract_genesBygff.R, the reference genome that the present invention selects are Genbank accession for U18466.2
ASFV genome;
S8, using the corresponding sorted gene document of step S6, step S7 of being subject to building blast database carry out this
Ground blast filters out similarity greater than 90%, and length is greater than the optimal result of 450bp, utilizes step S7 corresponding to optimal result
The Gene Name of output file extracts the name of the gene used with the R language scripts write according to the gbk file of reference genome
Claim and annotation information, this output file include the Gene Name of all sequences that can be used as detection of nucleic acids screened, compiles
The entitled extract_gene_name_infoBygbk.R for the R language scripts write.The gene order that all ASFV have been sorted
Merge in a fasta file;
S9, target detection gene order is extracted from reference genome and constructs local blast database, step S8 is obtained
The fasta file obtained is compared with local blast database, and screening similarity is greater than 90%, and length is greater than 450bp's as a result, life
Entitled result.txt, and all sequences are extracted using the Perl language scripts write, the result of this step output is to excavate
Conserved sequence, then carry out Multiple Sequence Alignment, separately design primer and probe then with for detection of nucleic acids, the Perl language write
Say that the entitled extract_seqByid.pl of script, all ASFV that can be used as ASFV detection of nucleic acids that the present invention excavates are protected
Gene information is kept as shown in table Fig. 3.
Using method design ASFV detection primer proposed by the invention and probe:
Random selection gene M GF360-15R, CP312R, E184L and MGF505 gene family exists on ASFV genome
3 multicopies and relatively conservative segment detect target gene respectively as ASFV, the design of primer and probe passes through software
Beacon designer 8 carry out, for MGF 360-15R gene design detection interval between base sequence 112-263,
The site that upstream primer starts from base sequence is 112,5'-ATGGACATGATATGTCTAGAC-3', and downstream primer starts from base
The site of sequence is 245,5'-GCACATCATCTACTACAAG-3', and the site that probe starts from base sequence is 148,5'(6-
FAM)-CCTGCTCCTCTGGCGATGAT-3'(BHQ-1).For CP312R gene design detection interval in base sequence
Between 310-461, the site that upstream primer starts from base sequence is 310,5'-GATCCCTGTTTGCAGTTC-3', downstream primer
The site for starting from base sequence is 441,5'-GCTTCTTCTAACAGTTCAATA-3', and the site that probe starts from base sequence is
339,5'(6-FAM)-AATCTCGCCGCCATTGGAAG-3'(BHQ-1), for E184L gene design detection interval in alkali
Between basic sequence 92-235, the site that upstream primer starts from base sequence is 92,5'-CACCATTCTAAACCATATCTG-3',
The site that downstream primer starts from base sequence is 218,5'-CACCTGAGGAGAAGAATC-3', and probe starts from the position of base sequence
Point is 191,5'(6-FAM)-CCTCCTTCGAGAGCCCATCTTTGA-3'(BHQ-1), for the detection of MGF505 gene design
Section is between base sequence 1-101, since the conservative of all sequences in 505 gene family of MGF is not high, to examine as far as possible
All copies of 505 gene family of MGF are measured, therefore devise degenerate primer and probe, upstream primer starts from base sequence
Site is 1,5'-TTACTRTGGRAGGGRA-3', and the site that downstream primer starts from base sequence is 85,5'-
TGRCAGTCYYCRATTTG-3', the site that probe starts from base sequence is 28,5'(6-FAM)-
TCYAARGCTCCTATGATGGC-3'(BHQ-1), experiment the primer probe sequence information is as shown in table Fig. 4.
The ASFV detection primer probe sensitivity and specificity that Standard PCR and quantitative fluorescent PCR are evaluated are specific to detect
Method the following steps are included:
T1, analog detection sample: the plasmid containing above-mentioned object to be measured segment is added into different amounts to different pigs respectively
Whole blood sample in 10 parts of ASFV of simulation detect samples, the fragment sequence on plasmid is with reference to genome Genbank
The sequence of No. accession corresponding aforementioned four gene for U18466.2, wherein 2,3,9,10 and 11 be the positive sample of simulation
This, 1,5,6,7 and 8 be simulation negative sample.
T2, with the QIAamp DNA Blood Mini Kit (Cat.51104) of Qiagen, illustratively handbook extracts above-mentioned mould
Quasi- ASFV detects sample nucleic acid.
T3, the primer and probe dry powder difference compound concentration for synthesizing company are 20 μM.
T4, standard PCR amplification reaction: the reaction total system of Standard PCR is 25 μ L, 2.5 μ L 10 × buffer, 2 μ L
DNTPs, 0.4 20 μM of μ L upstream primer, 0.4 20 μM of μ L downstream primer, 0.5 μ L archaeal dna polymerase, 17.2 μ L sterile waters, finally
Adding the 2 μ L of DNA profiling extracted in step T2, of short duration centrifugation after reaction system is mixed, reaction condition is 95 DEG C of 3min of initial denaturation,
With 95 DEG C of 30s, 58 DEG C of 30s, 72 DEG C of 30s, 35 circulations, 72 DEG C of 5min, 12 DEG C of 5min, in BioRad T100PCR instrument are expanded
Upper amplification detects amplified production and comes confirmatory reaction system and primer specificity, finally obtains Ago-Gel electricity after amplification
Swimming is as a result, this experiment is reacted with standard PCR amplification to three pairs of primers designed by MGF 360-15R, CP312R and E184L gene
Specificity evaluated, finally obtain the regular-PCR amplified production agarose gel electrophoresis results figure of Fig. 5 and Fig. 6.Fig. 5
MGF 360-15R and CP312R gene detection sequence regular-PCR amplification in, 1-10 be MGF 360-15R amplification
As a result, wherein 2,3,8,9,10 for simulation positive sample amplification, 1,4,5,6,7 for simulation negative sample amplification, 11-
20 be the amplification of CP312R, wherein 12,13,18,19,20 be to simulate positive sample amplification, 11,14,15,16,17
To simulate negative sample amplification;In the detection sequence regular-PCR amplification of the E184L gene of Fig. 6,21-30 E184L
Amplification, wherein 22,23,28,29,30 is negative for simulation for simulation positive sample amplification, 21,24,25,26,27
Sample amplification.
T5, fluorescent quantitative PCR reaction: using TaqMan real-time fluorescence quantitative PCR reaction system, and total system is 25 μ
L, including use 12.5 μ L of Takara (Code No.RR390A) MIX, 0.4 20 μM of μ L upstream primer, 0.4 downstream 20 μM of μ L
Primer, 0.4 20 μM of μ L probe, 9.3 μ L DEPC water, the 2 μ L of DNA profiling finally plus in step T2 extracted mix reaction system
Of short duration centrifugation after even, reaction condition expand 40 circulations, in BioRad with 95 DEG C of 10s, 58 DEG C of 30s for 95 DEG C of 1min of initial denaturation
It is expanded on CFX96 quantitative fluorescent PCR instrument, this experiment is to according to MGF 360-15R, 505 gene of CP312R, E184L and MGF
The detection effect of designed primer and probe is evaluated, and is finally obtained as Fig. 7, Fig. 8, Fig. 9 and fluorescence shown in Fig. 10 are fixed
Measure PCR amplification result.
The specificity and sensitivity of T6, further 4 sets of primed probes of evaluation: 1. extracting Streptococcus suis, hemolytic streptococcus,
The liver organization of single increasing Liszt, Escherichia coli, salmonella and ASFV feminine gender pig, musculature, swine fever virus
(CSFV), pig blue-ear disease is malicious (PRRSV), and the nucleic acid of H7N8 avian influenza virus tests the specificity of primed probe;2. simulation sun
No. 2 sample gradient dilutions of property, detect the detection sensitivity of four pairs of primed probes.It finally obtains such as Figure 11, Figure 12, Figure 13, Figure 14
Shown in fluorescent quantitative PCR result.
Test result:
Find out from Standard PCR result, the three pairs of primers designed according to three gene M GF 360-15R, CP312R and E184L
It can accurately detect that ASFV simulates positive sample, and negative sample illustrates that the gene pairs result of pig is noiseless without amplification, from
It can be seen in the qPCR curve graph of the multicopy segment of 505 gene family of MGF 360-15R, CP312R, E184L gene and MGF
The primed probe designed out can detect simulation positive sample 2,3,9,10 and 11, simulate negative sample and negative control group is equal
Without amplification, designed according to the multicopy segment of three gene M GF 360-15R, CP312R and E184L and 505 gene family of MGF
Four pairs of primed probes to the minimum detection limit for simulating positive No. 2 samples be respectively 22copies/ μ L, 22copies/ μ L,
22copies/ μ L and 2256copies/ μ L illustrates that these four primed probes have very high sensitivity for detection segment, in addition,
With four pairs of primed probes to Streptococcus suis, hemolytic streptococcus is single to increase Liszt, Escherichia coli, salmonella, and African pig
The liver organization of pest feminine gender pig, musculature, swine fever virus (CSFV), pig blue-ear disease poison (PRRSV), H7N8 avian influenza virus
QPCR detects no positive signal, shows higher specificity, method proposed by the invention for ASFV detection have compared with
High practical application value, and be not limited solely to the detection of ASFV, the detection of nucleic acids sequence analyses of other viruses can also be with
It excavates with the method to all candidate genes that can be used for detection of nucleic acids.
It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality
Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation
In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to
Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.
It although an embodiment of the present invention has been shown and described, for the ordinary skill in the art, can be with
A variety of variations, modification, replacement can be carried out to these embodiments without departing from the principles and spirit of the present invention by understanding
And modification, the scope of the present invention is defined by the appended.
Claims (9)
1. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence, it is characterised in that: specifically includes the following steps:
S1, from the nucleic acid database of NCBI existing ASFV genome is obtained first, individually using each genome as one
The downloading of FASTA formatted file, is then named file, then All Files are stored in a file, orders file
Name;
S2, separately one ref-genome file of creation randomly choose two or more genome files and are put into wherein, use
ChewBBACA software first carries out analysis mining to the genome in ref-genome file and goes out whole genes;
S3, all genomes are called using Gene by gene allele calling algorithm using chewBBACA software
Prodigal2.6.0 predicted gene, blastp are compared to whole genes, and are based on BSR calculating sifting BSR value, then will
Gene of the BSR value greater than 0.6 recycles the software to carry out allele calling, filters out core as allele
Genes, the matrix file of one core genes type comprising all genomes of output, also with the software transfer
Clustalw2.1 and mafft v7.4.07 by all types of core genes of ASFV with reference to the corresponding core of genome
Genes sequence carries out Multiple Sequence Alignment, exports the comparison result file comprising each core gene type of ASFV, thus
Obtain the information of all Conserved core genes of ASFV;
S4, the core genes type matrix file and core that the output of chewBBACA software is read in using the R language scripts write
The representative sequence alignment file of genes, by Ergodic Matrices data and pattern match core genes by each core of ASFV
Gene is redistributed, and total fasta file of one all core genes sequence comprising all ASFV of output is named as
total.fasta;
S5, the total fasta file exported using R language scripts in the Perl language scripts read step S4 write, will be each
All core genes of ASFV are assigned in an independent fasta file, i.e., each individually fasta file only includes one
All core genes of ASFV itself;
All core genes' of each ASFV itself exported in the perl script circulation read step S5 that S6, recycling are write
All core genes of each ASFV itself are ranked up by independent fasta file according in the genome 5 ' to 3 ' direction,
This step generates sorted gene document;
S7, ref- is put by selection to complete annotation information according to gff3 file using the R language scripts write
The ASFV of genome file extracts all gene orders and Gene Name with reference to genome (one), owns to extraction
Gene nucleic acid sequence construct blast database;
S8, using the corresponding sorted gene document of step S6, the blast database of step S7 of being subject to building carries out local
Blast filters out similarity greater than 90%, and length is greater than the optimal result of 450bp, defeated using step S7 corresponding to optimal result
The Gene Name of file out extracts the title of the gene used with the R language scripts write according to the gbk file of reference genome
And annotation information, this output file include the Gene Name of all sequences that can be used as detection of nucleic acids screened.By institute
The gene order for having ASFV to sort merges in a fasta file;In conclusion having had been built up all can be used as
The ASFV nucleic acid sequence library (the fasta file of the good gene order of the ordering by merging that S8 is generated) of detection of nucleic acids sequence and corresponding base
Because of title (the including all Gene Names that can be used as detection of nucleic acids sequence for screening of S8 output).
S9, extraction think the nucleic acid sequence and design primer of detection, then conservative target inspection need to be individually extracted from reference genome
Cls gene sequence simultaneously constructs local blast database, the local blast of the fasta file that step S8 is obtained and this step building
Database compares, and screening similarity is greater than 90%, and length is greater than 450bp's as a result, be named as result.txt, and using writing
Perl language scripts extract all sequences, the result of this step output is the conserved sequence to be detected excavated, then is carried out more
Then sequence alignment separately designs primer and probe for detection of nucleic acids.
2. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence according to claim 1, feature exist
In: in the step S1 by file designation be that genome ID number adds fa suffix, and file be named as genomes.fa or
It can customize name.
3. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence according to claim 1, feature exist
In: randomly selected genome is the genome for having complete annotation information in the step S2.
4. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence according to claim 1, feature exist
In: the entitled merge_all_allele2total_fasta.R for the R language scripts write in the step S4.
5. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence according to claim 1, feature exist
In: the entitled assign_each_sample_core_allele2each_ for the Perl language scripts write in the step S5
file.pl。
6. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence according to claim 1, feature exist
In: the entitled sort_each_sample_allele.pl for the Perl language scripts write in the step S6.
7. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence according to claim 1, feature exist
In: the entitled extract_genesBygff.R for the R language scripts write in the step S7.
8. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence according to claim 1, feature exist
In: the entitled extract_gene_name_infoBygbk.R for the R language scripts write in the step S8.
9. a kind of method using raw letter technology mining ASFV detection of nucleic acids sequence according to claim 1, feature exist
In: the entitled extract_seqByid.pl for the Perl language scripts write in the step S9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910763772.3A CN110364225B (en) | 2019-08-19 | 2019-08-19 | Method for excavating ASFV nucleic acid detection sequence by using letter generation technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910763772.3A CN110364225B (en) | 2019-08-19 | 2019-08-19 | Method for excavating ASFV nucleic acid detection sequence by using letter generation technology |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110364225A true CN110364225A (en) | 2019-10-22 |
CN110364225B CN110364225B (en) | 2023-08-08 |
Family
ID=68225216
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910763772.3A Active CN110364225B (en) | 2019-08-19 | 2019-08-19 | Method for excavating ASFV nucleic acid detection sequence by using letter generation technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110364225B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012079016A1 (en) * | 2010-12-10 | 2012-06-14 | Brandeis University | Compositions and methods for the detection and analysis of african swine fever virus |
US9474797B1 (en) * | 2014-06-19 | 2016-10-25 | The United States Of America, As Represented By The Secretary Of Agriculture | African swine fever virus georgia strain adapted to efficiently grow in the vero cell line |
CN107784199A (en) * | 2017-10-18 | 2018-03-09 | 中国科学院昆明植物研究所 | A kind of organelle gene group screening technique based on STb gene sequencing result |
CN109295255A (en) * | 2018-09-18 | 2019-02-01 | 张薇 | A kind of nucleic acid rapid detection method for African swine fever virus |
-
2019
- 2019-08-19 CN CN201910763772.3A patent/CN110364225B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012079016A1 (en) * | 2010-12-10 | 2012-06-14 | Brandeis University | Compositions and methods for the detection and analysis of african swine fever virus |
US9474797B1 (en) * | 2014-06-19 | 2016-10-25 | The United States Of America, As Represented By The Secretary Of Agriculture | African swine fever virus georgia strain adapted to efficiently grow in the vero cell line |
CN107784199A (en) * | 2017-10-18 | 2018-03-09 | 中国科学院昆明植物研究所 | A kind of organelle gene group screening technique based on STb gene sequencing result |
CN109295255A (en) * | 2018-09-18 | 2019-02-01 | 张薇 | A kind of nucleic acid rapid detection method for African swine fever virus |
Non-Patent Citations (1)
Title |
---|
乔彩霞;刘洋;刘艳华;高志强;林志雄;刘巍;田纯见;王传彬;王强;倪建强;: "非洲猪瘟病毒核酸能力验证样品的制备及初步应用" * |
Also Published As
Publication number | Publication date |
---|---|
CN110364225B (en) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Postel et al. | High abundance and genetic variability of atypical porcine pestivirus in pigs from Europe and Asia | |
Moreira-Soto et al. | Evidence for multiple sylvatic transmission cycles during the 2016–2017 yellow fever virus outbreak, Brazil | |
Chiu | Viral pathogen discovery | |
Obbard | Expansion of the metazoan virosphere: Progress, pitfalls, and prospects | |
Stubbs et al. | Assessment of a multiplex PCR and Nanopore-based method for dengue virus sequencing in Indonesia | |
Kim et al. | Variations in spike glycoprotein gene of MERS-CoV, South Korea, 2015 | |
Dao et al. | Characterization of Lumpy skin disease virus isolated from a giraffe in Vietnam | |
Hodcroft et al. | Evolution, geographic spreading, and demographic distribution of Enterovirus D68 | |
Marek et al. | Whole-genome sequences of two turkey adenovirus types reveal the existence of two unknown lineages that merit the establishment of novel species within the genus Aviadenovirus | |
Dieu et al. | Evaluation of white spot syndrome virus variable DNA loci as molecular markers of virus spread at intermediate spatiotemporal scales | |
Sahin et al. | Genomic characterization of SARS‐CoV‐2 isolates from patients in Turkey reveals the presence of novel mutations in spike and nsp12 proteins | |
Lo et al. | Contrasting epidemiology and genetic variation of Plasmodium vivax infecting Duffy-negative individuals across Africa | |
Mekchay et al. | Population structure of four Thai indigenous chicken breeds | |
Hu et al. | Genome-wide study on genetic diversity and phylogeny of five species in the genus Cervus | |
Domanska-Blicharz et al. | Molecular epidemiology of infectious bronchitis virus in Poland from 1980 to 2017 | |
Wille et al. | Evolutionary genetics of canine respiratory coronavirus and recent introduction into Swedish dogs | |
Bigot et al. | Discovery of Culex pipiens associated tunisia virus: a new ssRNA (+) virus representing a new insect associated virus family | |
Williams et al. | Discovery of Jogalong virus, a novel hepacivirus identified in a Culex annulirostris (Skuse) mosquito from the Kimberley region of Western Australia | |
Vasconcellos et al. | Genome sequences of chikungunya virus isolates circulating in midwestern Brazil | |
Brown-Joseph et al. | Identification and characterization of epizootic hemorrhagic disease virus serotype 6 in cattle co-infected with bluetongue virus in Trinidad, West Indies | |
Hajdarevic et al. | Genetic association study in myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) identifies several potential risk loci | |
Zannoli et al. | A deletion in the N gene may cause diagnostic escape in SARS-CoV-2 samples | |
Davis et al. | Hepatitis E virus: whole genome sequencing as a new tool for understanding HEV epidemiology and phenotypes | |
El Hadad et al. | Partial sequencing analysis of the NS5B region confirmed the predominance of hepatitis C virus genotype 1 infection in Jeddah, Saudi Arabia | |
Hansen et al. | High diversity of picornaviruses in rats from different continents revealed by deep sequencing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |