CN105274092A - Batch acquiring method for specific isothermal oligonucleotide probes - Google Patents

Batch acquiring method for specific isothermal oligonucleotide probes Download PDF

Info

Publication number
CN105274092A
CN105274092A CN201510849278.0A CN201510849278A CN105274092A CN 105274092 A CN105274092 A CN 105274092A CN 201510849278 A CN201510849278 A CN 201510849278A CN 105274092 A CN105274092 A CN 105274092A
Authority
CN
China
Prior art keywords
pathogenic bacterium
gene
sequence
pathogenic
peculiar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510849278.0A
Other languages
Chinese (zh)
Inventor
牛超
高志贤
刘颖
王涛
王静
王尚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Hygiene and Environmental Medicine Academy of Military Medical Sciences of Chinese PLA
Original Assignee
Institute of Hygiene and Environmental Medicine Academy of Military Medical Sciences of Chinese PLA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Hygiene and Environmental Medicine Academy of Military Medical Sciences of Chinese PLA filed Critical Institute of Hygiene and Environmental Medicine Academy of Military Medical Sciences of Chinese PLA
Priority to CN201510849278.0A priority Critical patent/CN105274092A/en
Publication of CN105274092A publication Critical patent/CN105274092A/en
Pending legal-status Critical Current

Links

Landscapes

  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a batch acquiring method for specific isothermal oligonucleotide probes based on a whole genome sequence. The batch acquiring method comprises the following steps: firstly, utilizing a design algorithm to acquire a specific genome segment sequence of the whole genome of the bacterial pathogenic bacteria; utilizing a specific isothermal probe design algorithm based on the specific genome segment sequence of the bacterial pathogenic bacteria to design the isothermal oligonucleotide probes in batches on different bacterial pathogenic bacteria genes; combining with chip scanning, thereby realizing the scanning analysis for the whole genome of the bacterial pathogenic bacteria; developing a detecting chip for detecting the known bacterial specific genes and virulence genes, thereby achieving the quick screening and confirmation for the unknown pathogenic bacteria. The implementation of the method has higher economic and social significance in the related technical field.

Description

The batch acquisition methods of one species specificity isothermal oligonucleotide probe
Technical field
The present invention relates to biological technical field, particularly relate to a kind of batch acquisition methods of the specificity isothermal oligonucleotide probe based on whole genome sequence.
Background technology
In mid-May, 2011, Germany breaks out enterohemorrhagic Escherichia coli (EHEC) epidemic situation, the short two weeks time, epidemic situation in Europe at least 9 countries spread, shock the whole world.June 2, the World Health Organization (WHO) food safety office externally announced: cause the pathogenic bacterium of nearest german sausage Enterohemorrhagic E.coli (EHEC) epidemic situation to be by the novel germ (restructuring) of two different bacterial strain mutation, containing fatal gene (batch acquisition methods 2 virulence gene of St mono-species specificity isothermal oligonucleotide probe), belong to the virulent strain of highly infective, and to part microbiotic, there is resistance.End June 20, cause 39 people dead, infected more than 3000 people.This plays epidemic situation and has absolutely proved: the new unknown pathogenic agent that may occur in future will have new unknown message in virulence, restructuring and resistance, brings great threat to public safety guarantee.
Since entering 21 century, population in the world quantity constantly increases, international exchange is flourish, provand globalizes increasingly, ecotope is destroyed, microbiotic widely uses, the mankind are on the continuous transformation impact of environment simultaneously, the pathogenic micro-organism kind causing infectious diseases is day by day complicated, the threat of common known pathogenic micro-organism is not only eliminated, and there are some endurance strains, as staphylococcus, faecalis, Pseudomonas aeruginosa, intestinal bacteria etc., may occurring of the new pathogenic agent newly sent out, very large difficulty is brought to the detection of pathogenic agent and qualification.The World Health Organization (WHO) announces: new discovery tens of kinds of new pathogenic agent over nearly 30 years are much wherein the communicable diseases caused by pathogenic bacteria; Such as: légionaires' disease, the microbial food poisoning of monocyte Liszt, yersinia entero-colitica infection etc. that the cholera that EHEC O157: H 7 infects, O139 serotype vibrio cholerae causes, legionella pneumophilia cause.
In the face of precipitate epidemic situation, we must find out pathogenesis as early as possible and (not only comprise and determine Pathogen category, and comprise determine pathogenic agent virulence property, whether there is restructuring and resistance feature etc.), to determine treatment plan, take effective measures in time, prevent epidemic situation from worsening further; Also can provide Main Basis information for aspects such as drug screening, new drug development, later stage preventions simultaneously.Rapid Identification for known pathogenic agent develops very fast at present in the world, and the pathogenic agent method for quickly detecting wherein grown up based on microbiology, chemistry, molecular biology and Immunology can carry out rapid detection qualification to pathogenic agent respectively.But, these conventional biochemistry and immunological test all can not provide the potential pathogenic information of pathogenic micro-organism or xicity related information, mostly these pathogenic agent method for quickly detecting grown up based on microbiology, chemistry, molecular biology and Immunology are man-to-man detection method simultaneously, or the method for the one-to-many of limitation.Face the future the new pathogenic bacteria that may occur, they all belong to unknown pathogenic agent, for accomplishing to identify these unknown pathogenic agent fast, need the detection method of one-to-many.Biochip, as a kind of high-throughput techniques, is one of hot research field of current global medical science detection and real-time analysis technology.Utilize biochip technology can develop the general new detecting technique of pathogenic agent of " one-to-many ", multiple pathogens is detected by primary first-order equation, fast, accurately, it can gain time for the disposal of accident result, can also provide new means for the Monitoring systems tackling biological threats.
When carrying out array experiment, specific probe design is one of key factor determining base chip detection usefulness.In chip design process, probe mainly designs according to conservative region relative in object pathogenic agent target gene, so select the target gene being applicable to chip detection to be vital for preparing gene chip.Full-length genome chip is made up of single pathogen gene group, and probe fragment is from the open reading frame of cDNA library or pathogen gene group.According to the high flux property of gene chip, be integrated on same chip micro-for the gene probe of multiple pathogenic agent, can also realize the detection to multiple related diseases substance and qualification.Full-length genome chip utilizes specific probe to identify pathogenic agent, carry out phylogenetic systematics, the rapid detection to known bacterial classification specific gene, virulence gene and resistance related gene can be realized, the genetic diversity of pathogenic agent can also being studied by the mutability detecting genes involved, giving a clue for obtaining relevant virulence gene simultaneously.
Summary of the invention
The object of the invention is the technological deficiency for existing in prior art, and a kind of batch acquisition methods of the specificity isothermal oligonucleotide probe based on whole genome sequence is provided.
The technical scheme adopted for realizing object of the present invention is: the batch acquisition methods of a species specificity isothermal oligonucleotide probe, is characterized in that comprising the following steps:
(1) non-pathogenic bacteria Protein Data Bank is built;
(2) Bioinformatics Prediction of the peculiar gene of pathogenic bacteria;
The method of sequence analysis is adopted to carry out the similarity system design of protein in database;
(3) the peculiar gene fragment order of batch pathogenic bacterium is identified;
Similarity comparison method based on Sample Storehouse size reduction mode obtains the peculiar gene of pathogenic bacterium and pathogenic bacterium Orphan gene;
(4) the peculiar fingerprint fragment sequence of batch pathogenic bacterium is obtained;
(5) batch pathogenic bacterium specific probe is obtained.
It is choose all non-pathogenic non-pathogenic bacteria bacterial strain of people, animal and plant from completing the bacterial isolates of genome sequencing of NCBI genome project announcement that described step (1) builds non-pathogenic bacteria Protein Data Bank, and builds non-pathogenic bacteria Protein Data Bank from the full-length genome protein sequence of NCBIGenBank acquisition non-pathogenic bacteria bacterial strain.
The Bioinformatics Prediction of the peculiar gene of described step (2) pathogenic bacteria utilizes Blast instrument to carry out homology comparison the protein sequence in each protein sequence in pathogenic bacterium albumen database and whole non-pathogenic bacteria albumen database, remove in pathogenic bacterium albumen database all proteins sequence that there is homology comparison e value and be less than 1e-7 in non-pathogenic bacteria storehouse, protein sequence in remaining pathogenic bacterium storehouse is defined as pathogenic bacterium candidate protein matter; Inner homology comparison is carried out in all pathogenic bacterium candidate protein matter obtained, if there is not e value when homology comparison and be less than 1e-5 in any one pathogenic bacterium candidate protein matter and all the other all candidate protein matter, then this candidate pathogenic bacterium candidate protein matter is defined as pathogenic bacterium orphan proteins (being produced by gene substitution), otherwise is defined as the peculiar albumen of pathogenic bacterium.
Described step (3) identifies that the peculiar gene fragment order of batch pathogenic bacterium comprises the following steps:
1. in NCBIGenBank database, obtain the corresponding nucleic sequence of the peculiar gene of pathogenic bacterium and pathogenic bacterium Orphan gene;
2. batch obtains the peculiar gene fragment order of pathogenic bacterium, carries out according to four steps:
The first step: first remove the interference sequence fragment had on biological significance existed in gene order, namely removes Sequences of Low Complexity LCR;
Second step: full length gene sequence is broken up for gene fragment order: design gene order breaks up algorithm, by 112,421 pathogenic bacterium strain gene sequences are by length length=29mer, step-length step=7mer, and overlapping overlap=22mer breaks up;
3rd step: for removing the sequence comprising degenerate codon in the total gene fragment order broken up; Described degenerate codon comprises: Y=CT, R=AG, W=AT, S=CG, K=GT, M=AC, D=AGT, V=ACG, H=ACT, B=CGT, N=ACGT;
4th step: remove the gene fragment occurred in non-pathogenic bacteria gene order;
After removal includes the fragment sequence of degenerate codon, for remaining fragment sequence, if there are continuous 15 bases in any bar segment to mate completely with the sequence of described step (1) structure non-pathogenic bacteria Protein Data Bank, then remove this fragment, final remaining set of segments is the peculiar gene fragment order of pathogenic bacterium;
(4) the peculiar fingerprint fragment sequence of batch pathogenic bacterium is obtained;
Based on the pathogenic bacterium strain gene sequence that the length obtained in step (3) is 29mer, obtain seven relevant informations of the peculiar fragment of its pathogenic bacterium:
Id: the peculiar gene fragment numbering of pathogenic bacterium;
Seq: the peculiar gene fragment order of pathogenic bacterium;
Gi: the peculiar gene fragment of pathogenic bacterium has the protein number of distribution on gene;
Chromosome/plasmid: karyomit(e) or plasmid number belonging to the gene that the peculiar gene fragment of pathogenic bacterium has a distribution on gene;
Geneproduct: the peculiar gene fragment of pathogenic bacterium has distribution corresponding gene product to describe on gene;
Strain: the bacterial isolates title of the peculiar gene fragment distribution of pathogenic bacterium;
Position: the start position information that the peculiar gene fragment of pathogenic bacterium distributes on gene.
Calculate the distribution of the peculiar gene fragment of each pathogenic bacterium in different pathogenic bacterium and gene thereof in gene fragment, and obtain conservative fragments specific to different pathogenic bacterium Pseudomonas, bacterial classification, bacterial strain, build the peculiar fingerprint fragment sequence of batch pathogenic bacterium;
(5) batch pathogenic bacterium specific probe is obtained.
Extract the distributed intelligence of all peculiar gene fragments in different pathogenic bacterium and gene thereof in the batch pathogenic bacterium peculiar fingerprint fragment sequence obtained from step (4), comprise as follows:
Gi: the peculiar gene fragment of pathogenic bacterium has the protein number of distribution on gene;
Position: the start position information that the peculiar gene fragment of pathogenic bacterium distributes on gene.
Simultaneously, also need the non-pathogenic bacteria Protein Data Bank that step (1) builds, adopt the specificity isothermal probe design algorithm based on the peculiar gene fragment of pathogenic bacterium, utilize this algorithm to design isothermal probe on different pathogenic bacterium and gene, build pathogenic bacterium specific probe database.
In described step (5), pathogenic bacterium specific probe and target sequence are combined closely, and do not comprise self dimer or hairpin structure, and cannot form homodimer.
In described step (5), the overall homology of pathogenic bacterium specific probe and non-targeted sequence is less than 75%;
In described step (5), pathogenic bacterium specific probe and non-targeted sequence can not have the consensus sequence that continuity sequence length is greater than 15nt;
In described step (5), pathogenic bacterium specific probe is without complexity sequence; Described is successional identical base sequence without complexity sequence, and a kind of base repeats to be no more than 4 continuously.
In described step (5), pathogenic bacterium specific probe has identical melting temperature(Tm), and GC content is 40-70%.
Compared with prior art, principle of work of the present invention and beneficial effect are: the invention provides a kind of specificity isothermal oligonucleotide probe based on whole genome sequence batch acquisition methods, first algorithm for design is utilized to obtain the peculiar gene fragment order of this bacillary pathogenic bacterium full-length genome, utilize the specificity isothermal probe design algorithm Batch Design isothermal oligonucleotide probe on different pathogenic bacterium genes based on the peculiar gene fragment order of pathogenic bacterium of design, in conjunction with chip scanning, thus realize the scanning analysis of pathogenic bacterium full-length genome, development is for detecting known bacterium specific gene, the detection chip of virulence gene, thus the rapid screening reached unknown pathogenetic bacteria and confirmation, the enforcement of the method has larger economy and social effect for correlative technology field.
Embodiment
Below in conjunction with specific embodiment, the present invention is described in further detail.Should be appreciated that specific embodiment described herein only in order to explain the present invention, be not intended to limit the present invention.
Embodiment:
Based on a batch acquisition methods for the specificity isothermal oligonucleotide probe of whole genome sequence, comprise the following steps:
(1) non-pathogenic bacteria Protein Data Bank is built;
Building non-pathogenic bacteria Protein Data Bank is from NCBI genome project (GenomeProject, http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi) completing in the bacterial isolates of genome sequencing of announcing choose all non-pathogenic non-pathogenic bacteria bacterial strain of people, animal and plant, and obtain the full-length genome protein sequence of non-pathogenic bacteria bacterial strain to build non-pathogenic bacteria Protein Data Bank from NCBIGenBank (ftp: //ftp.ncbi.nih.gov/genbank).
The selection standard of non-pathogenic bacterial strain: the complicacy developed due to bacterial pathogens, conditioned pathogen is not included in the non-pathogenic bacteria albumen database built.Along with developing rapidly of sequencing technologies, complete genome sequencing bacterial genomes in continuous increase, in order to manage completing genome sequencing bacterial strain and inquire about easily, genome standard consortium (GenomicStandardsConsortium, GSC) the minimal information implementation (Theminimuminformationaboutagenomesequence building whole genome sequence is proposed, MIGS), in the minimal information implementation of whole genome sequence, the habitat of the pathogenic and living environment of bacterial isolates becomes crucial data of description unit.Pathogenic is exactly mainly that description bacterial isolates is pathogenic or not pathogenic to people, animal or plant, and the habitat of living environment describes bacterial isolates nature exactly or lower of normal growing conditions occupies residence or environment.At present, GenBank database and GOLD (GenomesOn-LineDatabase) database are two general data resources with the explanation of whole genome sequence minimal information, and the habitat of the pathogenic and living environment of the 1063 strain non-pathogenic bacteria bacterial strains that non-pathogenic bacteria Protein Data Bank comprises mainly obtains exactly from the relevant information table these two databases:
1) three fields in the prokaryotic organism attribute list (ProkaryoticAttributesTables) in .GenBank (http://www.ncbi.nlm.nih.gov/Genbank/) database: habitat (Habitat) field in pathogenic (Pathogenicin), disease (Disease) and environment (Environment).
2) three fields in the organism information (Organisminformation) in .GOLD (GenomesOn-LineDatabase, http://www.genomesonline.org/) database: phenotype (Phenotype), disease (Disease) and habitat (Habitat) field.
(2) Bioinformatics Prediction of the peculiar gene of pathogenic bacteria;
On the basis understanding the evolution of the pathogen virulence factor, the present invention adopts " the similarity comparison method of Sample Storehouse size reduction mode " to predict the peculiar gene of pathogenic bacterium, wherein " completing the whole of general vital process may gene " are main at present adopts all non-pathogenic bacteria genes to replace, tentatively achieve to the prediction of the peculiar gene of pathogenic bacterium with determine.
In concrete forecasting process, the method (Blast instrument) of main employing sequence analysis is carried out protein similarity in database and is compared, specific practice utilizes Blast to carry out homology comparison the protein sequence in each protein sequence in pathogenic bacterium albumen database and whole non-pathogenic bacteria albumen database, remove in pathogenic bacterium albumen databases in non-pathogenic bacteria storehouse, there is the little all proteins sequence with 1e-7 of homology comparison e value, protein sequence in remaining pathogenic bacterium storehouse is defined as pathogenic bacterium candidate protein; Inner homology comparison is carried out in all pathogenic bacterium candidate protein matter obtained, if there is not e value when homology comparison and be less than 1e-5 in any one pathogenic bacterium candidate protein matter and all the other all candidate protein matter, then this candidate protein matter is defined as pathogenic bacterium orphan proteins (being produced by gene substitution), otherwise is defined as the peculiar albumen of pathogenic bacterium.
(3) the peculiar gene fragment order of batch pathogenic bacterium is identified;
Similarity comparison method based on Sample Storehouse size reduction mode obtains the peculiar gene of pathogenic bacterium and pathogenic bacterium Orphan gene;
(4) the peculiar fingerprint fragment sequence of batch pathogenic bacterium is obtained;
(5) batch pathogenic bacterium specific probe is obtained.
The Bioinformatics Prediction of the peculiar gene of described step (2) pathogenic bacteria utilizes Blast instrument to carry out homology comparison the protein sequence in each protein sequence in pathogenic bacterium albumen database and whole non-pathogenic bacteria albumen database, remove in pathogenic bacterium albumen database all proteins sequence that there is homology comparison e value and be less than 1e-7 in non-pathogenic bacteria storehouse, protein sequence in remaining pathogenic bacterium storehouse is defined as pathogenic bacterium candidate protein matter; Inner homology comparison is carried out in all pathogenic bacterium candidate protein matter obtained, if there is not e value when homology comparison and be less than 1e-5 in any one pathogenic bacterium candidate protein matter and all the other all candidate protein matter, then this candidate pathogenic bacterium candidate protein matter is defined as pathogenic bacterium orphan proteins (being produced by gene substitution), otherwise is defined as the peculiar albumen of pathogenic bacterium.
Described step (3) identifies that the peculiar gene fragment order of batch pathogenic bacterium utilizes the similarity comparison method based on Sample Storehouse size reduction mode to obtain the peculiar gene of pathogenic bacterium and pathogenic bacterium Orphan gene; Comprise the following steps specifically:
1. in NCBIGenBank database, obtain the corresponding nucleic sequence of the peculiar gene of pathogenic bacterium and pathogenic bacterium Orphan gene;
2. batch obtains the peculiar gene fragment order of pathogenic bacterium, carries out according to four steps:
The first step: first remove the interference sequence fragment had on biological significance existed in gene order, namely removes Sequences of Low Complexity LCR (lowcomplexityregions);
Adopt DUST module in the WindowMasker software package of NCBI exploitation, related software can from ftp: //ftp.ncbi.nih.gov/toolbox/ncbi_tools++/CURRENT downloads.DUST program is mainly used to the Sequences of Low Complexity differentiated or filter in nucleotide sequence, its advantage is that it is a kind of heuritic approach, compare without any need for the database built in advance, only need gene order itself, by calculating the repetition rate of 64 continuous base window nucleotide and giving a mark, thus realize the object differentiating Sequences of Low Complexity.
Second step: full length gene sequence is broken up for gene fragment order: design gene order breaks up algorithm, by 112,421 pathogenic bacterium strain gene sequences are by length length=29mer, step-length step=7mer, and overlapping overlap=22mer breaks up;
3rd step: for removing the sequence comprising degenerate codon (Degeneratenucleotidecodes) in the total gene fragment order broken up; Described degenerate codon comprises: Y=CT, R=AG, W=AT, S=CG, K=GT, M=AC, D=AGT, V=ACG, H=ACT, B=CGT, N=ACGT;
4th step: remove the gene fragment occurred in non-pathogenic bacteria gene order;
After removal includes the fragment sequence of degenerate codon, for remaining fragment sequence, if there are continuous 15 bases in any bar segment to mate completely with the sequence of described step (1) structure non-pathogenic bacteria Protein Data Bank, then remove this fragment, final remaining set of segments is the peculiar gene fragment order of pathogenic bacterium;
(4) the peculiar fingerprint fragment sequence of batch pathogenic bacterium is obtained;
Based on the pathogenic bacterium strain gene sequence that the length obtained in step (3) is 29mer, obtain seven relevant informations of the peculiar fragment of its pathogenic bacterium:
Id: pathogenic bacterium peculiar gene fragment numbering (this numbering does not have Special Significance, be exactly to the peculiar gene fragment of all pathogenic bacterium according to ranking results, carry out the numbering of 1 to 115,152);
Seq: the peculiar gene fragment order of pathogenic bacterium;
Gi: the peculiar gene fragment of pathogenic bacterium has the protein number of distribution on gene;
Chromosome/plasmid: karyomit(e) or plasmid number belonging to the gene that the peculiar gene fragment of pathogenic bacterium has a distribution on gene;
Geneproduct: the peculiar gene fragment of pathogenic bacterium has distribution corresponding gene product to describe on gene;
Strain: the bacterial isolates title of the peculiar gene fragment distribution of pathogenic bacterium;
Position: the start position information that the peculiar gene fragment of pathogenic bacterium distributes on gene.
Calculate the distribution of the peculiar gene fragment of each pathogenic bacterium in different pathogenic bacterium and gene thereof in gene fragment, and obtain conservative fragments specific to different pathogenic bacterium Pseudomonas, bacterial classification, bacterial strain, build the peculiar fingerprint fragment sequence of batch pathogenic bacterium;
(5) batch pathogenic bacterium specific probe is obtained.
Extract the distributed intelligence of all peculiar gene fragments in different pathogenic bacterium and gene thereof in the batch pathogenic bacterium peculiar fingerprint fragment sequence obtained from step (4), comprise as follows:
Gi: the peculiar gene fragment of pathogenic bacterium has the protein number of distribution on gene;
Position: the start position information that the peculiar gene fragment of pathogenic bacterium distributes on gene.
Simultaneously, also need the non-pathogenic bacteria Protein Data Bank that step (1) builds, adopt the specificity isothermal probe design algorithm based on the peculiar gene fragment of pathogenic bacterium, utilize this algorithm to design isothermal probe on different pathogenic bacterium and gene, build pathogenic bacterium specific probe database.
Sensitivity refers to that first probe must have the ability of combining closely with target sequence, in order to make it have good sensitivity, in described step (5), pathogenic bacterium specific probe and target sequence are combined closely, avoid occurring stable secondary structure, described secondary structure comprises self dimer or hairpin structure; And the tendency forming homodimer can not be had, the binding ability of probe and target sequence can be affected if there is these secondary structures or probe dipolymer.
In specific algorithm implementation procedure, mainly through utilize minimum free energy theory predict probe may produce secondary structure combine free energy (choosing △ G>-5.0Kcals/mol) avoid occurring secondary structure in designing probe.
The specificity of probe refers in crossover process, and designing probe can not combine closely with non-targeted sequence, and this just requires that designing probe can not have too much sequence similarity with the non-targeted sequence that occurs in crossover process.In specific algorithm implementation procedure, judge that whether probe is too high with the homology of non-targeted sequence mainly through three standards:
1, in described step (5), the overall homology of pathogenic bacterium specific probe and non-targeted sequence is less than 75%;
2, the middle pathogenic bacterium specific probe of described step (5) and non-targeted sequence can not have the consensus sequence that continuity sequence length is greater than 15nt;
3, the middle pathogenic bacterium specific probe of described step (5) is without complexity sequence; Described is successional identical base sequence without complexity sequence, and a kind of base repeats to be no more than 4 continuously, in order to avoid non-specific hybridization produces.
Consistence between probe mainly refers to that designing probe concentrates all probes all to have consistent melting temperature(Tm) (Tm), ensures rational GC content simultaneously.
Preferably, in described step (5), pathogenic bacterium specific probe has identical melting temperature(Tm) (Tm), and GC content is 40-70%.
What the present invention adopted is isothermal probe design algorithm, and the hybridization melting temperature(Tm) of the oligonucleotide probe designed by the method is consistent haply, effectively can reduce base hybridization mispairing, improve the reliability of detected result.About the calculating of melting temperature(Tm) (Tm), mainly contain three kinds of methods at present: basic Tm value calculating method, through the method for calculation of salt concn adjustment and based on most proximate region thermokinetics formulae discovery.Mainly take most proximate region thermokinetics formula (Nearest-neighborthermodynamictheory), the melting temperature(Tm) that this method is calculated takes full advantage of thermokinetic parameters, and analog calculation melting temperature values is out more close to the melting temperature(Tm) in actual crossover process.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (6)

1. the batch acquisition methods of a species specificity isothermal oligonucleotide probe, is characterized in that comprising the following steps:
(1) non-pathogenic bacteria Protein Data Bank is built;
Building non-pathogenic bacteria Protein Data Bank is choose all non-pathogenic non-pathogenic bacteria bacterial strain of people, animal and plant from completing the bacterial isolates of genome sequencing of NCBI genome project announcement, and builds non-pathogenic bacteria Protein Data Bank from the full-length genome protein sequence of NCBIGenBank acquisition non-pathogenic bacteria bacterial strain;
(2) Bioinformatics Prediction of the peculiar gene of pathogenic bacteria;
The Bioinformatics Prediction of the peculiar gene of pathogenic bacteria utilizes Blast instrument to carry out homology comparison the protein sequence in each protein sequence in pathogenic bacterium albumen database and whole non-pathogenic bacteria albumen database, remove in pathogenic bacterium albumen database all proteins sequence that there is homology comparison e value and be less than 1e-7 in non-pathogenic bacteria storehouse, protein sequence in remaining pathogenic bacterium storehouse is defined as pathogenic bacterium candidate protein matter; Inner homology comparison is carried out in all pathogenic bacterium candidate protein matter obtained, if there is not e value when homology comparison and be less than 1e-5 in any one pathogenic bacterium candidate protein matter and all the other all candidate protein matter, then this candidate pathogenic bacterium candidate protein matter is defined as pathogenic bacterium orphan proteins (being produced by gene substitution), otherwise is defined as the peculiar albumen of pathogenic bacterium;
(3) the peculiar gene fragment order of batch pathogenic bacterium is identified;
Similarity comparison method based on Sample Storehouse size reduction mode obtains the peculiar gene of pathogenic bacterium and pathogenic bacterium Orphan gene, comprises the following steps:
1. in NCBIGenBank database, obtain the corresponding nucleic sequence of the peculiar gene of pathogenic bacterium and pathogenic bacterium Orphan gene;
2. batch obtains the peculiar gene fragment order of pathogenic bacterium, carries out according to four steps:
The first step: first remove the interference sequence fragment had on biological significance existed in gene order, namely removes Sequences of Low Complexity LCR;
Second step: full length gene sequence is broken up for gene fragment order: design gene order breaks up algorithm, by 112,421 pathogenic bacterium strain gene sequences are by length length=29mer, step-length step=7mer, and overlapping overlap=22mer breaks up;
3rd step: for removing the sequence comprising degenerate codon in the total gene fragment order broken up; Described degenerate codon comprises: Y=CT, R=AG, W=AT, S=CG, K=GT, M=AC, D=AGT, V=ACG, H=ACT, B=CGT, N=ACGT;
4th step: remove the gene fragment occurred in non-pathogenic bacteria gene order;
After removal includes the fragment sequence of degenerate codon, for remaining fragment sequence, if there are continuous 15 bases in any bar segment to mate completely with the sequence of described step (1) structure non-pathogenic bacteria Protein Data Bank, then remove this fragment, final remaining set of segments is the peculiar gene fragment order of pathogenic bacterium;
(4) the peculiar fingerprint fragment sequence of batch pathogenic bacterium is obtained;
Based on the pathogenic bacterium strain gene sequence that the length obtained in step (3) is 29mer, obtain seven relevant informations of the peculiar fragment of its pathogenic bacterium:
Id: the peculiar gene fragment numbering of pathogenic bacterium;
Seq: the peculiar gene fragment order of pathogenic bacterium;
Gi: the peculiar gene fragment of pathogenic bacterium has the protein number of distribution on gene;
Chromosome/plasmid: karyomit(e) or plasmid number belonging to the gene that the peculiar gene fragment of pathogenic bacterium has a distribution on gene;
Geneproduct: the peculiar gene fragment of pathogenic bacterium has distribution corresponding gene product to describe on gene;
Strain: the bacterial isolates title of the peculiar gene fragment distribution of pathogenic bacterium;
Position: the start position information that the peculiar gene fragment of pathogenic bacterium distributes on gene;
Calculate the distribution of the peculiar gene fragment of each pathogenic bacterium in different pathogenic bacterium and gene thereof in gene fragment, and obtain conservative fragments specific to different pathogenic bacterium Pseudomonas, bacterial classification, bacterial strain, build the peculiar fingerprint fragment sequence of batch pathogenic bacterium;
(5) batch pathogenic bacterium specific probe is obtained;
Extract the distributed intelligence of all peculiar gene fragments in different pathogenic bacterium and gene thereof in the batch pathogenic bacterium peculiar fingerprint fragment sequence obtained from step (4), comprise as follows:
Gi: the peculiar gene fragment of pathogenic bacterium has the protein number of distribution on gene;
Position: the start position information that the peculiar gene fragment of pathogenic bacterium distributes on gene;
Adopt the specificity isothermal probe design algorithm based on the peculiar gene fragment of pathogenic bacterium, utilize this algorithm to design isothermal probe on different pathogenic bacterium and gene, build pathogenic bacterium specific probe database.
2. the batch acquisition methods of a species specificity isothermal oligonucleotide probe according to claim 1, it is characterized in that what the middle pathogenic bacterium specific probe of described step (5) and target sequence were combined closely, do not comprise self dimer or hairpin structure, and cannot homodimer be formed.
3. the batch acquisition methods of a species specificity isothermal oligonucleotide probe according to claim 1, is characterized in that the overall homology of pathogenic bacterium specific probe and non-targeted sequence in described step (5) is less than 75%.
4. the batch acquisition methods of a species specificity isothermal oligonucleotide probe according to claim 1, is characterized in that the consensus sequence that pathogenic bacterium specific probe and non-targeted sequence in described step (5) can not have continuity sequence length and be greater than 15nt.
5. the batch acquisition methods of a species specificity isothermal oligonucleotide probe according to claim 1, is characterized in that in described step (5), pathogenic bacterium specific probe is without complexity sequence; Described is successional identical base sequence without complexity sequence, and a kind of base repeats to be no more than 4 continuously.
6. the batch acquisition methods of a species specificity isothermal oligonucleotide probe according to claim 1, it is characterized in that in described step (5), pathogenic bacterium specific probe has identical melting temperature(Tm), and GC content is 40-70%.
CN201510849278.0A 2015-11-30 2015-11-30 Batch acquiring method for specific isothermal oligonucleotide probes Pending CN105274092A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510849278.0A CN105274092A (en) 2015-11-30 2015-11-30 Batch acquiring method for specific isothermal oligonucleotide probes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510849278.0A CN105274092A (en) 2015-11-30 2015-11-30 Batch acquiring method for specific isothermal oligonucleotide probes

Publications (1)

Publication Number Publication Date
CN105274092A true CN105274092A (en) 2016-01-27

Family

ID=55143926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510849278.0A Pending CN105274092A (en) 2015-11-30 2015-11-30 Batch acquiring method for specific isothermal oligonucleotide probes

Country Status (1)

Country Link
CN (1) CN105274092A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292125A (en) * 2016-04-01 2017-10-24 深圳华大基因科技有限公司 The method and system of design object regiospecificity liquid phase probe
CN107653299A (en) * 2016-07-23 2018-02-02 成都十洲科技有限公司 A kind of acquisition methods of the gene chip probes sequence based on high-flux sequence
CN109266768A (en) * 2018-11-19 2019-01-25 南京工业大学 Screening method of nucleotide fragments for identifying closely related microorganisms
CN110534157A (en) * 2019-07-26 2019-12-03 江苏省农业科学院 A kind of batch extracting genomic gene information simultaneously translates the method for comparing analytical sequence
CN111477274A (en) * 2020-04-02 2020-07-31 上海之江生物科技股份有限公司 Method and device for identifying specific region in microbial target fragment and application
WO2024109093A1 (en) * 2022-11-22 2024-05-30 壹健生物科技(苏州)有限公司 Probe for detecting methane metabolic gene in sample, chip, kit, and method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107292125A (en) * 2016-04-01 2017-10-24 深圳华大基因科技有限公司 The method and system of design object regiospecificity liquid phase probe
CN107292125B (en) * 2016-04-01 2021-03-05 深圳华大基因科技有限公司 Method and system for designing target area specific liquid phase probe
CN107653299A (en) * 2016-07-23 2018-02-02 成都十洲科技有限公司 A kind of acquisition methods of the gene chip probes sequence based on high-flux sequence
CN109266768A (en) * 2018-11-19 2019-01-25 南京工业大学 Screening method of nucleotide fragments for identifying closely related microorganisms
CN110534157A (en) * 2019-07-26 2019-12-03 江苏省农业科学院 A kind of batch extracting genomic gene information simultaneously translates the method for comparing analytical sequence
CN110534157B (en) * 2019-07-26 2023-07-25 江苏省农业科学院 Method for extracting genome gene information in batches and translating and comparing analysis sequences
CN111477274A (en) * 2020-04-02 2020-07-31 上海之江生物科技股份有限公司 Method and device for identifying specific region in microbial target fragment and application
CN111477274B (en) * 2020-04-02 2020-11-24 上海之江生物科技股份有限公司 Method and device for identifying specific region in microbial target fragment and application
EP4116983A4 (en) * 2020-04-02 2023-12-13 Shanghai Zj Bio-tech Co., Ltd. Method and device for identifying specific region in microorganism target fragment and use thereof
WO2024109093A1 (en) * 2022-11-22 2024-05-30 壹健生物科技(苏州)有限公司 Probe for detecting methane metabolic gene in sample, chip, kit, and method

Similar Documents

Publication Publication Date Title
CN105274092A (en) Batch acquiring method for specific isothermal oligonucleotide probes
Unemo et al. The novel 2016 WHO Neisseria gonorrhoeae reference strains for global quality assurance of laboratory investigations: phenotypic, genetic and reference genome characterization
Hartmann et al. Genome‐wide evidence for divergent selection between populations of a major agricultural pathogen
Erler et al. VibrioBase: a MALDI-TOF MS database for fast identification of Vibrio spp. that are potentially pathogenic in humans
Ricke et al. Molecular‐based identification and detection of Salmonella in food production systems: current perspectives
Taboada et al. Current methods for molecular typing of Campylobacter species
Alam et al. Transmission and microevolution of USA300 MRSA in US households: evidence from whole-genome sequencing
CN103582887B (en) The method and sequencing device of nucleotide sequence data are provided
CN103224942A (en) Resequencing pathogen microarray
Chanturia et al. Phylogeography of Francisella tularensis subspecies holarctica from the country of Georgia
Weedall et al. Evolutionary genomics of Entamoeba
Payne et al. Multilevel genome typing: genomics-guided scalable resolution typing of microbial pathogens
CN108138244A (en) Virus group capture microarray dataset, design and construction method and application method
Waseem et al. Virulence factor activity relationships (VFARs): a bioinformatics perspective
Nsofor DNA microarrays and their applications in medical microbiology
Wright et al. SISPA-Seq for rapid whole genome surveys of bacterial isolates
Zhang et al. Expansion of the known Klebsiella pneumoniae species gene pool by characterization of novel alien DNA islands integrated into tmRNA gene sites
US20110152109A1 (en) Biological sample target classification, detection and selection methods, and related arrays and oligonucleotide probes
Liew et al. Defining species specific genome differences in malaria parasites
Xu et al. Genetic structure of Spirometra mansoni (Cestoda: Diphyllobothriidae) populations in China revealed by a Target SSR-seq method
Zhang et al. Complete mitochondrial genomes of Epeorus carinatus and E. dayongensis (Ephemeroptera: Heptageniidae): Genomic comparison and phylogenetic inference
Delgado et al. New Genetic Variants of Leptospira spp Characterized by MLST from Peruvian Isolates
Marbouty et al. Phages-bacteria interactions network of the healthy human gut
KR101915701B1 (en) Method for measuring mutation rate
Lu et al. Chromosome-level genome assembly of a fragrant japonica rice cultivar ‘Changxianggeng 1813’provides insights into genomic variations between fragrant and non-fragrant japonica rice

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20160127

WD01 Invention patent application deemed withdrawn after publication