Background technology
Infectious disease is the class disease that can mutually propagate between person to person, animal and animal or human and animal that is caused by various pathogen.Every kind of infectious disease has its special pathogen, comprises virus, bacterium, fungi, protozoon, conveyor screw, Garrick thatch body etc.Infectious disease is compared with other kind diseases, have high, the velocity of propagation of morbidity intensity fast, involve that scope is wide, region and the characteristics such as seasonal strong, the harmfulness that infectious disease produces is very big, not only mortality is high, and easily cause social Psychological phobia, the secondary harm that produces is often larger, directly affects the economic activity of society and people's normal life order.
Although infectious disease is divided into person to person, animal and animal or human and the proprietary infectious disease of animal in theory, but many communicable diseases, even comprise epidemic disease, all arise from the common characteristic of people and animals, distinguish which disease and progressively be evolved into from infection animal and can infect the mankind and remarkable, but evidence show that measles, smallpox, influenza, diphtheria etc. are all like this.And acquired immune deficiency syndrome (AIDS), flu and tuberculosis are also all from the species beyond the mankind.People and animals' common fault attracts great attention internationally, because they normally pass by undiscovered disease, or virulence strengthens in evolutionary process, or accidentally imports group or species that tool not resists the immunity of this disease into.Therefore, the pathogen take vertebrate as the host systematically being monitored, is a necessary links that effectively carries out the infectious disease prevention and control.
In the outbreak of communicable diseases process, the mankind were controlled to treatment by former passive bearing, then to shifting to an earlier date prevention and control, had accumulated technical experience a large amount of and that infectious disease is struggled.Particularly along with modern medicine and Protocols in Molecular Biology development, the mankind have set up multiple concrete infectious disease detection method, 1) culture of microorganism; 2) Serologic markers detection method; 3) in blood or secretion, contained virus and pathogenesis-related protein quality inspection are surveyed, comprising methods such as ELISA, collaurums; 4) by infection sources nucleotide sequence, carry out specific fluorescence quantitative PCR detection method; 5) fast-developing biochip microarray high throughput method.
Culture of microorganism is due to characteristics such as intuitive and convenient, or topmost Diagnosis of Infectious Diseases instrument, but some infectious agent can't manually cultivate as virus and Leptospira, just can only be by other diagnostic tools; The Serologic markers detection method, also to detect by the specific antibody that produces after the pathogenic infection body, current antibody test can only just can make a definite diagnosis after week at infection 2-4, and the method also needs mutually to quote as proof with culture of microorganism owing to there being " serology window phase "; Based on the cause of disease protein detection of blood or secretion, be also improvement to Serology test as ELISA and colloidal gold method, there is too above drawback; Recently emerge in large numbers for the RNA of pathogen or the fluorescent quantitative PCR detection method of DNA, have sensitivity high, accuracy rate is high, can effectively shorten characteristics such as " cause of disease window phases ".But this detection method also can only be for special PCR primer and the probe of known pathogen design, can not realize the high flux inspection, can not satisfy quick, accurate, the sensitive diagnostic requirements of new and sudden infectious disease, be the epidemic prevention prevention and control and one of major technique bottleneck of in time giving treatment to of great communicable disease.
Biochip method, having considered on the limitation basis of tradition and existing infectious disease detection method, in conjunction with modern molecular biology high-throughput techniques advantage, and the infectious disease pathogens diagnosis detecting method of setting up.The method major technique advantage comprises: 1) high flux.A dot matrix on chip can be analyzed simultaneously to a sample the pathogen of thousands of kinds, can analyze simultaneously the dozens of clinical sample and have on chip; 2) quick, accurate and sensitive.Single detects 1 day and can complete, and the high flux specificity, detect effect and obviously be better than existing additive method in addition; Owing to adopting totally enclosed fluorescence automated detection system in testing process, gather specific probe, accuracy in detection is high, sensitivity good; 3) can detect unknown pathogen.The former body detecting method of PI can only be confirmed known pathogen, and is helpless for unknown pathogen detection, and for example fluorescence quantifying PCR method, have a lot of technical advantages, but prerequisite must be known tested pathogen nucleic acid sequence, otherwise can't detect.And biological chips detection system because probe design itself just has compatibility, detects sequence and undergos mutation and will can not affect hybridization check.Most of pathogen new varieties are all the mutant of known pathogen under medicine and environmental pressure in fact, and sequence has very high homology.
Due to the potential value of the technological merit of biological chip testing technology own and clinical practice, make the lot of domestic and foreign sci tech experts be absorbed in the research of biological chip testing technology in pestology.For example, the Virochip chip that can detect multiple virus of the DeRisi of Univ California-San Francisco USA laboratory research and development, what the Lipkin of Columbia Univ USA researched and developed in the laboratory can detect multiple virus, bacterium, fungi and parasitic GreeneChip chip etc. simultaneously.
The purpose of biochip probe design is: can be when more biomolecule being detected through the probe after computing method optimization, ensure higher detecting reliability, namely taking into account simultaneously coverage rate and accuracy rate two aspects, is vital for high-throughout pathogen detection this point.Common way is at first to inquire about as international public databases such as EMBL and GenBank, obtains corresponding DNA sequence data as the reference object sequence of biochip probe design, then therefrom selects the very high nucleotide fragments of specificity to come designing probe.Specificity refers to the difference of the existence between target species and non-target species, is the core foundation that the Biochip for detection sheet is differentiated species.The selection of specific probe is the key link in the probe design process, and the research of probe optimization algorithm for design has become urgent problem in the information processing of detection type genetic chip.Discriminating for the small-scale species, mainly to select by the result dependence manual analysis of sequence alignment, but along with one single chip being detected the quick increase of species quantity demand, sequence to be analyzed is more and more, add probe design and also will consider a lot of otherwise complicated factors, artificial design is not only wasted time and energy, and difficult quality guarantee, so computing method are widely used aspect probe design.Waibhav has proposed the probe design flow process of a cover from the pathogen whole genome sequence, Satya improves again on this process base, except effectively having reduced computing time, also used the narrow spectrum criterion of many cover tolerance probes to carry out theoretical appraisal to probe mass.The people such as Jabado have carried out being directed to the probe design work of viral detection chip, they think aspect the sequence conservation analysis, use the protein-protein comparison more to have superiority compared to the comparison between nucleotide sequence, so they have proposed based on the probe design flow process of a cover from the virus protein sequence.In order to take into account the requirement to the probe high coverage rate, also replenished some take non-coding region as stencil design out probe.In sum, present biochip probe design method, science, rationally more, the designed probe that goes out has reasonable coverage rate and accuracy rate, can satisfy the demand that high flux detects.But these methods for designing also exist the main deficiency of two aspects: 1) calculating is consuming time, and design efficiency is lower.Take the people's such as Satya TOFI-beta flow process as example, the detector probe of a species Brucella melitensis of design, just need 21 hours on 74 CPU; 2) a lot of probe design flow process due to the restriction of sequence resource, can only be met the condition detector probe on the level that belongs to, be difficult to accomplish meticulousr detection.Along with enriching constantly of sequence resource, the pathogen that detects on kind or subspecies level all will become possibility, and existing design cycle all lacks a dynamic data management update system, can not accomplish to accomplish to synchronize renewal with the sequence library of rapid growth.
Embodiment
Below in conjunction with concrete example and accompanying drawing, this method is described further.
One, directed toward bacteria, fungi and protozoic design cycle take rRNA as template
We as target species, and introduce probe design flow process shown in Figure 1 with bacterium Brevibacterium epidermidis as an example of it example.At first, obtain the 16S rRNA sequence of target species from Ribosomal Database Project (RDP) lane database.According to these 16S rRNA sequences, carry out sequence alignment, extract the more 16S rRNA of these species sequence from GenBank, simultaneously the kind descriptor of sequence is proofreaied and correct, guarantee the 16S rRNA sequence for target species.Many 16SrRNA to target species carry out Multiple Sequence Alignment, extract kind of an interior conservative sequence area.Figure 3 shows that one section conserved sequence fragment wherein.
Analyze by system, representative series to the whole bacteria cultures of institute's research volume builds chadogram, can find the arest neighbors species of target species from the branch of chadogram, as shown in Figure 4, the arest neighbors bacterial classification of bacterium Brevibacterium epidermidis is Kineosporia aurantiaca.Two bacterial classifications are carried out sequence alignment, the conserved region between being planted.Remove conserved region between this part kind conservative region in the kind of Brevibacterium epidermidis, namely obtained the specific regions of target bacterial classification, carry out next step probe design as alternative sequence.
According to following a few class experiment conditions, comprise that probe length is 60mer, the theoretical thaw temperature of all probes fluctuates in 2 degree, and GC content extracts the alternative probe set that satisfies condition in the scope of 30%-70% etc. from alternative sequence.Structure is incorporated into together non-target species sequence library with vertebrate sequence and corresponding pathogen sequence, by Blastn, alternative probe is carried out homology and detects.The specific criteria that we arrange is alternative probe for the continuous complimentary piece segment length of non-target species gene less than 15bp, and total complementary length should be less than 75%.By the screening, get rid of may with the result of non-target species sequence generation crisscrossing, obtain high narrow spectrum probe.
Two, for the design cycle of virus take protein coding sequence as template
Probe design flow process shown in Figure 2 is for virus, and the design cycle take protein coding sequence as template.At first, download the virus sequence normative document from European Molecular Biology Laboratory (EMBL) database.Therefrom extracting arranges the sequence that belongs to target viral, according to the information that sequential file provides, further extracts the nucleotide sequence of coding structure albumen and coded protein sequence.Seed Sequences in these protein sequences and Pfam protein families database is compared, the sequence area that obtains guarding, nucleic acid coding that it is corresponding district is as the alternative sequence of next step design.Can not be by comparing with the Pfam database sequence that obtains conserved region for those, directly the nucleic acid coding district with them carries out sequence alignment, cluster, obtains conservative region, as another source of alternative sequence.The step of back is consistent from the step of alternative sequence designing probe and design cycle take rRNA as template.