CN112309499A - Method and device for quickly annotating bacterial pdif - Google Patents

Method and device for quickly annotating bacterial pdif Download PDF

Info

Publication number
CN112309499A
CN112309499A CN202011239023.XA CN202011239023A CN112309499A CN 112309499 A CN112309499 A CN 112309499A CN 202011239023 A CN202011239023 A CN 202011239023A CN 112309499 A CN112309499 A CN 112309499A
Authority
CN
China
Prior art keywords
pdif
bacterial
sequence
database
pdifdb
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011239023.XA
Other languages
Chinese (zh)
Inventor
华孝挺
俞云松
刘海洋
娄永锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Weishu Biotechnology Co ltd
Zhejiang University ZJU
Original Assignee
Hangzhou Weishu Biotechnology Co ltd
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Weishu Biotechnology Co ltd, Zhejiang University ZJU filed Critical Hangzhou Weishu Biotechnology Co ltd
Priority to CN202011239023.XA priority Critical patent/CN112309499A/en
Publication of CN112309499A publication Critical patent/CN112309499A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation

Abstract

The invention discloses a method and a device for quickly annotating bacterial pdif, which are used for constructing a pdifDB database; obtaining a bacterial DNA sequence to be analyzed through whole genome sequencing; the bacterial DNA sequences were aligned to the pdif sequences in the pdifDB database by BLASTN to obtain pdif sequence annotation results in bacterial DNA sequences. The method has great significance for rapidly finding the pdif sequence in the bacterial DNA sequence and determining the drug-resistant gene transfer mechanism mediated by the pdif in the bacteria.

Description

Method and device for quickly annotating bacterial pdif
Technical Field
The invention relates to the field of bacterial DNA sequence annotation, in particular to a method and a device for quickly annotating a bacterial pdif sequence, which can be used for analyzing the existence of the pdif in the bacterial DNA sequence and the position of the pdif in the bacterial DNA sequence.
Background
In recent years, the development of Whole Genome Sequencing (WGS) has been rapid. Furthermore, the reduced cost of sequencing increases its likelihood of performing rapid bacterial WGS in the laboratory. The biggest advantage of WGS is that after sequencing results are obtained and assembled, drug resistance genes, mobile elements, etc. of bacteria can be quickly annotated and predicted by a website and command line based tool, greatly accelerating the research progress in the field of molecular characterization of bacteria and epidemiological monitoring.
Bacterial resistance is a major public health problem facing the world. The wide prevalence of drug-resistant bacteria presents a significant challenge to the treatment of infectious diseases. The transmission of intraspecies/interspecies resistance between bacteria can be achieved by moving elements. Understanding the transfer of mobile elements associated with bacterial resistance is crucial to defining the spread of resistance between bacteria. At present, Insertion Sequences (IS), integrants (In), transposons (Transposon, Tn), and the like are mainly reported as mobile elements. In recent years, the XerCD-dif site-specific recombination system is considered as another way for mediating drug-resistant gene transfer, and is more found in Acinetobacter baumannii, which is gradually a hot point of research, but the specific mechanism is still unknown. Many bacteria encode two homologous recombinase proteins, usually occurring in pairs, called XerC/XerD, respectively. XerC/XerD belongs to the family of tyrosine recombinases and can catalyze the cleavage of two consecutive DNA strands and exchange at a defined site dif located in the end region of the chromosome. Generally, the dif site is 28bp in length, and is composed of two Xer binding regions with 11bp inverted repeats at two ends, and the central region is a 6bp spacer region. One monomer of XerC and XerD each bind to a half-binding site of 11bp, XerC binds at the left site and XerD binds at the right site. However, the dif site present on a plasmid is called pdif, and multiple pdif sites, even up to 16, may be present on a single plasmid. Several important drug resistance genes have been found to exist between the pdif sites, such as: blaOXA-24,blaOXA-58,blaOXA-72Tet39, etc. The pdif sites can mediate the transfer of drug resistance genes in the same plasmid and among different plasmids, and especially play a crucial role in the transmission of the drug resistance of acinetobacter baumannii. However, there is no annotation method related to important identification of pdif sites. Therefore, the identification of the pdif locus and the further research on the mechanism of the co-transfer of the gene carrying the drug resistance are of great significance for controlling the transmission of clinically drug-resistant bacteria.
Disclosure of Invention
The invention aims to provide a method and a device for quickly annotating a bacterial pdif sequence, so as to solve the problem that no annotation method related to a pdif locus sequence exists at present.
In order to achieve the purpose, the invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for rapidly annotating a bacterial pdif, comprising:
constructing a pdifDB database;
obtaining a bacterial DNA sequence to be analyzed through whole genome sequencing;
the bacterial DNA sequences were aligned to the pdif sequences in the pdifDB database by BLASTN to obtain pdif sequence annotation results in bacterial DNA sequences.
Further, constructing a pdifDB database, comprising:
a pdifDB database was extracted and established from the plasmid Genbank file at NCBI.
Further, still include:
and outputting the pdif sequence annotation result in a tabular and graphical form.
Further, the content shown in the graph includes the number of pdifs and the specific positions of pdifs in the bacterial DNA sequence.
Further, the format of the bacterial DNA sequence is Fasta or Genbank format.
In a second aspect, the present invention provides a bacterial pdif rapid annotation device, comprising:
the database construction module is used for constructing a pdifDB database;
the acquisition module is used for acquiring a bacterial DNA sequence to be analyzed through whole genome sequencing;
an alignment module for aligning said bacterial DNA sequence with a pdif sequence in said pdifDB database by BLASTN to obtain pdif sequence annotation results in bacterial DNA sequences.
Through the technical scheme, the invention has the following beneficial effects: firstly, the invention provides a method and a device for quickly annotating bacterial pdif sequences, and the number and the positions of the pdif sequences in bacterial DNA can be annotated only by simply submitting the sequences. Meanwhile, the user can download the annotated related files by himself according to the requirement, and the problem that no related annotation website identified by the pdif sequence exists at present is solved. The user only needs to upload the DNA sequence of the bacteria, and the information about the number and the position of the pdif sequence of the bacteria can be obtained in a very short time after the delivery, and can be further visualized in a graph form. The establishment of the methodology has important significance for researching the transfer mechanism of the drug-resistant gene caused by the pdif sequence in the bacteria.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a flow chart of a method for rapidly annotating a bacterial pdif sequence according to an embodiment of the present invention;
FIG. 2 is a pdif sequence annotation main interface of Acinetobacter baumannii in the examples of the present invention;
FIG. 3a is a schematic diagram showing the result of pdif site analysis of the genomic sequence of Acinetobacter baumannii 255_ n in the example of the present invention;
FIG. 3b is a schematic diagram showing the sequence information of the aligned fragment of Acinetobacter baumannii 255_ n genome in the embodiment of the present invention;
FIG. 4 is a block diagram of a device for fast annotation of bacterial pdif sequences provided in the examples of the present invention.
Detailed Description
For purposes of making the claimed subject matter and the claims more detailed and clear, reference will now be made to the following detailed description taken in conjunction with the accompanying drawings. In addition, the embodiments are only a part of the present application, and not all of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
FIG. 1 is a flow chart of a method for rapidly annotating bacterial pdif according to an embodiment of the present invention; the method for rapidly annotating bacterial pdif mainly comprises the following steps:
step S101, constructing a pdifDB database;
specifically, the pdifDB database was extracted and established from the annotations in the plasmid Genbank file at the National Center for Biotechnology Information, NCBI. Further, 41 pdif sequences were extracted from the annotations in the plasmid Genbank file on NCBI into the pdifDB database.
Step S102, obtaining a bacterial DNA sequence to be analyzed through whole genome sequencing or directly downloading a complete spliced bacterial DNA sequence from NCBI;
specifically, the sequence to be analyzed and completed with splicing or the already spliced sequence downloaded from the NCBI database is obtained by whole genome sequencing, and the format of the sequence may be Fasta or Genbank, which are the most common formats among DNA sequences.
Step S103, comparing the bacterial DNA sequence with the pdif sequence in the pdifDB database by using a Search Tool (Basic Local Alignment Search Tool, BLAST) based on a Local Alignment algorithm to obtain a pdif sequence annotation result in the bacterial DNA sequence;
specifically, the bacterial DNA sequence was aligned with the pdifDB database constructed in step a by BLAST, and the corresponding pdif in the bacterial DNA sequence was searched, so as to obtain the annotation result of the pdif sequence in the bacterial DNA sequence. Further, before comparison, the minimum coverage and the consistency need to be set, and the common default values are 60% and 90%, respectively. And further uploading the sequence, and comparing the sequence with the pdifDB database established in the step A to obtain the specific position information, the amount and the sequence of the pdif sequence in the bacterial DNA sequence.
Step S104, outputting the pdif sequence annotation result in a form and a graphical form;
specifically, this example will obtain the sequence information of the best matching fragment in the pdifDB database based on the coverage (60%) and consistency (90%) of the alignment, and the system will output the annotation result in a very short time.
Performing visualization operation on the output annotation result, and specifically realizing the visualization operation through the following steps: specifically, a Biopython tool is used for converting the Genbank format into the Genbank format, meanwhile, annotation information is added, then a Python django framework and a nginx server are used for building a website, the result is displayed on a webpage, and GBrows and echarts plug-ins are used for displaying the Genbank format file in a graphical form.
The visualized content mainly comprises the number of pdifs, the sequence of the pdifs and the positions of the pdifs.
In addition, the method can be written into a website through a Python program, the name of the website is pdif, and the flow is as follows:
first, the pdif has one required parameter and four optional parameters. An input file is required, i.e. the assembled bacterial DNA sequence (Fasta or Genbank format). The bacterial DNA sequence loading parameter for Fasta format is-nucleotide (-n), and the bacterial DNA sequence loading parameter for Genbank format is-Genbank (-g). In addition, the sequence file should also include the following optional parameters:
outputting a file directory: - - -resultdir; and (3) referencing a database: -databases; minimum coverage: - -coverage; minimum consistency: -identity.
Next, the corresponding pdif sequence in the bacterial DNA sequence was found by BLAST and comparison with the pdif sequence in the pdifDB database (e value: 1 e-5). The comparison result comprises:
qseqid: inquiring a sequence id number;
sseqid: a target sequence id number;
cadent: percent of matching sequences;
length: matching sequence length;
mismatch: the number of mismatched bases;
gapopen: the number of gaps;
qstart: the starting position of the matched query sequence;
qend: a matched query sequence termination location;
sstart: the starting position of the matched target sequence;
send: (ii) a matched target sequence termination location;
evalue: an expected value;
bitscore: scoring the matching condition;
qseq: nucleic acid sequences of the matched query sequence;
sseq: a nucleic acid sequence of the matched target sequence;
slen: the length of the nucleic acid sequence of the matched query sequence.
Thirdly, based on the comparison coverage rate and consistency, obtaining the sequence information of the best matching fragment; minimum coverage and consistency, default values are 60% and 90%, respectively.
Finally, the result is output in tabular and graphical form for the pdif sequence.
The specific application method of the website is two, one is based on the website, and the other is based on the command line form:
based on the website:
the pdif website was opened and the Fasta or Genbank format file of the target bacterial DNA sequence (fig. 2) was submitted with minimum coverage and consistency, defaults of 60% and 90%, respectively. The sequence of deliveries, the program was run for <1 min/time.
The analysis results will mainly show the specific information of the pdif of the bacterial DNA sequence. As shown in FIG. 3, the test strain was Acinetobacter baumannii 255_ n, which revealed that the strain had 8 pdifs (FIG. 3a), and the positions of the pdif sequences in the bacterial DNA sequence were clearly seen (FIG. 3 b).
Based on the command line:
the method has been uploaded to GitHub as follows:
this script accepts sequence files in Fasta or Genbank format (plasmid sequence less than 1M);
the output file includes: amrgene.txt, isfider.filter.xls, pdif _ site.txt, plasma.html;
the process comprises the following steps:
firstly, searching for a drug-resistant gene of a bacterial DNA sequence, and if not, terminating the program;
secondly, based on the pdif database, all pdifs are searched, if not, the procedure is terminated;
further, looking for paired pdifs, only keeping paired pdifs, if not, the procedure is terminated;
and finally, outputting the result in a chart form. In the figure, blue represents the drug resistance gene, gray line represents pdif, and green represents the insertion element.
Operation:
the pdiffender is a python3.X script, is easy to use and supports Window and Linux systems;
first, BLAST is required, which should be added to the environmental variables;
second, read Requirements. txt, check for dependent items or run directly: pip install-r requisitions.txt;
finally, the command is run as follows:
python pdiffield, py-i fastfile-o outdi or python pdiffield, py-g genbank file-o outdi
Example 2:
the embodiment of the present invention provides a bacterial pdif fast annotation device (as shown in fig. 4), which can perform a bacterial pdif fast annotation method provided by any embodiment of the present invention. The method has the corresponding functional modules and beneficial effects. The device includes:
a database construction module 21, configured to construct a pdifDB database;
an obtaining module 22, configured to obtain a bacterial DNA sequence to be analyzed by whole genome sequencing;
an alignment module 23 for aligning the bacterial DNA sequence with the pdif sequences in the pdifDB database by BLAST to obtain pdif sequence annotation results in bacterial DNA sequences.
Further comprising: and the output module 24 is used for outputting the pdif sequence annotation result in a tabular and graphical form.
In the embodiments of the present invention, the corresponding descriptions are focused, and the related descriptions of other embodiments may be referred to for parts that are not described in detail in a certain embodiment.
In the embodiments provided in the present application, the disclosed technical content can be implemented in other ways. The above-mentioned embodiments are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined/integrated into another system, or some features may be omitted, or not executed. The units described as separate parts are not physically separate and may be located in one unit or distributed in a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment. In addition, the integrated unit may be implemented in a hardware form/a software form.
The integrated unit, if implemented in software and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Wherein the storage medium includes: a removable hard disk, a usb disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), or an optical disk.
The above description is only exemplary of the invention and should not be taken as limiting the invention, as any modification, equivalent replacement, or improvement made within the spirit and principle of the invention may be included within the scope of the invention.

Claims (10)

1. A bacterial pdif rapid annotation method is characterized by comprising the following steps:
constructing a pdifDB database;
obtaining a bacterial DNA sequence to be analyzed through whole genome sequencing;
the bacterial DNA sequences were aligned to the pdif sequences in the pdifDB database by BLASTN to obtain pdif sequence annotation results in bacterial DNA sequences.
2. The method for rapid annotation of bacterial pdif according to claim 1, wherein the pdifDB database is constructed by:
a pdifDB database was extracted and established from the plasmid Genbank file at NCBI.
3. The method for rapid annotation of bacterial pdif according to claim 1, further comprising:
and outputting the pdif sequence annotation result in a tabular and graphical form.
4. The method for fast annotation of bacterial pdif according to claim 3, wherein the graphical representation comprises the number of pdif and the specific location of pdif in the bacterial DNA sequence.
5. The method for rapidly annotating bacterial pdif according to claim 1, wherein said bacterial DNA sequence is in Fasta or Genbank format.
6. A bacterial pdif rapid annotation device, comprising:
the database construction module is used for constructing a pdifDB database;
the acquisition module is used for acquiring a bacterial DNA sequence to be analyzed through whole genome sequencing;
an alignment module for aligning said bacterial DNA sequence with a pdif sequence in said pdifDB database by BLASTN to obtain pdif sequence annotation results in bacterial DNA sequences.
7. The bacterial pdif rapid annotation device of claim 6, wherein the pdifDB database is constructed by:
a pdifDB database was extracted and established from the plasmid Genbank file at NCBI.
8. The bacterial pdif rapid annotation device of claim 6, further comprising:
and the output module is used for outputting the pdif sequence annotation result in a tabular and graphical form.
9. The apparatus for rapid annotation of bacterial pdif according to claim 8, wherein the graphical representation comprises the number of pdif and the specific location of pdif in the bacterial DNA sequence.
10. The device for rapidly annotating bacterial pdif according to claim 6, wherein said bacterial DNA sequence is in Fasta or Genbank format.
CN202011239023.XA 2020-11-09 2020-11-09 Method and device for quickly annotating bacterial pdif Pending CN112309499A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011239023.XA CN112309499A (en) 2020-11-09 2020-11-09 Method and device for quickly annotating bacterial pdif

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011239023.XA CN112309499A (en) 2020-11-09 2020-11-09 Method and device for quickly annotating bacterial pdif

Publications (1)

Publication Number Publication Date
CN112309499A true CN112309499A (en) 2021-02-02

Family

ID=74325245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011239023.XA Pending CN112309499A (en) 2020-11-09 2020-11-09 Method and device for quickly annotating bacterial pdif

Country Status (1)

Country Link
CN (1) CN112309499A (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02128692A (en) * 1988-11-09 1990-05-17 Yakult Honsha Co Ltd Composite shuttle vector
EP1574519A1 (en) * 2002-12-18 2005-09-14 Japan Science and Technology Agency Transcription control factor ZHX3
US6960465B1 (en) * 2001-06-27 2005-11-01 Northwestern University Increased cell resistance to toxic organic substances
CN101128479A (en) * 2005-02-25 2008-02-20 味之素株式会社 Novel plasmid autonomously replicable in enterobacteriaceae family
CN103093123A (en) * 2011-11-08 2013-05-08 北京健数通生物计算技术有限公司 Pathogen genome sequence database system
CN106544353A (en) * 2016-11-08 2017-03-29 宁夏医科大学总医院 A kind of method that utilization CRISPR Cas9 remove Acinetobacter bauamnnii drug resistance gene
CN106884037A (en) * 2015-12-16 2017-06-23 博奥生物集团有限公司 A kind of gene chip kit of detection bacterium drug resistant gene
CN106886689A (en) * 2015-12-15 2017-06-23 浙江大学 A kind of pathogenic microorganism genome rapid analysis method and system
CN107384926A (en) * 2017-08-13 2017-11-24 中国人民解放军疾病预防控制所 A kind of CRISPR Cas9 systems for targetting bacteria removal Drug Resistance Plasmidss and application
CN108350510A (en) * 2015-09-09 2018-07-31 优比欧迈公司 For diagnosis of the gastrointestinal health associated disease from microbial population and therapy and system
CN110021348A (en) * 2018-06-19 2019-07-16 上海交通大学医学院附属瑞金医院 Oncogene mutation detection methods and system based on RNA-seq data
CN110349630A (en) * 2019-06-21 2019-10-18 天津华大医学检验所有限公司 Analysis method, device and its application of the macro gene order-checking data of blood
CN110423772A (en) * 2019-07-17 2019-11-08 上海科技大学 One kind being used for Acinetobacter bauamnnii cytosine base editor plasmid and its application
CN110910960A (en) * 2019-11-30 2020-03-24 浙江天科高新技术发展有限公司 Acinetobacter baumannii molecular serotype rapid analysis method
CN111009286A (en) * 2018-10-08 2020-04-14 深圳华大因源医药科技有限公司 Method and apparatus for microbiological analysis of host samples
CN111254158A (en) * 2020-02-27 2020-06-09 山东省千佛山医院 Method for eliminating drug-resistant plasmids in enterobacteriaceae bacteria
CN111378788A (en) * 2020-04-08 2020-07-07 广州微远基因科技有限公司 Bacterial marker for assisting COVID-19 diagnosis and application thereof

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH02128692A (en) * 1988-11-09 1990-05-17 Yakult Honsha Co Ltd Composite shuttle vector
US6960465B1 (en) * 2001-06-27 2005-11-01 Northwestern University Increased cell resistance to toxic organic substances
EP1574519A1 (en) * 2002-12-18 2005-09-14 Japan Science and Technology Agency Transcription control factor ZHX3
CN101128479A (en) * 2005-02-25 2008-02-20 味之素株式会社 Novel plasmid autonomously replicable in enterobacteriaceae family
CN103093123A (en) * 2011-11-08 2013-05-08 北京健数通生物计算技术有限公司 Pathogen genome sequence database system
CN108350510A (en) * 2015-09-09 2018-07-31 优比欧迈公司 For diagnosis of the gastrointestinal health associated disease from microbial population and therapy and system
CN106886689A (en) * 2015-12-15 2017-06-23 浙江大学 A kind of pathogenic microorganism genome rapid analysis method and system
CN106884037A (en) * 2015-12-16 2017-06-23 博奥生物集团有限公司 A kind of gene chip kit of detection bacterium drug resistant gene
CN106544353A (en) * 2016-11-08 2017-03-29 宁夏医科大学总医院 A kind of method that utilization CRISPR Cas9 remove Acinetobacter bauamnnii drug resistance gene
CN107384926A (en) * 2017-08-13 2017-11-24 中国人民解放军疾病预防控制所 A kind of CRISPR Cas9 systems for targetting bacteria removal Drug Resistance Plasmidss and application
CN110021348A (en) * 2018-06-19 2019-07-16 上海交通大学医学院附属瑞金医院 Oncogene mutation detection methods and system based on RNA-seq data
CN111009286A (en) * 2018-10-08 2020-04-14 深圳华大因源医药科技有限公司 Method and apparatus for microbiological analysis of host samples
CN110349630A (en) * 2019-06-21 2019-10-18 天津华大医学检验所有限公司 Analysis method, device and its application of the macro gene order-checking data of blood
CN110423772A (en) * 2019-07-17 2019-11-08 上海科技大学 One kind being used for Acinetobacter bauamnnii cytosine base editor plasmid and its application
CN110910960A (en) * 2019-11-30 2020-03-24 浙江天科高新技术发展有限公司 Acinetobacter baumannii molecular serotype rapid analysis method
CN111254158A (en) * 2020-02-27 2020-06-09 山东省千佛山医院 Method for eliminating drug-resistant plasmids in enterobacteriaceae bacteria
CN111378788A (en) * 2020-04-08 2020-07-07 广州微远基因科技有限公司 Bacterial marker for assisting COVID-19 diagnosis and application thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李一鸣 等: "肠杆菌科细菌耐药基因表达的遗传和环境调控", 《生物工程学报》 *
陈代杰 等: "从靶标到网络———抗菌药物作用机制与细菌耐药机制的研究进展", 《中国感染与化疗杂志》 *

Similar Documents

Publication Publication Date Title
Venturini et al. Leveraging multiple transcriptome assembly methods for improved gene structure annotation
Song et al. Rcorrector: efficient and accurate error correction for Illumina RNA-seq reads
Koren et al. Reducing assembly complexity of microbial genomes with single-molecule sequencing
Johnson et al. Re-assembly, quality evaluation, and annotation of 678 microbial eukaryotic reference transcriptomes
US20210317518A1 (en) Sequencing controls
Chen et al. Benchmarking hybrid assembly approaches for genomic analyses of bacterial pathogens using Illumina and Oxford Nanopore sequencing
Paulino et al. Sealer: a scalable gap-closing application for finishing draft genomes
Johnstone et al. Upstream ORF s are prevalent translational repressors in vertebrates
US20230178184A1 (en) Nucleic acid sequence assembly
Magi et al. Characterization of MinION nanopore data for resequencing analyses
Treangen et al. Repetitive DNA and next-generation sequencing: computational challenges and solutions
US20170199959A1 (en) Genetic analysis systems and methods
CN112017729A (en) Method and device for quickly annotating bacterial DNA sequence
CN105849276B (en) Systems and methods for detecting structural variants
Heames et al. A continuum of evolving de novo genes drives protein-coding novelty in Drosophila
Lee et al. Bioinformatics tools and databases for analysis of next-generation sequence data
Yildirim et al. High-resolution 3D models of Caulobacter crescentus chromosome reveal genome structural variability and organization
Dündar et al. Introduction to differential gene expression analysis using RNA-seq
KR20170106979A (en) System and method for visualizing structure variation and phase adjustment information
Qian et al. A new noncoding RNA arranges bacterial chromosome organization
CN111477281A (en) Pan-genome construction method and construction device based on phylogenetic tree
Luo et al. A chromosome-level reference genome of the wax gourd (Benincasa hispida)
CN112309499A (en) Method and device for quickly annotating bacterial pdif
Kozobay-Avraham et al. Involvement of DNA curvature in intergenic regions of prokaryotes
Agrawal et al. Complete sequence construction of the highly repetitive ribosomal RNA gene repeats in eukaryotes using whole genome sequence data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210202