CN110556163A - Analysis method of long-chain non-coding RNA translation small peptide based on translation group - Google Patents

Analysis method of long-chain non-coding RNA translation small peptide based on translation group Download PDF

Info

Publication number
CN110556163A
CN110556163A CN201910830146.1A CN201910830146A CN110556163A CN 110556163 A CN110556163 A CN 110556163A CN 201910830146 A CN201910830146 A CN 201910830146A CN 110556163 A CN110556163 A CN 110556163A
Authority
CN
China
Prior art keywords
open reading
short open
reading frame
coding
translation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910830146.1A
Other languages
Chinese (zh)
Other versions
CN110556163B (en
Inventor
夏昊强
周煌凯
高川
张羽
陶勇
罗玥
陈飞钦
邢燕
张秋雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Gene Denovo Biotechnology Co ltd
Original Assignee
Guangzhou Gene Denovo Biotechnology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Gene Denovo Biotechnology Co ltd filed Critical Guangzhou Gene Denovo Biotechnology Co ltd
Priority to CN201910830146.1A priority Critical patent/CN110556163B/en
Publication of CN110556163A publication Critical patent/CN110556163A/en
Application granted granted Critical
Publication of CN110556163B publication Critical patent/CN110556163B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • G16B30/10Sequence alignment; Homology search
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Bioethics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention discloses a method for analyzing a long-chain non-coding RNA translated small peptide based on a translation group, which comprises the following steps: step S1, searching and identifying the short open reading frame of the non-coding region; step S2, evaluating the translation ability of the short open reading frame by analyzing 4 aspects, wherein the 4 aspects are: calculating the ribosome imprinting abundance of the short open reading frame, screening the potential translatable short open reading frame based on ribosome imprinting, evaluating the coding potential of the short open reading frame based on sequence characteristics, and annotating the potential protein structural domain of the short open reading frame; and step S3, comprehensively analyzing the coding capacity of each short open reading frame obtained in the step S2, and drawing a Venn diagram. By combining the translation omics technology with the bioinformatics means, the invention can predict and analyze the translation capability of the non-coding region and realize the visual research on the translation and ribosome moving process.

Description

Analysis method of long-chain non-coding RNA translation small peptide based on translation group
Technical Field
The invention relates to the field of high-throughput sequencing and bioinformatics technical analysis, in particular to an analysis method of translated small peptides of a translational group and long-chain non-coding RNA.
Background
Long non-coding RNAs (incrnas) are a class of RNA molecules greater than 200 nucleotides in length, originally thought to be "noise" of genome transcription, and have no biological function and no function of encoding proteins. With the discovery and research of more and more long-chain non-coding RNAs, many research reports indicate that the long-chain non-coding RNAs play a role in biological processes such as disease occurrence, cell cycle, stem cell differentiation and the like in four molecular mechanism forms of signal molecules, induction molecules, guide molecules and scaffold molecules.
with the development and maturity of Ribosome imprinting technology (Ribosome nucleotides profiling, shortly called ribo-seq), translation group sequencing is born, and a technical means is provided for finding a new translation mode. Studies have shown that some regions of RNA traditionally thought not to encode proteins (including long non-coding RNAs, 5 'UTRs, 3' UTRs) may actually be involved in the regulation of the level of translation in the body, translating certain small peptides, typically less than 100 amino acids in length. These small peptides (small peptides) shorter than 100 amino acids also play diverse roles in organisms, including ontogeny, muscle contraction and DNA repair, among others. The small Open Reading frames encoding these small peptides are typically less than 300nt in length, and are commonly referred to as short Open Reading frames (srfs). Therefore, searching and identifying new unknown small peptides, further researching the functions of the new unknown small peptides and having great value in the field of scientific research.
however, the simple ribosome imprinting technique may detect RNA that is only indirectly bound to the ribosome and not translated, or RNA that is bound to other types of protein complexes; and some random factors, the ribosome complex may also bind to RNA, but normal translational activity is not performed. It is necessary to find a non-coding RNA that is actually translatable and to mine a new functional small peptide in combination with bioinformatics techniques. However, in the common gene annotation analysis, only proteins with a length of more than 100 amino acids are generally focused, and the group-filling Sequence of these common Coding genes is also called consensus Coding Sequence (CCDS), and the corresponding open reading frame is called CCDS ORF, and usually these ORFs are also the main protein-Coding mrfs (mainprotein-Coding ORFs). On the other hand, proteins with less than 100 amino acids are considered as noise and false positive results by most software and algorithms, so that the proteins are ignored all the time, the number of short open reading frames is huge, important biological functions are exerted, and a genome annotation method is required to be designed for screening the short open reading frames.
In summary, the current research methods for long-chain non-coding RNA and the tools for predicting the translation function thereof are not comprehensive and accurate, and do not form a systematic method, so that the regulation mechanism and the biological function of the long-chain non-coding RNA cannot be well predicted, and the long-chain non-coding RNA and the translated small peptide thereof in all organisms such as human, mammal and plant cannot be systematically and deeply researched. The invention searches the potential small peptide of the genome non-coding region by combining the technology of the translatomics and the bioinformatics means, judges whether the detected reads signal is consistent with the footprint of the ribosome in translation or not, and can realize the visual research on the translation and the movement process of the ribosome.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide a method for analyzing a long-chain non-coding RNA translated small peptide based on a translation group, so as to solve the problems in the prior art. The invention provides an analysis method of a long-chain non-coding RNA translation small peptide based on a translation group, which comprises the following steps:
step S1: searching and identifying a short open reading frame of a non-coding region;
Preferably, the short open reading frame search of the RNA of the non-coding region requires more than 20 amino acids (60 nt) and less than 150 amino acids (450 nt). The short open reading frame searching steps of the non-coding region are as follows: (1) searching and extracting a transcript sequence from a genome; (2) for coding genes, extracting a longest transcript from each gene, wherein a mORF is contained in the sequence; (3) for lncRNA, each transcript sequence is retained, i.e., variable splicing is retained.
Preferably, the short open reading frame recognition of the non-coding region is based on a transcript sequence extracted from a reference genome, with potential open reading frames between all start and stop codons being retrieved in the non-coding region.
Step S2: analyzing the translation ability of the short open reading frame, including but not limited to the following:
step S2.1: calculating the ribosome imprinting abundance of the short open reading frame;
preferably, first, the ribosome imprinted reads are aligned to the extracted transcript sequence by the software bowtie2, and only the reads aligned to a certain gene are reserved; secondly, based on the comparison result, further counting the abundance of the transcripts (the number of reads with the ORF open reading frame having overlap) in each open reading frame region, and calculating the expression quantity RPKM value of each open reading frame; finally, based on the RPKM value, calculating the expression difference multiple and the relationship between the uORF translation abundance difference multiple and the downstream mORF translation abundance multiple of the short open reading frame among the samples, and graphically displaying the analysis result.
Step S2.2: screening a short open reading frame which is potentially translatable based on ribosome imprinting;
According to the characteristics presented by ribosome imprinting signals for executing the translation process, an ORF score calculation method and an RRS (Ribosome Release score) calculation method are adopted to screen the short open reading frames which can be translated potentially.
Step S2.3: evaluating the coding potential of the short open reading frame based on sequence characteristics;
The coding potential of the short open reading frames was evaluated using the software CPAT calculations Fickett _ score and Hexamer _ score with the known coding sequences as a reference database.
Step S2.4: annotating short open reading frame potential protein domains;
The protein sequences potentially encoded by the short open reading frame were aligned to the pfam database (http:// pfam. xfam. org /), and the protein domains potentially contained in the small peptides were analyzed.
Step S3: comprehensive analysis of multiple evaluation methods;
and comprehensively analyzing the coding capacity of each short open reading frame in the step S2, firstly, respectively using 4 standard screened potential translatable short open reading frames, and drawing a Venn diagram.
Preferably, the 4 criteria and their screening criteria are: (1) m is that the average expression level in at least one group of samples is more than 1; (2) ORFscore short open reading frame with ORFscore exceeding threshold; (3) RRS: short open reading frame with RRS above threshold (4) Fickett _ score: fickett _ score exceeds a short open reading frame of 0.74.
Compared with the prior art, the analysis method of the long-chain non-coding RNA translated small peptide based on the translation group solves the problems that a long-chain non-coding RNA research method and an analysis method for predicting the translation function of the long-chain non-coding RNA are not comprehensive and do not form a systematic method in the prior art through a translational omics technology and a biological information technology, and meanwhile, the systematic analysis method of the long-chain non-coding RNA translated small peptide is established in the invention and can predict and analyze the translation capacity of a non-coding region.
Drawings
FIG. 1 is a flow chart of a method for analyzing a translated small peptide of a long non-coding RNA based on a translational set according to the present invention;
FIG. 2 is a distribution of fold difference for the three types (lncORF, uORF, dORF) of ORFs in the examples of the invention;
FIG. 3 is a differentially expressed volcano plot in an embodiment of the present invention;
FIG. 4 is a distribution chart of the score of mORF and sORF calculated by the RRS method in the example of the present invention;
FIG. 5 is a graph showing a distribution of scores of mORF and sORF obtained by calculation according to the Fickett _ score method in the example of the present invention;
FIG. 6 is a graph showing the results of screening for potentially translatable sORF using 4 criteria in the examples of the present invention;
Detailed Description
Other advantages and capabilities of the present invention will be readily apparent to those skilled in the art from the present disclosure by describing the embodiments of the present invention with specific embodiments thereof in conjunction with the accompanying drawings. The invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention.
The analysis shows that the analysis method of the long-chain non-coding RNA translation small peptide based on the translation group can overcome the incomprehensive problem of a long-chain non-coding RNA research method and a tool for predicting the translation function of the long-chain non-coding RNA, and can accurately and systematically predict the translation capability of a non-coding region. Therefore, the present invention provides a method for analyzing translated small peptide of long non-coding RNA based on translation group (FIG. 1), which comprises the following steps:
step S1: searching and identifying a short open reading frame of a non-coding region;
Preferably, the short open reading frame searching step of the non-coding region is as follows: (1) searching and extracting a transcript sequence from a genome; (2) for coding genes, extracting a longest transcript from each gene, wherein a mORF is contained in the sequence; (3) for lncRNA, each transcript sequence is retained, i.e., variable splicing is retained.
Preferably, the short open reading frame search of the RNA of the non-coding region requires more than 20 amino acids (60 nt) and less than 150 amino acids (450 nt).
Preferably, the short open reading frame recognition of the non-coding region is based on a transcript sequence extracted from a reference genome, with potential open reading frames between all start and stop codons being retrieved in the non-coding region. In this embodiment, the short open reading frames are classified according to their origin regions as follows: (1) a short open reading frame derived from the lncRNA region, defined as lncrf (lncRNA orf); (2) a short open reading frame, defined as uORF (updraft ORF), derived from a region known to encode the 5' UTR of a gene; (3) the short open reading frame, known from the source, encoding the 3' UTR region of a gene is defined as dORF (downstream ORF).
step S2: analyzing the translation ability of the short open reading frame, including but not limited to the following:
Step S2.1: calculating the ribosome imprinting abundance of the short open reading frame: the Ribosome Footprint (RFs) represents the position on the RNA where the Ribosome complex is slidingly retained. Thus, if there is a higher RFs signal within an sORF, this indicates that there is more ribosome complex remaining in this region, and the greater the likelihood that it can translate proteins.
Preferably, first, the ribosome imprinted reads are aligned to the extracted transcript sequence by the software bowtie2, and only the reads aligned to a certain gene are reserved; secondly, based on the comparison result, further counting the abundance of the transcripts (the number of reads with the ORF open reading frame having overlap) in each open reading frame region, and calculating the expression quantity RPKM value of each open reading frame; finally, based on the RPKM values, fold difference in expression of the short open reading frames between the respective samples was calculated using difference analysis software edge R, and further the relationship between fold difference in translational abundance of uORF and fold difference translational abundance of downstream mrfs was calculated.
Preferably, in pairwise comparisons sets, ORFs with an average RPKM value ≧ 1 (including sORF and mORF) are selected in at least one set of samples, judged to have sufficient RFs coverage for subsequent analysis, and filtered for short open reading frames that fail to meet the criteria. Wherein subsequent analysis includes, but is not limited to: a short open reading frame difference fold drawing distribution diagram (figure 2), a scatter diagram of sORF difference change and downstream mOR difference change, a difference expression volcano diagram (figure 3) and a short open reading frame column diagram with obvious difference.
Step S2.2: screening for potentially translatable short open reading frames based on ribosomal imprinting: the Ribosome Footprint (RFs) is the footprint left by the binding of the Ribosome complex to the RNA. This binding may be random in origin without efficient translation processes, or may be the instantaneous footprint left by the ribosome complex sliding on a translatable piece of RNA for peptide chain synthesis. For RFs signals that perform the translation process, which exhibit certain properties (i.e. the RFs signal should conform to a three base rhythm and the 3' UTR region after the stop codon will be significantly lower than the upstream ORF region), the present invention screens for potentially translatable short open reading frames by calculating two parameters, ORF score and RRS, to determine whether the RFs signal bound to the non-coding RNA sequence is from a ribosomal translation signal.
Preferably, ORF score is a method for determining whether the RFs signal within an ORF matches a three base rhythm. The analysis comprises the following main steps: firstly, RFs of each sample are classified according to length, then the main code reading frame (frame) where the RFs of each length fall is counted, and the position P corresponding to a single RFs is predicted, so that the single RFs is judged to fall in the code reading frame; secondly, combining RFs classification results of all samples based on classification results of reading frames to which RFs belong, and then calculating ORF score of each open reading frame covered by RFs data; finally, the lowest 5% (lower 5% quantile) of ORF score of the encoding gene was used as a threshold criterion for determining whether a sequence was translated.
Preferably, RRS (Ribosome Release score) is a method for calculating the translation ratio of the CDS region and the downstream UTR region of each ORF. The reads data for each sample were first pooled and the RRS value for each ORF was then calculated. The lowest 5% (lower 5% quantile) of the RRS of the coding gene is used as the threshold criterion for judging whether a segment of sequence is translated or not.
Preferably, the conventional calculation formula of the RRS is: RRS =
Wherein: ribo _ RPKM (CDS) is the Ribo-seq abundance (RPKM value) of the CDS region of one ORF; ribo _ RPKM (3 'UTR) is the Ribo-seq abundance (RPKM value) of the 3' UTR region downstream of one ORF; RNA _ RPKM (CDS) is the RNA-seq abundance (RPKM value) of the CDS region of one ORF; RNA _ RPKM (3 'UTR) is the RNA-seq abundance (RPKM value) of the 3' UTR region downstream of one ORF.
In this example analysis, potential translatable short open reading frames were screened by calculating two screening indices of ORFs score and RRS, and a score distribution of mrfs and srorfs was obtained (fig. 4).
step S2.3: evaluating the coding potential of the short open reading frame based on sequence characteristics;
the coding potential of the short open reading frames was evaluated using the software CPAT calculations Fickett _ score and Hexamer _ score with the known coding sequences as a reference database.
preferably, Fickett _ score: the likelihood of its coding is calculated from the preference of absolute and relative positions (percent positions) of the bases of the coding and non-coding sequences. Values less than or equal to 0.74 are generally considered to be without coding capability; values greater than or equal to 0.95 are coding capable and values between 0.74 and 0.95 are uncertain as to whether coding capable or not.
preferably, Hexamer _ score: breaking the coded and non-coded sequences into 6 mers would yield 4096(4^6) cases, calculating the probability of possible coding for each 6mer based on their ratio in the coded and non-coded sequences. A positive value and a larger value indicates a more likely to encode a protein, a negative value, and a larger negative value indicates a less likely to encode a protein.
preferably, the probability of the total encoding can also be obtained by integrating the first two indicators. In this embodiment, the coding capacity of the short open reading frames was evaluated by calculating two screening indexes, i.e., Fickett _ score and Hexamer _ score, to obtain a distribution of scores of mORF and sORF (FIG. 5).
Step S2.4: annotating short open reading frame potential protein domains;
The protein sequences potentially encoded by the short open reading frame were aligned to the pfam database (http:// pfam. xfam. org /), and the protein domains potentially contained in the small peptides were analyzed.
Step S3: comprehensive analysis of multiple evaluation methods;
and comprehensively analyzing the coding capacity of each short open reading frame in the step S2, firstly, respectively using 4 standard screened potential translatable short open reading frames, and drawing a Venn diagram.
Preferably, the 4 criteria and their screening criteria are: (1) m (mean) is an average expression level of greater than 1 in at least one group of samples; (2) ORF score short open reading frame with ORF score exceeding threshold; (3) RRS: short open reading frame with RRS above threshold (4) Fickett _ score: fickett _ score exceeds a short open reading frame of 0.74.
in summary, the analysis method of the long-chain non-coding RNA translated small peptide based on the translation group is used for analyzing the long-chain non-coding RNA translated small peptide by combining a translational omics method and a biological information method, so that the problems that a long-chain non-coding RNA research method and an analysis method for predicting the translation function of the long-chain non-coding RNA are not comprehensive and a systematic method is not formed in the prior art are solved, and meanwhile, the systematic analysis method of the long-chain non-coding RNA translated small peptide is established, so that the translation capability of a non-coding region can be predicted and analyzed.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Modifications and variations can be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the present invention. Therefore, the scope of the invention should be determined from the following claims.

Claims (8)

1. a method for analyzing a translated small peptide of a long non-coding RNA based on a translation group, which is characterized by comprising the following steps:
step S1, searching and identifying the short open reading frame of the non-coding region;
step S2, evaluating the translation ability of the short open reading frame by analyzing 4 aspects, wherein the 4 aspects are: calculating the ribosome imprinting abundance of the short open reading frame, screening the potential translatable short open reading frame based on ribosome imprinting, evaluating the coding potential of the short open reading frame based on sequence characteristics, and annotating the potential protein structural domain of the short open reading frame;
And step S3, comprehensively analyzing the short open reading frame coding capacity obtained in the step S2, and drawing a Venn diagram.
2. The assay of claim 1, wherein in step S1, the short open reading frame search requirement for non-coding region RNA is: greater than 20 amino acids (60 nt) and less than 150 amino acids (450 nt); the short open reading frame searching steps of the non-coding region are as follows: (1) searching and extracting a transcript sequence from a genome; (2) for coding genes, extracting a longest transcript from each gene, wherein a mORF is contained in the sequence; (3) for long non-coding RNAs, each transcript sequence is retained.
3. The analytical method of claim 2, wherein: short open reading frame recognition of non-coding regions is based on transcript sequences extracted from a reference genome, with potential open reading frames between all start and stop codons being retrieved in the non-coding region.
4. The method of claim 1, wherein the step of calculating the ribosomic abundance of the short open reading frame in step S2 comprises: firstly, ribosome imprinted reads are aligned to an extracted transcript sequence through software bowtie2, and only the reads aligned to a certain gene are reserved; secondly, further counting the abundance of the transcripts falling in each open reading frame region based on the comparison result, and calculating the expression quantity RPKM value of each open reading frame; and finally, calculating the expression difference multiple of the short open reading frames among the samples and the relation between the uORF translation abundance difference multiple and the downstream mORF translation abundance multiple based on the RPKM value, and graphically displaying the analysis result.
5. The method of claim 1, wherein in step S2, the screening for potentially translatable short open reading frames based on ribosom is performed by screening for potentially translatable short open reading frames using ORF score calculation and RRS calculation based on the properties exhibited by the ribosom signal that is indicative of the progress of the translation.
6. the method of claim 1, wherein the step S2 of evaluating the coding potential of the short open reading frame based on the sequence characteristics comprises evaluating the coding potential of the short open reading frame by calculating Fickett _ score and Hexamer _ score using software CPAT with known coding sequences as a reference database.
7. the method of claim 1, wherein the step of annotating potential protein domains of short open reading frames in step S2 is performed by aligning the protein sequences potentially encoded by the short open reading frames with the pfam database to analyze the protein domains potentially contained in the small peptides.
8. The method of claim 1, wherein in step S3, the short open reading frame coding capacity obtained in step S2 is analyzed in combination by 4 analysis criteria, respectively (1) M: at least in a group of samples, the average expression level is greater than 1; (2) ORF score short open reading frame with ORF score exceeding threshold; (3) RRS: short open reading frame with RRS above threshold (4) Fickett _ score: fickett _ score exceeds a short open reading frame of 0.74.
CN201910830146.1A 2019-09-04 2019-09-04 Analysis method of long-chain non-coding RNA translation small peptide based on translation group Active CN110556163B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910830146.1A CN110556163B (en) 2019-09-04 2019-09-04 Analysis method of long-chain non-coding RNA translation small peptide based on translation group

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910830146.1A CN110556163B (en) 2019-09-04 2019-09-04 Analysis method of long-chain non-coding RNA translation small peptide based on translation group

Publications (2)

Publication Number Publication Date
CN110556163A true CN110556163A (en) 2019-12-10
CN110556163B CN110556163B (en) 2022-12-30

Family

ID=68738912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910830146.1A Active CN110556163B (en) 2019-09-04 2019-09-04 Analysis method of long-chain non-coding RNA translation small peptide based on translation group

Country Status (1)

Country Link
CN (1) CN110556163B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111243665A (en) * 2020-01-07 2020-06-05 广州基迪奥生物科技有限公司 Analysis method and system for ribosome imprinting sequencing data
CN111690673A (en) * 2020-06-22 2020-09-22 扬州大学 Verification method for promoting PGCs (PGCs) formation by lncRNA (long chain ribonucleic acid) through encoding small peptide
CN114038500A (en) * 2021-08-27 2022-02-11 海南医学院 Method for identifying non-coding RNA polypeptide
CN114639442A (en) * 2022-03-30 2022-06-17 中国农业科学院农业基因组研究所 Method and system for predicting open reading frame based on single nucleotide polymorphism
CN118038995A (en) * 2024-01-23 2024-05-14 常州大学 Method and system for predicting small open reading window coding polypeptide capacity in non-coding RNA

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AUPQ580300A0 (en) * 2000-02-23 2000-03-16 National Cancer Centre Of Singapore Pte Ltd Genetic analysis
US6303297B1 (en) * 1992-07-17 2001-10-16 Incyte Pharmaceuticals, Inc. Database for storage and analysis of full-length sequences
US6321163B1 (en) * 1999-09-02 2001-11-20 Genetics Institute, Inc. Method and apparatus for analyzing nucleic acid sequences
WO2003064608A2 (en) * 2002-01-25 2003-08-07 The Board Of Trustees Of The Leland Stanford Junior University Surface based translation system
US20100120625A1 (en) * 2008-11-03 2010-05-13 The Regents Of The University Of California Methods for detecting modification resistant nucleic acids
CN106957354A (en) * 2017-04-14 2017-07-18 广州医科大学附属第三医院(广州重症孕产妇救治中心、广州柔济医院) HOXB AS3 polypeptides and the ORFs for encoding it
CA3054487A1 (en) * 2017-03-01 2018-09-07 Bluedot Llc Systems and methods for metagenomic analysis
CN109979528A (en) * 2019-03-28 2019-07-05 广州基迪奥生物科技有限公司 A kind of analysis method of unicellular immune group library sequencing data

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6303297B1 (en) * 1992-07-17 2001-10-16 Incyte Pharmaceuticals, Inc. Database for storage and analysis of full-length sequences
US6321163B1 (en) * 1999-09-02 2001-11-20 Genetics Institute, Inc. Method and apparatus for analyzing nucleic acid sequences
AUPQ580300A0 (en) * 2000-02-23 2000-03-16 National Cancer Centre Of Singapore Pte Ltd Genetic analysis
WO2003064608A2 (en) * 2002-01-25 2003-08-07 The Board Of Trustees Of The Leland Stanford Junior University Surface based translation system
US20100120625A1 (en) * 2008-11-03 2010-05-13 The Regents Of The University Of California Methods for detecting modification resistant nucleic acids
CA3054487A1 (en) * 2017-03-01 2018-09-07 Bluedot Llc Systems and methods for metagenomic analysis
CN106957354A (en) * 2017-04-14 2017-07-18 广州医科大学附属第三医院(广州重症孕产妇救治中心、广州柔济医院) HOXB AS3 polypeptides and the ORFs for encoding it
CN109979528A (en) * 2019-03-28 2019-07-05 广州基迪奥生物科技有限公司 A kind of analysis method of unicellular immune group library sequencing data

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
QINGQING LIU,等: ""Long Non-coding RNA Expression Profile and Functional Analysis in Children With Acute Fulminant Myocarditis"", 《FRONTIERS IN PEDIATRICS》 *
WEI ZHENG,等: ""Integrated analysis of long non-coding RNAs and mRNAs associated with peritendinous fibrosis"", 《JOURNAL OF ADVANCED RESEARCH》 *
基迪奥生物: ""如何进行环状RNA编码能力分析"", 《HTTPS://WWW.SOHU.COM/A/284986175_278730》 *
张晓琴,等: ""脂肪形成相关长链非编码RNA的研究进展"", 《临床检验杂志》 *
贾志龙等: "核糖体谱技术及其应用", 《生物化学与生物物理进展》 *
郭睿等: "意大利蜜蜂工蜂中肠发育过程中长链非编码RNA的差异表达分析", 《中国农业科学》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111243665A (en) * 2020-01-07 2020-06-05 广州基迪奥生物科技有限公司 Analysis method and system for ribosome imprinting sequencing data
CN111690673A (en) * 2020-06-22 2020-09-22 扬州大学 Verification method for promoting PGCs (PGCs) formation by lncRNA (long chain ribonucleic acid) through encoding small peptide
CN114038500A (en) * 2021-08-27 2022-02-11 海南医学院 Method for identifying non-coding RNA polypeptide
CN114639442A (en) * 2022-03-30 2022-06-17 中国农业科学院农业基因组研究所 Method and system for predicting open reading frame based on single nucleotide polymorphism
CN114639442B (en) * 2022-03-30 2024-01-30 中国农业科学院农业基因组研究所 Method and system for predicting open reading frame based on single nucleotide polymorphism
CN118038995A (en) * 2024-01-23 2024-05-14 常州大学 Method and system for predicting small open reading window coding polypeptide capacity in non-coding RNA

Also Published As

Publication number Publication date
CN110556163B (en) 2022-12-30

Similar Documents

Publication Publication Date Title
CN110556163B (en) Analysis method of long-chain non-coding RNA translation small peptide based on translation group
CN113160887B (en) Screening method of tumor neoantigen fused with single cell TCR sequencing data
CN109767810B (en) High-throughput sequencing data analysis method and device
US20120191685A1 (en) Method for identifying peptides and proteins from mass spectrometry data
CN110189796A (en) A kind of sheep full-length genome resurveys sequence analysis method
CN113160877B (en) Prediction method of cell-specific genome G-quadruplex
JP6644672B2 (en) Characterization of biological materials using unassembled sequence information, stochastic methods, and trait-specific database catalogs
CN115101128B (en) Method for evaluating off-target risk of hybridization capture probe
CN113096737B (en) Method and system for automatically analyzing pathogen type
CN111180013B (en) Device for detecting blood disease fusion gene
Crysup et al. Using unique molecular identifiers to improve allele calling in low-template mixtures
Yu et al. Prediction of protein-coding small ORFs in multi-species using integrated sequence-derived features and the random forest model
CN117275577A (en) Algorithm for detecting human mitochondrial genetic mutation sites based on second-generation sequencing technology
CN110438235B (en) Method for deducing crowd source based on hair shaft proteome nsSNP
CN115478113A (en) Beef cattle fatty acid component candidate marker multi-omics screening method and application thereof
CN111164701A (en) Fixed-point noise model for target sequencing
CN110751985B (en) Gut microbial markers highly correlated with large heavy chickens
CN109215736A (en) A kind of high-flux detection method of enterovirus group and application
CN110684830A (en) RNA analysis method for paraffin section tissue
CN111028885B (en) Method and device for detecting yak RNA editing site
CN110592093B (en) Aptamer capable of recognizing EpCAM protein, and preparation method and application thereof
CN113528631B (en) Method and system for predicting sample quality in NGS sequencing
Wang Improved Basecalling and Base Modification Detection Through Signal-level Analysis of Nanopore Direct RNA Data
Pfeil Development of a novel barcode calling algorithm for long error-prone reads
CN118230820A (en) Metagene sequencing data-based drug-resistant gene species source identification method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant