CN105787290B - Synthetic biology data processing method and its processing system - Google Patents

Synthetic biology data processing method and its processing system Download PDF

Info

Publication number
CN105787290B
CN105787290B CN201610077123.4A CN201610077123A CN105787290B CN 105787290 B CN105787290 B CN 105787290B CN 201610077123 A CN201610077123 A CN 201610077123A CN 105787290 B CN105787290 B CN 105787290B
Authority
CN
China
Prior art keywords
data
sequence
table
characteristic
tables
Prior art date
Application number
CN201610077123.4A
Other languages
Chinese (zh)
Other versions
CN105787290A (en
Inventor
师佩
李盼盼
吕琪
吴涛
董亚非
Original Assignee
陕西师范大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 陕西师范大学 filed Critical 陕西师范大学
Priority to CN201610077123.4A priority Critical patent/CN105787290B/en
Publication of CN105787290A publication Critical patent/CN105787290A/en
Application granted granted Critical
Publication of CN105787290B publication Critical patent/CN105787290B/en

Links

Abstract

The invention discloses synthetic biology data processing method and its systems, it is related to synthetic biology technical field, the data obtained from iGEM are split as multiple characteristic sequences by this method, start with from information such as sequence, title and types and characteristic sequence is handled, achievees the effect that data are comprehensive and irredundant.

Description

Synthetic biology data processing method and its processing system

Technical field

The present invention relates to synthetic biology technical fields, in particular to synthetic biology data processing method and its processing system System.

Background technique

With the development of synthetic biology, biologic components data volume is increasing, and repetitive rate is higher and higher, various data texts Part format is different, general cannot have become the problem of synthesising biological scholar faces jointly.It is well known that the essence of biologic components Its sequence, the basis of biologic components is the assembling of different DNA sequence dnas, the target of assembly be make component have specific function and compared with It is perfect.Many synthetic biologies have focused in the assembling criterion of component now, propose many new assembly strategies for example: BioBrick、BglBrick。

Complete sequence and its description are provided on many magazines, but user's acquisition is restricted.Many magazines all do not close In the description of plasmid sequence, it solves the problems, such as the data information on plasmid to Addgene, it provides complete plasmid sequence also There is annotation, but their work hard also only covers the sub-fraction plasmid delivered.Same type before review The data volume of article, the biologic components being related to is considerably less, only about 2000, not comprehensive enough for the data of analysis.

Standard biological component is registered in iGEM and GenoCAD, data volume is greatly and the data of submission are synthesising biological scholars Data format that is acknowledged and being submitted is unified.But data nobody is submitted to carry out arrangement point to data only on iGEM Analysis, only raw data table here, there are a large amount of redundancies.

Summary of the invention

The embodiment of the invention provides synthetic biology data processing method and its processing systems, to solve the prior art Middle biologic components data volume is small and data have redundancy.

Synthetic biology data processing method, comprising the following steps:

Data are obtained from iGEM, and are stored in raw data table;

Component in the raw data table is split, multiple characteristic sequences are obtained, and is stored in fisrt feature sequence In list, wherein each data include feature serial number, title, type, complementation, starting, sequence in the fisrt feature sequence table Column and part serial number information;

Search sequence, title and type characteristic sequence all the same, each to retain one in the fisrt feature sequence table Item, and be stored in second feature sequence table;

Search sequence is identical in the second feature sequence table, but the characteristic sequence that title is different, is stored in first In name data table;

The complementary characteristic sequence for being 1 is inquired and retained in the first title tables of data, is deleted other characteristic sequences, is obtained Obtain the first sub- name data table;

If the characteristic sequence in the first sub- name data table is all the same, inquired in the described first sub- name data table And retain the complementary identical and the smallest characteristic sequence of starting, other characteristic sequences are deleted, the second sub- name data table is obtained;

If the characteristic sequence in the second sub- name data table is all the same, looked into the described second sub- name data table The characteristic sequence with complete concept title is ask and retained, other characteristic sequences are deleted, obtains the second title tables of data;

Search sequence and title are all the same in the second feature sequence table, but different types of characteristic sequence, deposit Storage is in first kind tables of data;

The complementary characteristic sequence for being 1 is inquired and retained in the first kind tables of data, is deleted other characteristic sequences, is obtained Obtain the first subtype tables of data;

If the characteristic sequence in the first subtype tables of data is all the same, looked into the first subtype tables of data The characteristic sequence that simultaneously Retention Type meets preset condition is ask, first subtype number is retained in the second title tables of data According to the characteristic sequence retained in table, other characteristic sequences are deleted, obtain Second Type tables of data;

It is all the same that search sequence, title and type are distinguished in the fisrt feature sequence table, but complementary different, starting Different and different part serial number characteristic sequences retains one respectively, and the different characteristic sequence of sequence is stored in described In Second Type tables of data, third feature sequence table is obtained;

Characteristic sequence in the third feature sequence table is renamed, fourth feature sequence table is obtained.

Preferably, the step further include:

If there are different characteristic sequences in the first sub- name data table, the described first sub- name data table is made For the second title tables of data;Or

If there are different characteristic sequences in the second sub- name data table, the described second sub- name data table is made For the second title tables of data.

Preferably, the step further include:

If there are different characteristic sequences in the first subtype tables of data, the first subtype tables of data is made For the Second Type tables of data.

Preferably, step renames the data in the third feature sequence table, obtains fourth feature sequence table Specifically:

It is combined, is obtained new using the title and feature serial number of each characteristic sequence in the third feature sequence table Title obtains the fourth feature sequence table.

Synthetic biology data processing system, comprising:

Data acquisition module for obtaining data from iGEM, and is stored in raw data table;

Component splits module, for splitting the component in the raw data table, obtains multiple characteristic sequences, and Be stored in fisrt feature sequence table, wherein in the fisrt feature sequence table each data include feature serial number, title, Type, complementation, starting, sequence and part serial number information;

It is all the same to be used for search sequence, title and type in the fisrt feature sequence table for first data screening module Characteristic sequence, it is each to retain one, and be stored in second feature sequence table;

It is identical to be used for the search sequence in the second feature sequence table for second data screening module, but title is different Characteristic sequence, be stored in the first title tables of data;

First data removing module deletes other feature sequences for inquiring and retaining the complementary characteristic sequence for being 1 first Column obtain the first sub- name data table;If the characteristic sequence in the first sub- name data table is all the same, in first son The complementary identical and the smallest characteristic sequence of starting is inquired and retained in name data table, is deleted other characteristic sequences, is obtained second Sub- name data table;If the characteristic sequence in the second sub- name data table is all the same, in the described second sub- name data The characteristic sequence with complete concept title is inquired and retained in table, deletes other characteristic sequences, obtains the second title tables of data;

Third data screening module, in the second feature sequence table search sequence and title it is all the same, still Different types of characteristic sequence is stored in first kind tables of data;

Second data removing module deletes other feature sequences for inquiring and retaining the complementary characteristic sequence for being 1 first Column obtain the first subtype tables of data;If the characteristic sequence in the first subtype tables of data is all the same, described first Inquiry and Retention Type meet the characteristic sequence of preset condition in subtype tables of data, retain in the second title tables of data The characteristic sequence retained in the first subtype tables of data deletes other characteristic sequences, obtains Second Type tables of data;

4th data screening module, it is equal for distinguishing search sequence, title and type in the fisrt feature sequence table It is identical, but complementary characteristic sequence different, starting is different and part serial number is different, retain one respectively, and not by sequence Same characteristic sequence is stored in the Second Type tables of data, obtains third feature sequence table;

Data renamer module obtains for renaming to the characteristic sequence in the third feature sequence table Four characteristic sequence tables.

Preferably, there are different characteristic sequences in the described first sub- name data table for the first data removing module When, using the described first sub- name data table as the second title tables of data;Or

There are when different characteristic sequences in the described second sub- name data table, then by the described second sub- name data table As the second title tables of data.

Preferably, there are different characteristic sequences in the first subtype tables of data for the second data removing module When, using the first subtype tables of data as the Second Type tables of data.

Preferably, the data renamer module using each characteristic sequence in the third feature sequence table title and Feature serial number is combined, and obtains new title, that is, obtains the fourth feature sequence table.

Synthetic biology data processing method and its processing system in the embodiment of the present invention, the data obtained from iGEM are torn open Be divided into multiple characteristic sequences, start with from information such as sequence, title and types and characteristic sequence is handled, reach data comprehensively and Irredundant effect.

Detailed description of the invention

It, below will be to embodiment in order to illustrate more clearly of inventive embodiments of the present invention or technical solution in the prior art Or attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is only It is some embodiments that the present invention invents, for those of ordinary skill in the art, in the premise not made the creative labor Under, it is also possible to obtain other drawings based on these drawings.

Fig. 1 is the step flow chart of synthetic biology data processing method provided in an embodiment of the present invention;

Fig. 2 is the schematic diagram for the raw data table that method uses in Fig. 1;

Fig. 3 is the schematic diagram for the fisrt feature sequence table that method uses in Fig. 1;

Fig. 4 is the functional block diagram using the synthetic biology data processing system of method in Fig. 1.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

Referring to Fig.1, the embodiment of the invention provides synthetic biology data processing methods, this method comprises:

Step 100, using Perl, (Practical Extraction and Report Language, practical report are mentioned Taking language) language is in iGEM (International Genetically Engineered Machine Competition, state The machine design competition of border genetic engineering) data are obtained in database, and be stored in raw data table, as shown in Fig. 2, described The information of each one biologic components of behavior in raw data table, those information include partial order number (parts_id), title (name), (definition), sequence (sequence), length (length) and state (status) information are described;

Step 101, biologic components each in the raw data table are split, obtains multiple characteristic sequences, and deposit Storage in fisrt feature sequence table, as shown in figure 3, in the fisrt feature sequence table each one characteristic sequence of behavior letter Breath, those information include feature serial number (feature_id), original serial number (origin_id), title (name), type (type), complementary (complement), starting (start), termination (end), sequence (sequence) and part serial number (parts_ Id) information;

Step 102, search sequence, title and type characteristic sequence all the same in the fisrt feature sequence table, often There is kind the characteristic sequence of identical sequence, title and type to retain one, and be stored in second feature sequence table;

Step 103, search sequence is identical in the second feature sequence table, but the characteristic sequence that title is different, deposits Storage is in the first title tables of data;

Step 104, the characteristic sequence in the second feature sequence table is deleted referring to the first title tables of data It removes, obtains the second title tables of data;

Specifically, the complementary characteristic sequence for being 1 is inquired and retained in the first title tables of data first, deletes other Characteristic sequence obtains the first sub- name data table;If the characteristic sequence in the first sub- name data table is all the same, described The complementary identical and the smallest characteristic sequence of starting is inquired and retained in first sub- name data table, is deleted other characteristic sequences, is obtained Obtain the second sub- name data table;If the characteristic sequence in the second sub- name data table is all the same, in the described second sub- name Claim to inquire and retain the characteristic sequence with complete concept title in tables of data, deletes other characteristic sequences, obtain described second Name data table;

If there are different characteristic sequences in the first sub- name data table, the described first sub- name data table is made For the second title tables of data;

If there are different characteristic sequences in the second sub- name data table, the described second sub- name data table is made For the second title tables of data;

Step 105, search sequence and title are all the same in the second feature sequence table, but different types of feature Sequence is stored in first kind tables of data;

Step 106, characteristic sequence in the second title tables of data is deleted referring to the first kind tables of data, Obtain Second Type tables of data;

Specifically, the complementary characteristic sequence for being 1 is inquired and retained first, is deleted other characteristic sequences, is obtained the first subclass Type tables of data;If the characteristic sequence in the first subtype tables of data is all the same, in the first subtype tables of data The characteristic sequence that simultaneously Retention Type meets preset condition is inquired, first subtype is retained in the second title tables of data The characteristic sequence retained in tables of data deletes other characteristic sequences, obtains the Second Type tables of data;

If there are different characteristic sequences in the first subtype tables of data, the first subtype tables of data is made For the Second Type tables of data.

In the present embodiment, the preset condition is a plurality of types of priority orders, as shown in Table 1:

One characteristic sequence type priority grade sequence list of table

Retain It deletes bioBrick cds、dna、binding、promoter cds misc、protein dna cds、misc protein mutation、misc binding operater operator stem_loop promoter operator、mutation rbs mutation、binding mutation start、end misc binding、mutation、promoter、stem_loop、rbs

In table one, the characteristic sequence type priority grade for retaining column is higher than the characteristic sequence type for deleting column, that is, inquires When multiple characteristic sequences have the type for retaining column and deleting column simultaneously, the characteristic sequence for deleting column type is deleted, will only be protected The characteristic sequence of column type is stayed to retain;

Step 107, it is all the same that search sequence, type and title are distinguished in the fisrt feature sequence table, but it is complementary It is different;Sequence, type and title are all the same, but originate different;And sequence, type and title are all the same, but partial order Number different characteristic sequence, every kind of characteristic sequence retain one, and the characteristic sequence that sequence in the characteristic sequence of reservation is different It is stored in the Second Type tables of data, obtains third feature sequence table;

Step 108, the characteristic sequence in the third feature sequence table is renamed, obtains fourth feature sequence Table;

Specifically, the rule characteristic sequence in the third feature sequence table renamed are as follows: use described the The title and feature serial number of each characteristic sequence are combined in three characteristic sequence tables, obtain new title, that is, obtain described Four characteristic sequence tables.There is comprehensive and irredundant characteristic sequence, in order to use feature in the fourth feature sequence table The new biologic components of sequence construct.

Based on the same inventive concept, the embodiment of the present invention provides synthetic biology data processing system, as shown in figure 4, by It is similar with synthetic biology data processing method in the principle that the system solves technical problem, therefore the implementation of the system can refer to The implementation of method, overlaps will not be repeated.

Data acquisition module 200 for using Perl language to obtain data in iGEM database, and is stored in original number According in table, the information of each biologic components is stored in the raw data table, those information include partial order number (parts_ Id), title (name), description (definition), sequence (sequence), length (length) and state (status) letter Breath;

Component splits module 201 and obtains multiple spies for splitting biologic components each in the raw data table Sequence is levied, and is stored in fisrt feature sequence table, the information of each characteristic sequence is stored in the fisrt feature sequence table, Those information include feature serial number (feature_id), original serial number (origin_id), title (name), type (type), mutually It mends (complement), starting (start), terminate (end), sequence (sequence) and part serial number (parts_id) information;

It is homogeneous to be used for search sequence, title and type in the fisrt feature sequence table for first data screening module 202 Same characteristic sequence, every kind of characteristic sequence with identical sequence, title and type retains one, and is stored in second feature sequence In list;

It is identical to be used for the search sequence in the second feature sequence table for second data screening module 203, but title is not Same characteristic sequence, is stored in the first title tables of data;

First data removing module 204 deletes other features for inquiring and retaining the complementary characteristic sequence for being 1 first Sequence obtains the first sub- name data table;If the characteristic sequence in the first sub- name data table is all the same, described first The complementary identical and the smallest characteristic sequence of starting is inquired and retained in sub- name data table, deletes other characteristic sequences, obtains the Two sub- name data tables;If the characteristic sequence in the second sub- name data table is all the same, in the described second sub- title number According to the characteristic sequence with complete concept title is inquired and retained in table, other characteristic sequences are deleted, obtain the second name data Table;If in the first sub- name data table, there are different characteristic sequences, using the described first sub- name data table as institute State the second title tables of data;If in the second sub- name data table, there are different characteristic sequences, by the described second sub- name Claim tables of data as the second title tables of data;

Third data screening module 205, in the second feature sequence table search sequence and title it is all the same, but It is different types of characteristic sequence, is stored in first kind tables of data;

Second data removing module 206 deletes other features for inquiring and retaining the complementary characteristic sequence for being 1 first Sequence obtains the first subtype tables of data;If the characteristic sequence in the first subtype tables of data is all the same, described Inquiry and Retention Type meet the characteristic sequence of preset condition in one subtype tables of data, protect in the second title tables of data The characteristic sequence retained in the first subtype tables of data is stayed, other characteristic sequences are deleted, obtains Second Type tables of data;If There are different characteristic sequences in the first subtype tables of data, then using the first subtype tables of data as described second Categorical data table.

4th data screening module 207, for distinguishing search sequence, type and title in the fisrt feature sequence table It is all the same, but it is complementary different;Sequence, type and title are all the same, but originate different;And sequence, type and title are equal Identical, but the characteristic sequence that part serial number is different, every kind of characteristic sequence retains one, and by sequence in the characteristic sequence of reservation Different characteristic sequences is stored in the Second Type tables of data, obtains third feature sequence table;

Data renamer module 208 is obtained for renaming to the characteristic sequence in the third feature sequence table Fourth feature sequence table.

It should be appreciated that the above synthetic biology data processing system include module only according to the system realize function The logical partitioning of progress in practical application, can carry out the superposition or fractionation of above-mentioned module.And the synthesis that the embodiment provides The function and synthetic biology data processing method provided by the above embodiment one that biological data processing system is realized are a pair of It answers, for the more detailed process flow that the system is realized, has been described in detail in above method embodiment one, herein It is not described in detail.

It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.

The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.

Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims (8)

1. synthetic biology data processing method, which comprises the following steps:
Data are obtained from iGEM, and are stored in raw data table;
Component in the raw data table is split, multiple characteristic sequences are obtained, and is stored in fisrt feature sequence table In, wherein in the fisrt feature sequence table each data include feature serial number, title, type, complementation, starting, sequence and Part serial number information;
Search sequence, title and type characteristic sequence all the same in the fisrt feature sequence table, it is each to retain one, and It is stored in second feature sequence table;
Search sequence is identical in the second feature sequence table, but the characteristic sequence that title is different, is stored in the first title In tables of data;
The complementary characteristic sequence for being 1 is inquired and retained in the first title tables of data, deletes other characteristic sequences, obtains the One sub- name data table;
If the characteristic sequence in the first sub- name data table is all the same, inquires and protect in the described first sub- name data table The complementary identical and the smallest characteristic sequence of starting is stayed, other characteristic sequences is deleted, obtains the second sub- name data table;
If the characteristic sequence in the second sub- name data table is all the same, inquired simultaneously in the described second sub- name data table Retain the characteristic sequence with complete concept title, delete other characteristic sequences, obtains the second title tables of data;
Search sequence and title are all the same in the second feature sequence table, but different types of characteristic sequence, are stored in In first kind tables of data;
The complementary characteristic sequence for being 1 is inquired and retained in the first kind tables of data, deletes other characteristic sequences, obtains the One subtype tables of data;
If the characteristic sequence in the first subtype tables of data is all the same, inquired simultaneously in the first subtype tables of data Retention Type meets the characteristic sequence of preset condition, and the first subtype tables of data is retained in the second title tables of data The characteristic sequence of middle reservation deletes other characteristic sequences, obtains Second Type tables of data;
It is all the same that search sequence, title and type are distinguished in the fisrt feature sequence table, but complementary different, starting difference And the characteristic sequence that part serial number is different, retain one respectively, and the different characteristic sequence of sequence is stored in described second In categorical data table, third feature sequence table is obtained;
Characteristic sequence in the third feature sequence table is renamed, fourth feature sequence table is obtained.
2. the method as described in claim 1, which is characterized in that the step further include:
If in the first sub- name data table, there are different characteristic sequences, using the described first sub- name data table as institute State the second title tables of data;Or
If in the second sub- name data table, there are different characteristic sequences, using the described second sub- name data table as institute State the second title tables of data.
3. the method as described in claim 1, which is characterized in that the step further include:
If there are different characteristic sequences in the first subtype tables of data, using the first subtype tables of data as institute State Second Type tables of data.
4. the method as described in claim 1, which is characterized in that step to the characteristic sequence in the third feature sequence table into Row renaming, obtains fourth feature sequence table specifically:
It is combined using the title and feature serial number of each data in the third feature sequence table, obtains new title, i.e., Obtain the fourth feature sequence table.
5. synthetic biology data processing system characterized by comprising
Data acquisition module for obtaining data from iGEM, and is stored in raw data table;
Component splits module and obtains multiple characteristic sequences, and store for splitting the component in the raw data table In fisrt feature sequence table, wherein in the fisrt feature sequence table each data include feature serial number, title, type, Complementary, starting, sequence and part serial number information;
First data screening module, for search sequence, title and type spy all the same in the fisrt feature sequence table Sequence is levied, it is each to retain one, and be stored in second feature sequence table;
Second data screening module, in the second feature sequence table search sequence it is identical, but the spy that title is different Sequence is levied, is stored in the first title tables of data;
First data removing module is deleted other characteristic sequences, is obtained for inquiring and retaining the complementary characteristic sequence for being 1 first Obtain the first sub- name data table;If the characteristic sequence in the first sub- name data table is all the same, in the described first sub- title The complementary identical and the smallest characteristic sequence of starting is inquired and retained in tables of data, is deleted other characteristic sequences, is obtained the second sub- name Claim tables of data;If the characteristic sequence in the second sub- name data table is all the same, in the described second sub- name data table The characteristic sequence with complete concept title is inquired and retained, other characteristic sequences are deleted, obtains the second title tables of data;
Third data screening module, in the second feature sequence table search sequence and title it is all the same, but type Different characteristic sequences is stored in first kind tables of data;
Second data removing module is deleted other characteristic sequences, is obtained for inquiring and retaining the complementary characteristic sequence for being 1 first Obtain the first subtype tables of data;If the characteristic sequence in the first subtype tables of data is all the same, in first subclass Inquiry and Retention Type meet the characteristic sequence of preset condition in type tables of data, in the second title tables of data described in reservation The characteristic sequence retained in first subtype tables of data deletes other characteristic sequences, obtains Second Type tables of data;
4th data screening module, it is all the same for distinguishing search sequence, title and type in the fisrt feature sequence table, But complementary characteristic sequence different, starting is different and part serial number is different, retain one, and the spy that sequence is different respectively Sign sequence is stored in the Second Type tables of data, obtains third feature sequence table;
It is special to obtain the 4th for renaming to the characteristic sequence in the third feature sequence table for data renamer module Levy sequence table.
6. system as claimed in claim 5, which is characterized in that the first data removing module is in the described first sub- title number According in table there are when different characteristic sequences, using the described first sub- name data table as the second title tables of data;Or
There are when different characteristic sequences in the described second sub- name data table, then using the described second sub- name data table as The second title tables of data.
7. system as claimed in claim 5, which is characterized in that the second data removing module is in first subtype number According in table there are when different characteristic sequences, using the first subtype tables of data as the Second Type tables of data.
8. system as claimed in claim 5, which is characterized in that the data renamer module uses the third feature sequence The title of each data and feature serial number are combined in table, obtain new title, that is, obtain the fourth feature sequence table.
CN201610077123.4A 2016-01-30 2016-01-30 Synthetic biology data processing method and its processing system CN105787290B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610077123.4A CN105787290B (en) 2016-01-30 2016-01-30 Synthetic biology data processing method and its processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610077123.4A CN105787290B (en) 2016-01-30 2016-01-30 Synthetic biology data processing method and its processing system

Publications (2)

Publication Number Publication Date
CN105787290A CN105787290A (en) 2016-07-20
CN105787290B true CN105787290B (en) 2018-12-18

Family

ID=56402493

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610077123.4A CN105787290B (en) 2016-01-30 2016-01-30 Synthetic biology data processing method and its processing system

Country Status (1)

Country Link
CN (1) CN105787290B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101714187A (en) * 2008-10-07 2010-05-26 中国科学院计算技术研究所 Index acceleration method and corresponding system in scale protein identification
US9177101B2 (en) * 2010-08-31 2015-11-03 Annai Systems Inc. Method and systems for processing polymeric sequence data and related information

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8725418B2 (en) * 2002-03-25 2014-05-13 Janssen Pharmaceutica, N.V. Data mining of SNP databases for the selection of intragenic SNPs
US8108384B2 (en) * 2002-10-22 2012-01-31 University Of Utah Research Foundation Managing biological databases

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101714187A (en) * 2008-10-07 2010-05-26 中国科学院计算技术研究所 Index acceleration method and corresponding system in scale protein identification
US9177101B2 (en) * 2010-08-31 2015-11-03 Annai Systems Inc. Method and systems for processing polymeric sequence data and related information

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
一种基于图论的计算蛋白质数据库代表序列的算法;刘鹏飞 等;《计算机与应用化学》;20080528;第25卷(第5期);全文 *
一种处理生物数据库中数据冗余的方法;郭建奎 等;《计算机科学》;20041231;第31卷(第10期);全文 *
人类基因组非冗余Exon/Intron数据库的构建;罗冬梅 等;《华南师范大学学报》;20101130;第2010年卷(第4期);全文 *

Also Published As

Publication number Publication date
CN105787290A (en) 2016-07-20

Similar Documents

Publication Publication Date Title
Braha Data mining for design and manufacturing: methods and applications
Reddy et al. The Genomes OnLine Database (GOLD) v. 5: a metadata management system based on a four level (meta) genome project classification
Kohavi et al. Oblivious decision trees, graphs, and top-down pruning
US20110107246A1 (en) Undo/redo operations for multi-object data
Ji et al. Identifying time-lagged gene clusters using gene expression data
Williams et al. Rapid and widespread vegetation responses to past climate change in the North Atlantic region
EP1415247A1 (en) Method for analyzing and providing of inter-relations between patents from the patent database
US20150154269A1 (en) Advanced field extractor with modification of an extracted field
Pollack et al. Emergent trends and passing fads in project management research: A scientometric analysis of changes in the field
WO2003069554A3 (en) Method and system for interactive ground-truthing of document images
WO2007008524A3 (en) Rich drag drop user interface
Bauer et al. Women in executive power: A global overview
WO2006004670A3 (en) Methods and systems for managing data
WO2002021326A3 (en) Information system and method using analyses based on object-centric longitudinal data
Phanstiel et al. Mango: a bias-correcting ChIA-PET analysis pipeline
US10394946B2 (en) Refining extraction rules based on selected text within events
WO2012005765A1 (en) Data analysis using multiple systems
Embrey et al. Guidelines for preventing human error in process safety
Smallman et al. CHEX (Change History EXplicit): New HCI concepts for change awareness
WO2004097679A1 (en) Database device, database search device, and method thereof
WO2000060495A9 (en) Patent-related tools and methodology for use in research and development projects
US9304672B2 (en) Representation of an interactive document as a graph of entities
Pring The simulation and analysis by digital computer of biochemical systems in terms of kinetic models: II. Curve-fitting procedures
Badampudi et al. Experiences from using snowballing and database searches in systematic literature studies
Babulak et al. Discrete event simulation: State of the art

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant