CN104484750B - The product parameters automatic matching method and system of biological information project - Google Patents

The product parameters automatic matching method and system of biological information project Download PDF

Info

Publication number
CN104484750B
CN104484750B CN201410742454.6A CN201410742454A CN104484750B CN 104484750 B CN104484750 B CN 104484750B CN 201410742454 A CN201410742454 A CN 201410742454A CN 104484750 B CN104484750 B CN 104484750B
Authority
CN
China
Prior art keywords
project
sub
sample
quality control
analysis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410742454.6A
Other languages
Chinese (zh)
Other versions
CN104484750A (en
Inventor
苏海桥
李卡麟
徐伟玲
黄泽辉
石俊杰
郑媛
梁绍光
刘娜
李国庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BGI Technology Solutions Co Ltd
Original Assignee
BGI Technology Solutions Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BGI Technology Solutions Co Ltd filed Critical BGI Technology Solutions Co Ltd
Priority to CN201410742454.6A priority Critical patent/CN104484750B/en
Publication of CN104484750A publication Critical patent/CN104484750A/en
Application granted granted Critical
Publication of CN104484750B publication Critical patent/CN104484750B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B45/00ICT specially adapted for bioinformatics-related data visualisation, e.g. displaying of maps or networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling

Landscapes

  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Theoretical Computer Science (AREA)
  • Bioethics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Epidemiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Disclose a kind of product parameters automatic matching method of biological information project, including step:When the type of sub-project only to filter sub-project, configured according to the default default parameters of sample situation Auto-matching and with unified filter criteria to being filtered and being analyzed through machine data under the sample after the completion of sequencer, generate analysis result;When the type of sub-project is normalizer project, the corresponding sample of the sub-project is while through sequencer, normal process analysis is created to the sample, corresponding alignment parameters and filtration parameter are inputted according to the sample situation of current sub-project by user in each normal process analytic process is created;Each sample data is filtered and compared according to the default filtration parameter of sample situation Auto-matching and alignment parameters, so as to remove the sample data for not meeting alignment parameters;Then again with created normal process analysis to meeting filtration parameter and each sample data after comparison is analyzed, so as to generate analysis result.

Description

The product parameters automatic matching method and system of biological information project
Technical field
The present invention relates to analysis of biological information field, more particularly to a kind of product parameters Auto-matching of biological information project Method and system.
Background technology
With the fast development of life science experimental technique, the automation of scientific instrument, intelligent level increasingly carry Height, data output capacity have qualitative leap.Meanwhile life science to analysis test requirement no matter sample size, Analytical cycle, analysis project and data accuracy etc. are proposed higher standard and the request, biology laboratory output Information increases by geometric progression.
In traditional biology laboratory, since data type is various, form differs, the preservations of data, exchange, inquiry, Analysis, maintenance are all very inconvenient, and the information seriously hindered between researcher is submitted.In large-scale parallel sequencing (under also crying Generation sequencing technologies, high-flux sequence, English:NGS, High-throughput Sequencing) experiment and biological information A variety of flows for arriving involved in credit analysis, as DNA library is built, gene order-checking, data processing, interpretation of result, achievement output, Multiple links such as data sharing, each link have different technical staff to participate in, therefore in links transmission or accepting The problems such as information loss or inefficiency etc. occurs.Sequencing especially in bioinformatics, high-performance calculation link, specially Industry sequencing laboratory needs to receive substantial amounts of sequencing order items, arranges sequencing experiment, the sequencing knot of timely processing high speed output Fruit.
In the link after the completion of sequencing in bioinformatics, all items have corresponding service line project team to do. Service line, which had not only been done, only to be filtered, but also is standardized, and does personalization.Also pass through 1. project managements project verification, 2. information Mans are responsible for People's examination & approval, 3. information executors confirm that 4. arrange to run flow, and 5. reports are filled in, and 6. deliver, the processes such as 7. project managements are linked up, Therefore the originally limited resource of service line is consumed.
Wherein, for the flow of each type entry, parameter is independently arranged by corresponding service line, is often run once, if Put once, run.There are it is following the problem of:1st, sample number difference causes parameter setting different;2nd, in parameter setting procedure Because the factors such as sample ID, large scale computer nonrecognition, cause mistake.
The content of the invention
It is an object of the invention to provide the product parameters automatic matching method and system of a kind of biological information project, can directly and Product type associates, the sample situation of machine data under Auto-matching, Automatic-searching path, so as to reduce what is artificially set The various criterion problem of issuable error and each service line.
The present invention provides a kind of product parameters automatic matching method of biological information project, including step:
Step 1:Establishment project is simultaneously stored in business management system, and each project includes more sub-projects;And described in selecting Sub-project and mission bit stream in establishment project;The type of the sub-project includes only filtering sub-project and normalizer project;
Step 2:When the sub-project type for only filtering a sub-project, then according to the sub-project type and task of selection Information, obtains sample data corresponding and after sequencer from lower machine data management system successively;And often obtain One sample data, i.e., configure according to the default default parameters of sample situation Auto-matching and use unified filter criteria to carry out Filtering and analysis, so as to filter the sample data for not meeting default parameters configuration;And all samples data filtering to be obtained with After having analyzed, analysis result is generated, analysis result includes sub-project information and corresponding sample message;
Step 3:When the type of the sub-project is normalizer project, then the corresponding sample of the sub-project is through surveying While sequence instrument is sequenced, filter analysis, express spectra quantitative analysis are included to sample establishment, comparison in difference is analyzed, Cluster gathers One or more normal process analyses in alanysis, microRNA target prediction analysis, KOGO analyses and base editor analysis, and Create in each normal process analytic process by user according to the sample situation of current sub-project input corresponding alignment parameters and Filtration parameter;After the completion of sequencer, according to the default filtration parameter of sample situation Auto-matching and alignment parameters to every A sample data are filtered and compared, so as to remove the sample data for not meeting alignment parameters;Then use what is created again Normal process analysis to meeting filtration parameter and each sample data after comparison is analyzed so that default parameters and The alignment parameters of the input analyze corresponding filter criteria with created normal process and each sample data were carried out Filter and analysis, so as to filter the sample for not meeting default parameters and alignment parameters;And treat all samples data filtering and analysis After complete, analysis result is generated, analysis result includes sub-project information and corresponding sample message;Step 4:According to sample situation The default filtering of Auto-matching/Quality Control parameter to the analysis result to carry out contrast Quality Control, if Quality Control is by directly exporting The analysis result;If Quality Control is by the way that and the analysis result and the gap of quality control standard then update in threshold range Filtering and the analytic process of step 2 or step 3 are carried out after the sample data or filtering/Quality Control parameter again, until point Analysis result passes through Quality Control;If Quality Control is by the way that and the analysis result and the gap of quality control standard exceed threshold value, then described in editor Sample and discarded correlation Lane, and place an order again in the business management system.
As the improvement of above-mentioned technical proposal, the summary info per sub-project include sub-project code, sub-project title, Whether sub-project type, be filtering, total sample number, executor, starting and end time, sub-project state and a son Project relevant operation.
As the improvement of above-mentioned technical proposal, the sample message includes sample ID, library title, Lane ID, sequencing Strategy, Flowcell ID, Raw data, Raw Reads, Read Length, GC%, Q20%, Q30%, Error Rate, Base distribution figure and base Quality Control distribution map.
As the improvement of above-mentioned technical proposal, further include:
Step 5:The analysis result is subjected to storage backup.
As the improvement of above-mentioned technical proposal, in the step 4:If Quality Control not by and the analysis result and matter The gap of control standard is that can be criticized with single sample data edition or sample updating the sample data in threshold range Amount editor.
The invention also discloses a kind of product parameters automatic patching system of biological information project, including:
Creating unit, for creating project and being stored in business management system, each project includes more sub-projects;And select Select the sub-project and mission bit stream in the establishment project;The type of the sub-project includes only filtering sub-project and normalizer Project;
First filter analysis unit, is only filtering sub-project for the type when the sub-project, then according to the son of selection Item types and mission bit stream, obtain sample corresponding and after sequencer from lower machine data management system successively Data;And often obtain a sample data, i.e., according to the default default parameters configuration of sample situation Auto-matching with unification Filter criteria is filtered and analyzed, so as to filter the sample data for not meeting default parameters configuration;And all samples to be obtained Product data filtering and after analyze, generation analysis result, analysis result includes sub-project information and corresponding sample message;
Second filter analysis unit, for when the type of the sub-project is normalizer project, then the sub-project pair For the sample answered while through sequencer, which, which is created, includes filter analysis, express spectra quantitative analysis, comparison in difference One or more marks in analysis, Cluster cluster analyses, microRNA target prediction analysis, KOGO analyses and base editor analysis Quasi- process analysis, and in each normal process analytic process is created by user according to the input pair of the sample situation of current sub-project The alignment parameters and filtration parameter answered;After the completion of sequencer, according to the default filtration parameter of sample situation Auto-matching Each sample data is filtered and compared with alignment parameters, so as to remove the sample data for not meeting alignment parameters;Then Again with created normal process analysis to meeting filtration parameter and each sample data after comparison is analyzed, from And analysis result is generated, analysis result includes sub-project information and corresponding sample message;Quality Control unit, for according to sample feelings The default filtering of condition Auto-matching/Quality Control parameter to the analysis result to carry out contrast Quality Control, direct defeated if Quality Control passes through Go out the analysis result;If Quality Control is by the way that and the analysis result and the gap of quality control standard are then compiled again in threshold range Filtering and the analytic process of step 2 or step 3 are carried out after volume sample data or filtering/Quality Control parameter again, until Analysis result passes through Quality Control;If Quality Control is by the way that and the analysis result and the gap of quality control standard exceed threshold value, then editor institute Sample and discarded correlation Lane are stated, and is placed an order again in the business management system.
As the improvement of above-mentioned technical proposal, the summary info per sub-project include sub-project code, sub-project title, Whether sub-project type, be filtering, total sample number, executor, starting and end time, sub-project state and a son Project relevant operation.
As the improvement of above-mentioned technical proposal, the sample message includes sample ID, library title, Lane ID, sequencing Strategy, Flowcell ID, Raw data, Raw Reads, Read Length, GC%, Q20%, Q30%, Error Rate, Base distribution figure and base Quality Control distribution map.
As the improvement of above-mentioned technical proposal, further include:
Storage unit:For storage backup will to be carried out by the analysis result of Quality Control.
As the improvement of above-mentioned technical proposal, in the Quality Control unit:If Quality Control not by and the analysis result and The gap of quality control standard in threshold range, update the sample data be can be with single sample data edition or sample Batch is edited.
Compared with prior art, the product parameters automatic matching method of biological information project disclosed by the invention and system tool Have the advantages that:By directly being associated with product type, the sample situation of machine data, Automatic-searching under Auto-matching Path.So as to summarize the parameters of all service lines, reached standardization, it is unitized, reduce artificially set there may be Error, and the various criterion problem of each service line.
Brief description of the drawings
Fig. 1 is a kind of flow signal of product parameters automatic matching method of biological information project in the embodiment of the present invention Figure.
Fig. 2 shows the idiographic flow of the step S2 in Fig. 1.
Fig. 3 shows the idiographic flow of the step S3 in Fig. 1.
Fig. 4 shows the idiographic flow of the step S4 in Fig. 1.
Fig. 5 shows the idiographic flow of the step S5 in Fig. 1.
Fig. 6 is a kind of structural representation of the product parameters automatic patching system of biological information project in the embodiment of the present invention Figure.
Fig. 7 shows UI pages of one embodiment of the product parameters automatic patching system of the thing information project of the invention that grows directly from seeds The screenshot capture in face, the sectional drawing show the selective listing of sub-project.
Fig. 8 shows UI pages of one embodiment of the product parameters automatic patching system of the thing information project of the invention that grows directly from seeds The screenshot capture in face, the sectional drawing show the summary info of every sub-project.
Fig. 9 shows UI pages of one embodiment of the product parameters automatic patching system of the thing information project of the invention that grows directly from seeds The screenshot capture in face, the sectional drawing show the parameter setting interface for only filtering sub-project.
Figure 10 shows the UI of one embodiment of the product parameters automatic patching system of the thing information project of the invention that grows directly from seeds The screenshot capture of the page, the sectional drawing show parameter setting interface and the normal process analysis selection interface of normalizer project.
Embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art are obtained every other without creative efforts Embodiment, belongs to the scope of protection of the invention.
It is a kind of product parameters automatic matching method of biological information project provided in an embodiment of the present invention referring to Fig. 1 Structure diagram.The product parameters automatic matching method of the biological information project, including step:
S1:Establishment project is simultaneously stored in business management system, and each project includes more sub-projects;And select the establishment Sub-project and mission bit stream in project;The type of the sub-project includes only filtering sub-project and normalizer project;
In this step, the selected summary info per sub-project includes sub-project code, sub-project title, subitem Whether mesh type, be filtering, total sample number, executor, starting and end time, sub-project state and a sub-project Relevant operation.
S2:When the type of the sub-project is an only filtering sub-project, then according to the sub-project type and mission bit stream of selection, Sample data corresponding and after sequencer is obtained from lower machine data management system successively;And often obtain a sample Product data, i.e., filtered with being divided according to the default default parameters configuration of sample situation Auto-matching with unified filter criteria Analysis, so as to filter the sample data for not meeting default parameters configuration;And all samples data filtering to be obtained and after having analyzed, Analysis result is generated, analysis result includes sub-project information and corresponding sample message;
In this step, the sample message include sample ID, library title, Lane ID, sequencing strategy, Flowcell ID, Raw data, Raw Reads, Read Length, GC%, Q20%, Q30%, Error Rate, base point Butut and base Quality Control distribution map.
S3:When the type of the sub-project is normalizer project, then the corresponding sample of the sub-project is through sequenator While sequencing, which, which is created, includes filter analysis, express spectra quantitative analysis, comparison in difference analysis, Cluster clusters point One or more normal process analyses in analysis, microRNA target prediction analysis, KOGO analyses and base editor analysis, and creating Corresponding alignment parameters and filtering are inputted according to the sample situation of current sub-project by user in each normal process analytic process Parameter;After the completion of sequencer, according to the default filtration parameter of sample situation Auto-matching and alignment parameters to as every Product data are filtered and compared, so as to remove the sample data for not meeting alignment parameters;Then created standard is used again Process analysis is to meeting filtration parameter and each sample data after comparison is analyzed, so that analysis result is generated, point Analysis result includes sub-project information and corresponding sample message;
S4:According to the default filtering of sample situation Auto-matching/Quality Control parameter to carry out contrast matter to the analysis result Control, if Quality Control is by directly exporting the analysis result;If Quality Control is not by and the analysis result and the difference of quality control standard Away from carrying out step S2 or step again after in threshold range, then updating the sample data or filtering/Quality Control parameter The filtering of S3 and analytic process, until analysis result passes through Quality Control;If Quality Control is not by and the analysis result and quality control standard Gap exceed threshold value, then edit the sample and discarded correlation Lane, and place an order again in the business management system.
S5:The analysis result is subjected to storage backup.
The filter analysis of the present invention is to be distinguished according to the type of sub-project for only filtering sub-project or normalizer project Carry out, be described in detail separately below by Fig. 2 and Fig. 3.
As shown in Fig. 2, the process of filter analysis is carried out to sample message when the type of sub-project is only filters sub-project Including step:
S201:Detect corresponding one and machine under sample (sample) is only sequenced;
In this step, lower machine refers to the sample data for completing to obtain after sequencing by sequenator by sample data.
S202:Sample is sequenced to this according to the default default parameters configuration of sample situation Auto-matching to be filtered with being divided Analyse (run);
In the step, with unified filter analysis standard (and the default default parameters of Auto-matching configures) to each Only the lower machine data of sequencing sample carry out, so as to filter out non-compliant lower machine data.S203:Determine the sub-project (project) it is complete with analysis (run) whether all sequencing samples (sample) filterIf so, step S204 is then carried out, it is no Then return to step S202;
S204:Generate analysis result.
As shown in figure 3, the process of filter analysis is carried out to sample message when the type of sub-project is normalizer project Including step:
S301:Detect machine on a normalized sample (sample);
In this step, upper machine refers to sample data uploading to sequenator to be sequenced.
S302:One or more normal process analyses are created to the normalized sample, and create the same of normal process analysis When by the corresponding alignment parameters of user setting;The normal process analysis include but not limited to filter analysis, express spectra quantitative analysis, Comparison in difference analysis, Cluster cluster analyses, microRNA target prediction analysis, KOGO analyses and base editor analysis
S303:Machine under sample selected by detection (sample);
In this step, lower machine refers to the sample data for completing to obtain after sequencing by sequenator by sample data.
S304:Each sample data was carried out according to the default filtration parameter of sample situation Auto-matching and alignment parameters Filter and comparison, so as to remove the sample data for not meeting alignment parameters;Then again with created normal process analysis to symbol Close filtration parameter and each sample data after comparison is analyzed;
S305:Determine whether all normalized samples (sample) of the sub-project (project) filter and analysis (run) It is completeIf so, then carry out step S306, otherwise return to step S304;
S306:Generate analysis result.
It is the analysis knot to being obtained after any one sample data progress filter analysis in a sub-project with reference to figure 4 Fruit carries out the process of Quality Control, and paying attention to Quality Control is just carried out after the sample data analysis of all samples of sub-project has been filtered, And Quality Control is carried out successively to each sample, specifically include step:
S401:Detect that some sample completes filter analysis, and generate analysis result;
S402:Quality Control is carried out to the analysis result;
Specifically contrasted according to the default filtering of sample situation Auto-matching/Quality Control parameter with analysis result, so that Carry out Quality Control.
S403:Judge Quality Control whether by if Quality Control is by entering step S404, otherwise entering step S405;
S404:Export the analysis result;
S405:The gap of the analysis result and quality control standard is judged whether not in threshold range (i.e. gap is too big), If otherwise entering step S406, step S408 is otherwise transferred to;
S406:Update the sample data or filtering/Quality Control parameter;
In this step, can be edited with single sample data edition or sample batch.
S407:The sample data is filtered and analyzed again according to sub-project type, generates analysis result again;And Return to step S402;
S408:The sample and discarded correlation Lane are edited, and in BMS (Business Management System, industry Business management system) in place an order again;
S409:Wait machine under new sample data and the sub-project type according to sample and carry out corresponding filtering With analysis, analysis result, and return to step S402 are generated;
Then, after all samples data of a sub-project carry out Quality Control, then a QC report is generated.
With reference to figure 5, the process that storage backup is carried out to the analysis result of sample data specifically includes step:
S501:Sample data is analyzed;
S502:Judge whether the sample analysis is completedIf so, then entering step S503, otherwise continue step S501;
S503:Activation system device backup function is available;
S504:User confirms to back up, and clicks on " backup ";
S505:System prompt backup request is submitted;
S506:System copies data to delivery system;
S507:Judge whether copy succeedsIf so, then entering step S509, S508 is otherwise entered step:
S508:Prompt user ID error, and return to step S504.
S509:Prompt user ID success;And terminate.
As it can be seen that the product parameters automatic matching method of biological information project disclosed in the present embodiment, passes through direct and product Type association is got up, the sample situation of machine data, Automatic-searching path under Auto-matching.So as to summarize the ginseng of all service lines Number, has reached standardization, unitizes, and reduces the issuable error artificially set, and the various criterion of each service line Problem.
Present invention also offers a kind of product parameters automatic patching system of biological information project, as shown in fig. 6, including wound Unit 10, the first filter analysis unit 20, the second filter analysis unit 30, Quality Control unit 40 and storage unit 50 are built, wherein Creating unit 10, the first filter analysis unit 20, the second filter analysis unit 30, Quality Control unit 40 and storage unit 50 can be with It is incorporated into a background server, and front end directly operates on webpage, is operated by user and input parameter, specifically 's:
Creating unit 10, for create project and be stored in business management system (Business Management System, BMS, sequencing and the distribution of information analysis task and management system, contain the organizational informations such as sub-project, person liable, data) in, often A project includes more sub-projects;And select the sub-project and mission bit stream in the establishment project;The type of the sub-project Including only filtering sub-project and normalizer project;
As shown in fig. 7, one embodiment for the product parameters automatic patching system of thing information project that grows directly from seeds for the present invention The screenshot capture of the UI pages, the sectional drawing show the selective listing of sub-project.More sub-projects are shown in the sub-project list, And it is labeled as a filtering items (Y) or standardization project (N) per sub-project.And Fig. 8 is to specifically show a sub-project Summary info.Per sub-project summary info include sub-project code, sub-project title, sub-project type, whether be only Filtering, total sample number, executor, starting and end time, sub-project state and sub-project relevant operation.
First filter analysis unit 20, is only filtering sub-project for the type when the sub-project, then according to selection Sub-project type and mission bit stream, successively from lower machine data management system (Data Management System, DMS, to sequencing The lower machine data completed carry out quality monitoring and data management) in obtain corresponding and after sequencer sample number According to;And often obtain a sample data, i.e., unified mistake is used according to the default default parameters configuration of sample situation Auto-matching Filter standard is filtered and analyzed, so as to filter the sample data for not meeting default parameters configuration;And all samples to be obtained Data filtering and after analyze, generation analysis result, analysis result includes sub-project information and corresponding sample message;
With reference to figure 9, be the present invention grow directly from seeds thing information project product parameters automatic patching system one embodiment UI The screenshot capture of the page, the sectional drawing show the parameter setting interface for only filtering sub-project.
Second filter analysis unit 30, for when the type of the sub-project is normalizer project, then the sub-project For corresponding sample while through sequencer, which, which is created, includes filter analysis, express spectra quantitative analysis, diversity ratio Compared with one or more in analysis, Cluster cluster analyses, microRNA target prediction analysis, KOGO analyses and base editor analysis Normal process is analyzed, and is inputted in each normal process analytic process is created by user according to the sample situation of current sub-project Corresponding alignment parameters and filtration parameter;After the completion of sequencer, joined according to the default filtering of sample situation Auto-matching Number and alignment parameters are filtered and compared to each sample data, so as to remove the sample data for not meeting alignment parameters;So Afterwards again with created normal process analysis to meeting filtration parameter and each sample data after comparison is analyzed, So as to generate analysis result, analysis result includes sub-project information and corresponding sample message;Wherein, the sample message includes Sample ID, library title, Lane ID, sequencing strategy, Flowcell ID, Raw data, Raw Reads, Read Length, GC%, Q20%, Q30%, Error Rate, base distribution figure and base Quality Control distribution map.
With reference to figure 10, be the present invention grow directly from seeds thing information project product parameters automatic patching system one embodiment UI The screenshot capture of the page, the sectional drawing show parameter setting interface and the normal process analysis selection interface of normalizer project.
Quality Control unit 40, for being tied according to the default filtering of sample situation Auto-matching/Quality Control parameter to the analysis Fruit carries out contrast Quality Control, if Quality Control is by directly exporting the analysis result;If Quality Control not by, and the analysis result and The gap of quality control standard is then updated and (can edited with single sample data edition or sample batch) and is described in threshold range The mistake of the first filter analysis unit 20 or the second filter analysis unit 30 is carried out after sample data or filtering/Quality Control parameter again Filter and analytic process, until analysis result passes through Quality Control;If Quality Control is not by and the analysis result and the gap of quality control standard More than threshold value, then the sample and discarded correlation Lane are edited, and place an order again in the business management system;And
Storage unit 50:The analysis result is backed up for storing.
As it can be seen that the product parameters automatic patching system of biological information project disclosed in the present embodiment, passes through direct and product Type association is got up, the sample situation of machine data, Automatic-searching path under Auto-matching.So as to summarize the ginseng of all service lines Number, has reached standardization, unitizes, and reduces the issuable error artificially set, and the various criterion of each service line Problem.
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art For, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also considered as Protection scope of the present invention.

Claims (8)

1. a kind of product parameters automatic matching method of biological information project, it is characterised in that including step:
Step 1:Establishment project is simultaneously stored in business management system, and each project includes more sub-projects;And select the establishment Sub-project and mission bit stream in project;The type of the sub-project includes only filtering sub-project and normalizer project;Each The summary info of sub-project include sub-project code, sub-project title, sub-project type, whether be only a filtering, total sample number, Executor, starting and end time, sub-project state and sub-project relevant operation;
Step 2:When the type of the sub-project is an only filtering sub-project, then according to the sub-project type and mission bit stream of selection, Sample data corresponding and after sequencer is obtained from lower machine data management system successively;And often obtain a sample Product data, i.e., configured according to the default default parameters of sample situation Auto-matching and with unified filter criteria carry out filtering with Analysis, so as to filter the sample data for not meeting default parameters configuration;And all samples data filtering to be obtained is with having analyzed Afterwards, analysis result is generated, analysis result includes sub-project information and corresponding sample message;
Step 3:When the type of the sub-project is normalizer project, then the corresponding sample of the sub-project is through sequenator While sequencing, which, which is created, includes filter analysis, express spectra quantitative analysis, comparison in difference analysis, Cluster clusters point One or more normal process analyses in analysis, microRNA target prediction analysis, KOGO analyses and base editor analysis, and creating Corresponding alignment parameters and filtering are inputted according to the sample situation of current sub-project by user in each normal process analytic process Parameter;After the completion of sequencer, according to the default filtration parameter of sample situation Auto-matching and alignment parameters to as every Product data are filtered and compared, so as to remove the sample data for not meeting alignment parameters;Then created standard is used again Process analysis is to meeting filtration parameter and each sample data after comparison is analyzed, so that analysis result is generated, point Analysis result includes sub-project information and corresponding sample message;
Step 4:According to the default filtering of sample situation Auto-matching/Quality Control parameter to carry out contrast matter to the analysis result Control, if Quality Control is by directly exporting the analysis result;If Quality Control is not by and the analysis result and the difference of quality control standard Away from carrying out step 2 or step again after in threshold range, then updating the sample data or filtering/Quality Control parameter Three filtering and analytic process, until analysis result passes through Quality Control;If Quality Control is not by and the analysis result and quality control standard Gap exceed threshold value, then edit the sample and discarded correlation Lane, and place an order again in the business management system.
2. the product parameters automatic matching method of biological information project as claimed in claim 1, it is characterised in that the sample Information include sample ID, library title, Lane ID, sequencing strategy, Flowcell ID, Raw data, Raw Reads, Read Length, GC%, Q20%, Q30%, Error Rate, base distribution figure and base Quality Control distribution map.
3. the product parameters automatic matching method of biological information project as claimed in claim 1, it is characterised in that further include:
Step 5:The analysis result is subjected to storage backup.
4. the product parameters automatic matching method of biological information project as claimed in claim 1, it is characterised in that in the step In rapid four:If Quality Control not by and the analysis result and the gap of quality control standard in threshold range, it is described updating Sample data is can be edited with single sample data edition or sample batch.
A kind of 5. product parameters automatic patching system of biological information project, it is characterised in that including:
Creating unit, for creating project and being stored in business management system, each project includes more sub-projects;And select institute State the sub-project and mission bit stream in establishment project;The type of the sub-project includes only filtering sub-project and standardization subitem Mesh;Whether the summary info per sub-project includes sub-project code, sub-project title, sub-project type, is only filtering, total Sample number, executor, starting and end time, sub-project state and sub-project relevant operation;
First filter analysis unit, is only filtering sub-project for the type when the sub-project, then according to the sub-project of selection Type and mission bit stream, obtain sample number corresponding and after sequencer from lower machine data management system successively According to;And often obtain a sample data, i.e., unified mistake is used according to the default default parameters configuration of sample situation Auto-matching Filter standard is filtered and analyzed, so as to filter the sample data for not meeting default parameters configuration;And all samples to be obtained Data filtering and after analyze, generation analysis result, analysis result includes sub-project information and corresponding sample message;
Second filter analysis unit, for when the type of the sub-project is normalizer project, then the sub-project to be corresponding For sample while through sequencer, which, which is created, includes filter analysis, express spectra quantitative analysis, comparison in difference point One or more standards in analysis, Cluster cluster analyses, microRNA target prediction analysis, KOGO analyses and base editor analysis Process analysis, and inputted and corresponded to according to the sample situation of current sub-project by user in each normal process analytic process is created Alignment parameters and filtration parameter;After the completion of sequencer, according to the default filtration parameter of sample situation Auto-matching and Alignment parameters are filtered and compared to each sample data, so as to remove the sample data for not meeting alignment parameters;Then again With the normal process analysis created to meeting filtration parameter and each sample data after comparison is analyzed, so that Analysis result is generated, analysis result includes sub-project information and corresponding sample message;
Quality Control unit, for being carried out according to the default filtering of sample situation Auto-matching/Quality Control parameter to the analysis result Quality Control is contrasted, if Quality Control is by directly exporting the analysis result;If Quality Control is not by and the analysis result and Quality Control mark Accurate gap in threshold range, then update carry out again after the sample data or filtering/Quality Control parameter step 2 or The filtering of person's step 3 and analytic process, until analysis result passes through Quality Control;If Quality Control is not by and the analysis result and matter The gap of control standard exceedes threshold value, then edits the sample and discarded correlation Lane, and in the business management system again Place an order.
6. the product parameters automatic patching system of biological information project as claimed in claim 5, it is characterised in that the sample Information include sample ID, library title, Lane ID, sequencing strategy, Flowcell ID, Raw data, Raw Reads, Read Length, GC%, Q20%, Q30%, Error Rate, base distribution figure and base Quality Control distribution map.
7. the product parameters automatic patching system of biological information project as claimed in claim 5, it is characterised in that further include:
Storage unit:For storage backup will to be carried out by the analysis result of Quality Control.
8. the product parameters automatic patching system of biological information project as claimed in claim 5, it is characterised in that in the matter Control in unit:If Quality Control not by and the analysis result and the gap of quality control standard in threshold range, updating It is that can be edited with single sample data edition or sample batch to state sample data.
CN201410742454.6A 2014-12-08 2014-12-08 The product parameters automatic matching method and system of biological information project Active CN104484750B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410742454.6A CN104484750B (en) 2014-12-08 2014-12-08 The product parameters automatic matching method and system of biological information project

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410742454.6A CN104484750B (en) 2014-12-08 2014-12-08 The product parameters automatic matching method and system of biological information project

Publications (2)

Publication Number Publication Date
CN104484750A CN104484750A (en) 2015-04-01
CN104484750B true CN104484750B (en) 2018-04-24

Family

ID=52759291

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410742454.6A Active CN104484750B (en) 2014-12-08 2014-12-08 The product parameters automatic matching method and system of biological information project

Country Status (1)

Country Link
CN (1) CN104484750B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650319A (en) * 2016-11-15 2017-05-10 上海派森诺生物科技股份有限公司 Automatic filtering method for high-throughout Miseq sequencing data
CN112365928B (en) * 2020-11-16 2021-07-06 赛福解码(北京)基因科技有限公司 Biological information data analysis and result quality control automation method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982409A (en) * 2012-11-07 2013-03-20 浪潮电子信息产业股份有限公司 Informationalized management design method for information biology high-performance computing platform
CN103324866A (en) * 2013-03-26 2013-09-25 张弘 Ripple system
CN103993069A (en) * 2014-03-21 2014-08-20 深圳华大基因科技服务有限公司 Virus integration site capture sequencing analysis method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050273272A1 (en) * 2004-04-22 2005-12-08 Applera Corporation, A Delaware Corporation System and method for laboratory-wide information management
CN103714180A (en) * 2014-01-08 2014-04-09 浪潮(北京)电子信息产业有限公司 Bioinformatics database system and data processing method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102982409A (en) * 2012-11-07 2013-03-20 浪潮电子信息产业股份有限公司 Informationalized management design method for information biology high-performance computing platform
CN103324866A (en) * 2013-03-26 2013-09-25 张弘 Ripple system
CN103993069A (en) * 2014-03-21 2014-08-20 深圳华大基因科技服务有限公司 Virus integration site capture sequencing analysis method

Also Published As

Publication number Publication date
CN104484750A (en) 2015-04-01

Similar Documents

Publication Publication Date Title
CN104484558B (en) The analysis report automatic generation method and system of biological information project
CN104484582B (en) The biological information project automatic analysis method and system realized by modularization selection
Nellore et al. Rail-RNA: scalable analysis of RNA-seq splicing and coverage
Glock et al. Decision support models for production ramp-up: a systematic literature review
CN104503840B (en) The method and device that terminal resource is optimized
US7996172B2 (en) System and a method for managing sample test results and respective sample result context information
CN107766696A (en) Eucaryote alternative splicing analysis method and system based on RNA seq data
CN102053912A (en) Device and method for automatically testing software based on UML (unified modeling language) graphs
Külahoglu et al. Quantitative transcriptome analysis using RNA-seq
Bradshaw et al. Optimising regionalisation techniques: Identifying centres of endemism in the extraordinarily endemic-rich Cape Floristic Region
CN112990515A (en) Workshop resource scheduling method based on heuristic optimization algorithm
CN104484375B (en) Establish the method and system of database automatically in project analysis flow
CN104484750B (en) The product parameters automatic matching method and system of biological information project
CN109859797A (en) A kind of miRNA data analysing method without ginseng based on miRBase database
CN107506614A (en) A kind of bacterium ncRNA Forecasting Methodologies of transcript profile sequencing data and PeakCalling methods based on Illumina
CN108642568A (en) A kind of special SNP chip design method of domesticated dog full-length genome low-density cultivar identification
CN104484581B (en) The automated analysis method and system of biological information project
CN103136440B (en) Data processing method and device
CN105653897B (en) LncRNA analysis system and method based on biological cloud platform
CN112434032B (en) Automatic feature generation system and method
CN111433610A (en) Laboratory instrument selection and configuration
US20030004612A1 (en) Methods and computer program products for automated experimental design
Cheema et al. THREaD Mapper Studio: a novel, visual web server for the estimation of genetic linkage maps
Van der Blom et al. Sparkle: Toward Accessible Meta-Algorithmics for Improving the State of the Art in Solving Challenging Problems
US7191173B2 (en) Method of determining database search path

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant