CN109656910A - Expansible Large Scale Biology medicine sample management and Visualization Platform - Google Patents

Expansible Large Scale Biology medicine sample management and Visualization Platform Download PDF

Info

Publication number
CN109656910A
CN109656910A CN201811487666.9A CN201811487666A CN109656910A CN 109656910 A CN109656910 A CN 109656910A CN 201811487666 A CN201811487666 A CN 201811487666A CN 109656910 A CN109656910 A CN 109656910A
Authority
CN
China
Prior art keywords
platform
sample
user
large scale
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811487666.9A
Other languages
Chinese (zh)
Other versions
CN109656910B (en
Inventor
臧天仪
刘春圃
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201811487666.9A priority Critical patent/CN109656910B/en
Publication of CN109656910A publication Critical patent/CN109656910A/en
Application granted granted Critical
Publication of CN109656910B publication Critical patent/CN109656910B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention is expansible Large Scale Biology medicine sample Visualization Platform.The platform includes sample statistics module, visualization model, retrieval module and mongodb Database Systems.User logs in the platform and carries out identity and Authority Verification, and after user completes Authority Verification, user uploads to platform addition data information.Quality examination is carried out to the data information of addition, mongodb Database Systems are stored in by the data information of quality examination.User checks the statistics of search result by sample information present in many condition integrated retrieval mongodb database.After search complete, the platform counts sample searches result, is presented to the user in the form of Visual Chart etc..The present invention is directed to magnanimity biomedicine sample, realizes more easily distributed deployment and expansible storage, while can provide more easily operation, greatly improves biomedical sample management and service efficiency.

Description

Expansible Large Scale Biology medicine sample management and Visualization Platform
Technical field
It is a kind of expansible Large Scale Biology medicine sample management and visual the present invention relates to biology or medical domain Change platform.
Background technique
It is also more and more about biomedical data with the rapid development of modern biomedical, wherein very crucial A part of data are the data about biomedical sample.These data are all derived from the data of true organism, for doctor Learning will be of great significance with the research in biological field.So being one for the management of biomedical sample is worth closing The problem of note.
Current domestic most of management or searching platform with biology and medicine relevant information is all based on documentation management, Lack the management platform specifically for biomedical sample, so the biomedical sample management platform established in this patent can Sample information is preferably managed, and researcher is helped preferably to utilize associated biomolecule medicine sample information.
It is most of in existing sample database system to use relevant database, such as mysql.Relevant database is A kind of database that comparison is classical, but there is some problems for its performance in storage size and scalability.When needs are deposited When storing up mass data, traditional database is difficult to carry out the extension of convenient distributed deployment and storage, this is just to open ended Data scale is limited.And it after the data volume in relevant database becomes larger, is operated on the database Speed and efficiency can be greatly reduced, and influence the usage experience of sample database system.So how to use novel storage management technique It is a good problem to study to improve the operating efficiency of platform.This patent in order to solve problem above, using novel Non-relational database can not only very easily be extended data storage, and can optimize the effect of database manipulation Rate promotes the experience of user.
In addition to this, the quality of sample is irregular in most of sample database systems, believes the sample entered in sample database Breath lacks corresponding inspection and control, it is difficult to the user of system be allowed to obtain really valuable data.This patent is directed to this Problem, sample storage during joined sample quality detection link, improved by the inspection to sample data into Enter into sample database the quality of sample information.
In most of sample databases, the form of retrieval is relatively simple, can only be retrieved according to certain items or certain conditions, Search terms and retrieval form cannot be freely selected, can sometimes be difficult to meet the various demands of user.In search result, Also lack corresponding statistics and visualization, it cannot be more vivid, whole to sample results are searched to system user offer one Assurance.
Summary of the invention
The present invention be in order to keep the management of magnanimity biomedicine sample more convenient with it is efficient, provide a kind of expansible Large Scale Biology medicine sample management and Visualization Platform, the present invention provides following technical schemes:
A kind of expansible Large Scale Biology medicine sample management and Visualization Platform, the platform includes sample statistics mould Block, visualization model, retrieval module and mongodb database, the platform intergration integrate mongodb database, mongodb number It is responsible for the functions such as newly-increased of the storages of data, sample data according to library.
Preferably, the platform carries out the control of the quality of data using xml schema technology to different data item.
Preferably, the platform uses xml technical definition sample metadata, and the platform uses the file of excel form Carrier as sample data information.
Preferably, by increasing the Key-Value of mongodb database, increasing data information amount and improving operating efficiency.
A kind of expansible Large Scale Biology medicine sample management and Visualization Platform operating method, include the following steps:
Step 1: user logs in expansible Large Scale Biology medicine sample management and Visualization Platform and carries out identity and power Limit verifying;
Step 2: after user completes Authority Verification, user uploads to platform addition data information;
Step 3: quality examination is carried out to the data information of addition, is not stored in by the data information of quality examination Mongodb database is returned to client and corrects mistake, is stored in mongodb database by the data information of quality examination;
Step 4: user checks retrieval knot by sample information present in many condition integrated retrieval mongodb database The statistics of fruit;
Step 5: after search complete, expansible Large Scale Biology medicine sample management and Visualization Platform search sample Hitch fruit is counted, and is presented to the user in the form of Visual Chart.
Preferably, many condition integrated retrieval by realize any search terms combination and with or non-form carry out Conjunctive search.
The invention has the following advantages:
1, the platform uses the non-relational database mongodb towards big data, with traditional Relational DataBase phase Than the sample of more big data quantity can be accommodated, and it can be realized the extension of more easily distributed deployment and storage, while can be real Now operation faster, greatly improves the service efficiency of platform.
2, the platform can be realized the quality control of sample.It can be to upload sample file during sample storage In sample data checked, avoid sample from lacking excessive data or the format using mistake, can protect to a certain extent The sample quality in database is demonstrate,proved, platform user is enabled preferably to utilize sample data.
3, the platform can be realized many condition integrated retrieval of sample.Compared with other manage platform, the inspection of this platform Rope module can be supplied to user and freely add the function of search terms, and can be realized search for generally and " with or it is non-" logic Search, enables this platform more to meet the different search needs of user.
4, the statistics and visualization of sample data can be better achieved in the platform.This platform is equipped with the system of sample data Meter and visualization model, can not only provide various types of sample statistics information, and it is visually more intuitive to be able to use family Understand platform in sample distribution situation.
Detailed description of the invention
Fig. 1 is expansible Large Scale Biology medicine sample management and Visualization Platform flow chart.
Specific embodiment
Below in conjunction with specific embodiment, describe the invention in detail.
Specific embodiment one:
A kind of expansible Large Scale Biology medicine sample management and Visualization Platform, it is characterized in that: the platform includes Sample statistics module, visualization model, retrieval module and mongodb database, the platform are connected with mongodb database It connects.
The platform uses non-relational database mongodb to be responsible for the storage managements of data, sample data it is new Increase and other operations are all responsible for by mongodb.According to there was only a database server, in the future basis in current demand cluster The raising that the increase of data volume and operating efficiency require can further increase the quantity of database server in cluster to extend The storage of data, meanwhile, further progress data backup and load balancing, thus further lifting system reliability and available Property.
The platform can be realized the quality control of sample.It can be in upload sample file during sample storage Sample data checked, avoid sample from lacking excessive data or the format using mistake, can guarantee to a certain extent Sample quality in database enables platform user preferably to utilize sample data.
The platform can be realized many condition integrated retrieval of sample.Compared with other manage platform, the retrieval of this platform Module can be supplied to user and freely add the function of search terms, and can be realized search for generally and " with or it is non-" logic search Rope enables this platform more to meet the search need of user.
The statistics and visualization of sample data can be better achieved in the platform.This platform is equipped with the statistics of sample data And visualization model, various types of sample statistics information can not only be provided, it is visually more intuitive to be able to use family Understand the distribution situation of sample in platform.
Specific embodiment two:
A kind of expansible Large Scale Biology medicine sample management and Visualization Platform operating method, include the following steps:
Step 1: user logs in expansible Large Scale Biology medicine sample management and Visualization Platform and carries out identity and power Limit verifying;
Step 2: after user completes Authority Verification, user uploads to platform addition data information;
Step 3: quality examination is carried out to the data information of addition, is not stored in by the data information of quality examination Mongodb database is returned to client and corrects mistake, is stored in mongodb database by the data information of quality examination;
Step 4: user checks retrieval knot by sample information present in many condition integrated retrieval mongodb database The statistics of fruit.
Step 5: after search complete, expansible Large Scale Biology medicine sample Visualization Platform is to sample searches result It is counted, is presented to the user in the form of Visual Chart.
It needs to log in this platform using the user of this platform and carries out identity and Authority Verification.Different accounts can assign difference Permission, different permissions can be able to carry out different operations, to guarantee the safety of sample data.
After completing identity authority verifying, there is the user of corresponding authority that can add sample data into platform.For side Just user uses this platform, this platform uses carrier of the form as sample information of excel file, user only need by Sample information is filled into excel table according to specification and completes the upload of sample file in interface.For specification sample The quality of this information, this platform have made corresponding excel template to sample information uploader, and user can be under respective interface It carries template and uses.The template that is there is provided using this platform upload sample information enable to the sample uploaded have better quality and Higher inspection percent of pass.
It can be to the information progress quality examination of sample in upper transmitting file during upload.Here this platform uses xml Schema technology to carry out different data item the control of the quality of data, and the technology is it can be found that the data item lacked in sample With do not meet data item as defined in data format.If the sample information in the sample file uploaded is not examined by sample quality It looks into, the sample information in this document will not be put in storage, and the information that specification is not met in sample can be returned to user by platform Sample information mistake is corrected for user, if sample information has passed through sample quality inspection, which can be added into database In.The present invention increases data information amount and improves operating efficiency by the Key-Value of increase mongodb database.
User can retrieve sample information present in platform database.This platform provides a variety of retrieval modes, May be implemented any search terms combination and " with or it is non-" form carry out conjunctive search, user can choose think it is to be used Search terms and search form carry out the retrieval of sample.After the completion of search, this platform can carry out a letter to sample searches result Single statistics, such as the statistics of gender and Sample preservation mechanism, are then presented to the user in a manner of Visual Chart.
Other than way of search, user can carry out the statistics of diversified forms and visual for the data in entire platform Change checks, such as: count the sample distribution situation between Sample preservation different from visualizing mechanism and each Sample preservation machine Sample distribution situation inside structure, statistics is with visualization sample in national distribution situation etc..
The above is only the preferred embodiment of expansible Large Scale Biology medicine sample management and Visualization Platform, The protection scope of expansible Large Scale Biology medicine sample management and Visualization Platform is not limited merely to above-described embodiment, all The technical solution belonged under thinking all belongs to the scope of protection of the present invention.It should be pointed out that for those skilled in the art For, several improvements and changes without departing from the principles of the present invention, such modifications and variations also should be regarded as of the invention Protection scope.

Claims (6)

1. a kind of expansible Large Scale Biology medicine sample management and Visualization Platform, it is characterized in that: the platform includes sample This statistical module, visualization model, retrieval module and mongodb database, the platform are connected with mongodb database, Mongodb database be responsible for the storage of data, sample data it is newly-increased.
2. one kind according to claim 1 expansible Large Scale Biology medicine sample management and Visualization Platform, special Sign is: the platform carries out data quality control to different data item using xml schema technology.
3. one kind according to claim 1 expansible Large Scale Biology medicine sample management and Visualization Platform, special Sign is: the platform adopts xml technical definition metadata, and the platform is using excel formal file as sample data information Carrier.
4. one kind according to claim 1 expansible Large Scale Biology medicine sample management and Visualization Platform, special Sign is: the Key-Value by increasing mongodb database, increases data information amount and improves operating efficiency.
5. a kind of expansible Large Scale Biology medicine sample management and Visualization Platform operation side as described in claim 1 Method, it is characterized in that: including the following steps:
Step 1: user logs in expansible Large Scale Biology medicine sample management and Visualization Platform progress identity and permission and tests Card;
Step 2: after user completes Authority Verification, user uploads to platform addition data information;
Step 3: quality examination is carried out to the data information of addition, mongodb is not stored in by the data information of quality examination Database is returned to client and corrects mistake, is stored in mongodb database by the data information of quality examination;
Step 4: user checks search result by sample information present in many condition integrated retrieval mongodb database Statistics;
Step 5: after search complete, expansible Large Scale Biology medicine sample management and Visualization Platform are to sample searches knot Fruit is counted, and is presented to the user in the form of Visual Chart.
6. operating method according to claim 5, it is characterized in that: many condition integrated retrieval is by realizing any retrieval Combination and with or non-form carry out conjunctive search.
CN201811487666.9A 2018-12-06 2018-12-06 Extensible large-scale biomedical sample management and visualization platform Active CN109656910B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811487666.9A CN109656910B (en) 2018-12-06 2018-12-06 Extensible large-scale biomedical sample management and visualization platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811487666.9A CN109656910B (en) 2018-12-06 2018-12-06 Extensible large-scale biomedical sample management and visualization platform

Publications (2)

Publication Number Publication Date
CN109656910A true CN109656910A (en) 2019-04-19
CN109656910B CN109656910B (en) 2021-04-13

Family

ID=66112703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811487666.9A Active CN109656910B (en) 2018-12-06 2018-12-06 Extensible large-scale biomedical sample management and visualization platform

Country Status (1)

Country Link
CN (1) CN109656910B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699161A (en) * 2019-10-23 2021-04-23 上海磐门信息科技有限公司 Medical statistical system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933112A (en) * 2015-06-04 2015-09-23 浙江力石科技股份有限公司 Distributed Internet transaction information storage and processing method
CN107066531A (en) * 2017-03-01 2017-08-18 苏州朗动网络科技有限公司 A kind of business data radar monitoring method and system based on enterprise's big data platform
CN107066532A (en) * 2017-03-01 2017-08-18 苏州朗动网络科技有限公司 A kind of method and system for generating enterprise's transverse and longitudinal graph of a relation
US20180218114A1 (en) * 2017-01-31 2018-08-02 Onramp Bioinformatics, Inc. Method for managing complex genomic data workflows

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933112A (en) * 2015-06-04 2015-09-23 浙江力石科技股份有限公司 Distributed Internet transaction information storage and processing method
US20180218114A1 (en) * 2017-01-31 2018-08-02 Onramp Bioinformatics, Inc. Method for managing complex genomic data workflows
CN107066531A (en) * 2017-03-01 2017-08-18 苏州朗动网络科技有限公司 A kind of business data radar monitoring method and system based on enterprise's big data platform
CN107066532A (en) * 2017-03-01 2017-08-18 苏州朗动网络科技有限公司 A kind of method and system for generating enterprise's transverse and longitudinal graph of a relation

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112699161A (en) * 2019-10-23 2021-04-23 上海磐门信息科技有限公司 Medical statistical system and method

Also Published As

Publication number Publication date
CN109656910B (en) 2021-04-13

Similar Documents

Publication Publication Date Title
CN102708136B (en) The index to feature and search including the use of reusable index field
Holzschuher et al. Performance of graph query languages: comparison of cypher, gremlin and native access in neo4j
CN105701098B (en) The method and apparatus for generating index for the table in database
CN104737154B (en) Related information broadcasting system
Haarbrandt et al. Automated population of an i2b2 clinical data warehouse from an openEHR-based data repository
US8412735B2 (en) Data quality enhancement for smart grid applications
CN103631842B (en) For detecting the method and system of multiple row compound keys row set
Howe et al. Database-as-a-service for long-tail science
CN106055621A (en) Log retrieval method and device
Wu et al. Differential diagnosis model of hypocellular myelodysplastic syndrome and aplastic anemia based on the medical big data platform
Vera et al. Data modeling for NoSQL document-oriented databases
CN105653559A (en) Method and device for searching in database
CN103488671A (en) Method and system for querying and integrating structured and instructured data
CN105556474B (en) Manage the memory and memory space of data manipulation
CN104050223A (en) Pivot facets for text mining and search
CN106021260A (en) Method and system to search for at least one relationship pattern in a plurality of runtime artifacts
CN108536692A (en) A kind of generation method of executive plan, device and database server
US11887013B2 (en) System and method for facilitating model-based classification of transactions
AU2014101659A4 (en) Metadata automated system
US10171311B2 (en) Generating synthetic data
Bekhuis et al. Towards automating the initial screening phase of a systematic review
CN109656910A (en) Expansible Large Scale Biology medicine sample management and Visualization Platform
O’hare et al. High-value token-blocking: efficient blocking method for record linkage
CN110222102A (en) The data management system and method for rocket engine ground test
KR20220152505A (en) Managing method for EHR based on block chain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant