CN106354759A - Retrieving and automatically downloading system of articles and data based on biological cloud platform - Google Patents
Retrieving and automatically downloading system of articles and data based on biological cloud platform Download PDFInfo
- Publication number
- CN106354759A CN106354759A CN201610687029.0A CN201610687029A CN106354759A CN 106354759 A CN106354759 A CN 106354759A CN 201610687029 A CN201610687029 A CN 201610687029A CN 106354759 A CN106354759 A CN 106354759A
- Authority
- CN
- China
- Prior art keywords
- data
- module
- retrieval
- article
- retrieving
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
Abstract
The invention discloses a retrieving and automatically downloading system of articles and data based on a biological cloud platform. The retrieving and automatically downloading system comprises a data downloading module, a data analysis module, a data memory module, a web graphic interface module and a data retrieving module. After original data is downloaded, the downloaded original data is analyzed into the standard format, the standard-format data is integrated, then word dividing, index setting and storing are carried out according to the preset word dividing strategy, and a retrieving interface is provided. According to the retrieving and automatically downloading system of articles and data based on the biological cloud platform, disheveled original data is analyzed into the standard format according to the regular rule and stored into a retrieving colony, a web interface is provided for allowing users to retrieve and browse articles and data, and data is conveniently recycled and researched.
Description
Technical field
The present invention relates to data downloads and search field is and in particular to a kind of article data based on biological cloud platform
Retrieval and automatic download system.
Background technology
With the continuous development of sequencing technologies, the speed of response of biological data becomes quickly, according to statistics, the secondary survey in the whole world
The data speed of response of sequence technology is annual 13pbp, and also in continuous acceleration, bioinformatics research formally enters
The big data epoch.The speed of response of article also constantly increases simultaneously.But these data in disclosed data base on the Internet
Retrieval is isolated, it is impossible to directly take the data such as sra, gsm of this article after such as searching article, needs to re-search for
The data bases such as sra, geo datasets, make reusing of internet data become abnormal loaded down with trivial details and difficult.
Content of the invention
The defect existing for prior art, the present invention provides a kind of retrieval of the article data based on biological cloud platform
With automatic download system.
The embodiment of the present invention proposes a kind of retrieval of the article data based on biological cloud platform and automatic download system, bag
Include:
Data download module, data resolution module, data memory module, web graph shape interface module data retrieval mould
Block;Wherein,
Described data download module, downloads sequencing field all of article data for the data base from network,
Described data resolution module, the article data for obtaining download is parsed into the data of standard data format,
Described data memory module, for data that described data resolution module is obtained according to default participle and index
Strategy is processed, and the data obtaining is stored,
Described web graph shape interface module, for providing a user with the search interface of article data, and user is passed through
The retrieval result that described search interface carries out article data retrieval is shown,
Described data retrieval module, for the search condition that arranged by described search interface according to user from described data
Retrieval in the data of memory module storage obtains retrieval result, and described retrieval result is fed back to described web graph shape interface
Module.
The retrieval of the article data based on biological cloud platform provided in an embodiment of the present invention and automatic download system, pass through
Download all of data in sequencing field and article, data and article are carried out parse, associate integration, participle, set up index and go forward side by side
Row storage, so that user can carry out the retrieval of article data in a web page, is easy to reusing of public data
And research.
Brief description
Fig. 1 is a kind of retrieval of the article data based on biological cloud platform of the present invention and automatic download system one embodiment
Structural representation.
Specific embodiment
Purpose, technical scheme and advantage for making the embodiment of the present invention are clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is explicitly described it is clear that described embodiment is the present invention
A part of embodiment, rather than whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not having
The every other embodiment being obtained under the premise of making creative work, broadly falls into the scope of protection of the invention.
Referring to Fig. 1, the present embodiment discloses a kind of retrieval of the article data based on biological cloud platform and automatic download is
System, comprising:
Data download module 1, data resolution module 2, data memory module 3, the inspection of web graph shape interface module 4 data
Rope module 5;Wherein,
Described data download module 1, downloads sequencing field all of article data for the data base from network,
Described data resolution module 2, the article data for obtaining download is parsed into the data of standard data format,
In a particular application, described standard data format can be json form.
Described data memory module 3, for parsing the data obtaining according to default participle to described data resolution module 2
Strategy carries out word segmentation processing, obtains data participle, and described data participle is set up with search index, and the institute to foundation search index
State data participle to be stored,
Described web graph shape interface module 4, for providing a user with the search interface of article data, and user is led to
Cross described search interface and carry out the retrieval result of article data retrieval and be shown,
Described data retrieval module 5, for the search condition that arranged by described search interface according to user from described number
Obtain retrieval result according to retrieval in the data of memory module 3 storage, and described retrieval result is fed back to described web graph Xing Hua circle
Face mould block 4.
In the embodiment of the present invention, data memory module 3 can enter line number to the data that data resolution module 2 parsing obtains first
Integrate according to association, different data bases will be associated by force in various data bases, facilitate the various conditions of data retrieval module
Combined retrieval it is ensured that the accurate inquiry of user, the data after integrating can be carried out with participle afterwards, set up and index and deposit
Storage.
In the embodiment of the present invention, search interface can show multiple search conditions, user when entering line retrieval, Ke Yitong
Cross input or select corresponding retrieval type to carry out the retrieval of article data.Retrieval with specific reference to user input or selection
Formula is entered line retrieval from the participle data that data memory module stores and can be adopted existing searching document from bibliographic data base
Search method, here is omitted for concrete retrieving.
The retrieval of the article data based on biological cloud platform provided in an embodiment of the present invention and automatic download system, pass through
Download all of data in sequencing field and article, data and article are carried out parse, associate integration, participle, set up index and go forward side by side
Row storage, so that user can carry out the retrieval of article data in a web page, is easy to reusing of public data
And research.
Alternatively, in the present invention based on the retrieval of the article data of biological cloud platform and another reality of automatic download system
Apply in example, also include:
Timing update module;Wherein,
Described timing update module, for the latest data in timing acquisition network and article, and by described latest data
It is sent to described data memory module with article.
Alternatively, in the present invention based on the retrieval of the article data of biological cloud platform and another reality of automatic download system
Apply in example, described data retrieval module, it is additionally operable to for latest data and article to be pushed to booking reader.
Alternatively, in the present invention based on the retrieval of the article data of biological cloud platform and another reality of automatic download system
Apply in example, described data download module, download sequencing field for the data base from network by the Internet reptile all of
Article data.
Alternatively, in the present invention based on the retrieval of the article data of biological cloud platform and another reality of automatic download system
Apply in example, described data memory module, specifically for by including the data storage after processing according to described participle and index strategy
The distributed elasticsearch cluster in portion.
In the embodiment of the present invention, elasticsearch cluster has the performances such as High Availabitity, high extension.When entering line retrieval,
Can externally provide search api to facilitate web graph shape interface module to call according to elasticsearch cluster, and facilitate figure
Change checking and using of interface user.User, can be according to different combination conditions pair after logging in graphic interface
Mass data in elasticsearch carries out various different combinatorial search and details are checked.Number is stored by clustered
According to ensure that integrity, safety, availability and the quick response of data.
Although being described in conjunction with the accompanying embodiments of the present invention, those skilled in the art can be without departing from this
Various modifications and variations are made, such modification and modification each fall within by claims in the case of bright spirit and scope
Within limited range.
Claims (5)
1. a kind of retrieval of the article data based on biological cloud platform and automatic download system are it is characterised in that include:
Data download module, data resolution module, data memory module, web graph shape interface module data retrieval module;Its
In,
Described data download module, downloads sequencing field all of article data for the data base from network,
Described data resolution module, the article data for obtaining download is parsed into the data of standard data format,
Described data memory module, the data for obtaining to described data resolution module is tactful according to default participle and index
Processed, and the data obtaining stored,
Described web graph shape interface module, for providing a user with the search interface of article data, and user is passed through described
The retrieval result that search interface carries out article data retrieval is shown,
Described data retrieval module, for the search condition that arranged by described search interface according to user from described data storage
In the data of module stores, retrieval obtains retrieval result, and described retrieval result is fed back to described web graph shape interface module.
2. system according to claim 1 is it is characterised in that also include:
Timing update module;Wherein,
Described timing update module, for the latest data in timing acquisition network and article, and by described latest data and literary composition
Chapter is sent to described data memory module.
3. system according to claim 2 is it is characterised in that described data retrieval module, be additionally operable to latest data and
Article is pushed to booking reader.
4. system according to claim 1 is it is characterised in that described data download module, for by the Internet reptile
Data base from network downloads sequencing field all of article data.
5. system according to claim 1 is it is characterised in that described data memory module, specifically for will be according to described
Data storage after participle and index strategy process distributed elasticsearch cluster internally.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610687029.0A CN106354759B (en) | 2016-08-18 | 2016-08-18 | The retrieval of article and data based on biological cloud platform and automatic download system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610687029.0A CN106354759B (en) | 2016-08-18 | 2016-08-18 | The retrieval of article and data based on biological cloud platform and automatic download system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106354759A true CN106354759A (en) | 2017-01-25 |
CN106354759B CN106354759B (en) | 2019-07-12 |
Family
ID=57843505
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610687029.0A Active CN106354759B (en) | 2016-08-18 | 2016-08-18 | The retrieval of article and data based on biological cloud platform and automatic download system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106354759B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113032436A (en) * | 2021-04-16 | 2021-06-25 | 苏州臻璇数据信息技术有限公司 | Searching method and device based on article content and title |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103412933A (en) * | 2013-08-20 | 2013-11-27 | 南京物联网应用研究院有限公司 | Cloud search platform |
CN103699572A (en) * | 2013-11-26 | 2014-04-02 | 北京航空航天大学 | Digital media content and resource integration and sharing method in cloud environment |
CN104462865A (en) * | 2014-10-17 | 2015-03-25 | 北京百迈客生物科技有限公司 | Article analysis system and method based on biological cloud platform |
CN105159971A (en) * | 2015-08-26 | 2015-12-16 | 成都布林特信息技术有限公司 | Cloud platform data retrieval method |
CN105183809A (en) * | 2015-08-26 | 2015-12-23 | 成都布林特信息技术有限公司 | Cloud platform data query method |
CN105205104A (en) * | 2015-08-26 | 2015-12-30 | 成都布林特信息技术有限公司 | Cloud platform data acquisition method |
-
2016
- 2016-08-18 CN CN201610687029.0A patent/CN106354759B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103412933A (en) * | 2013-08-20 | 2013-11-27 | 南京物联网应用研究院有限公司 | Cloud search platform |
CN103699572A (en) * | 2013-11-26 | 2014-04-02 | 北京航空航天大学 | Digital media content and resource integration and sharing method in cloud environment |
CN104462865A (en) * | 2014-10-17 | 2015-03-25 | 北京百迈客生物科技有限公司 | Article analysis system and method based on biological cloud platform |
CN105159971A (en) * | 2015-08-26 | 2015-12-16 | 成都布林特信息技术有限公司 | Cloud platform data retrieval method |
CN105183809A (en) * | 2015-08-26 | 2015-12-23 | 成都布林特信息技术有限公司 | Cloud platform data query method |
CN105205104A (en) * | 2015-08-26 | 2015-12-30 | 成都布林特信息技术有限公司 | Cloud platform data acquisition method |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113032436A (en) * | 2021-04-16 | 2021-06-25 | 苏州臻璇数据信息技术有限公司 | Searching method and device based on article content and title |
Also Published As
Publication number | Publication date |
---|---|
CN106354759B (en) | 2019-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3400540B1 (en) | Database operation using metadata of data sources | |
US9703882B2 (en) | Generating search results containing state links to applications | |
CN109344223B (en) | Building information model management system and method based on cloud computing technology | |
US9600530B2 (en) | Updating a search index used to facilitate application searches | |
CN103488781B (en) | Method, the search engine server of information search are provided | |
CN102682090B (en) | A kind of sensitive word matching treatment system and method based on polymerization word tree | |
CN105956123A (en) | Local updating software-based data processing method and apparatus | |
CN107145496A (en) | The method for being matched image with content item based on keyword | |
CN107861753B (en) | APP generation index, retrieval method and system and readable storage medium | |
JP6554791B2 (en) | Information processing system and information processing method for character input prediction | |
TW201241773A (en) | Method and apparatus of determining product category information | |
WO2014183956A4 (en) | Social media content analysis and output | |
US8862610B2 (en) | Method and system for content search | |
CN105224658A (en) | A kind of Query method in real time of large data and system | |
CN103365992A (en) | Method for realizing dictionary search of Trie tree based on one-dimensional linear space | |
Lapi et al. | Identification and utilization of components for a linked open data platform | |
CN108470296B (en) | Business object information processing method and device | |
CN106650408B (en) | Method and system for judging whether android system has root permission | |
CN110018982A (en) | Method, apparatus, equipment and the computer readable storage medium of locating file | |
US20150294005A1 (en) | Method and device for acquiring information | |
CN101957860B (en) | Method and device for releasing and searching information | |
CN105095436A (en) | Automatic modeling method for data of data sources | |
CN106354759A (en) | Retrieving and automatically downloading system of articles and data based on biological cloud platform | |
US8667008B2 (en) | Search request control apparatus and search request control method | |
KR20140026796A (en) | System and method for providing customized patent analysis service |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |