CN106506673A - 一种大规模分布式数据管理系统及其方法 - Google Patents
一种大规模分布式数据管理系统及其方法 Download PDFInfo
- Publication number
- CN106506673A CN106506673A CN201611055775.4A CN201611055775A CN106506673A CN 106506673 A CN106506673 A CN 106506673A CN 201611055775 A CN201611055775 A CN 201611055775A CN 106506673 A CN106506673 A CN 106506673A
- Authority
- CN
- China
- Prior art keywords
- data
- reptile
- collection server
- data collection
- business
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013523 data management Methods 0.000 title claims abstract description 20
- 238000000034 method Methods 0.000 title claims abstract description 17
- 241000270322 Lepidosauria Species 0.000 claims abstract description 108
- 238000013480 data collection Methods 0.000 claims abstract description 91
- 238000003860 storage Methods 0.000 claims abstract description 32
- 238000009826 distribution Methods 0.000 claims abstract description 21
- 230000010354 integration Effects 0.000 claims abstract description 11
- 230000003993 interaction Effects 0.000 claims abstract description 4
- 238000004422 calculation algorithm Methods 0.000 claims description 15
- 230000009193 crawling Effects 0.000 claims description 15
- 238000004140 cleaning Methods 0.000 claims description 12
- 238000006243 chemical reaction Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 5
- 238000005201 scrubbing Methods 0.000 claims description 5
- 230000004048 modification Effects 0.000 claims description 4
- 238000012986 modification Methods 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 3
- 238000013500 data storage Methods 0.000 claims description 2
- 230000006978 adaptation Effects 0.000 claims 1
- 238000004220 aggregation Methods 0.000 claims 1
- 230000002776 aggregation Effects 0.000 claims 1
- 230000005540 biological transmission Effects 0.000 description 9
- 238000004458 analytical method Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000013016 learning Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 230000001502 supplementing effect Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 108010022579 ATP dependent 26S protease Proteins 0.000 description 1
- 241000938605 Crocodylia Species 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010247 heart contraction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1097—Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1001—Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
- H04L67/1004—Server selection for load balancing
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Computing Systems (AREA)
- Information Transfer Between Computers (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (12)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611055775.4A CN106506673B (zh) | 2016-11-25 | 2016-11-25 | 一种大规模分布式数据管理系统及其方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611055775.4A CN106506673B (zh) | 2016-11-25 | 2016-11-25 | 一种大规模分布式数据管理系统及其方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106506673A true CN106506673A (zh) | 2017-03-15 |
CN106506673B CN106506673B (zh) | 2019-08-02 |
Family
ID=58328899
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611055775.4A Active CN106506673B (zh) | 2016-11-25 | 2016-11-25 | 一种大规模分布式数据管理系统及其方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106506673B (zh) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107241319A (zh) * | 2017-05-26 | 2017-10-10 | 山东省科学院情报研究所 | 基于vpn的分布式网络爬虫系统及调度方法 |
CN107679233A (zh) * | 2017-10-24 | 2018-02-09 | 麦格创科技(深圳)有限公司 | 分布式爬虫任务分配方法及系统 |
CN108108423A (zh) * | 2017-12-15 | 2018-06-01 | 吉旗(成都)科技有限公司 | 一种流式处理物联网数据的方法 |
CN108460093A (zh) * | 2018-01-30 | 2018-08-28 | 青岛中兴智能交通有限公司 | 一种公安系统的数据处理方法和装置 |
WO2019079992A1 (zh) * | 2017-10-25 | 2019-05-02 | 麦格创科技(深圳)有限公司 | 分布式爬虫系统中任务管理器的分配方法及系统 |
CN109922083A (zh) * | 2019-04-10 | 2019-06-21 | 武汉金盛方圆网络科技发展有限公司 | 一种网络协议流量控制系统 |
CN110737647A (zh) * | 2019-08-20 | 2020-01-31 | 广州宏数科技有限公司 | 一种互联网大数据清洗方法 |
CN110955853A (zh) * | 2018-09-26 | 2020-04-03 | 北京国双科技有限公司 | 一种数据存储方法及装置 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6434548B1 (en) * | 1999-12-07 | 2002-08-13 | International Business Machines Corporation | Distributed metadata searching system and method |
CN102932448A (zh) * | 2012-10-30 | 2013-02-13 | 工业和信息化部电信传输研究所 | 一种分布式网络爬虫的url排重系统及方法 |
CN103455636A (zh) * | 2013-09-27 | 2013-12-18 | 浪潮齐鲁软件产业有限公司 | 一种基于互联网税务数据自动抓取与智能分析的方法 |
CN103973744A (zh) * | 2013-02-01 | 2014-08-06 | 北京英富森信息技术有限公司 | 一种分布式文件递进存储技术 |
-
2016
- 2016-11-25 CN CN201611055775.4A patent/CN106506673B/zh active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6434548B1 (en) * | 1999-12-07 | 2002-08-13 | International Business Machines Corporation | Distributed metadata searching system and method |
CN102932448A (zh) * | 2012-10-30 | 2013-02-13 | 工业和信息化部电信传输研究所 | 一种分布式网络爬虫的url排重系统及方法 |
CN103973744A (zh) * | 2013-02-01 | 2014-08-06 | 北京英富森信息技术有限公司 | 一种分布式文件递进存储技术 |
CN103455636A (zh) * | 2013-09-27 | 2013-12-18 | 浪潮齐鲁软件产业有限公司 | 一种基于互联网税务数据自动抓取与智能分析的方法 |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107241319A (zh) * | 2017-05-26 | 2017-10-10 | 山东省科学院情报研究所 | 基于vpn的分布式网络爬虫系统及调度方法 |
CN107241319B (zh) * | 2017-05-26 | 2020-06-02 | 山东省科学院情报研究所 | 基于vpn的分布式网络爬虫系统及调度方法 |
CN107679233A (zh) * | 2017-10-24 | 2018-02-09 | 麦格创科技(深圳)有限公司 | 分布式爬虫任务分配方法及系统 |
WO2019079992A1 (zh) * | 2017-10-25 | 2019-05-02 | 麦格创科技(深圳)有限公司 | 分布式爬虫系统中任务管理器的分配方法及系统 |
CN108108423A (zh) * | 2017-12-15 | 2018-06-01 | 吉旗(成都)科技有限公司 | 一种流式处理物联网数据的方法 |
CN108460093A (zh) * | 2018-01-30 | 2018-08-28 | 青岛中兴智能交通有限公司 | 一种公安系统的数据处理方法和装置 |
CN110955853A (zh) * | 2018-09-26 | 2020-04-03 | 北京国双科技有限公司 | 一种数据存储方法及装置 |
CN109922083A (zh) * | 2019-04-10 | 2019-06-21 | 武汉金盛方圆网络科技发展有限公司 | 一种网络协议流量控制系统 |
CN110737647A (zh) * | 2019-08-20 | 2020-01-31 | 广州宏数科技有限公司 | 一种互联网大数据清洗方法 |
CN110737647B (zh) * | 2019-08-20 | 2023-07-25 | 广州宏数科技有限公司 | 一种互联网大数据清洗方法 |
Also Published As
Publication number | Publication date |
---|---|
CN106506673B (zh) | 2019-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106506673A (zh) | 一种大规模分布式数据管理系统及其方法 | |
US10685283B2 (en) | Demand classification based pipeline system for time-series data forecasting | |
Siddiqa et al. | A survey of big data management: Taxonomy and state-of-the-art | |
US10430480B2 (en) | Enterprise data processing | |
US20190138639A1 (en) | Generating a subquery for a distinct data intake and query system | |
US20190147084A1 (en) | Distributing partial results from an external data system between worker nodes | |
CN108549580A (zh) | 自动部署Kubernetes从节点的方法及终端设备 | |
US9183031B2 (en) | Provisioning of a virtual machine by using a secured zone of a cloud environment | |
CN108847989A (zh) | 基于微服务架构的日志处理方法、业务服务系统以及电子设备 | |
CN104966006A (zh) | 基于云变平台的智能人脸识别系统 | |
CN110543464A (zh) | 一种应用于智慧园区的大数据平台及操作方法 | |
CN109919771B (zh) | 一种应用分层区块链技术的工业互联网交易装置 | |
CN110838065A (zh) | 一种交易数据处理方法及装置 | |
CN101420458B (zh) | 基于内容分发网络的多媒体内容监控系统、方法及装置 | |
CN104969213A (zh) | 用于低延迟数据存取的数据流分割 | |
CN102147809B (zh) | 一种并行文件系统及其管理方法 | |
US12058269B2 (en) | Systems and methods for providing secure internet of things data notifications using blockchain | |
WO2021108582A1 (en) | Managed materialized views created from heterogeneous data sources | |
CN104331464A (zh) | 一种基于MapReduce的监控数据优先预取处理方法 | |
CN113505260A (zh) | 人脸识别方法、装置、计算机可读介质及电子设备 | |
CN1682190A (zh) | 管理硬件和软件部件的方法和装置 | |
Ding et al. | DS‐Harmonizer: A Harmonization Service on Spatiotemporal Data Stream in Edge Computing Environment | |
CN107729218A (zh) | 一种监控处理计算资源设备的系统及方法 | |
CN105426770B (zh) | 面向多维数据的权限管理机制的配置方法 | |
Song et al. | Towards modeling large-scale data flows in a multidatacenter computing system with petri net |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
PE01 | Entry into force of the registration of the contract for pledge of patent right |
Denomination of invention: Large-scale distributed data management system and method thereof Effective date of registration: 20200518 Granted publication date: 20190802 Pledgee: Zhongguancun Beijing technology financing Company limited by guarantee Pledgor: SIC YOUE DATA Co.,Ltd. Registration number: Y2020990000482 |
|
PE01 | Entry into force of the registration of the contract for pledge of patent right | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: 100070, No. 101-8, building 1, 31, zone 188, South Fourth Ring Road, Beijing, Fengtai District Patentee after: Guoxin Youyi Data Co., Ltd Address before: 100070 Beijing city Fengtai District South Fourth Ring Road No. 188 (ABP) B headquarters mansion 9 floor Patentee before: SIC YOUE DATA Co.,Ltd. |
|
PC01 | Cancellation of the registration of the contract for pledge of patent right | ||
PC01 | Cancellation of the registration of the contract for pledge of patent right |
Date of cancellation: 20211129 Granted publication date: 20190802 Pledgee: Zhongguancun Beijing technology financing Company limited by guarantee Pledgor: Guoxin Youyi Data Co., Ltd Registration number: Y2020990000482 |
|
PM01 | Change of the registration of the contract for pledge of patent right | ||
PM01 | Change of the registration of the contract for pledge of patent right |
Change date: 20211129 Registration number: Y2020990000482 Pledgor after: Guoxin Youyi Data Co., Ltd Pledgor before: SIC YOUE DATA Co.,Ltd. |