WO2016065776A1 - Procédé permettant une interaction de mégadonnées modulables et couplées étroitement - Google Patents

Procédé permettant une interaction de mégadonnées modulables et couplées étroitement Download PDF

Info

Publication number
WO2016065776A1
WO2016065776A1 PCT/CN2015/072975 CN2015072975W WO2016065776A1 WO 2016065776 A1 WO2016065776 A1 WO 2016065776A1 CN 2015072975 W CN2015072975 W CN 2015072975W WO 2016065776 A1 WO2016065776 A1 WO 2016065776A1
Authority
WO
WIPO (PCT)
Prior art keywords
read
nodes
write
node
data
Prior art date
Application number
PCT/CN2015/072975
Other languages
English (en)
Chinese (zh)
Inventor
王恩东
张东
亓开元
刘成平
辛国茂
杨勇
卢俊佐
Original Assignee
浪潮电子信息产业股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浪潮电子信息产业股份有限公司 filed Critical 浪潮电子信息产业股份有限公司
Publication of WO2016065776A1 publication Critical patent/WO2016065776A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Definitions

  • the invention relates to the field of big data technology, in particular to a tightly coupled and scalable big data interaction method.
  • the existing interactive analysis engine supports the table structure and the SQL statement mode, and the underlying data system adopts the distributed architecture, but the interaction analysis effect in practical applications is still very poor.
  • Hive uses the MapReduce engine to adopt a strict synchronization and step-by-step materialization mode in each processing stage, and the processing delay is large.
  • Shark is based on the memory calculation engine, the processing performance is optimized through pipelined and intermediate result caching, but because of its traditional Clinet /Sever mode, and SQL parsing, path planning and metadata processing Server side only supports single-point deployment, but it restricts the high concurrent interaction processing effect. Therefore, there is a need for a new driver architecture that meets the needs of online high-concurrency interactive analysis of big data.
  • the object of the present invention is achieved in the following manner.
  • a distributed tightly coupled client driver layer on the basis of ensuring consistency, a single point of failure of the client or the server can be avoided, and the relationship between the clients is reduced.
  • the communication overhead makes the system have near-linear scalability in the scenario dominated by the metadata query class, which satisfies the online high-concurrency interactive analysis requirements of big data.
  • the application instance gets the returned result and processes it in the business logic layer to avoid a single point on the client or server. It fails and reduces the communication overhead between clients. Because the client driver of the above architecture only needs to save the metadata state of a small number of systems, and the metadata is mainly based on read and query operations, it can effectively expand and support. High concurrency, when metadata write operations occur, there is a problem of metadata synchronization, so it is necessary to ensure read and write consistency through inter-node interaction;
  • the read-write synchronization process reads the current version from the node first when reading and writing each time; after the data is updated, the version number is incremented by 1, and a write data update request is sent to all the nodes; after the node receives the new version update, If you have not agreed to a higher version before, you are in favor of returning, otherwise notify the sender of the latest version number;
  • step 4.1 is actively performed to synchronize the data
  • the object of the present invention is that the above method can ensure the read/write consistency of data, and although the read operation may be delayed, the order of reading the versions can be ensured. In the case that you need to read the latest version, you can actively perform a data synchronization process.
  • the method is very fault tolerant, only If the number of failed nodes is less than half, the read and write data of other nodes is not affected. When the node replies, only one read and write operation can be synchronized through the steps.
  • Figure 1 is a single client, single server system architecture diagram
  • Figure 2 is a single client, multi-server system architecture diagram
  • Figure 3 is a multi-client, multi-server separation system architecture diagram
  • Figure 4 is a multi-node tightly coupled system architecture diagram
  • Figure 5 is a diagram of the read and write synchronization process of the multi-node drive architecture.
  • the single-client and single-server systems shown in Figure 1 have a single point of failure and performance bottleneck on the server side.
  • the single-client and multi-server systems shown in Figure 2 establish a cluster on the server.
  • the multi-client and multi-server separation system shown in Figure 3 establishes a cluster on the client and the server respectively, and can perform load balancing on both ends, although a single point can be avoided.
  • the client driver accepts the interactive request sent by the application, completes the Sql parsing, performs the operation compiling and path optimization, and sends an operation request to the distributed big data processing system;
  • the big data processing system performs processing on each processing node, and returns the result to the client driver summary processing
  • the application instance gets the returned result and processes it at the business logic layer
  • the above architecture can avoid single point of failure of the client or the server, and reduce the communication overhead between the clients. Because the client driver of the above architecture only needs to save the metadata state of a small number of systems, and the metadata is read and queried. Class operations are dominant, so they can be effectively extended and support high concurrency.
  • the above method can ensure the read and write consistency of the data. Although the simple read operation is affected by the step (6), a delay phenomenon occurs, but the order of reading the versions can be ensured. In case you need to read the latest version, you can take the initiative to perform step 4.1) to synchronize the data. In addition, the method has good fault tolerance, as long as the number of failed nodes is less than n/2+1, the read and write data of other nodes is not affected. When the node replies, only one read and write operation is required, through steps 4.2), 4.3) You can sync.
  • the drive architecture and synchronization method for big data interaction processing proposed by the present invention can be applied to big data processing systems such as MapReduce, Spark, HBase, etc., by constructing a client driver layer, enabling customers on the basis of ensuring consistency.
  • the driver layer has near-linear scalability in the scenario dominated by the metadata query class, meeting the needs of online high-concurrency interactive analysis of big data.
  • the driver architecture built on MapReduce as an example, in the case where the original Hive single-point mode only supports 100 concurrency, the 5-node tightly coupled driver architecture can achieve 500 concurrency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)
  • Computer And Data Communications (AREA)

Abstract

L'invention concerne un procédé permettant une interaction de mégadonnées modulables et couplées étroitement. Au moyen de la construction d'une couche de pilote de client couplée étroitement et distribuée tout en assurant une cohérence, une défaillance ponctuelle d'un client ou d'un serveur est évitée et le surdébit de communication entre des clients est réduit, ce qui provoque une extensibilité quasi linéaire du système lorsqu'il est basé sur une classe d'interrogation de métadonnées, répondant aux exigences d'une analyse et d'une interaction en ligne à haut degré de simultanéité. Le procédé de l'invention garantit une cohérence de lecture/d'écriture des données. Même s'il se produit des retards avec une seule opération de lecture, la cohérence dans l'ordre des versions de lecture est assurée. Lorsqu'il est nécessaire de lire la version la plus récente, un processus de synchronisation de données peut être exécuté de manière proactive. De plus, le procédé présente une bonne tolérance de défaut ; tant que le nombre de nœuds défaillants est inférieur à la moitié du nombre de nœuds, les données de lecture/d'écriture des autres nœuds ne sont pas affectées ; après la réponse d'un nœud, une seule opération de lecture/d'écriture réussie est nécessaire afin de réaliser la synchronisation.
PCT/CN2015/072975 2014-10-28 2015-02-13 Procédé permettant une interaction de mégadonnées modulables et couplées étroitement WO2016065776A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410585403.7A CN104348913B (zh) 2014-10-28 2014-10-28 一种紧耦合可扩展的大数据交互方法
CN201410585403.7 2014-10-28

Publications (1)

Publication Number Publication Date
WO2016065776A1 true WO2016065776A1 (fr) 2016-05-06

Family

ID=52503695

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/072975 WO2016065776A1 (fr) 2014-10-28 2015-02-13 Procédé permettant une interaction de mégadonnées modulables et couplées étroitement

Country Status (2)

Country Link
CN (1) CN104348913B (fr)
WO (1) WO2016065776A1 (fr)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104348913B (zh) * 2014-10-28 2016-08-24 浪潮电子信息产业股份有限公司 一种紧耦合可扩展的大数据交互方法
CN108063780B (zh) * 2016-11-08 2021-02-19 中国电信股份有限公司 用于动态复制数据的方法和系统
CN106599195B (zh) * 2016-12-14 2020-07-31 北京邮电大学 一种海量网络数据环境下的元数据同步方法及系统
CN108234641B (zh) * 2017-12-29 2021-01-29 北京奇元科技有限公司 基于分布式一致性协议实现的数据读写方法及装置
CN110825309B (zh) * 2018-08-08 2021-06-29 华为技术有限公司 数据读取方法、装置及系统、分布式系统
CN109542872B (zh) * 2018-10-26 2021-01-22 金蝶软件(中国)有限公司 数据读取方法、装置、计算机设备和存储介质
CN111090665A (zh) * 2019-11-15 2020-05-01 广东数果科技有限公司 一种数据任务调度方法及调度系统
CN116483739B (zh) * 2023-06-21 2023-08-25 深存科技(无锡)有限公司 基于hash计算的KV对快速写架构

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103218210A (zh) * 2013-04-28 2013-07-24 北京航空航天大学 适于大数据高并发访问的文件级分条系统
CN103235807A (zh) * 2013-04-19 2013-08-07 浪潮集团山东通用软件有限公司 一种支持高并发大数据量的数据抽取处理方法
CN103428292A (zh) * 2013-08-20 2013-12-04 浪潮集团有限公司 一种大数据有效存储的装置和方法
CN104348913A (zh) * 2014-10-28 2015-02-11 浪潮电子信息产业股份有限公司 一种紧耦合可扩展的大数据交互方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102023920B (zh) * 2010-10-27 2012-09-05 西安交通大学 一种远程并行程序调试系统中基于树形的消息聚集方法
CN102521044B (zh) * 2011-12-30 2013-12-25 北京拓明科技有限公司 一种基于消息中间件的分布式任务调度方法及系统
CN103188346A (zh) * 2013-03-05 2013-07-03 北京航空航天大学 支持分布式决策的大规模高并发访问i/o服务器负载均衡系统
CN103227754B (zh) * 2013-04-16 2017-02-08 浪潮(北京)电子信息产业有限公司 一种高可用集群系统负载动态均衡方法及节点设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103235807A (zh) * 2013-04-19 2013-08-07 浪潮集团山东通用软件有限公司 一种支持高并发大数据量的数据抽取处理方法
CN103218210A (zh) * 2013-04-28 2013-07-24 北京航空航天大学 适于大数据高并发访问的文件级分条系统
CN103428292A (zh) * 2013-08-20 2013-12-04 浪潮集团有限公司 一种大数据有效存储的装置和方法
CN104348913A (zh) * 2014-10-28 2015-02-11 浪潮电子信息产业股份有限公司 一种紧耦合可扩展的大数据交互方法

Also Published As

Publication number Publication date
CN104348913A (zh) 2015-02-11
CN104348913B (zh) 2016-08-24

Similar Documents

Publication Publication Date Title
WO2016065776A1 (fr) Procédé permettant une interaction de mégadonnées modulables et couplées étroitement
US9367410B2 (en) Failover mechanism in a distributed computing system
US20170013058A1 (en) Data replication in a tree based server architecture
US8639786B2 (en) Consistency domains for replication in distributed computing
US11068499B2 (en) Method, device, and system for peer-to-peer data replication and method, device, and system for master node switching
US20150331910A1 (en) Methods and systems of query engines and secondary indexes implemented in a distributed database
US20160028806A1 (en) Halo based file system replication
US9367261B2 (en) Computer system, data management method and data management program
CN105493474B (zh) 用于支持用于同步分布式数据网格中的数据的分区级别日志的系统及方法
US11595474B2 (en) Accelerating data replication using multicast and non-volatile memory enabled nodes
US10826812B2 (en) Multiple quorum witness
CN109639773B (zh) 一种动态构建的分布式数据集群控制系统及其方法
US20180137188A1 (en) Command processing method and server
WO2017092384A1 (fr) Procédé et dispositif de stockage distribué de base de données groupée
CN102937964A (zh) 基于分布式系统的智能数据服务方法
CN110807039A (zh) 一种云计算环境下数据一致性维护系统及方法
CN111913837A (zh) 大数据环境下实现分布式中间件消息恢复策略管理的系统
CN108462737B (zh) 基于批处理和流水线的分层数据一致性协议优化方法
WO2023246236A1 (fr) Procédé de configuration de nœud, procédé de synchronisation de journal de transactions et nœud pour base de données réparties
CN109218386B (zh) 一种管理Hadoop命名空间的高可用方法
KR101696911B1 (ko) 분산 데이터 베이스 장치 및 그 장치에서의 스트림 데이터 처리 방법
CN112751789A (zh) 一种非对称sdn控制器集群的实现方法及系统
US10148503B1 (en) Mechanism for dynamic delivery of network configuration states to protocol heads
Lu et al. Software-Defined, Fast and Strongly-Consistent Data Replication for RDMA-Based PM Datastores
US10728155B2 (en) Inter-datacenter multicast system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15855578

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15855578

Country of ref document: EP

Kind code of ref document: A1