WO2020191930A1 - 一种有效降低容器化关系型数据库i/o消耗的方法 - Google Patents

一种有效降低容器化关系型数据库i/o消耗的方法 Download PDF

Info

Publication number
WO2020191930A1
WO2020191930A1 PCT/CN2019/092672 CN2019092672W WO2020191930A1 WO 2020191930 A1 WO2020191930 A1 WO 2020191930A1 CN 2019092672 W CN2019092672 W CN 2019092672W WO 2020191930 A1 WO2020191930 A1 WO 2020191930A1
Authority
WO
WIPO (PCT)
Prior art keywords
memcached
layer
distributed cache
container
storage layer
Prior art date
Application number
PCT/CN2019/092672
Other languages
English (en)
French (fr)
Inventor
李鹏
杨菲
王汝传
徐鹤
李超飞
樊卫北
朱枫
程海涛
Original Assignee
南京邮电大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京邮电大学 filed Critical 南京邮电大学
Priority to JP2021522369A priority Critical patent/JP2022505720A/ja
Publication of WO2020191930A1 publication Critical patent/WO2020191930A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/20Software design
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines

Definitions

  • the invention belongs to the technical field of performance optimization of container virtualization, and specifically relates to a method for effectively reducing I/O consumption of a containerized relational database.
  • Containers can solve many distributed application challenges, such as portability and performance overhead.
  • Kubernetes is a system that implements container-based deployment in the platform as a service (PaaS) cloud. It is a widely recognized docker cluster solution in the industry. It can deploy cloud-native applications and is a distributed and horizontal (micro)service composition.
  • the capacity is limited by the capacity of a single node, and the choice of RDS instance deployment nodes is limited by the underlying storage media (SSD/HDD); while the volume type provided by Kubernetes and distributed storage methods can achieve persistent storage of data
  • This method of persisting data to remote storage facilitates the use of a separate architecture for computing and storage.
  • the biggest advantage of the separation of computing and storage is: using volume to mount stateful data to the storage layer.
  • the architecture is clear and the storage capacity is clear. Easy to expand. Compared with the local storage (local) method, this separated architecture requires remote data transmission. A single I/O has more network overhead, and the request response time is increased compared with the local method. It is a delay-sensitive application such as a database. Delay will greatly affect the performance of the database, resulting in low service quality of the business system. If deployed in a high-density scenario, it may cause insufficient utilization of computing resources and storage resources.
  • a single microservice usually corresponds to a separate database.
  • Such a large application usually has multiple databases to share the huge amount of data, and there may be multiple backup instances at the same time. , Resulting in a large number of database instances.
  • the computing and storage separation architecture faces multiple instances that need to persist data to the storage layer, causing network I/O overhead, especially at the RDS instance layer (all RDS instances in the platform) with highly concurrent access
  • network bandwidth becomes a performance bottleneck, and network traffic consumption increases sharply.
  • the distributed storage system will introduce the two major bottlenecks of the computer system (disk I/O and network I/O) into the business system, further aggravating the I/O overhead of the separate architecture.
  • the existing methods for optimizing the performance of computing and storage separation architecture (1) Optimization for the RDS instance layer: The database instance can improve the I/O throughput by optimizing the speed of writing Redo during transaction commit, and the database read and write separation, DB Splitting, etc.; (2) Optimize for the storage layer: the storage layer's multiple replicas (replicas) write design adopts the return strategy when the replica reaches the majority, hardware upgrades, or the use of flow control design at the storage layer. These methods are not only expensive, but also difficult to achieve an order of magnitude improvement in the performance of the storage separation architecture, and cannot meet the requirements.
  • the present invention proposes a method to effectively reduce the I/O consumption of a containerized relational database. This method is implemented at the RDS instance layer.
  • the high-availability distributed cache is added between the storage layer and the storage layer to realize the I/O overhead caused by the separation of computing and storage architecture.
  • a method for effectively reducing I/O consumption of containerized relational databases includes:
  • Use StorageClass to dynamically create Persistent Volume at the storage layer, and create a shared storage in the high-availability distributed cache architecture based on the storage layer protocol to dynamically allocate volumes, indicate the shared path created by the storage layer and specify the provisioner_name in the env;
  • S15 Define a svc.yaml file in the high-availability distributed cache architecture, and set a Persistent Volume corresponding to each memcached pod in the svc.yaml file;
  • the data that needs to be written to the storage layer at the RDS instance layer is first written into the high-availability distributed cache architecture for persistent storage, and then refreshed to the storage layer by the high-availability distributed cache architecture;
  • the data access mode between the RDS instance layer, the high-availability distributed cache architecture, and the storage layer is a serial mode; and the RDS instance layer directly performs read and write operations on the high-availability distributed cache architecture.
  • the high-availability distributed architecture uses the Persistent Volume to refresh data according to a specified period size.
  • the method of the present invention to effectively reduce the I/O consumption of containerized relational databases is to build a memcached-based high-availability distributed cache architecture between the RDS instance layer and the storage layer on the kubernetes and Docker platforms, and the RDS instance layer, high
  • the data interaction mode between the available distributed cache architecture and the storage layer is set in series, which can effectively reduce the network I/O distance; the data in the RDS instance layer is persisted through the highly available distributed architecture, and the highly available distributed cache
  • the architecture refreshes data to the storage layer and realizes the data interaction between the RDS instance layer and the storage layer at one time, which can effectively reduce I/O consumption in RDS; compared with the prior art, the beneficial effects of the present invention are: high availability:
  • the design of the high-availability distributed cache architecture takes into account the disaster tolerance problem.
  • the high-availability distributed cache architecture uses containers Encapsulate the memcache application to achieve rapid distribution and deployment, and use kubernetes technology to deploy a distributed system method to simplify the management of each instance.
  • Figure 1 is a schematic diagram of a completed architecture using a highly available distributed architecture based on kubernetes and Docker platforms in an embodiment of the present invention
  • FIG. 2 is a schematic diagram of the RDS instance layer cache mode in the embodiment of the present invention.
  • FIG. 3 is a schematic diagram of the composition structure of the high-availability distributed architecture in an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of a processing flowchart of a write request at the RDS instance layer in an embodiment of the present invention
  • FIG. 5 is a schematic diagram of the processing flowchart of the read request of the RDS instance layer in the embodiment of the present invention.
  • a method for effectively reducing I/O consumption of a containerized relational database is provided. Specifically, the method is used between the RDS instance layer and the storage layer through kubernetes and Docker
  • the platform builds a high-availability distributed cache architecture based on memcached; the data that needs to be written to the storage layer at the RDS instance layer is first written into the high-availability distributed cache architecture for persistent storage, and then refreshed from the high-availability distributed cache architecture to the storage layer, and
  • the highly available distributed cache architecture caches hot data in the RDS instance layer.
  • the data access mode between the RDS instance layer, the high-availability distributed cache architecture, and the storage layer is the serial mode; and the RDS instance layer directly performs read and write operations on the high-availability distributed cache architecture; based on memcached
  • the process of building a highly available distributed cache architecture includes: First, add the namespace_name prefix to the Key value of the memcached stored data on the client side; specifically, select the consistent hash algorithm of memcached for horizontal sharding of data and serve in kubernetes It may be defined in different namespaces. In order to avoid the same key value in different namespaces, it is necessary to separate data for each namespace, that is, each record must have a globally unique primary key.
  • the StatefulSet resource object is used to create magent and memcached instances, and the container instances are started sequentially, and the generated pod sequence is from 0 to n-1: memcached master
  • the container and the memcached slave container use the same image, but statefulset files are created for the memcached master container and memcached slave container.
  • the memcached master container definition file points out that the memcached instance name generated is the memcached master container, and the two ports of service port and synchronization port are set.
  • the parameter TaintBasedEvictions is set to true to control the memcached master container to be generated on different node nodes, in the container template In the command section, define the memcached master container startup command and set replication: listen; the memcached slave container definition file points out that the generated pod name is memcached slave container and two ports. After setting the TaintBasedEvictions parameter to true, add the slave start in the command Script, where master and slave with the same serial number can not be started on the same node node.
  • the definition files of the memcached master container and memcached slave container also need to set volumeClaimTemplates (persistent storage) to point to the created shared path.
  • magent instance When creating a magent instance, first match the master and slave with the same number as the magent instance, and specify -s for master-x and -b for slave-x in the startup command.
  • the storage layer needs to import the memcached plug-in libmemcached.so, and the configuration information is added and activated by libmemcached.so.
  • the data written to the storage layer is passed through the provisioner
  • the method is passed to the storage layer, and the data of the storage layer is read, written, added, deleted and other operations are performed through functions in the libmemcached.so plug-in.
  • the RDS instance layer sends read and write requests to the memcached client through the environment variables env: service_name and port, and the client forwards the read and write requests to the corresponding memcached magent container through the consistent hash algorithm, and then the memcached
  • the magent container passes the request to memcached; specifically, the key and memcached magent container of the cached data corresponding to the read and write request can be mapped to the circular hash space through the consistent hash algorithm.
  • the mapping relationship between the cache key and the magent container is: hash(key) the first magent container hash(magent x) encountered in the clockwise direction; among them, if it is a write request, the memcached magent container writes data to the memcached master container and memcached slave container; if it is a read request, it will The request is sent to the memcached instance whose role is the memcached master container; the data of each Memcached instance is periodically refreshed to the storage layer Persistent Volume through the volume definition.
  • repcached is added to realize the data synchronization and backup between the single master and single slave of the cache instance.
  • the memcached master container and memcached slave container are both readable and writable.
  • the memcached slave container When the memcached master container appears Downtime or temporarily unavailable, the memcached slave container automatically listens to become the master and waits for the creation of a new instance; join the memcached magent container to achieve load balancing of distributed clusters, memcached client connects to memcached magent container, memcached magent container connects memcached master container and memcached The slave container, every time data is written, it will be written to the memcached master container and memcached slave container. When the roles of the memcached master container and the memcached slave container are exchanged, the order of the multiple memcached magent containers remains unchanged for the client. Does not affect data migration.
  • the method for the RDS instance layer to access the shared cache in the present invention is the serial mode.
  • the serial mode can completely block the direct data interaction between each RDS instance layer and the storage layer.
  • all access requests are sent to the shared cache.
  • the RDS instance layer write data is directly written to the shared cache, and the read request is also sent directly to the shared cache.
  • the request Sent to the storage layer the storage layer searches for the corresponding data, writes it to the shared cache, and then returns it from the shared cache.
  • the highly available distributed architecture in the present invention refreshes data through Persistent Volume, and the present invention does not fix or limit the size of data refresh, and can be set according to actual conditions.
  • the method of the present invention to effectively reduce the I/O consumption of containerized relational databases is to build a memcached-based high-availability distributed cache architecture between the RDS instance layer and the storage layer on the kubernetes and Docker platforms, and the RDS instance layer, high
  • the data interaction mode between the available distributed cache architecture and the storage layer is set in series, which can effectively reduce the network I/O distance; the data in the RDS instance layer is persisted through the highly available distributed architecture, and the highly available distributed cache
  • the architecture refreshes data to the storage layer and realizes the data interaction between the RDS instance layer and the storage layer at one time, which can effectively reduce I/O consumption in RDS; compared with the prior art, the beneficial effects of the present invention are: high availability:
  • the design of the high-availability distributed cache architecture takes into account the disaster tolerance problem.
  • the high-availability distributed cache architecture uses containers Encapsulate the memcache application to achieve rapid distribution and deployment, and use kubernetes technology to deploy a distributed system method to simplify the management of each instance.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明公开了一种有效降低容器化关系型数据库I/O消耗的方法,本发明方法通过在RDS实例层和存储层之间通过在kubernetes和Docker平台搭建基于memcached的高可用分布式缓存架构;RDS实例层需要写入到存储层的数据先写入所述高可用分布式缓存架构持久保存,再由所述高可用分布式缓存架构刷新到存储层;并由所述高可用分布式缓存架构缓存RDS实例层中的热点数据;本发明可利用高可用分布式缓存架构阻挡了RDS实例层和存储层之间的直接交互,能有效降低RDS实例层中I/O的消耗,同时,能够降低网络I/O距离。

Description

一种有效降低容器化关系型数据库I/O消耗的方法 技术领域
本发明属于容器虚拟化的性能优化技术领域,具体涉及一种有效降低容器化关系型数据库I/O消耗的方法。
背景技术
随着信息技术的快速发展,集群系统的规模日益庞大,如何充分高效的使用集群系统资源成为急需解决的问题。由于传统虚拟化技术存在实施难度高、更新和升级困难等问题,容器化成为了传统虚拟化技术的替代,具有轻量级,共享资源及快速扩展等优点。容器可以解决许多分布式应用挑战,例如便携性和性能开销。不过使用容器作为大规模系统的基础技术时其资源管理领域面临许多挑战。Kubernetes是一个在平台即服务(PaaS)云中实现基于容器部署的系统,是业界广泛认可的docker集群解决方案,它可以部署云原生应用程序,是一个由(微)服务组成的分布式和水平可扩展系统,具有弹性和弹性支持等功能。云行业对kubernetes和Docker的组合接受程度超乎想象,并逐渐将其引入RDS(Relational Database Service,关系型数据库服务)领域,但数据库作为一种有状态的应用,使用容器部署时,必须考虑数据持久化问题,就出现了本地存储和远程存储(分离架构的原因):Kubernetes提供的volume类型中的emptyDir或hostPath(本地存储)方式,会导致容器在重启或漂移后无法保留之前的数据,存储容量受限于单个node节点的容量,以及RDS实例部署节点选择受限于底层存储介质(SSD/HDD);而Kubernetes提供的volume类型中的云存储以及分布式存储方式都可以实现数据的持久存储,这种将数据持久化到远程存储端的方式便利用了计算与存储分离架构。计算与存储分离最大的优势就是:利用volume将有状态的数据挂载到存储层,RDS实例部署时,不需要像local方式去感知Node节点的存储介质,只需要调度到满足计算资源(requests、limits)要求的Node节点,数据库实例启动时,只需在存储层挂载匹配的volume即可,显著的提高了数据库容器实例的部署密度和计算资源的利用率,同时架构也清晰,且存储容量扩展方便。这种分离架构与本地存储(local)方式相比,需要进行远程的数据传输,单路I/O多了网络开销,较local方式请求响应时间增加,对数据库这种延时敏感型应用,网络延时会极大影响数据库的性能,导致业务系统的服务质量低下,若在高密度部署的场景,可能导致计算资源和存储资源利用不充分。
互联网的飞速发展以及业务的不断扩张,使得数据量急剧膨胀,单个微服务通常对应单独的数据库,这样一个大型应用程序通常由多个库来分担庞大的数据量,同时可能会有多个备份实例,导致数据库实例数量庞大,此时计算与存储分离架构面临多个实例需要将数 据持久保存到存储层,造成网络I/O开销,尤其在RDS实例层(平台中所有的RDS实例)高度并发访问远端存储系统场景,网络带宽成为性能瓶颈,网络流量消耗剧增。同时,在存储层引入分布式存储时,分布式存储系统会把计算机系统的两大瓶颈点(磁盘I/O和网络I/O)引入业务系统,进一步加剧分离架构的I/O开销。
现有的优化计算与存储分离架构性能的方法:(1)针对RDS实例层进行的优化:数据库实例可以通过优化事务commit时写Redo的速度来提高I/O吞吐,以及数据库读写分离,DB拆分等;(2)针对存储层进行优化:存储层的多副本(replicas)写入设计中采用副本达到多数即返回策略,硬件升级,或在存储层采用流量控制设计。这些方法不仅成本高昂,而且对于存储分离架构的性能很难达到数量级上的提升,无法满足要求。
发明内容
针对上述现有的优化计算与存储分离架构性能中成本高、性能提升不明显的问题,本发明于提出一种有效降低容器化关系型数据库I/O消耗的方法,该方法通过在RDS实例层和存储层之间加入高可用分布式缓存来实现保存数据采用计算与存储分离架构后造成的I/O开销,具体技术方案如下:
一种有效降低容器化关系型数据库I/O消耗的方法,所方法包括:
S1、在RDS实例层和存储层之间通过在kubernetes和Docker平台搭建基于memcached的高可用分布式缓存架构:
S11、在client端memcached存储数据的Key值前加上namespace_name前缀;
S12、制定所述高可用分布式缓存架构中的libevent、memcached、repcached、magent组件相关组件的容器镜像:libevent+magent和libevent+memcache+repcached;
S13、使用StorageClass在存储层动态创建Persistent Volume,并基于存储层协议在所述高可用分布式缓存架构中创建一个共享存储进行动态分配卷,标明存储层创建好的共享路径以及env中指定provisioner_name;
S14、基于所述容器镜像:libevent+magent和libevent+memcache+repcached部署memcached master容器、memcached slave容器和memcached magent容器,将所述memcached master容器和memcached slave容器设置在不同node节点上;
S15、在所述高可用分布式缓存架构中定义一个svc.yaml文件,并在所述svc.yaml文件中设置与每一个memcached pod对应的Persistent Volume;
S2、RDS实例层需要写入到存储层的数据先写入所述高可用分布式缓存架构持久保存,再由所述高可用分布式缓存架构刷新到存储层;
S3、由所述高可用分布式缓存架构缓存RDS实例层中的热点数据。
进一步的,所述RDS实例层、高可用分布式缓存架构和存储层之间的数据访问模式为串联模式;且所述RDS实例层在所述高可用分布式缓存架构上直接进行读写操作。
进一步的,所述高可用分布式架构通过所述Persistent Volume按照指定周期大小进行数据刷新。
本发明的有效降低容器化关系型数据库I/O消耗的方法,通过在RDS实例层和存储层之间在kubernetes和Docker平台搭建基于memcached的高可用分布式缓存架构,并且将RDS实例层、高可用分布式缓存架构和存储层之间的数据交互方式设置成串联方式,能够有效降低网络I/O距离;通过高可用分布式架构持久保存RDS实例层中的数据,并由高可用分布式缓存架构将数据刷新到存储层,一次实现RDS实例层和存储层之间的数据交互,能有效降低RDS中的I/O消耗;与现有技术相比,本发明的有益效果为:高可用性:高可用分布式缓存架构的设计考虑了容灾问题,使用主从复制且主从不在同一节点的方式部署,可实现数据备份以及缓存实例数据同步;轻量特性:高可用分布式缓存架构使用容器封装memcache应用,实现快速分发和部署,并利用kubernetes技术部署分布式系统的方法,实现对各实例的管理简单化。
附图说明
图1是采用本发明实施例中基于kubernetes和Docker平台采用高可用分布式架构的完成架构图示意;
图2是本发明实施例中所述RDS实例层缓存模式示意图;
图3是本发明实施例中所述高可用分布式架构的组成结构图示意;
图4是本发明实施例中所述RDS实例层写请求的处理流程图图示意;
图5是本发明实施例中所述RDS实例层读请求的处理流程图图示意。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述。
结合图1~图5,在本发明实施例中,提供了一种有效降低容器化关系型数据库I/O消耗的方法,具体的,方法在RDS实例层和存储层之间通过在kubernetes和Docker平台搭建基于memcached的高可用分布式缓存架构;RDS实例层需要写入到存储层的数据先写入高可用分布式缓存架构持久保存,再由高可用分布式缓存架构刷新到存储层,并由高可用分布式缓存架构缓存RDS实例层中的热点数据。
在本发明实施例中,RDS实例层、高可用分布式缓存架构和存储层之间的数据访问模式为串联模式;且RDS实例层在高可用分布式缓存架构上直接进行读写操作;基于 memcached的高可用分布式缓存架构的搭建过程包括:首先,在client端memcached存储数据的Key值前加上namespace_name前缀;具体的,选择memcached的一致性hash算法为数据进行水平分片,在kubernetes中服务可能被定义在不同的namespace中,为避免不同namespace中出现相同的key值,需要为每个namespace单独进行数据分片,即每一条记录都要有一个全局唯一的主键,在client端,将key的规则设计为:key=namespace_name+value_key,其中namespace_name代表命名空间的字符串,value_key代表命名空间内的缓存数据的key值;并制定高可用分布式缓存架构中的libevent、memcached、repcached、magent组件相关组件的容器镜像:libevent+magent和libevent+memcache+repcached。
然后,使用StorageClass在存储层动态创建Persistent Volume,并基于存储层协议在高可用分布式缓存架构中创建一个共享存储进行动态分配卷,标明存储层创建好的共享路径以及env中指定provisioner_name;同时,基于容器镜像:libevent+magent和libevent+memcache+repcached部署memcached master容器、memcached slave容器和memcached magent容器,将memcached master容器和memcached slave容器设置在不同node节点上;其中,memcached作为有状态的应用,每个实例都需要具备唯一的标识,并且各实例还有启动顺序的要求,因此采用StatefulSet资源对象创建magent和memcached实例,各容器实例顺序启动,生成的pod顺序从0到n-1:memcached master容器和memcached slave容器使用相同的镜像,但是为memcached master容器和memcached slave容器分别创建statefulset文件。memcached master容器定义文件中指出生成的memcached实例名称为memcached master容器,设置服务port和同步port两种端口,参数TaintBasedEvictions设置为true,用以控制memcached master容器在不同的node节点上生成,在容器模板的command处,定义memcached master容器启动命令并设置replication:listen;memcached slave容器定义文件中指出生成的pod名称为memcached slave容器以及两种端口,设置TaintBasedEvictions参数为true后,在command中添加slave的启动脚本,其中指定相同序号的master和slave不能再同一个node节点启动,执行启动命令前需要匹配和slave实例具有相同编号的master,然后执行启动命令并设置:replication:accept(peer=master-x)replication:marugoto copying replication:start优选的,memcached master容器和memcached slave容器的定义文件中还需设置volumeClaimTemplates(持久存储),使其指向创建好的共享路径。
在创建magent实例时,首先匹配与magent实例编号相同的master和slave,在启动 命令中指定-s为master-x,-b为slave-x。
最后,在高可用分布式缓存架构中定义一个svc.yaml文件,并在svc.yaml文件中设置与每一个memcached pod对应的Persistent Volume。
具体的,为使memcached client能发现magent,需要为magent创建svc.yaml,指定全局唯一的服务名,以及服务端口;并修改key值规则的memcache(client)镜像;基于此创建headless service指定该共享缓存的服务名,以及提供服务的port;通过修改RDS实例层的环境变量env,其中env指定该共享缓存服务的服务名和端口,RDS实例层通过服务名和端口号对共享缓存进行访问;而当存储层在不加处理时,不能处理缓存层发送的读请求的,此时,需要存储层导入memcached插件libmemcached.so,由libmemcached.so加入配置信息并激活,其中,写入存储层的数据通过provisioner方式传递给存储层,且存储层的数据经过libmemcached.so插件中的函数进行读、写、增、删等操作。
在本发明实施例中,RDS实例层将读写请求通过环境变量env:service_name和port指定发送给memcached client,client端通过一致性hash算法将读写请求转发给对应的memcached magent容器,再由memcached magent容器将请求传递给memcached;具体的,通过一致性hash算法可将读写请求对应的缓存数据的key和memcached magent容器分别经过hash映射到环形hash空间,缓存key和magent容器的映射关系为:hash(key)在顺时针方向遇到的第一个magent容器hash(magent x);其中,如果是写请求,memcached magent容器写入数据到memcached master容器和memcached slave容器;如果是读请求,将请求发送给角色为memcached master容器的memcached实例;每个Memcached实例的数据通过volume定义定期刷新到存储层Persistent Volume。
此外,在基于memcached的高可用分布式集群架构基础上,加入repcached实现缓存实例单主单从之间的数据同步和备份,memcached master容器和memcached slave容器都可读可写,当memcached master容器出现宕机或暂时不可用,memcached slave容器自动listen成为master,并等待新实例的创建;加入memcached magent容器实现分布式集群的负载均衡,memcached client连接memcached magent容器,memcached magent容器连接memcached master容器和memcached slave容器,每次写数据都会写到memcached master容器和memcached slave容器上,当memcached master容器和memcached slave容器的角色互换时,对于client来说多个memcached magent容器之间的排列顺序没有变,不影响数据的迁移。
优选的,本发明中RDS实例层访问共享缓存的方式是串联模式,通过串联模式可完全阻挡了各RDS实例层和存储层之间的直接的数据交互,当RDS实例层与存储层需要进行 数据交互的访问请求时,所有的访问请求全部发送到共享缓存,RDS实例层写数据被直接写入共享缓存,读请求也直接发送给共享缓存,当共享缓存中没有要读取的数据时,请求发送给存储层,由存储层查找对应的数据,先写入共享缓存,再由共享缓存返回。
优选的,本发明中的高可用分布式架构通过Persistent Volume对数据进行刷新处理,且本发明对于数据刷新的大小并不做固定和限制,可按照实际情况进行设定。
本发明的有效降低容器化关系型数据库I/O消耗的方法,通过在RDS实例层和存储层之间在kubernetes和Docker平台搭建基于memcached的高可用分布式缓存架构,并且将RDS实例层、高可用分布式缓存架构和存储层之间的数据交互方式设置成串联方式,能够有效降低网络I/O距离;通过高可用分布式架构持久保存RDS实例层中的数据,并由高可用分布式缓存架构将数据刷新到存储层,一次实现RDS实例层和存储层之间的数据交互,能有效降低RDS中的I/O消耗;与现有技术相比,本发明的有益效果为:高可用性:高可用分布式缓存架构的设计考虑了容灾问题,使用主从复制且主从不在同一节点的方式部署,可实现数据备份以及缓存实例数据同步;轻量特性:高可用分布式缓存架构使用容器封装memcache应用,实现快速分发和部署,并利用kubernetes技术部署分布式系统的方法,实现对各实例的管理简单化。
以上仅为本发明的较佳实施例,但并不限制本发明的专利范围,尽管参照前述实施例对本发明进行了详细的说明,对于本领域的技术人员而言,其依然可以对前述各具体实施方式所记载的技术方案进行修改,或者对其中部分技术特征进行等效替换。凡是利用本发明说明书及附图内容所做的等效结构,直接或间接运用在其他相关的技术领域,均同理在本发明专利保护范围之内。

Claims (3)

  1. 一种有效降低容器化关系型数据库I/O消耗的方法,其特征在于,所方法包括:
    S1、在RDS实例层和存储层之间通过在kubernetes和Docker平台搭建基于memcached的高可用分布式缓存架构:
    S11、在client端memcached存储数据的Key值前加上namespace_name前缀;
    S12、制定所述高可用分布式缓存架构中的libevent、memcached、repcached、magent组件相关组件的容器镜像:libevent+magent和libevent+memcache+repcached;
    S13、使用StorageClass在存储层动态创建Persistent Volume,并基于存储层协议在所述高可用分布式缓存架构中创建一个共享存储进行动态分配卷,标明存储层创建好的共享路径以及env中指定provisioner_name;
    S14、基于所述容器镜像:libevent+magent和libevent+memcache+repcached部署memcached master容器、memcached slave容器和memcached magent容器,将所述memcached master容器和memcached slave容器设置在不同node节点上;
    S15、在所述高可用分布式缓存架构中定义一个svc.yaml文件,并在所述svc.yaml文件中设置与每一个memcached pod对应的Persistent Volume;
    S2、RDS实例层需要写入到存储层的数据先写入所述高可用分布式缓存架构持久保存,再由所述高可用分布式缓存架构刷新到存储层;
    S3、由所述高可用分布式缓存架构缓存RDS实例层中的热点数据。
  2. 如权利要求1所述的有效降低容器化关系型数据库I/O消耗的方法,其特征在于,所述RDS实例层、高可用分布式缓存架构和存储层之间的数据访问模式为串联模式;且所述RDS实例层在所述高可用分布式缓存架构上直接进行读写操作。
  3. 如权利要求1所述的有效降低容器化关系型数据库I/O消耗的方法,其特征在于,所述高可用分布式架构通过所述Persistent Volume按照指定周期大小进行数据刷新。
PCT/CN2019/092672 2019-03-25 2019-06-25 一种有效降低容器化关系型数据库i/o消耗的方法 WO2020191930A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2021522369A JP2022505720A (ja) 2019-03-25 2019-06-25 コンテナ化されたリレーショナルデータベースのi/o消費を効果的に減少させる方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910235720.9 2019-03-25
CN201910235720.9A CN109933312B (zh) 2019-03-25 2019-03-25 一种有效降低容器化关系型数据库i/o消耗的方法

Publications (1)

Publication Number Publication Date
WO2020191930A1 true WO2020191930A1 (zh) 2020-10-01

Family

ID=66988465

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/092672 WO2020191930A1 (zh) 2019-03-25 2019-06-25 一种有效降低容器化关系型数据库i/o消耗的方法

Country Status (3)

Country Link
JP (1) JP2022505720A (zh)
CN (1) CN109933312B (zh)
WO (1) WO2020191930A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239118A (zh) * 2021-05-31 2021-08-10 广州宏算信息科技有限公司 一种区块链实训系统和方法
CN113296711A (zh) * 2021-06-11 2021-08-24 中国科学技术大学 一种数据库场景中优化分布式存储延迟的方法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825705A (zh) * 2019-11-22 2020-02-21 广东浪潮大数据研究有限公司 一种数据集缓存方法及相关装置
CN111176664A (zh) * 2019-12-26 2020-05-19 中国电子科技网络信息安全有限公司 一种存储集群设置方法、装置、介质及设备
CN111597192B (zh) * 2020-04-10 2023-10-03 北京百度网讯科技有限公司 数据库的切换控制方法、装置及电子设备
CN115941686A (zh) * 2022-11-15 2023-04-07 浪潮云信息技术股份公司 一种实现云原生应用高可用服务的方法及系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106843837A (zh) * 2016-12-21 2017-06-13 中电科华云信息技术有限公司 openstack组件容器化的构建方法
CN107797767A (zh) * 2017-09-30 2018-03-13 南京卓盛云信息科技有限公司 一种基于容器技术部署分布式存储系统及其存储方法
CN109491859A (zh) * 2018-10-16 2019-03-19 华南理工大学 针对Kubernetes集群中容器日志的收集方法

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8229945B2 (en) * 2008-03-20 2012-07-24 Schooner Information Technology, Inc. Scalable database management software on a cluster of nodes using a shared-distributed flash memory
US10191778B1 (en) * 2015-11-16 2019-01-29 Turbonomic, Inc. Systems, apparatus and methods for management of software containers
US8612688B2 (en) * 2010-12-30 2013-12-17 Facebook, Inc. Distributed cache for graph data
US9984079B1 (en) * 2012-01-13 2018-05-29 Amazon Technologies, Inc. Managing data storage using storage policy specifications
CN103747060B (zh) * 2013-12-26 2017-12-08 惠州华阳通用电子有限公司 一种基于流媒体服务集群的分布式监控系统及方法
CN104504158A (zh) * 2015-01-19 2015-04-08 浪潮(北京)电子信息产业有限公司 一种快速更新业务的内存缓存的方法和设备
JP2018173741A (ja) * 2017-03-31 2018-11-08 富士通株式会社 コンテナ登録プログラム、コンテナ登録装置及びコンテナ登録方法
CN109213571B (zh) * 2018-08-30 2020-12-29 北京百悟科技有限公司 一种内存共享方法、容器管理平台及计算机可读存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106843837A (zh) * 2016-12-21 2017-06-13 中电科华云信息技术有限公司 openstack组件容器化的构建方法
CN107797767A (zh) * 2017-09-30 2018-03-13 南京卓盛云信息科技有限公司 一种基于容器技术部署分布式存储系统及其存储方法
CN109491859A (zh) * 2018-10-16 2019-03-19 华南理工大学 针对Kubernetes集群中容器日志的收集方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239118A (zh) * 2021-05-31 2021-08-10 广州宏算信息科技有限公司 一种区块链实训系统和方法
CN113296711A (zh) * 2021-06-11 2021-08-24 中国科学技术大学 一种数据库场景中优化分布式存储延迟的方法

Also Published As

Publication number Publication date
CN109933312B (zh) 2021-06-01
JP2022505720A (ja) 2022-01-14
CN109933312A (zh) 2019-06-25

Similar Documents

Publication Publication Date Title
WO2020191930A1 (zh) 一种有效降低容器化关系型数据库i/o消耗的方法
US10929428B1 (en) Adaptive database replication for database copies
LU102666B1 (en) Small-file storage optimization system based on virtual file system in kubernetes user-mode application
CN108604164B (zh) 用于存储区域网络协议存储的同步复制
US10275489B1 (en) Binary encoding-based optimizations at datastore accelerators
US20190370362A1 (en) Multi-protocol cloud storage for big data and analytics
US20190392053A1 (en) Hierarchical namespace with strong consistency and horizontal scalability
US10540119B2 (en) Distributed shared log storage system having an adapter for heterogenous big data workloads
US20190370360A1 (en) Cloud storage distributed file system
US10735369B2 (en) Hierarchical namespace service with distributed name resolution caching and synchronization
WO2018157602A1 (zh) 一种同步活动事务表的方法及装置
CN111078121A (zh) 一种分布式存储系统数据迁移方法、系统、及相关组件
US10852985B2 (en) Persistent hole reservation
US11567680B2 (en) Method and system for dynamic storage scaling
WO2017113962A1 (zh) 访问分布式数据库的方法和分布式数据服务的装置
US11321283B2 (en) Table and index communications channels
CN113032356B (zh) 一种客舱分布式文件存储系统及实现方法
CN107493309B (zh) 一种分布式系统中的文件写入方法及装置
CN110704541A (zh) 一种Redis集群多数据中心高可用的分布式方法及架构
US11196806B2 (en) Method and apparatus for replicating data between storage systems
US10102228B1 (en) Table and index communications channels
US11940972B2 (en) Execution of operations on partitioned tables
KR101335934B1 (ko) 비대칭 클러스터 분산 파일 시스템에서 데이터 복제 및 복구 방법
JP5278254B2 (ja) ストレージシステム、データ記憶方法及びプログラム
US20200342065A1 (en) Replicating user created snapshots

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19921449

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021522369

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19921449

Country of ref document: EP

Kind code of ref document: A1