WO2022126863A1 - 一种基于读写分离及自动伸缩的云编排系统及方法 - Google Patents

一种基于读写分离及自动伸缩的云编排系统及方法 Download PDF

Info

Publication number
WO2022126863A1
WO2022126863A1 PCT/CN2021/078499 CN2021078499W WO2022126863A1 WO 2022126863 A1 WO2022126863 A1 WO 2022126863A1 CN 2021078499 W CN2021078499 W CN 2021078499W WO 2022126863 A1 WO2022126863 A1 WO 2022126863A1
Authority
WO
WIPO (PCT)
Prior art keywords
read
module
write
automatic scaling
write separation
Prior art date
Application number
PCT/CN2021/078499
Other languages
English (en)
French (fr)
Inventor
占绍雄
冯景华
金荣钏
李扬
韩卿
Original Assignee
跬云(上海)信息科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 跬云(上海)信息科技有限公司 filed Critical 跬云(上海)信息科技有限公司
Priority to US17/611,183 priority Critical patent/US20230359647A1/en
Priority to EP21801819.0A priority patent/EP4044031A4/en
Publication of WO2022126863A1 publication Critical patent/WO2022126863A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/5055Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering software capabilities, i.e. software resources associated or available to the machine
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/501Performance criteria
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5017Task decomposition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/505Clust
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing

Definitions

  • the present invention relates to the technical field of cloud orchestration, and in particular, to a cloud orchestration system and method based on read-write separation and automatic scaling.
  • the present disclosure provides a cloud orchestration system and method based on read-write separation and automatic scaling, which can complete job operation with reasonable resources and effectively reduce TCO (total cost of ownership) according to the flexibility of the job.
  • the technical solution is as follows :
  • the present invention provides a cloud orchestration system based on read-write separation and automatic scaling, including a client-side visualization module, a pre-computing engine module, a read-write separation module, and an automatic scaling module.
  • the client-side visualization module is used to visualize Set the number of task nodes and submit tasks to the pre-computing engine module.
  • the pre-computing engine module is used to analyze the OLAP analysis requirements according to the online data of the business. Use OLAP technology to perform multi-dimensional analysis on the tasks submitted by the client-side visualization module, and convert the original detailed data in the data warehouse. Constructed into a multi-dimensional data set, it provides aggregated data required for online data analysis and OLAP query.
  • the read-write separation module is used to isolate the read-write workload of the tasks submitted by the client visualization module.
  • the automatic scaling module is used to respond to the precomputing engine module. Resource application requests and dynamically apply for resources to the cloud and destroy resources.
  • the client visualization module is accessed through a browser.
  • the pre-computing engine module adopts the online data analysis OLAP modeling tool kylin.
  • the query cluster performs distributed query and read operations, the cluster is constructed to perform index construction and write operations, and the index data is stored in the corresponding object storage.
  • the automatic scaling module includes cloud server expansion and reduction functions, and provides two resource expansion strategies, which are based on time and the maximum number of waiting tasks, and an expansion operation will be triggered if one of the expansion strategies is satisfied; if it is in a space state, A reduction operation will be triggered.
  • the present invention provides a cloud orchestration method based on read-write separation and automatic scaling.
  • the method is applied to the above-mentioned cloud orchestration system based on read-write separation and automatic scaling, and includes the following steps:
  • Step 1 Client visualization module setting task
  • Step 2 The precomputing engine module analyzes the resource requirements of the task, and the read-write separation module analyzes the read-write requirements of the task;
  • Step 3 The automatic scaling module creates or recycles resources according to the resource requirements of the task
  • Step 4 The read-write separation module performs read or write operations according to the read and write requirements of the model task.
  • step 1 sets the task, including the following detailed steps:
  • Step 1.1 Set the maximum number of task server nodes and server model
  • Step 1.2 Convert the logical concept of the server node in step 1.1 into a server entity
  • Step 1.3 Perform model operations on the precomputing engine module through the client visualization module
  • Step 1.4 Trigger the build model task.
  • step 2 analyzes the task requirements
  • the pre-computing engine module submits the resources required by the computing task to the automatic scaling module
  • the read-write separation module analyzes the read-write requirements of the task to perform read-write separation.
  • resources are created by invoking an API corresponding to Terraform, an orchestration tool for infrastructure automation, and resource recovery is performed according to a resource recovery strategy.
  • the model operation in step 1.3 includes creating an editing model and setting an index.
  • the present invention provides a cloud orchestration method based on read-write separation and automatic scaling, the method comprising:
  • the precomputing engine module receives the tasks submitted by the client visualization module, and analyzes the task requirements
  • the read module of the read-write separation module reads the required information from the object storage
  • the write module of the read-write separation module dynamically creates or destroys resources through the automatic scaling module, and stores the result of the write operation.
  • the application or destruction of resources by the writing module of the read-write separation module through the automatic scaling module includes:
  • the pre-computing engine module After calculating the resource requirements of the task, the pre-computing engine module submits an application for resource requirements to the automatic scaling module;
  • the automatic scaling module dynamically applies for resources or destroys resources from the cloud according to resource requirements.
  • the automatic scaling module dynamically applying for resources or destroying resources from the cloud according to resource requirements includes:
  • the creation strategy includes resource expansion according to the resource expansion strategy.
  • the expansion strategy includes time-based strategies and/or based on the maximum number of waiting tasks;
  • Resource recycling is performed according to a shrinking strategy, which is to determine whether to trigger resource recycling by determining whether the working node is currently in an idle state according to the cluster API
  • the method also includes:
  • the maximum number of task server nodes and the server model set by the client visualization module are obtained.
  • the service node is coordinated by the precomputing engine module and the automatic scaling module to transform the logical concept of the server node into a server entity.
  • the method also includes:
  • the precomputing engine module adopts the online data analysis OLAP modeling tool kylin.
  • the present invention provides a cloud orchestration device based on read-write separation and automatic scaling, characterized in that the device includes:
  • the task receiving unit is used for the precomputing engine module to receive the tasks submitted by the client visualization module and analyze the task requirements;
  • the separation unit is used to separate the read-write processing of the submitted tasks through the read-write separation module
  • the read request processing unit is used to read the required information from the object storage by the read module of the read-write separation module if it is a read request;
  • the write request processing unit is used for the write request of the read-write separation module to dynamically create or destroy resources through the automatic scaling module, and to store the result of the write operation.
  • the present invention provides a cloud orchestration system based on read-write separation and automatic scaling, characterized in that the system includes a client, a server, and a cloud:
  • the client is used to set the number of server nodes and the model of the server; it is also used to operate the model and trigger the task of building the model to obtain the task of the pre-computing engine module that needs to be submitted to the server;
  • the server configured to execute the cloud orchestration method based on read-write separation and automatic scaling according to any one of the third aspects
  • the cloud is used to receive resource applications from the server and provide cloud resources for the server.
  • the present invention provides a cloud orchestration system and method based on read-write separation and automatic scaling, provides a cloud server automatic scaling scheme when running jobs on the cloud, improves the utilization rate of cloud resources, reduces the total cost of ownership (TCO) and reduces Manual intervention, the separation of query and computing cluster resources provides higher reliability for the horizontal expansion of query clusters, and the constructed data is stored in highly reliable object storage, which improves the stability of the system under high concurrency, and ensures data security and reliability.
  • TCO total cost of ownership
  • an efficient online data analysis OLAP query execution engine with read-write separation and automatic scaling can be constructed to cope with complex online data analysis OLAP queries of various reporting systems.
  • FIG. 1 is a schematic diagram of a cloud orchestration system based on read-write separation and automatic scaling provided by the present invention
  • FIG. 2 is a schematic diagram of a cloud orchestration method based on read-write separation and automatic scaling provided by the present invention
  • FIG. 3 is a schematic flow chart of an overall solution of a specific embodiment of the present invention.
  • FIG. 4 is a schematic diagram of another cloud orchestration method based on read-write separation and automatic scaling provided by the present invention.
  • Embodiment 1 of the present invention provides a cloud orchestration system based on read-write separation and automatic scaling.
  • FIG. 1 it includes a client visualization module, a pre-computing engine module, a read-write separation module, and an automatic scaling module.
  • the client The visualization module is used to visually set the number of task nodes and submit tasks to the pre-computing engine module.
  • the pre-computing engine module is used to analyze the OLAP analysis requirements according to the online data of the business.
  • the OLAP technology is used to perform multi-dimensional analysis on the tasks submitted by the client-side visualization module.
  • the original detailed data in the data warehouse is constructed into a cube, which provides aggregated data required for online data analysis and OLAP query.
  • the read-write separation module is used to isolate the read-write workload of the tasks submitted by the client visualization module.
  • the automatic scaling module is used for Respond to resource application requests from the precomputing engine module and dynamically apply for and destroy resources to the cloud.
  • the client visualization module is accessed through a browser.
  • the pre-computing engine module adopts the online data analysis OLAP modeling tool kylin, which improves the query efficiency of the aggregation query.
  • the query cluster performs distributed query and read operations, the cluster is constructed to perform index construction and write operations, and the index data is stored in the corresponding object storage.
  • the read-write separation architecture makes it possible to expand query nodes arbitrarily to achieve high-performance concurrent queries without worrying about building tasks preempting cluster resources.
  • the automatic scaling module includes cloud server expansion and reduction functions, and provides two resource expansion strategies, which are based on time and the maximum number of waiting tasks.
  • the automatic scaling module detects the current task waiting state and compares whether the task waiting scheduling time exceeds the module configuration.
  • the maximum waiting time configuration item in the file will also compare whether the number of waiting tasks for the current task is greater than the maximum number of waiting tasks configured in the configuration file. If one of the expansion policies is satisfied, the expansion operation will be triggered.
  • the auto-scaling module obtains the current worker node through an API call, whether it is in an idle state, and if it is in a space state, it will trigger a node reduction operation.
  • the second embodiment of the present invention provides a cloud orchestration method based on read-write separation and automatic scaling.
  • the method is applied to the above-mentioned cloud orchestration system based on read-write separation and automatic scaling. As shown in FIG. 2 , the method includes the following steps :
  • Step 1 Client visualization module setting task
  • Step 1.1 Access the corresponding link through the browser to enter the cluster setting page, and set the maximum number of task server nodes and server model.
  • the server node at this time is a logical concept
  • Step 1.2 The pre-computing task of the pre-computing engine module and the auto-scaling strategy of the auto-scaling module work together to transform the logical concept of the server node in Step 1.1 into a server entity; in the specific implementation, through the browser or calling the API of the construction task , to generate and schedule the construction tasks, calculate the resources required for the construction tasks in the pre-computing engine module, and obtain the resources required by the pre-computing engine module through the automatic scaling module.
  • the API completes the creation of the required resources;
  • Step 1.3 perform model operations on the pre-computing engine module on the interface through the client visualization module.
  • the model operations in Step 1.3 may include but are not limited to creating and editing models and setting indexes;
  • Step 1.4 Trigger the build model task.
  • Step 2 The precomputing engine module analyzes the resource requirements of the task, and the read-write separation module analyzes the read-write requirements of the task;
  • step 2 analyzes the resource requirements of the task
  • the pre-computing engine module submits the resources required by the computing task, including memory and CPU, to the automatic scaling module
  • the read-write separation module analyzes the read-write requirements of the task to perform read-write separation.
  • Step 3 The automatic scaling module creates or recycles resources according to the resource requirements of the task
  • resources are created by calling the API corresponding to Terraform, an orchestration tool for infrastructure automation, to ensure the operation of construction tasks; resource recovery is performed according to the resource recovery strategy, by obtaining whether the cluster worker nodes are in an idle state, and if so, then Trigger reduction operations to avoid wasting resources.
  • Step 4 The read-write separation module performs read or write operations according to the read and write requirements of the model task.
  • a specific embodiment of the present invention provides a cloud orchestration system based on read-write separation and automatic scaling.
  • the overall solution process is shown in Figure 3.
  • the read-write request is submitted from the client visualization module to the pre-computing engine module, and the read request or
  • the write request can also be a read request combined with a write request.
  • the precomputing engine module analyzes the task requirements, separates the read request and the write request, and sends them to the read and write modules of the read-write separation module, respectively, for read-write separation processing.
  • the operation reads the required information from the object storage, the write operation passes through the automatic scaling module, creates resources on demand, recycles resources according to the resource recovery strategy, utilizes resources rationally, and stores the information that needs to be written to the object storage.
  • the embodiment of the present application provides a cloud orchestration method based on read-write separation and automatic scaling, as shown in FIG. 4 , the method includes the following steps:
  • the server which corresponds to the visualization client
  • the server can be understood as the backend of the management cluster
  • the visualization client can be understood as the front end of the management cluster.
  • the server includes a pre-computing engine module, a read-write separation module, and an automatic scaling module.
  • the precomputing engine module receives the task submitted by the client visualization module, and analyzes the task requirements.
  • the user can set the maximum number of task server nodes and the server model through the client visualization module (visualization client). Specifically, you can access the corresponding link through the browser to enter the visualization client and perform settings.
  • this application involves the setting of clusters, that is, the visualization client can set pages for clusters in practical applications.
  • the service node set here is still a logical concept, and the pre-computing engine module and the automatic scaling module need to cooperate to convert the logical concept of the server node into a server entity. Then, the model operation is performed in the visualization client, and the model building task is triggered to obtain the task that needs to be submitted to the precomputing engine module of the server. It should be noted that model operations include but are not limited to creating and editing models and setting indexes.
  • the task is submitted to the pre-computing engine module, so the pre-computing engine module can receive the task submitted by the client-side visualization module (visualization client).
  • the client-side visualization module visualization client
  • the precomputing engine module After the precomputing engine module receives the submitted task, it will analyze the requirements of the task, including analyzing whether the submitted task is a read request, a write request, or a read and write request; it also includes an analysis of the resources required by the task.
  • the required resources include memory, CPU, etc.
  • the specific implementation is: if it is a read request, it is assigned to the read module of the read-write separation module for processing; if it is a write request, it is assigned to the write module of the read-write separation module for processing.
  • the read module of the read-write separation module reads the required information from the object storage.
  • the index data is stored in the corresponding object storage, not in the built-in disk of the corresponding machine, so for a read request, the required information needs to be read from the object storage.
  • Storing index data in object storage ensures data security, redundancy, and unlimited scalability in capacity. Therefore, when computing is not required, the cluster can even be safely stopped without worrying about data loss.
  • the write module of the read-write separation module dynamically creates or destroys resources through the automatic scaling module; and stores the result of the write operation.
  • the writing module of the read-write separation module dynamically creates or destroys resources through the automatic scaling module
  • the pre-computing engine module submits an application for resource requirements to the automatic scaling module
  • the module dynamically applies for resources or destroys resources from the cloud according to resource requirements.
  • the "automatic scaling module dynamically applies for resources or destroys resources from the cloud according to resource requirements” includes: creating resources by calling the API corresponding to Terraform, an orchestration tool for infrastructure automation, and expanding resources according to resource expansion strategies, which include resource expansion strategies based on Time expansion strategy and/or based on the maximum number of waiting tasks; resource recovery is performed according to the resource contraction strategy, wherein the resource contraction strategy is to determine whether to trigger resource recovery by judging whether the worker node is in an idle state according to the cluster API.
  • the creation of resources by calling the API corresponding to Terraform is to ensure the operation of tasks.
  • the creation of resources corresponds to applying for resources from the cloud.
  • Resource recycling is performed according to the resource recycling strategy, so as to avoid waste of resources.
  • the storage of the result of the write operation is specifically written to the object storage.
  • the cloud orchestration method based on read-write separation and automatic scaling of the embodiments of the present application can be based on the read-write separation of cloud Reasonable resources to complete the job operation and effectively reduce the total cost of ownership TCO.
  • the separation of query (read) and computing (write) cluster resources provides higher reliability for the horizontal expansion of the query cluster, and can ensure the stability of the system when dealing with high concurrency.
  • the pre-computing engine module in the above embodiment adopts the online data analysis OLAP modeling tool kylin.
  • the original detailed data in the data warehouse is built into cubes, and the aggregated data required for OLAP queries is provided.
  • the purpose of using kylin is to improve the query efficiency of aggregated queries.
  • An embodiment of the present application provides a cloud orchestration device based on read-write separation and automatic scaling, and the device includes:
  • the task receiving unit is used for the precomputing engine module to receive the tasks submitted by the client visualization module and analyze the task requirements;
  • the separation unit is used to separate the read-write processing of the submitted tasks through the read-write separation module
  • the read request processing unit is used to read the required information from the object storage by the read module of the read-write separation module if it is a read request;
  • the write request processing unit is used for, if it is a write request, the write module of the read-write separation module dynamically creates or destroys resources through the automatic scaling module, and stores the result of the write operation.
  • the cloud orchestration device based on read-write separation and automatic scaling can be based on the read-write separation of cloud orchestration and the automatic scaling solution of resources, and can use reasonable resources to complete job operation according to the flexibility of the job and effectively reduce the total Cost of ownership TCO.
  • the separation of query (read) and computing (write) cluster resources provides higher reliability for the horizontal expansion of the query cluster, and can ensure the stability of the system when dealing with high concurrency.
  • the embodiment of the present application provides a cloud orchestration system based on read-write separation and automatic scaling, and the system includes a client, a server, and a cloud:
  • the client is used to set the number of server nodes and the model of the server; it is also used to operate the model and trigger the task of building the model to obtain the task of the pre-computing engine module that needs to be submitted to the server;
  • the server is used to execute the cloud orchestration method based on read-write separation and automatic scaling described in the fourth embodiment
  • the cloud is used to receive resource applications from the server and provide cloud resources for the server.
  • the cloud orchestration system based on read-write separation and automatic scaling can be based on the read-write separation of cloud orchestration and the automatic scaling solution of resources, and can use reasonable resources to complete job operation according to the flexibility of the job and effectively reduce the total Cost of ownership TCO.
  • the separation of query (read) and computing (write) cluster resources provides higher reliability for the horizontal expansion of the query cluster, and can ensure the stability of the system when dealing with high concurrency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

本发明提供一种基于读写分离及自动伸缩的云编排系统及方法,对读操作和写操作进行分离,对集群进行扩容和缩容,并将所有的索引数据存放在相应的对象存储,预计算模块将计算任务所需的资源(内存、CPU)提交给自动伸缩模块,通过调用基础架构自动化的编排工具Terraform对应的API进行资源的创建以保障构建任务的运行。自动伸缩模块提供2种资源扩张策略,分别是基于时间和最大的等待任务数。本发明提高了云资源的利用率,降低了成本及减少人工干预,为查询集群水平扩展提供了更高的可靠性,提高了高并发时系统的稳定性,保证了数据的安全性及无限扩展性。

Description

一种基于读写分离及自动伸缩的云编排系统及方法
相关申请的交叉引用
本申请要求于2020年12月16日提交中国专利局,申请号为2020114922202,发明名称为“一种基于读写分离及自动伸缩的云编排系统及方法”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及云编排技术领域,尤其涉及一种基于读写分离及自动伸缩的云编排系统及方法。
背景技术
目前,如何降低云成本、快速响应业务需求,一直都是云服务关注的热点问题。目前大多数云服务都支持节点伸缩服务以运行对应任务,但通常需要操作人员手动调节节点数量,往往会造成资源的浪费,增加TCO(总拥有成本)以及运维成本。
业界大多数云服务都对外提供了对Terraform(云编排工具)暴露的api,通过Terraform可以很灵活和快速部署出一个云服务集群,这也是业界通用方式。
当前云编排工具提供创建、管理云资源的能力,但是无法自动根据当时云上作业的负载情况,自动进行灵活的资源配置。基于云服务提供的SDK,也无法灵活自动根据云上作业进行资源的调整,这样会导致TCO的增加。
当前,业界传统支持的云上计算引擎有Impala,Greenplum等,这些传统的MPP引擎在大规模数据量下,查询性能和并发性会严重下降。除了MPP引擎过程是实时运算外,主要原因在于如果MPP的一个节点总是执行慢于该集群中其他的节点,整个集群的性能就会受限于这个故障节点的执行速度(所谓木桶的短板效应),无论集群有多少节点,都不会有所提高。
由于现存云编排技术及MPP架构存在的缺陷,难以快速根据云上作业负载做出响应。
发明内容
有鉴于此,本公开提供一种基于读写分离及自动伸缩的云编排系统及方法,可以根据作 业的灵活度,用合理的资源完成作业运行及有效降低TCO(总拥有成本),技术方案如下:
第一方面,本发明提供了一种基于读写分离及自动伸缩的云编排系统,包括客户端可视化模块、预计算引擎模块、读写分离模块以及自动伸缩模块,客户端可视化模块用于可视化地设置任务节点数及向预计算引擎模块提交任务,预计算引擎模块用于根据业务的联机数据分析OLAP分析需求利用OLAP技术对客户端可视化模块提交的任务进行多维分析,将数据仓库中原始明细数据构建成多维数据集,提供联机数据分析OLAP查询所需要的聚合数据,读写分离模块用于实现客户端可视化模块提交任务的读写工作负载的隔离,自动伸缩模块用于响应预计算引擎模块的资源申请请求并动态地向云申请资源及销毁资源。
进一步地,所述的客户端可视化模块通过浏览器访问。
进一步地,所述的预计算引擎模块采用联机数据分析OLAP建模工具kylin。
进一步地,所述的读写分离模块,查询集群进行分布式查询读操作,构建集群进行索引构建写操作,索引数据存放在相应的对象存储。
进一步地,所述的自动伸缩模块包括云服务器扩张和缩减功能,提供2种资源扩张策略,分别是基于时间和最大的等待任务数,满足扩张策略之一会触发扩张操作;如果处于空间状态,则会触发缩减操作。
第二方面,本发明提供了一种基于读写分离及自动伸缩的云编排方法,所述方法应用在上述基于读写分离及自动伸缩的云编排系统中,包括以下步骤:
步骤1:客户端可视化模块设置任务;
步骤2:预计算引擎模块分析任务的资源需求,读写分离模块分析任务的读写需求;
步骤3:自动伸缩模块根据任务的资源需求进行资源创建或资源回收;
步骤4:读写分离模块根据模型任务的读写需求进行读操作或写操作。
进一步地,所述的步骤1设置任务,包括如下详细步骤:
步骤1.1:设置最大任务服务器节点数及服务器机型;
步骤1.2:将步骤1.1的服务器节点逻辑概念转化为服务器实体;
步骤1.3:通过客户端可视化模块对预计算引擎模块进行模型操作;
步骤1.4:触发构建模型任务。
进一步地,所述的步骤2分析任务需求,预计算引擎模块将计算任务所需的资源提交给自动伸缩模块,读写分离模块分析任务的读写需求进行读写分离。
进一步地,所述的步骤3,通过调用基础架构自动化的编排工具Terraform对应的API进行资源的创建,根据资源回收策略进行资源回收。
进一步地,所述的步骤1.3的模型操作包括创建编辑模型和设置索引。
第三方面,本发明提供了一种基于读写分离及自动伸缩的云编排方法,所述方法包括:
预计算引擎模块接收客户端可视化模块提交的任务,以及分析任务需求;
将提交的任务通过读写分离模块进行读写分离处理;
若为读请求,则读写分离模块的读模块从对象存储中读取所需信息;
若为写请求,则读写分离模块的写模块通过自动伸缩模块进行资源的动态创建或销毁;并进行写操作的结果存储。
进一步的,所述读写分离模块的写模块通过自动伸缩模块进行资源的申请或销毁包括:
预计算引擎模块在计算任务的资源需求后,向自动伸缩模块提交资源需求的申请;
自动伸缩模块根据资源需求动态向云端申请资源或销毁资源。
进一步的,所述自动伸缩模块根据资源需求动态向云端申请资源或销毁资源包括:
通过调用基础架构自动化的编排工具Terraform对应的API进行资源的创建,创建策略包括根据资源扩张策略进行资源扩张,扩张策略包括基于时间的策略和/或基于最大的等待任务数;
根据收缩策略进行资源回收,所述收缩策略为根据集群API确定现在工作节点是否处于空闲状态来确定是否触发资源的回收
进一步的,所述方法还包括:
获取客户端可视化模块设置的最大任务服务器节点数以及服务器机型,所述服务节点由预计算引擎模块以及自动伸缩模块协作将服务器节点逻辑概念转化为服务器实体。
进一步的,将所有的索引数据存储在对象存储中。
进一步的,所述方法还包括:
预计算引擎模块采用联机数据分析OLAP建模工具kylin。
第四方面,本发明提供了一种基于读写分离及自动伸缩的云编排装置,其特征在于,所述装置包括:
任务接收单元,用于预计算引擎模块接收客户端可视化模块提交的任务,以及分析任务需求;
分离单元,用于将提交的任务通过读写分离模块进行读写分离处理;
读请求处理单元,用于若为读请求,则读写分离模块的读模块从对象存储中读取所需信息;
写请求处理单元,用于若为写请求,则读写分离模块的写模块通过自动伸缩模块进行资 源的动态创建或销毁;并进行写操作的结果存储。
第五方面,本发明提供了一种基于读写分离及自动伸缩的云编排系统,其特征在于,所述系统包括客户端、服务端、云端:
所述客户端,用于进行服务器节点数以及服务器机型的设置;还用于进行模型的操作,并触发构建模型任务,得到需要提交给服务端的预计算引擎模块的任务;
所述服务端,用于执行上述第三方面中任一项所述的基于读写分离及自动伸缩的云编排方法;
所述云端,用于接收服务端的资源申请,为服务端提供云资源。
本发明提供一种基于读写分离及自动伸缩的云编排系统及方法,提供了在云上运行作业时云服务器自动伸缩方案,提高了云资源的利用率,降低了总拥有成本TCO并减少了人工干预,查询与计算集群资源的分离为查询集群水平扩展提供了更高的可靠性,构建的数据存放至高可靠的对象存储,提高了高并发时系统的稳定性,保证了数据的安全性及无限扩展性,基于本发明可以构建出一个具有读写分离和自动伸缩的高效的联机数据分析OLAP查询执行引擎,应对各类报表系统的复杂联机数据分析OLAP查询。
附图说明
为了更清楚地说明本发明具体实施方式或现有技术中的技术方案,下面将对具体实施方式或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明提供的一种基于读写分离及自动伸缩的云编排系统示意图;
图2为本发明提供的一种基于读写分离及自动伸缩的云编排方法示意图;
图3为本发明具体实施例的整体方案流程示意图;
图4为本发明提供的另一种基于读写分离及自动伸缩的云编排方法的示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
在本申请中,术语“上”、“下”、“左”、“右”、“前”、“后”、“顶”、“底”、“内”、“外”、“中”、“竖直”、“水平”、“横向”、“纵向”等指示的方位或位置关系为基于附图所示的方位或位置关系。这些术语主要是为了更好地描述本申请及其实施例,并非用于限定所指示的装置、元件或组成部分必须具有特定方位,或以特定方位进行构造和操作。
并且,上述部分术语除了可以用于表示方位或位置关系以外,还可能用于表示其他含义,例如术语“上”在某些情况下也可能用于表示某种依附关系或连接关系。对于本领域普通技术人员而言,可以根据具体情况理解这些术语在本申请中的具体含义。
另外,术语“多个”的含义应为两个以及两个以上。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
以下通过特定的具体实例说明本公开的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本公开的其他优点与功效。显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。本公开还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本公开的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。基于本公开中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
实施例一
本发明实施例一,提供了一种基于读写分离及自动伸缩的云编排系统,如图1所示,包括客户端可视化模块、预计算引擎模块、读写分离模块以及自动伸缩模块,客户端可视化模块用于可视化地设置任务节点数及向预计算引擎模块提交任务,预计算引擎模块用于根据业务的联机数据分析OLAP分析需求利用OLAP技术对客户端可视化模块提交的任务进行多维分析,将数据仓库中原始明细数据构建成多维数据集,提供联机数据分析OLAP查询所需要的聚合数据,读写分离模块用于实现客户端可视化模块提交任务的读写工作负载的隔离,自动伸 缩模块用于响应预计算引擎模块的资源申请请求并动态地向云申请资源及销毁资源。
所述的客户端可视化模块通过浏览器访问。
所述的预计算引擎模块采用联机数据分析OLAP建模工具kylin,提高了聚合查询的查询效率。
所述的读写分离模块,查询集群进行分布式查询读操作,构建集群进行索引构建写操作,索引数据存放在相应的对象存储。
通过读写分离,完全隔离了读操作和写操作两种工作负载,可以方便地对集群进行扩容和缩容,所有的索引数据存放在相应的对象存储而不是对应机器的自带磁盘中,保证了数据的安全、冗余以及容量上的无限扩展性,故在不需要计算的时候,甚至可以安全地停止集群而不用担心丢失数据。读写分离的架构使得可以任意的扩张查询节点来达到高性能的并发查询的同时无需担心构建任务抢占集群资源的情况。
所述的自动伸缩模块包括云服务器扩张和缩减功能,提供2种资源扩张策略,分别是基于时间和最大的等待任务数,自动伸缩模块检测当前任务等待状态,比较任务等待调度时间是否超过模块配置文件中的最大等待时间配置项,同时会比较对当前的任务等待数是否大于配置文件配置的最大等待任务数,满足扩张策略之一会触发扩张操作。自动伸缩模块通过API调用获取当前工作节点,是否处于空闲状态,如果处于空间状态,则会触发缩减节点操作。
实施例二
本发明实施例二,提供了一种基于读写分离及自动伸缩的云编排方法,所述方法应用在上述基于读写分离及自动伸缩的云编排系统中,如图2所示,包括以下步骤:
步骤1:客户端可视化模块设置任务;
包括如下详细步骤:
步骤1.1:通过浏览器访问对应链接进入到集群设置页面,设置最大任务服务器节点数及服务器机型,此时的服务器节点是个逻辑概念;
步骤1.2:由预计算引擎模块的预计算任务及自动伸缩模块的自动伸缩策略共同协作,将步骤1.1的服务器节点逻辑概念转化为服务器实体;具体实施时,通过在浏览器或者调用构建任务的API,进行构建任务的生成和调度,预计算引擎模块中计算出构建任务所需要的资源,通过自动伸缩模块获取预计算引擎模块所需的资源,在满足自动伸缩的扩张策略情况下,通过调用Terraform API完成所需资源的创建;
步骤1.3:通过客户端可视化模块在界面上对预计算引擎模块进行模型操作,具体实施时,步骤1.3的模型操作可以包括但不限于创建编辑模型和设置索引;
步骤1.4:触发构建模型任务。
步骤2:预计算引擎模块分析任务的资源需求,读写分离模块分析任务的读写需求;
具体实施时,步骤2分析任务的资源需求,预计算引擎模块将计算任务所需的资源,包括内存和CPU,提交给自动伸缩模块,读写分离模块分析任务的读写需求进行读写分离。
步骤3:自动伸缩模块根据任务的资源需求进行资源创建或资源回收;
具体实施时,通过调用基础架构自动化的编排工具Terraform对应的API进行资源的创建,以保障构建任务的运行;根据资源回收策略进行资源回收,通过获取集群工作节点是否处于空闲状态,如果是,则触发缩减操作,避免资源浪费。
步骤4:读写分离模块根据模型任务的读写需求进行读操作或写操作。
实施例三
本发明的一个具体实施例,提供了基于读写分离及自动伸缩的云编排系统,整体方案流程如图3所示,从客户端可视化模块提交读写请求至预计算引擎模块,可以读请求或者写请求,也可以是读请求合并写请求,由预计算引擎模块分析任务需求,将读请求和写请求分离,分别发送到读写分离模块的读模块和写模块,进行读写分离处理,读操作从对象存储读取所需信息,写操作经过自动伸缩模块,按需创建资源,根据资源回收策略进行资源回收,合理利用资源,将需要写操作的信息存储到对象存储。
实施例四
本申请实施例提供了一种基于读写分离及自动伸缩的云编排方法,如图所4所示,所述方法包括如下步骤:
首先需要说明的是,该方法应用在服务端,该服务端与可视化客户端是对应的,服务端可以理解为管理集群的后端,可视化客户端可以理解为管理集群的前端。服务端包括预计算引擎模块、读写分离模块、自动伸缩模块。
S401.预计算引擎模块接收客户端可视化模块提交的任务,以及分析任务需求。
在本步骤之前,首先用户可以通过客户端可视化模块(可视化客户端)设置最大任务服务器节点数以及服务器机型。具体的,可以通过浏览器访问对应的链接进入可视化客户端,进行设置。另外,需要说明的是,本申请涉及的是集群的设置,即可视化客户端在实际应用中可以为集群设置页面。
此处设置的服务节点还是逻辑概念,需要由预计算引擎模块以及自动伸缩模块协作将服务器节点逻辑概念转化为服务器实体。然后在可视化客户端中进行模型的操作,并触发构建模型任务,得到需要提交给服务端的预计算引擎模块的任务。需要说明的是模型的操作包括 但不限于创建编辑模型、设置索引。
得到任务后,将任务提交给预计算引擎模块,因此预计算引擎模块可以接收到客户端可视化模块(可视化客户端)提交的任务。
预计算引擎模块接收到提交的任务后,会对任务进行需求的分析,包括分析提交的任务是读请求还是写请求还是读写请求都包括;还包括对于任务所需资源的分析。其中所需资源包括内存、CPU等。
S402.将提交的任务通过读写分离模块进行读写分离处理。
具体的实现为:若为读请求,则分到读写分离模块的读模块进行处理;若为写请求,则分到读写分离模块的写模块进行处理。
S403.若为读请求,则读写分离模块的读模块从对象存储中读取所需信息。
需要说明的是,本申请中将索引数据存放在相应的对象存储,而不是对应机器的自带磁盘中,因此对于读请求,需要从对象存储中读取所需信息。将索引数据存放在对象存储中,保证了数据的安全,冗余以及容量上的无限扩展性,故在不需要计算的时候,甚至可以安全地停止集群而不用担心丢失数据。
S404.若为写请求,则读写分离模块的写模块通过自动伸缩模块进行资源的动态创建或销毁;并进行写操作的结果存储。
“读写分离模块的写模块通过自动伸缩模块进行资源的动态创建或销毁”的具体实现可以为:预计算引擎模块在计算任务的资源需求后,向自动伸缩模块提交资源需求的申请;自动伸缩模块根据资源需求动态向云端申请资源或销毁资源。
其中“自动伸缩模块根据资源需求动态向云端申请资源或销毁资源”包括:通过调用基础架构自动化的编排工具Terraform对应的API进行资源的创建,根据资源扩张策略进行资源扩张,其中资源扩张策略包括基于时间的扩张策略和/或基于最大的等待任务数;根据资源收缩策略进行资源的回收,其中资源收缩策略是根据集群API判断工作节点是否为空闲状态来确定是否触发资源的回收。
需要说明的是,通过调用基础架构自动化的编排工具Terraform对应的API进行资源的创建是为了保障任务的运行。资源的创建与向云端申请资源是对应的。
根据资源回收策略进行资源回收,这样可以避免资源的浪费。
其中,进行写操作的结果存储具体是写入到对象存储中。
从上述的描述中,可以看出,本申请实施例的基于读写分离及自动伸缩的云编排方法,可以基于云编排的读写分离及资源的自动伸缩方案,可以根据作业的灵活度,用合理的资源 完成作业运行及有效降低总拥有成本TCO。并且,查询(读)与计算(写)集群资源的分离,为查询集群水平扩展提供了更高的可靠性,可以保证应对高并发时系统的稳定性。
进一步的,还需要补充说明的是,上述实施例中的预计算引擎模块采用联机数据分析OLAP建模工具kylin。具体的:是根据业务的OLAP分析需求(需要分析哪些维度度量),将数据仓库中原始明细数据构建成Cube,提供OLAP查询所需要的聚合数据,使用kylin是为了提高聚合查询的查询效率。
实施例五
本申请实施例提供了一种基于读写分离及自动伸缩的云编排装置,所述装置包括:
任务接收单元,用于预计算引擎模块接收客户端可视化模块提交的任务,以及分析任务需求;
分离单元,用于将提交的任务通过读写分离模块进行读写分离处理;
读请求处理单元,用于若为读请求,则读写分离模块的读模块从对象存储中读取所需信息;
写请求处理单元,用于若为写请求,则读写分离模块的写模块通过自动伸缩模块进行资源的动态创建或销毁;并进行写操作的结果存储。
具体的本实施例中的各单元模块的实现可以参见前述实施例中对应的说明,此处不再赘述。
本申请实施例的基于读写分离及自动伸缩的云编排装置,可以基于云编排的读写分离及资源的自动伸缩方案,可以根据作业的灵活度,用合理的资源完成作业运行及有效降低总拥有成本TCO。并且,查询(读)与计算(写)集群资源的分离,为查询集群水平扩展提供了更高的可靠性,可以保证应对高并发时系统的稳定性。
实施例六
本申请实施例提供了一种基于读写分离及自动伸缩的云编排系统,所述系统包括客户端、服务端、云端:
所述客户端,用于进行服务器节点数以及服务器机型的设置;还用于进行模型的操作,并触发构建模型任务,得到需要提交给服务端的预计算引擎模块的任务;
所述服务端,用于执行上述实施例四中所述的基于读写分离及自动伸缩的云编排方法;
所述云端,用于接收服务端的资源申请,为服务端提供云资源。
具体的,具体的本实施例中的各模块的实现可以参见前述实施例中对应的说明,此处不再赘述。
本申请实施例的基于读写分离及自动伸缩的云编排系统,可以基于云编排的读写分离及资源的自动伸缩方案,可以根据作业的灵活度,用合理的资源完成作业运行及有效降低总拥有成本TCO。并且,查询(读)与计算(写)集群资源的分离,为查询集群水平扩展提供了更高的可靠性,可以保证应对高并发时系统的稳定性。
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。
虽然结合附图描述了本发明的实施方式,但是本领域技术人员可以在不脱离本发明的精神和范围的情况下作出各种修改和变型,这样的修改和变型均落入由所附权利要求所限定的范围之内。

Claims (18)

  1. 一种基于读写分离及自动伸缩的云编排系统,其特征在于,包括客户端可视化模块、预计算引擎模块、读写分离模块以及自动伸缩模块,客户端可视化模块用于可视化地设置任务节点数及向预计算引擎模块提交任务,预计算引擎模块利用OLAP技术对客户端可视化模块提交的任务进行多维分析,将数据仓库中原始明细数据构建成多维数据集,提供联机数据分析OLAP查询所需要的聚合数据,读写分离模块用于实现客户端可视化模块提交任务的读写工作负载的隔离,自动伸缩模块用于响应预计算引擎模块的资源申请请求并动态地向云申请资源及销毁资源。
  2. 根据权利要求1所述的一种基于读写分离及自动伸缩的云编排系统,其特征在于,所述的客户端可视化模块通过浏览器访问。
  3. 根据权利要求1所述的一种基于读写分离及自动伸缩的云编排系统,其特征在于,所述的预计算引擎模块采用联机数据分析OLAP建模工具kylin。
  4. 根据权利要求1所述的一种基于读写分离及自动伸缩的云编排系统,其特征在于,所述的读写分离模块,查询集群进行分布式查询读操作,构建集群进行索引构建写操作,索引数据存放在相应的对象存储。
  5. 根据权利要求1所述的一种基于读写分离及自动伸缩的云编排系统,其特征在于,所述的自动伸缩模块包括云服务器扩张和缩减功能,提供2种资源扩张策略,分别是基于时间和最大的等待任务数,满足扩张策略之一会触发扩张操作;如果处于空间状态,则会触发缩减操作。
  6. 一种基于读写分离及自动伸缩的云编排方法,应用在上述权利要求1-5中任一项所述的基于读写分离及自动伸缩的云编排系统中,其特征在于,包括以下步骤:
    步骤1:客户端可视化模块设置任务;
    步骤2:预计算引擎模块分析任务的资源需求,读写分离模块分析任务的读写需求;
    步骤3:自动伸缩模块根据任务的资源需求进行资源创建或资源回收;
    步骤4:读写分离模块根据任务的读写需求进行读操作或写操作。
  7. 根据权利要求6所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述的步骤1设置任务,包括如下步骤:
    步骤1.1:设置最大任务服务器节点数及服务器机型;
    步骤1.2:将步骤1.1的服务器节点逻辑概念转化为服务器实体;
    步骤1.3:通过客户端可视化模块对预计算引擎模块进行模型操作;
    步骤1.4:触发构建模型任务。
  8. 根据权利要求6所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述的步骤2中分析任务需求包括,预计算引擎模块将计算任务所需的资源提交给自动伸缩模块,读写分离模块分析任务的读写需求进行读写分离。
  9. 根据权利要求6所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述的步骤3包括,通过调用基础架构自动化的编排工具Terraform对应的api进行资源的创建,根据资源回收策略进行资源回收。
  10. 根据权利要求7所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述的步骤1.3的模型操作包括创建编辑模型和设置索引。
  11. 一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述方法包括:
    预计算引擎模块接收客户端可视化模块提交的任务,以及分析任务需求;
    将提交的任务通过读写分离模块进行读写分离处理;
    若为读请求,则读写分离模块的读模块从对象存储中读取所需信息;
    若为写请求,则读写分离模块的写模块通过自动伸缩模块进行资源的动态创建或销毁;并进行写操作的结果存储。
  12. 根据权利要求11所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述读写分离模块的写模块通过自动伸缩模块进行资源的申请或销毁包括:
    预计算引擎模块在计算任务的资源需求后,向自动伸缩模块提交资源需求的申请;
    自动伸缩模块根据资源需求动态向云端申请资源或销毁资源。
  13. 根据权利要求12所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述自动伸缩模块根据资源需求动态向云端申请资源或销毁资源包括:
    通过调用基础架构自动化的编排工具Terraform对应的API进行资源的创建,创建策略包括根据资源扩张策略进行资源扩张,所述资源扩张策略包括基于时间的扩张策略和/或基于最大的等待任务数;根据收缩策略进行资源回收,所述收缩策略为根据集群的API判断工作节点是否处于空闲状态来确定是否触发资源的回收。
  14. 根据权利要求11所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述方法还包括:
    获取客户端可视化模块设置的最大任务服务器节点数以及服务器机型,所述服务节点通过预计算引擎模块以及自动伸缩模块协作将服务器节点逻辑概念转化为服务器实体。
  15. 根据权利要求11所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,将所有的索引数据存储在对象存储中。
  16. 根据权利要求11所述的一种基于读写分离及自动伸缩的云编排方法,其特征在于,所述方法还包括:
    预计算引擎模块采用联机数据分析OLAP建模工具kylin。
  17. 一种基于读写分离及自动伸缩的云编排装置,其特征在于,所述装置包括:
    任务接收单元,用于预计算引擎模块接收客户端可视化模块提交的任务,以及分析任务需求;
    分离单元,用于将提交的任务通过读写分离模块进行读写分离处理;
    读请求处理单元,用于若为读请求,则读写分离模块的读模块从对象存储中读取所需信息;
    写请求处理单元,用于若为写请求,则读写分离模块的写模块通过自动伸缩模块进行资源的动态创建或销毁;并进行写操作的结果存储。
  18. 一种基于读写分离及自动伸缩的云编排系统,其特征在于,所述系统包括客户端、服务端、云端:
    所述客户端,用于进行服务器节点数以及服务器机型的设置;还用于进行模型的操作,并触发构建模型任务,得到需要提交给服务端的预计算引擎模块的任务;
    所述服务端,用于执行权利要求11-16中任一项所述的基于读写分离及自动伸缩的云编排方法;
    所述云端,用于接收服务端的资源申请,为服务端提供云资源。
PCT/CN2021/078499 2020-12-16 2021-03-01 一种基于读写分离及自动伸缩的云编排系统及方法 WO2022126863A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
US17/611,183 US20230359647A1 (en) 2020-12-16 2021-03-01 Read-Write Separation and Automatic Scaling-Based Cloud Arrangement System and Method
EP21801819.0A EP4044031A4 (en) 2020-12-16 2021-03-01 CLOUD ORCHESTRATION SYSTEM AND METHOD BASED ON READ-WRITE SEPARATION AND AUTOMATIC SCALING

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011492220.2 2020-12-16
CN202011492220.2A CN112579287A (zh) 2020-12-16 2020-12-16 一种基于读写分离及自动伸缩的云编排系统及方法

Publications (1)

Publication Number Publication Date
WO2022126863A1 true WO2022126863A1 (zh) 2022-06-23

Family

ID=75135654

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/078499 WO2022126863A1 (zh) 2020-12-16 2021-03-01 一种基于读写分离及自动伸缩的云编排系统及方法

Country Status (4)

Country Link
US (1) US20230359647A1 (zh)
EP (1) EP4044031A4 (zh)
CN (1) CN112579287A (zh)
WO (1) WO2022126863A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115544025A (zh) * 2022-11-30 2022-12-30 阿里云计算有限公司 数据处理方法和数据处理系统
CN116938724A (zh) * 2023-09-19 2023-10-24 广东保伦电子股份有限公司 音视频会议中服务器的扩容与缩容方法
WO2024066597A1 (zh) * 2022-09-29 2024-04-04 华为云计算技术有限公司 一种数据存储方法及装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103248656A (zh) * 2012-02-10 2013-08-14 联想(北京)有限公司 一种实现数据读写的方法以及分布式文件系统、客户端
US20140317223A1 (en) * 2013-04-19 2014-10-23 Electronics And Telecommunications Research Institute System and method for providing virtual desktop service using cache server
CN104504145A (zh) * 2015-01-05 2015-04-08 浪潮(北京)电子信息产业有限公司 一种实现数据库读写分离的方法和设备

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020029207A1 (en) * 2000-02-28 2002-03-07 Hyperroll, Inc. Data aggregation server for managing a multi-dimensional database and database management system having data aggregation server integrated therein
US20130205028A1 (en) * 2012-02-07 2013-08-08 Rackspace Us, Inc. Elastic, Massively Parallel Processing Data Warehouse
CA2860322C (en) * 2011-12-23 2017-06-27 Amiato, Inc. Scalable analysis platform for semi-structured data
CA2906816C (en) * 2013-03-15 2020-06-30 Amazon Technologies, Inc. Scalable analysis platform for semi-structured data
CN103235793A (zh) * 2013-04-01 2013-08-07 华为技术有限公司 联机处理数据的方法、设备及系统
CN103442049B (zh) * 2013-08-22 2016-08-31 浪潮电子信息产业股份有限公司 一种面向构件的混合型云操作系统体系结构及其通信方法
WO2015131961A1 (en) * 2014-03-07 2015-09-11 Systema Systementwicklung Dip.-Inf. Manfred Austen Gmbh Real-time information systems and methodology based on continuous homomorphic processing in linear information spaces
WO2016054605A2 (en) * 2014-10-02 2016-04-07 Reylabs Inc. Systems and methods involving diagnostic monitoring, aggregation, classification, analysis and visual insights
US10902022B2 (en) * 2017-03-28 2021-01-26 Shanghai Kyligence Information Technology Co., Ltd OLAP pre-calculation model, automatic modeling method, and automatic modeling system
US11157478B2 (en) * 2018-12-28 2021-10-26 Oracle International Corporation Technique of comprehensively support autonomous JSON document object (AJD) cloud service
US20200394455A1 (en) * 2019-06-15 2020-12-17 Paul Lee Data analytics engine for dynamic network-based resource-sharing
WO2022091203A1 (ja) * 2020-10-27 2022-05-05 日本電信電話株式会社 データ分析処理装置、データ分析処理方法、およびプログラム

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103248656A (zh) * 2012-02-10 2013-08-14 联想(北京)有限公司 一种实现数据读写的方法以及分布式文件系统、客户端
US20140317223A1 (en) * 2013-04-19 2014-10-23 Electronics And Telecommunications Research Institute System and method for providing virtual desktop service using cache server
CN104504145A (zh) * 2015-01-05 2015-04-08 浪潮(北京)电子信息产业有限公司 一种实现数据库读写分离的方法和设备

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HU XINPING, ET AL.: "The Study of Library Information System Based on Muti-Tenant", LIBRARY AND INFORMATION SERVICE, vol. 55, no. 11, 30 June 2011 (2011-06-30), XP055943699, ISSN: 0252-3116 *
See also references of EP4044031A4 *
ZHANG CHENGCHENG: "Research and Implementation of Container Cluster Management Platform Based on Docker", MASTER THESIS, 3 June 2019 (2019-06-03), pages 1 - 73, XP055872861 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024066597A1 (zh) * 2022-09-29 2024-04-04 华为云计算技术有限公司 一种数据存储方法及装置
CN115544025A (zh) * 2022-11-30 2022-12-30 阿里云计算有限公司 数据处理方法和数据处理系统
CN115544025B (zh) * 2022-11-30 2023-03-24 阿里云计算有限公司 数据处理方法和数据处理系统
CN116938724A (zh) * 2023-09-19 2023-10-24 广东保伦电子股份有限公司 音视频会议中服务器的扩容与缩容方法
CN116938724B (zh) * 2023-09-19 2024-01-30 广东保伦电子股份有限公司 音视频会议中服务器的扩容与缩容方法

Also Published As

Publication number Publication date
US20230359647A1 (en) 2023-11-09
CN112579287A (zh) 2021-03-30
EP4044031A4 (en) 2023-12-13
EP4044031A1 (en) 2022-08-17

Similar Documents

Publication Publication Date Title
WO2022126863A1 (zh) 一种基于读写分离及自动伸缩的云编排系统及方法
US11275622B2 (en) Utilizing accelerators to accelerate data analytic workloads in disaggregated systems
US9336288B2 (en) Workflow controller compatibility
CN106126601A (zh) 一种社保大数据分布式预处理方法及系统
WO2022083197A1 (zh) 数据处理方法、装置、电子设备和存储介质
US10185743B2 (en) Method and system for optimizing reduce-side join operation in a map-reduce framework
CN103677759A (zh) 一种用于信息系统性能提升的对象化并行计算方法及系统
Senthilkumar et al. A survey on job scheduling in big data
CN116302574B (zh) 一种基于MapReduce的并发处理方法
Aji et al. Haggis: turbocharge a MapReduce based spatial data warehousing system with GPU engine
WO2023231145A1 (zh) 基于云平台的数据处理方法、系统、电子设备及存储介质
CN114691050B (zh) 基于kubernetes的云原生存储方法、装置、设备及介质
Sharma et al. Open source big data analytics technique
US20190391847A1 (en) Resource Scheduling Method and Related Apparatus
US10552419B2 (en) Method and system for performing an operation using map reduce
Salehian et al. Comparison of spark resource managers and distributed file systems
Senthilkumar et al. An efficient FP-Growth based association rule mining algorithm using Hadoop MapReduce
CN115083538B (zh) 一种药物数据的处理系统、运行方法及数据处理方法
WO2017050177A1 (zh) 一种数据同步方法和装置
Thanekar et al. A study on MapReduce: Challenges and Trends
WO2020192225A1 (zh) 一种面向Spark的遥感数据索引方法、系统及电子设备
US10311019B1 (en) Distributed architecture model and management
Junwei et al. Architecture for component library retrieval on the cloud
Bhattu et al. Generalized communication cost efficient multi-way spatial join: revisiting the curse of the last reducer
Xu et al. Parallel implementation of K-Means clustering algorithm based on mapReduce computing model of hadoop.

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2021801819

Country of ref document: EP

Effective date: 20211118

NENP Non-entry into the national phase

Ref country code: DE