CN114816728A - Elastic expansion method and system for cloud environment MongoDB database cluster instance node - Google Patents

Elastic expansion method and system for cloud environment MongoDB database cluster instance node Download PDF

Info

Publication number
CN114816728A
CN114816728A CN202210222037.3A CN202210222037A CN114816728A CN 114816728 A CN114816728 A CN 114816728A CN 202210222037 A CN202210222037 A CN 202210222037A CN 114816728 A CN114816728 A CN 114816728A
Authority
CN
China
Prior art keywords
node
instance
database
nodes
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210222037.3A
Other languages
Chinese (zh)
Inventor
厉颖
赵山
王阳
孙思清
肖雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Cloud Information Technology Co Ltd
Original Assignee
Inspur Cloud Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Cloud Information Technology Co Ltd filed Critical Inspur Cloud Information Technology Co Ltd
Priority to CN202210222037.3A priority Critical patent/CN114816728A/en
Publication of CN114816728A publication Critical patent/CN114816728A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/12Protocols specially adapted for proprietary or special-purpose networking environments, e.g. medical networks, sensor networks, networks in vehicles or remote metering networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Mathematical Physics (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a system for elastic expansion and contraction of cloud environment MongoDB database cluster instance nodes, by providing the elastic expansion method of the MongoDB database cluster example nodes in the cloud environment, which consists of a MongoDB database process in the public cloud environment, a database monitoring data training prediction process and a database node elastic expansion process, the system automatically judges the appropriate number of nodes of the MongoDB copy set instance in the current time period, the read-only nodes of the database cluster are elastically stretched, the number of the example node resources is optimized, meanwhile, the database service is ensured not to be down, and a monitoring data training module is established by utilizing the monitoring indexes of analyzing large data volume in past time periods in the cloud environment, and according to the proper read-only nodes, the read-only instance nodes are automatically created or deleted, and the MongoDB database copy set cluster is automatically configured, so that the resource cost of a user is saved, and the operation and maintenance cost for monitoring the performance of the database at any time is reduced.

Description

Elastic expansion method and system for cloud environment MongoDB database cluster instance node
Technical Field
The invention relates to the technical field of cloud database correlation, in particular to a method and a system for elastic expansion and contraction of cloud environment MongoDB database cluster instance nodes.
Background
In an application scene of the internet, the read request data volume of the database is large, the concurrency requirement is high, and the database architecture needing read-write separation meets the requirement. The read-only instance group is an implementation of read-write separation.
Under the application scenario that a small amount of write requests exist for a database but a large amount of read requests exist, the read requests of the database become the bottleneck of the database, and the conventional database example cannot simultaneously meet the quick response of the read requests and the write operations. In order to improve the performance of the database, the database pressure is shared, the number of nodes of the database cluster is increased, the read request of the database is distributed to a new node, and the pressure of the existing database node is reduced. One or more database nodes may be created to form node groups, with the new node groups providing read-only functionality to share database read requirements, also referred to as read-only instance groups. The main instance address and the read-only instance address are configured in the application program, so that the writing request can be forwarded to the main instance, the reading request can be forwarded to the read-only instance, and the reading and writing separation is more convenient.
MongoDB is a product between relational databases and non-relational databases, and among the non-relational databases, the MongoDB has the most abundant functions and is most similar to the relational databases. The replica set approach of clustering provides redundancy and high availability and is the basis for all production environment deployments. At least two nodes are needed for MongoDB replication, wherein one node is a master node and is responsible for processing client requests, and the other nodes are slave nodes and are responsible for replicating data on the master node. A replica set may have one or more slave nodes. The master node receives all write operations and all members of the replica set can accept read operations. And under the default condition, the application program directs the reading operation to the main node, and when the database reading request quantity is large, the main node becomes a short board for database access.
Disclosure of Invention
The embodiment of the invention provides a method and a system for elastic expansion and contraction of cloud environment MongoDB database cluster instance nodes, which can improve the security of server access.
A cloud environment MongoDB database cluster instance node elastic expansion method and system comprises the following steps:
a MongoDB database process on a public cloud environment is deployed;
the database monitoring data training prediction process specifically comprises the following steps: training process of the model and prediction process of the node number;
the elastic expansion process of the database nodes specifically comprises the following steps: a node capacity expansion process and a node capacity reduction process.
Alternatively,
the MongoDB database deployment process on the public cloud environment specifically comprises the following steps:
the method comprises the following steps: prompting the user whether to start the MongoDB database cluster instance node, entering the step two if the MongoDB database cluster instance node is started, and ending the process if the MongoDB database cluster instance node is not started, wherein the user can automatically judge to create or delete the database instance node;
step two: the MongoDB database acquires monitoring performance indexes of the MongoDB database, including CPUs, memories, networks, read-write request numbers, read-write proportion indexes and average read operation response time in different time periods;
step three: the MongoDB database monitors data information, sets an average response time threshold value of reading operation within 5 minutes, establishes a multiple linear regression model for learning monitoring indexes and predicts the proper number of read-only nodes when the average response time of the reading operation meets the set threshold value range;
step four: and (4) whether the current node number is in the proper range of the predicted node number or not, if so, indicating that the current node number is proper, ending the process, and otherwise, entering the step five.
Step five: and if the current node number of the node is less than the predicted node number, performing a node capacity expansion function, otherwise, performing a node capacity reduction function.
Alternatively,
the training process of the model comprises the following steps:
step six: constructing training data and a test set according to the database monitoring historical data, wherein the marking data contained in the training data and the test set mainly comprises the node number of the current database instance, the CPU, the memory, the network and the reading load of the database instance;
step seven: normalizing the training data and the test data, removing units with non-uniform test indexes in the operation process, using x to represent the current index value, using y to represent the normalized data of the corresponding index, and adopting a formula as follows:
Figure BDA0003533957240000031
step eight: constructing a multiple linear regression model, wherein the specific formula is as follows: f (x) ═ w1 × 1+ w2 × 2+ … + wd × d + b, where w1, w2, …, and wd are weights, b is a bias, x1, x2, …, and xd are influence factors, such as the number of nodes of the current database instance, CPU, memory, network, and read load influence factor of the database instance, and f (x) is a predicted value, the loss function uses a mean square error, and the extremum finding uses a random gradient descent algorithm;
step nine: performing multi-round training on the constructed multiple linear regression model by using training data, wherein the training times are set to be 1000 times;
step ten: after the model is trained, the parameters of the model are verified through the test set, if the deviation is overlarge, the training times can be increased, and the model is continuously optimized until the error is within a reasonable range.
Alternatively, the first and second liquid crystal display panels may be,
the process of predicting the number of the nodes comprises the following steps:
step eleven: identifying parameter factors influencing the average response time of the reading operation, such as the number of nodes of the current database instance, CPU (central processing unit), memory, network, reading load and other influencing factors;
step twelve: acquiring performance data of a MongoDB database, wherein the performance data comprises monitoring data such as a CPU, an internal memory, a network, a read-write request number, a read-write proportion index, read operation average response time and the like of the database in different time periods, and meanwhile, carrying out normalization processing on the data;
step thirteen: if the average response time of the read operation is larger than the maximum response time threshold or smaller than the minimum response time threshold, executing node number prediction operation, wherein the threshold setting is combined with the characteristics of a MongoDB database, and the elastic expansion of the database nodes needs to consider the synchronization problem of database data, so that the expansion needs to be performed in advance before the elastic expansion, the MongoDB data recovery time is reserved, and the elastic expansion node number prediction is performed when the read response time continuously rises to reach the maximum response time threshold for 3 times continuously, so that the average maximum response time threshold is smaller than the normal response time threshold;
fourteen steps: the node prediction operation needs to pass through a training model and an acquisition value, the average response time of the reading operation is set as a specified value, and the number of predicted nodes is predicted;
step fifteen: and comparing the predicted proper node number with the node data of the current example, and informing a program whether to perform elastic expansion and contraction of the node and several nodes.
Alternatively,
the node capacity expansion process comprises the following steps:
sixthly, the steps are as follows: creating a virtual machine of the capacity expansion instance node, recommending the capacity expansion node to be consistent with the specification of the main instance, namely the capacity expansion node is consistent with the specification of a CPU memory, and facilitating subsequent monitoring of performance data;
seventeen steps: the MongoDB main instance cluster node data is restored to a newly expanded node, the synchronization of the expanded instance node data and the main instance is kept, and the main instance cluster is prevented from adding a new copy when the data volume is large, and the time for synchronizing the data of the copy is long;
eighteen steps: synchronizing the configuration file of the cluster main instance node to the capacity expansion instance, and keeping the capacity expansion copy consistent with the configuration of the main instance;
nineteen steps: adding a new instance copy member to the cluster main instance node, executing an rs.add command by the MongoDB database, configuring the priority of the new member to be lower than that of the main instance, ensuring that the node of the main instance cluster is preferentially selected when the failure node of the copy cluster is switched, and ensuring that the read-only instance node is taken as a slave node;
twenty steps: the cluster main instance node checks the cluster state, the MongoDB database executes an rs.status command to check the state of the current instance copy set cluster, and if the state is normal, the method enters a twenty-one step; otherwise, entering into twenty-three step;
twenty one: the read-only load balancing configuration is added with a new expansion instance and is responsible for adding information of a new expansion instance node in the read-only load distribution load balancer configuration;
step twenty-two: reloading load balance configuration, enabling the functions of the new capacity expansion instance nodes to take effect, and distributing read loads to the new capacity expansion nodes;
twenty-three steps: if the duplicate set state has a problem, rolling back and deleting the capacity expansion nodes;
twenty-four steps: and recording the capacity expansion state, and returning the number of MongoDB instance nodes after the capacity expansion instance nodes are created.
Alternatively,
the node capacity reduction process comprises the following steps:
twenty-five steps: the read-only load balancing configuration deletes the capacity reduction instance, and prevents the read-only load from being distributed to the instance node to be deleted;
twenty-six steps: reloading the load balancing configuration, and removing the effective functions of the instance nodes deleted by the reduction capacity;
twenty-seven steps: the cluster master instance removes the duplicate set member to be deleted, and the MongoDB database executes the rs.remove command;
twenty-eight steps: deleting the virtual machine of the instance node;
twenty-nine steps: and recording the capacity reduction state, and returning the number of the MongoDB instance nodes after deleting the read-only instance nodes.
The utility model provides a cloud environment MongoDB database cluster example node elastic expansion system which characterized in that:
the cloud environment MongoDB database cluster instance node elastic expansion system comprises:
a MongoDB database monitoring module;
a monitoring data training prediction module;
the node elastic telescopic module comprises a node automatic capacity expansion module and a node capacity reduction module.
Alternatively,
the MongoDB database monitoring module prompts a user whether to start an elastic expansion module of the MongoDB database cluster instance node, if the elastic expansion module is started, the MongoDB database monitoring module is started, if the elastic expansion module is not started, the process is ended, and the user can automatically judge to create or delete the database instance node;
the MongoDB database monitoring module is used for acquiring monitoring performance indexes of the MongoDB database, including CPUs, memories, networks, read-write request numbers, read-write proportion indexes and average read operation response time in different time periods;
the MongoDB database monitoring data training module sets an average response time threshold value of reading operation within 5 minutes, establishes a multiple linear regression model for learning monitoring indexes and predicts the proper number of read-only nodes when the average response time of the reading operation meets the set threshold value range;
and (4) whether the current node number is in the proper range of the predicted node number or not, if so, indicating that the current node number is proper, ending the process, and otherwise, entering the node elastic expansion module in the step five.
And the node elastic expansion module is used for expanding the capacity of the node if the current node number is less than the predicted node number, or else, expanding the capacity of the node.
Alternatively,
the node automatic capacity expansion module firstly creates a capacity expansion instance node virtual machine, and recommends the capacity expansion node to be consistent with the specification of the main instance, namely the capacity expansion node is consistent with the specification of a CPU memory, so that the subsequent monitoring of performance data is facilitated;
the MongoDB database monitoring module main instance cluster node data is restored to a newly expanded node, and the synchronization between the expanded instance node data and the main instance is kept;
synchronizing the configuration file of the cluster main instance node to the capacity expansion instance, and keeping the capacity expansion copy consistent with the configuration of the main instance;
adding a new instance copy member to the cluster master instance node, executing an rs.add command by the MongoDB database module, configuring the priority of the new member to be lower than that of the master instance, ensuring that the node of the master instance cluster is preferentially selected when the failure node of the copy cluster is switched, and ensuring that the read-only instance node is taken as a slave node;
the method comprises the steps that a cluster main instance node checks a cluster state, a MongoDB database module executes an rs.status command to check the state of a current instance copy cluster, if the state is normal, a read-only load balancing configuration is added with a new capacity expansion instance, and the MongoDB database module is responsible for adding information of a new capacity expansion instance node in the read-only instance load distribution load balancer configuration; otherwise, the state of the copy set has problems, and rollback is carried out to delete the capacity expansion nodes;
reloading load balance configuration, enabling the functions of the new capacity expansion instance nodes to take effect, and distributing read loads to the new capacity expansion nodes;
the node automatic capacity expansion module records the capacity expansion state and returns the number of MongoDB instance nodes after the capacity expansion instance nodes are created.
Alternatively,
the node capacity reduction module firstly performs read-only load balancing configuration to delete capacity reduction examples and prevents read-only loads from being distributed to example nodes to be deleted;
the node capacity reduction module reloads the load balancing configuration, and the function of removing the example nodes deleted by capacity reduction takes effect;
the cluster master instance removes the duplicate set member to be deleted, and the MongoDB database executes the rs.remove command;
deleting the virtual machine of the instance node;
and the node capacity reduction module records the capacity reduction state and returns the number of the MongoDB instance nodes after deleting the read-only instance nodes.
Compared with the prior art, the invention has the beneficial effects that:
in the embodiment of the invention, by providing an elastic expansion method of a cloud environment MongoDB database cluster example node, which consists of a MongoDB database process, a database monitoring data training prediction process and a database node elastic expansion process deployed in a public cloud environment, the system automatically judges the number of nodes suitable for MongoDB copy set examples in the current time period, elastically expands and contracts the database cluster read-only nodes, optimizes the number of example node resources, simultaneously ensures that database service is not down, establishes a monitoring data training module by utilizing a monitoring index for analyzing large data volume in the past time period under the cloud environment, sets an average response time threshold value of reading operation within a period of time, and performs learning of the monitoring index according to a multivariate linear regression model in the range that the average response time of the reading operation meets the set threshold value, and predicts the number of suitable read-only nodes; and according to the proper read-only nodes, the read-only instance nodes are automatically created or deleted, and the MongoDB database copy set cluster is automatically configured, so that the resource cost of a user is saved, and the operation and maintenance cost for monitoring the performance of the database at any time is reduced.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is an overall flow chart of an embodiment of the present invention;
FIG. 2 is a model training process according to an embodiment of the present invention;
FIG. 3 is a flow chart of predicting the number of nodes according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a node capacity expansion method according to an embodiment of the present invention;
fig. 5 is a flowchart of a node capacity reduction process according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer and more complete, the technical solutions in the embodiments of the present invention will be described below with reference to the drawings in the embodiments of the present invention, it is obvious that the described embodiments are some, but not all embodiments of the present invention, and based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
Referring to fig. 1-5, the present invention provides a technical solution: a cloud environment MongoDB database cluster instance node elastic expansion method and a system thereof are provided, the cloud environment MongoDB database cluster instance node elastic expansion method comprises the following steps:
a MongoDB database process on a public cloud environment is deployed;
the database monitoring data training and predicting process specifically comprises the following steps: training process of the model and prediction process of the node number;
the elastic expansion process of the database nodes specifically comprises the following steps: a node capacity expansion process and a node capacity reduction process.
The MongoDB database deployment process on the public cloud environment specifically comprises the following steps:
the method comprises the following steps: prompting the user whether to start the MongoDB database cluster instance node, entering the step two if the MongoDB database cluster instance node is started, and ending the process if the MongoDB database cluster instance node is not started, wherein the user can automatically judge to create or delete the database instance node;
step two: the MongoDB database acquires monitoring performance indexes of the MongoDB database, including CPUs, memories, networks, read-write request numbers, read-write proportion indexes and average read operation response time in different time periods;
step three: the MongoDB database monitors data information, sets an average response time threshold value of reading operation within 5 minutes, establishes a multiple linear regression model for learning monitoring indexes and predicts the proper number of read-only nodes when the average response time of the reading operation meets the set threshold value range;
step four: and (4) whether the current node number is in the proper range of the predicted node number or not, if so, indicating that the current node number is proper, ending the process, and otherwise, entering the step five.
Step five: and if the current node number of the node is less than the predicted node number, performing a node capacity expansion function, otherwise, performing a node capacity reduction function.
The training process of the model comprises the following steps:
step six: constructing training data and a test set according to the database monitoring historical data, wherein the marking data contained in the training data and the test set mainly comprises the node number of the current database instance, the CPU, the memory, the network and the reading load of the database instance;
step seven: normalizing the training data and the test data, removing units with non-uniform test indexes in the operation process, using x to represent the current index value, using y to represent the normalized data of the corresponding index, and adopting a formula as follows:
Figure BDA0003533957240000091
step eight: constructing a multiple linear regression model, wherein the specific formula is as follows: f (x) ═ w1 × 1+ w2 × 2+ … + wd × d + b, where w1, w2, …, and wd are weights, b is a bias, x1, x2, …, and xd are influence factors, such as the number of nodes of the current database instance, CPU, memory, network, and read load influence factor of the database instance, and f (x) is a predicted value, the loss function uses a mean square error, and the extremum finding uses a random gradient descent algorithm;
step nine: performing multi-round training on the constructed multiple linear regression model by using training data, wherein the training times are set to be 1000 times;
step ten: after the model is trained, the parameters of the model are verified through the test set, if the deviation is overlarge, the training times can be increased, and the model is continuously optimized until the error is within a reasonable range.
The process of predicting the number of nodes includes:
step eleven: identifying parameter factors influencing the average response time of the reading operation, such as the number of nodes of the current database instance, CPU (central processing unit), memory, network, reading load and other influencing factors;
step twelve: acquiring performance data of a MongoDB database, wherein the performance data comprises monitoring data such as a CPU, an internal memory, a network, a read-write request number, a read-write proportion index, read operation average response time and the like of the database in different time periods, and meanwhile, carrying out normalization processing on the data;
step thirteen: if the average response time of the read operation is larger than the maximum response time threshold or smaller than the minimum response time threshold, executing node number prediction operation, wherein the threshold setting is combined with the characteristics of a MongoDB database, and the elastic expansion of the database nodes needs to consider the synchronization problem of database data, so that the expansion needs to be performed in advance before the elastic expansion, the MongoDB data recovery time is reserved, and the elastic expansion node number prediction is performed when the read response time continuously rises to reach the maximum response time threshold for 3 times continuously, so that the average maximum response time threshold is smaller than the normal response time threshold;
fourteen steps: the node prediction operation needs to pass through a training model and an acquisition value, the average response time of the reading operation is set as a specified value, and the number of predicted nodes is predicted;
step fifteen: and comparing the predicted proper node number with the node data of the current example, and informing a program whether to perform elastic expansion and contraction of the node and several nodes.
The node capacity expansion process comprises the following steps:
sixthly, the steps are as follows: creating a virtual machine of the capacity expansion instance node, recommending the capacity expansion node to be consistent with the specification of the main instance, namely the capacity expansion node is consistent with the specification of a CPU memory, and facilitating subsequent monitoring of performance data;
seventeen steps: the MongoDB main instance cluster node data is restored to a newly expanded node, the synchronization of the expanded instance node data and the main instance is kept, and the main instance cluster is prevented from adding a new copy when the data volume is large, and the time for synchronizing the data of the copy is long;
eighteen steps: synchronizing the configuration file of the cluster main instance node to the capacity expansion instance, and keeping the capacity expansion copy consistent with the configuration of the main instance;
nineteen steps: adding a new instance copy member to the cluster main instance node, executing an rs.add command by the MongoDB database, configuring the priority of the new member to be lower than that of the main instance, ensuring that the node of the main instance cluster is preferentially selected when the failure node of the copy cluster is switched, and ensuring that the read-only instance node is taken as a slave node;
twenty steps: the cluster main instance node checks the cluster state, the MongoDB database executes an rs.status command to check the state of the current instance copy set cluster, and if the state is normal, the method enters a twenty-one step; otherwise, entering into twenty-three step;
twenty one: the read-only load balancing configuration is added with a new expansion instance and is responsible for adding information of a new expansion instance node in the read-only load distribution load balancer configuration;
step twenty-two: reloading load balance configuration, enabling the functions of the new capacity expansion instance nodes to take effect, and distributing read loads to the new capacity expansion nodes;
twenty-three steps: if the duplicate set state has a problem, rolling back and deleting the capacity expansion nodes;
twenty-four steps: and recording the capacity expansion state, and returning the number of MongoDB instance nodes after the capacity expansion instance nodes are created.
6. The method for elastic expansion and contraction of cloud environment MongoDB database cluster instance nodes according to claim 5, characterized in that:
the node capacity reduction process comprises the following steps:
twenty-five steps: the read-only load balancing configuration deletes the capacity reduction instance, and prevents the read-only load from being distributed to the instance node to be deleted;
twenty-six steps: reloading the load balancing configuration, and removing the effective functions of the instance nodes deleted by the reduction capacity;
twenty-seven steps: the cluster master instance removes the duplicate set member to be deleted, and the MongoDB database executes the rs.remove command;
twenty-eight steps: deleting the virtual machine of the instance node;
twenty-nine steps: and recording the capacity reduction state, and returning the number of the MongoDB instance nodes after deleting the read-only instance nodes.
The elastic expansion system for the cloud environment MongoDB database cluster instance nodes comprises:
a MongoDB database monitoring module;
a monitoring data training prediction module;
the node elastic expansion module comprises a node automatic expansion module and a node capacity reduction module.
The MongoDB database monitoring module prompts a user whether to start the MongoDB database cluster instance node elastic expansion module, if the MongoDB database node elastic expansion module is started, the MongoDB database monitoring module is started, if the MongoDB database node elastic expansion module is not started, the process is ended, and the user can automatically judge to create or delete the database instance node;
the MongoDB database monitoring module is used for acquiring monitoring performance indexes of the MongoDB database, including CPUs, memories, networks, read-write request numbers, read-write proportion indexes and average read operation response time in different time periods;
the MongoDB database monitoring data training module sets an average response time threshold value of reading operation within 5 minutes, establishes a multiple linear regression model for learning monitoring indexes and predicts the proper number of read-only nodes when the average response time of the reading operation meets the set threshold value range;
and (4) whether the current node number is in the proper range of the predicted node number or not, if so, indicating that the current node number is proper, ending the process, and otherwise, entering the node elastic expansion module in the step five.
And the node elastic expansion module is used for expanding the capacity of the node if the current node number is less than the predicted node number, or else, expanding the capacity of the node.
The node automatic capacity expansion module firstly creates a capacity expansion instance node virtual machine, and recommends the capacity expansion node to be consistent with the main instance specification, namely the capacity expansion node is consistent with the CPU memory specification, so that the subsequent monitoring of performance data is facilitated;
the MongoDB database monitoring module main instance cluster node data is restored to a newly expanded node, and the synchronization between the expanded instance node data and the main instance is kept;
synchronizing the configuration file of the cluster main instance node to the capacity expansion instance, and keeping the capacity expansion copy consistent with the configuration of the main instance;
adding a new instance copy member to the cluster master instance node, executing an rs.add command by the MongoDB database module, configuring the priority of the new member to be lower than that of the master instance, ensuring that the node of the master instance cluster is preferentially selected when the failure node of the copy cluster is switched, and ensuring that the read-only instance node is taken as a slave node;
the method comprises the steps that a cluster main instance node checks a cluster state, a MongoDB database module executes an rs.status command to check the state of a current instance copy cluster, if the state is normal, a read-only load balancing configuration is added with a new capacity expansion instance, and the MongoDB database module is responsible for adding information of a new capacity expansion instance node in the read-only instance load distribution load balancer configuration; otherwise, the state of the copy set has problems, and rollback is carried out to delete the capacity expansion nodes;
reloading load balance configuration, enabling the functions of the new capacity expansion instance nodes to take effect, and distributing read loads to the new capacity expansion nodes;
the node automatic capacity expansion module records the capacity expansion state and returns the number of MongoDB instance nodes after the capacity expansion instance nodes are created.
The node capacity reduction module firstly performs read-only load balancing configuration to delete capacity reduction examples and prevents read-only loads from being distributed to example nodes to be deleted;
the node capacity reduction module reloads the load balancing configuration, and the function of removing the example nodes deleted by capacity reduction takes effect;
the cluster master instance removes the duplicate set member to be deleted, and the MongoDB database executes the rs.remove command;
deleting the virtual machine of the instance node;
and the node capacity reduction module records the capacity reduction state and returns the number of the MongoDB instance nodes after deleting the read-only instance nodes.
According to the scheme, the elastic expansion method of the cloud environment MongoDB database cluster example nodes is composed of a MongoDB database process deployed in a public cloud environment, a database monitoring data training prediction process and a database node elastic expansion process, the system automatically judges the number of nodes suitable for MongoDB copy set examples in the current time period, elastically expands and contracts the database cluster read-only nodes, optimizes the number of example node resources, simultaneously ensures that database service is not down, establishes a monitoring data training module by utilizing monitoring indexes for analyzing large data volume in the past time period in the cloud environment, sets an average response time threshold value of reading operation in a period of time, learns the monitoring indexes according to a multivariate linear regression model in the range that the average response time of the reading operation meets the set threshold value, and predicts the number of the suitable read-only nodes; and according to the proper read-only nodes, the read-only instance nodes are automatically created or deleted, and the MongoDB database copy set cluster is automatically configured, so that the resource cost of a user is saved, and the operation and maintenance cost for monitoring the performance of the database at any time is reduced.
Because the information interaction, execution process, and other contents between the units in the device are based on the same concept as the method embodiment of the present invention, specific contents may refer to the description in the method embodiment of the present invention, and are not described herein again.
The invention also provides a cloud environment MongoDB database cluster instance node elastic scaling device storing instructions for causing a computer to perform the method for rights metadata distributed initialization as described herein. Specifically, a system or an apparatus equipped with a storage medium on which software program codes that realize the functions of any of the above-described embodiments are stored may be provided, and a computer (or a CPU or MPU) of the system or the apparatus is caused to read out and execute the program codes stored in the storage medium.
In this case, the program code itself read from the storage medium can realize the functions of any of the above-described embodiments, and thus the program code and the storage medium storing the program code constitute a part of the present invention.
Examples of the storage medium for supplying the program code include a floppy disk, a hard disk, a magneto-optical disk, an optical disk (e.g., CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD + RW), a magnetic tape, a nonvolatile memory card, and a ROM. Alternatively, the program code may be downloaded from a server computer via a communications network.
Further, it should be clear that the functions of any one of the above-described embodiments may be implemented not only by executing the program code read out by the computer, but also by causing an operating system or the like operating on the computer to perform a part or all of the actual operations based on instructions of the program code.
Further, it is to be understood that the program code read out from the storage medium is written to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causes a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on instructions of the program code, thereby realizing the functions of any of the above-described embodiments.
It should be noted that not all steps and modules in the above flows and system structure diagrams are necessary, and some steps or modules may be omitted according to actual needs. The execution order of the steps is not fixed and can be adjusted as required. The system structure described in the above embodiments may be a physical structure or a logical structure, that is, some modules may be implemented by the same physical entity, or some modules may be implemented by a plurality of physical entities, or some components in a plurality of independent devices may be implemented together.
In the above embodiments, the hardware unit may be implemented mechanically or electrically. For example, a hardware element may comprise permanently dedicated circuitry or logic (such as a dedicated processor, FPGA or ASIC) to perform the corresponding operations. The hardware elements may also comprise programmable logic or circuitry, such as a general purpose processor or other programmable processor, that may be temporarily configured by software to perform the corresponding operations. The specific implementation (mechanical, or dedicated permanent circuit, or temporarily configured circuit) may be determined based on cost and time considerations.
While the invention has been shown and described in detail in the drawings and in the preferred embodiments, it is not intended to limit the invention to the embodiments disclosed, and it will be apparent to those skilled in the art that various combinations of the code auditing means in the various embodiments described above may be used to obtain further embodiments of the invention, which are also within the scope of the invention.

Claims (10)

1. A cloud environment MongoDB database cluster instance node elastic expansion method and a system are characterized in that the cloud environment MongoDB database cluster instance node elastic expansion method comprises the following steps:
a MongoDB database process on a public cloud environment is deployed;
the database monitoring data training and predicting process specifically comprises the following steps: training process of the model and prediction process of the node number;
the elastic expansion process of the database nodes specifically comprises the following steps: a node capacity expansion process and a node capacity reduction process.
2. The method for elastic expansion of cloud environment MongoDB database cluster instance nodes according to claim 1, characterized in that:
the MongoDB database deployment process on the public cloud environment specifically comprises the following steps:
the method comprises the following steps: prompting the user whether to start the MongoDB database cluster instance node, entering the step two if the MongoDB database cluster instance node is started, and ending the process if the MongoDB database cluster instance node is not started, wherein the user can automatically judge to create or delete the database instance node;
step two: the MongoDB database acquires monitoring performance indexes of the MongoDB database, including CPUs, memories, networks, read-write request numbers, read-write proportion indexes and average read operation response time in different time periods;
step three: the MongoDB database monitors data information, sets an average response time threshold value of reading operation within 5 minutes, establishes a multiple linear regression model for learning monitoring indexes and predicts the proper number of read-only nodes when the average response time of the reading operation meets the set threshold value range;
step four: and (4) whether the current node number is in the proper range of the predicted node number or not, if so, indicating that the current node number is proper, ending the process, and otherwise, entering the step five.
Step five: and if the current node number of the node is less than the predicted node number, performing a node capacity expansion function, otherwise, performing a node capacity reduction function.
3. The method of claim 2 for elastic scaling of cloud environment MongoDB database cluster instance nodes, comprising:
the training process of the model comprises the following steps:
step six: constructing training data and a test set according to the database monitoring historical data, wherein the marking data contained in the training data and the test set mainly comprises the node number of the current database instance, the CPU, the memory, the network and the reading load of the database instance;
step seven: normalizing the training data and the test data, removing units with non-uniform test indexes in the operation process, using x to represent the current index value, using y to represent the normalized data of the corresponding index, and adopting a formula as follows:
Figure RE-FDA0003639937270000021
step eight: constructing a multiple linear regression model, wherein the specific formula is as follows: f (x) ═ w1 × 1+ w2 × 2+ … + wd × d + b, where w1, w2, …, and wd are weights, b is a bias, x1, x2, …, and xd are influence factors, such as the number of nodes of the current database instance, CPU, memory, network, and read load influence factor of the database instance, and f (x) is a predicted value, the loss function uses a mean square error, and the extremum finding uses a random gradient descent algorithm;
step nine: performing multi-round training on the constructed multiple linear regression model by using training data, wherein the training times are set to 1000 times;
step ten: after the model is trained, the parameters of the model are verified through a test set, if the deviation is overlarge, the training times can be increased, and the model is continuously optimized until the error is within a reasonable range.
4. The method for elastic scaling of cloud environment MongoDB database cluster instance nodes according to claim 3, characterized in that:
the process of predicting the number of the nodes comprises the following steps:
step eleven: identifying parameter factors influencing the average response time of the reading operation, such as the number of nodes of the current database instance, CPU (central processing unit), memory, network, reading load and other influencing factors;
step twelve: acquiring performance data of a MongoDB database, wherein the performance data comprises monitoring data such as a CPU, an internal memory, a network, a read-write request number, a read-write proportion index, read operation average response time and the like of the database in different time periods, and meanwhile, carrying out normalization processing on the data;
step thirteen: if the average response time of the read operation is larger than the maximum response time threshold or smaller than the minimum response time threshold, executing node number prediction operation, wherein the threshold setting is combined with the characteristics of a MongoDB database, and the elastic expansion of the database nodes needs to consider the synchronization problem of database data, so that the expansion needs to be performed in advance before the elastic expansion, the MongoDB data recovery time is reserved, and the elastic expansion node number prediction is performed when the read response time continuously rises to reach the maximum response time threshold for 3 times continuously, so that the average maximum response time threshold is smaller than the normal response time threshold;
fourteen steps: the node prediction operation needs to pass through a training model and an acquisition value, the average response time of the reading operation is set as a specified value, and the number of predicted nodes is predicted;
step fifteen: and comparing the predicted proper node number with the node data of the current example, and informing a program whether to perform elastic expansion and contraction of the node and several nodes.
5. The method of claim 4, wherein the method comprises the following steps:
the node capacity expansion process comprises the following steps:
sixthly, the steps are as follows: creating a virtual machine of the capacity expansion instance node, recommending the capacity expansion node to be consistent with the specification of the main instance, namely the capacity expansion node is consistent with the specification of a CPU memory, and facilitating subsequent monitoring of performance data;
seventeen steps: the MongoDB main instance cluster node data is restored to a newly expanded node, the synchronization of the expanded instance node data and the main instance is kept, and the main instance cluster is prevented from adding a new copy when the data volume is large, and the time for synchronizing the data of the copy is long;
eighteen steps: synchronizing the configuration file of the cluster main instance node to the capacity expansion instance, and keeping the capacity expansion copy consistent with the configuration of the main instance;
nineteen steps: adding a new instance copy member to the cluster main instance node, executing an rs.add command by the MongoDB database, configuring the priority of the new member to be lower than that of the main instance, ensuring that the node of the main instance cluster is preferentially selected when the failure node of the copy cluster is switched, and ensuring that the read-only instance node is taken as a slave node;
twenty steps: the cluster main instance node checks the cluster state, the MongoDB database executes an rs.status command to check the state of the current instance copy set cluster, and if the state is normal, the method enters a twenty-one step; otherwise, entering into twenty-three step;
twenty one: the read-only load balancing configuration is added with a new expansion instance and is responsible for adding information of a new expansion instance node in the read-only load distribution load balancer configuration;
step twenty-two: reloading load balance configuration, enabling the functions of the new capacity expansion instance nodes to take effect, and distributing read loads to the new capacity expansion nodes;
twenty-three steps: if the duplicate set state has a problem, rolling back and deleting the capacity expansion nodes;
twenty-four steps: and recording the capacity expansion state, and returning the number of MongoDB instance nodes after the capacity expansion instance nodes are created.
6. The method of claim 5, wherein the cloud environment MongoDB database cluster instance node elastic scaling method comprises:
the node capacity reduction process comprises the following steps:
twenty-five steps: the read-only load balancing configuration deletes the capacity reduction instance, and prevents the read-only load from being distributed to the instance node to be deleted;
twenty-six steps: reloading the load balancing configuration, and enabling the function of removing the instance nodes subjected to the capacity reduction deletion to take effect;
twenty-seven steps: the cluster master instance removes the duplicate set member to be deleted, and the MongoDB database executes the rs.remove command;
twenty-eight steps: deleting the virtual machine of the instance node;
twenty-nine steps: and recording the capacity reduction state, and returning the number of the MongoDB instance nodes after deleting the read-only instance nodes.
7. The utility model provides a cloud environment MongoDB database cluster example node elastic expansion system which characterized in that:
the cloud environment MongoDB database cluster instance node elastic expansion system comprises:
a MongoDB database monitoring module;
a monitoring data training prediction module;
the node elastic telescopic module comprises a node automatic capacity expansion module and a node capacity reduction module.
8. The method of claim 7, wherein the cloud environment MongoDB database cluster instance node elastic scaling method comprises:
the MongoDB database monitoring module prompts a user whether to start an elastic expansion module of the MongoDB database cluster instance node, if the elastic expansion module is started, the MongoDB database monitoring module is started, if the elastic expansion module is not started, the process is ended, and the user can automatically judge to create or delete the database instance node;
the MongoDB database monitoring module is used for acquiring monitoring performance indexes of the MongoDB database, including CPUs, memories, networks, read-write request numbers, read-write proportion indexes and average read operation response time in different time periods;
the MongoDB database monitoring data training module sets an average response time threshold value of reading operation within 5 minutes, establishes a multiple linear regression model for learning monitoring indexes and predicts the proper number of read-only nodes when the average response time of the reading operation meets the set threshold value range;
and (4) whether the current node number is in the proper range of the predicted node number or not, if so, indicating that the current node number is proper, ending the process, and otherwise, entering the node elastic expansion module in the step five.
And the node elastic expansion module is used for expanding the capacity of the node if the current node number is less than the predicted node number, or else, expanding the capacity of the node.
9. The cloud environment MongoDB database cluster instance node elastic scaling system of claim 7, wherein:
the node automatic capacity expansion module firstly creates a capacity expansion instance node virtual machine, and recommends the capacity expansion node to be consistent with the specification of the main instance, namely the capacity expansion node is consistent with the specification of a CPU memory, so that the subsequent monitoring of performance data is facilitated;
the MongoDB database monitoring module main instance cluster node data is restored to a newly expanded node, and the synchronization between the expanded instance node data and the main instance is kept;
synchronizing the configuration file of the cluster main instance node to the capacity expansion instance, and keeping the capacity expansion copy consistent with the configuration of the main instance;
adding a new instance copy member to the cluster master instance node, executing an rs.add command by the MongoDB database module, configuring the priority of the new member to be lower than that of the master instance, ensuring that the node of the master instance cluster is preferentially selected when the failure node of the copy cluster is switched, and ensuring that the read-only instance node is taken as a slave node;
the method comprises the steps that a cluster main instance node checks a cluster state, a MongoDB database module executes an rs.status command to check the state of a current instance copy cluster, if the state is normal, a read-only load balancing configuration is added with a new capacity expansion instance, and the MongoDB database module is responsible for adding information of a new capacity expansion instance node in the read-only instance load distribution load balancer configuration; otherwise, the state of the copy set has problems, and rollback is carried out to delete the capacity expansion nodes;
reloading load balance configuration, enabling the functions of the new capacity expansion instance nodes to take effect, and distributing read loads to the new capacity expansion nodes;
the node automatic capacity expansion module records the capacity expansion state and returns the number of MongoDB instance nodes after the capacity expansion instance nodes are created.
10. The cloud environment MongoDB database cluster instance node elastic scaling system of claim 7, wherein:
the node capacity reduction module firstly performs read-only load balancing configuration to delete capacity reduction examples and prevents read-only loads from being distributed to example nodes to be deleted;
the node capacity reduction module reloads the load balancing configuration, and the function of removing the example nodes deleted by capacity reduction takes effect;
the cluster master instance removes the duplicate set member to be deleted, and the MongoDB database executes the rs.remove command;
deleting the virtual machine of the instance node;
and the node capacity reduction module records the capacity reduction state and returns the number of the MongoDB instance nodes after deleting the read-only instance nodes.
CN202210222037.3A 2022-03-07 2022-03-07 Elastic expansion method and system for cloud environment MongoDB database cluster instance node Pending CN114816728A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210222037.3A CN114816728A (en) 2022-03-07 2022-03-07 Elastic expansion method and system for cloud environment MongoDB database cluster instance node

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210222037.3A CN114816728A (en) 2022-03-07 2022-03-07 Elastic expansion method and system for cloud environment MongoDB database cluster instance node

Publications (1)

Publication Number Publication Date
CN114816728A true CN114816728A (en) 2022-07-29

Family

ID=82528884

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210222037.3A Pending CN114816728A (en) 2022-03-07 2022-03-07 Elastic expansion method and system for cloud environment MongoDB database cluster instance node

Country Status (1)

Country Link
CN (1) CN114816728A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115237610A (en) * 2022-09-26 2022-10-25 城云科技(中国)有限公司 Elastic expansion method and device based on Kubernetes container cloud platform and application
CN115629879A (en) * 2022-10-25 2023-01-20 北京百度网讯科技有限公司 Load balancing method and device for distributed model training
CN117951119A (en) * 2024-02-21 2024-04-30 重庆邮电大学 Database performance optimization method based on cloud computing

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115237610A (en) * 2022-09-26 2022-10-25 城云科技(中国)有限公司 Elastic expansion method and device based on Kubernetes container cloud platform and application
CN115237610B (en) * 2022-09-26 2023-03-21 城云科技(中国)有限公司 Elastic expansion method and device based on Kubernetes container cloud platform and application
CN115629879A (en) * 2022-10-25 2023-01-20 北京百度网讯科技有限公司 Load balancing method and device for distributed model training
CN115629879B (en) * 2022-10-25 2023-10-10 北京百度网讯科技有限公司 Load balancing method and device for distributed model training
CN117951119A (en) * 2024-02-21 2024-04-30 重庆邮电大学 Database performance optimization method based on cloud computing

Similar Documents

Publication Publication Date Title
CN114816728A (en) Elastic expansion method and system for cloud environment MongoDB database cluster instance node
CN107544862B (en) Stored data reconstruction method and device based on erasure codes and storage node
US8429369B2 (en) Storage management program, storage management method, and storage management apparatus
US9785691B2 (en) Method and apparatus for sequencing transactions globally in a distributed database cluster
CN103608798A (en) Clustered file service
US20160034205A1 (en) Systems and/or methods for leveraging in-memory storage in connection with the shuffle phase of mapreduce
CN111212111B (en) Object storage service management method and electronic equipment
KR20100070968A (en) Cluster data management system and method for data recovery using parallel processing in cluster data management system
CN109218100A (en) Distributed objects storage cluster and its request responding method, system and storage medium
US10747764B1 (en) Index-based replica scale-out
US11442645B2 (en) Distributed storage system expansion mechanism
CN104410666A (en) Method and system for implementing heterogeneous storage resource management under cloud computing
CN104054076A (en) Data storage method, database storage node failure processing method and apparatus
CN115129768A (en) Node capacity expansion method of distributed search engine
CN106372160A (en) Distributive database and management method
US20060129521A1 (en) System and method for restoring a file directory structure
CN111752892B (en) Distributed file system and implementation method, management system, equipment and medium thereof
US20150039847A1 (en) Balancing data distribution in a fault-tolerant storage system
CN109120674B (en) Deployment method and device of big data platform
CN110298031B (en) Dictionary service system and model version consistency distribution method
CN114816272B (en) Magnetic disk management system under Kubernetes environment
CN115587141A (en) Database synchronization method and device
CN112988696B (en) File sorting method and device and related equipment
US6671801B1 (en) Replication of computer systems by duplicating the configuration of assets and the interconnections between the assets
CN113760822A (en) HDFS-based distributed intelligent campus file management system optimization method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination