CN117472907A - Multi-cluster index management method, system, device and storage medium - Google Patents

Multi-cluster index management method, system, device and storage medium Download PDF

Info

Publication number
CN117472907A
CN117472907A CN202311483200.2A CN202311483200A CN117472907A CN 117472907 A CN117472907 A CN 117472907A CN 202311483200 A CN202311483200 A CN 202311483200A CN 117472907 A CN117472907 A CN 117472907A
Authority
CN
China
Prior art keywords
index
cluster
target
management
clusters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311483200.2A
Other languages
Chinese (zh)
Inventor
王士强
刘伟
李�根
王茜
姜亮
周进
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202311483200.2A priority Critical patent/CN117472907A/en
Publication of CN117472907A publication Critical patent/CN117472907A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof

Abstract

The embodiment of the invention provides a method, a system, a device, a storage medium and an electronic device for managing multi-cluster indexes, wherein the method comprises the following steps: acquiring an index list of a plurality of clusters and configuration information of each index included in the index list from a database; determining a target cluster to be managed, which is included in the plurality of clusters, based on the configuration information; and managing the target index of the target cluster based on the management requirement included in the configuration information. The invention solves the problem of low index efficiency of the management cluster in the related technology, and achieves the effect of improving the index efficiency of the management cluster.

Description

Multi-cluster index management method, system, device and storage medium
Technical Field
The embodiment of the invention relates to the field of communication, in particular to a method, a system, a device, a storage medium and an electronic device for managing multi-cluster indexes.
Background
In the related art, the existing multi-ES cluster index lifecycle management technology has some drawbacks and challenges when faced with large-scale data sets, and needs to be solved:
1. lack of unified management: the prior art generally fails to provide a unified method for managing index lifecycles across multiple ES clusters, resulting in cumbersome manual configuration and management by operators across different clusters.
2. Performance limitations: when processing large-scale data sets, the prior art may face performance bottlenecks, and the processing speed is slow, so that the service requirements cannot be met.
3. Complex configuration: the configuration of the prior art is complex, complex rules and scripts are required to be written by operators, the maintenance and the expansion are not easy, and the usability and the expandability of the system are reduced.
As can be seen from the above, the related art has a problem of low efficiency in managing multi-cluster indexes.
In view of the above problems in the related art, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides a method, a system, a device, a storage medium and an electronic device for managing multi-cluster indexes, which are used for at least solving the problem of low efficiency of managing the multi-cluster indexes in the related technology.
According to an embodiment of the present invention, there is provided a method for managing multi-cluster indexes, including: acquiring an index list of a plurality of clusters and configuration information of each index included in the index list from a database; determining target clusters to be managed, which are included in a plurality of clusters, based on the configuration information; and managing the target index of the target cluster based on the management requirement included in the configuration information.
According to another embodiment of the present invention, there is provided a management system for multi-cluster index, including: the system comprises a plurality of clusters, wherein each cluster comprises an index and document data corresponding to the index are stored in the clusters; the database is used for storing index lists of indexes of a plurality of clusters and configuration information of the indexes; the scheduler is connected with the clusters and the database, and is used for acquiring an index list of the clusters and configuration information of each index included in the index list from the database, determining target clusters to be managed, which are included in the clusters, based on the configuration information, and managing the target indexes of the target clusters based on management requirements, which are included in the configuration information.
According to still another embodiment of the present invention, there is provided a management apparatus for multi-cluster index, including: the acquisition module is used for acquiring index lists of a plurality of clusters and configuration information of each index included in the index lists from a database; a determining module, configured to determine target clusters to be managed, where the target clusters are included in a plurality of clusters, based on the configuration information; and the management module is used for managing the target index of the target cluster based on the management requirement included in the configuration information.
According to a further embodiment of the invention, there is also provided a computer readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
According to a further embodiment of the invention, there is also provided an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
According to the invention, an index list of a plurality of clusters and configuration information of each index included in the index list are obtained from a database; determining target clusters to be managed, which are included in the clusters, according to the configuration information; and managing the target index of the target cluster according to the management requirement included in the configuration information. The index list of the clusters and the configuration information of each index included in the index list can be stored in the database, so that unified management of the clusters is realized, and the indexes of the target clusters can be managed according to the management requirements included in the configuration information due to the management requirements included in the configuration information, so that the management efficiency is improved. Therefore, the problem of low index efficiency of the management cluster existing in the related technology can be solved, and the effect of improving the index efficiency of the management cluster is achieved.
Drawings
Fig. 1 is a block diagram of a hardware structure of a mobile terminal of a method for managing multi-cluster indexes according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of managing multi-cluster indexes according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a multi-cluster index management system according to an embodiment of the present invention;
fig. 4 is a block diagram of a management apparatus for multi-cluster index according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
An Elastic Search (ES) system is used in the back-end processing of data. In these applications, the ever-increasing amount of data and the manner in which it is handled, often produces hundreds of millions of records. The massive data can consume a great deal of disk storage space and system resources, seriously reducing cluster performance and even disabling the system. Therefore, ES systems often need to implement management of data storage, periodically deleting old data to ensure performance and stability of the ES system. With the widespread development of big data applications, enterprises are increasingly facing challenges in processing massive amounts of data. The use of multiple Elasticsearch (ES) clusters for index management has become a common solution in the face of massive data. Lifecycle management for ES indexes may use lifecycle management APIs and a currier tool for ES.
The lifecycle management API of the ES allows the user to define indexed lifecycle policies, such as expiration and deletion based on time. By configuring the lifecycle policy, the lifecycle of the index can be automatically managed, thereby reducing the workload of operators. However, current lifecycle management APIs are primarily directed to index management of a single ES cluster, and do not provide a unified management method for spanning multiple ES clusters.
The currator is a popular open source tool for managing and maintaining ES clusters. It provides functions such as index snapshot and clean-up, fragment management, etc. However, the currator is mainly focused on the management of a single ES cluster, and there are some limitations and challenges to the index lifecycle management of multiple ES clusters.
In the related art, there are the following drawbacks in multi-ES cluster index lifecycle management:
1. lack of unified management: the prior art generally fails to provide a unified method to manage index lifecycles across multiple ES clusters. Operators need to manually configure and manage on different clusters, which results in complex operation, error-prone and increased workload.
2. Limited performance: the prior art may face performance bottlenecks in the face of large-scale data sets. The concurrent processing or optimization cannot be performed, so that the processing speed is low, and the real-time service requirement cannot be met.
3. Complex configuration: the configuration process in the prior art is complex, and an operator is required to write complex rules and scripts. This results in an increased risk of configuration errors and also increases the difficulty of system management and maintenance.
4. Lack of intelligence and automation: most of the prior art require an operator to manually perform the index lifecycle management task. The lack of an intelligent decision mechanism does not allow for intelligent automatic execution of indexing operations based on specific conditions or rules.
5. Lack of scalability: the prior art may lack good scalability in processing large-scale data sets. As the amount of data increases, the prior art may not be able to efficiently process and manage a large number of indexes.
In summary, in the related art, there are drawbacks in managing the lifecycle of the multiple ES cluster index, such as lack of unified management, limited performance, complex configuration, lack of intelligence and automation, and poor scalability.
The following embodiments are proposed in view of the above problems existing in the related art.
The method embodiments provided in the embodiments of the present application may be performed in a mobile terminal, a computer terminal or similar computing device. Taking the mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of a mobile terminal of a multi-cluster index management method according to an embodiment of the present invention. As shown in fig. 1, a mobile terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, wherein the mobile terminal may also include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely illustrative and not limiting of the structure of the mobile terminal described above. For example, the mobile terminal may also include more or fewer components than shown in fig. 1, or have a different configuration than shown in fig. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a method for managing multiple cluster indexes in an embodiment of the present invention, and the processor 102 executes the computer program stored in the memory 104 to perform various functional applications and data processing, that is, implement the above-mentioned method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the mobile terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 106 is arranged to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.
In this embodiment, a method for managing multiple cluster indexes is provided, and fig. 2 is a flowchart of a method for managing multiple cluster indexes according to an embodiment of the present invention, as shown in fig. 2, where the flowchart includes the following steps:
step S202, obtaining index lists of a plurality of clusters and configuration information of each index included in the index lists from a database;
step S204, determining target clusters to be managed, which are included in a plurality of clusters, based on the configuration information;
step S206, managing the target index of the target cluster based on the management requirement included in the configuration information.
In the above embodiment, the cluster may be an elastomer search cluster, which is an open source based distributed search and analysis engine that is built on top of the Apache Lucene library. It is widely used for real-time data indexing, searching and analysis, and is suitable for processing large and varied data. In the elastic search, the Index (Index) refers to a collection of logically related documents. It is similar to tables in traditional databases, but in elastic search, indexing is more flexible and free, fields can be created dynamically and full text searches can be performed.
In the above embodiment, the execution subject of the management method of the multi-cluster index may be a scheduler. The management method of the multi-cluster index can be applied to a management system of the multi-cluster index, and the management system of the multi-cluster index can comprise a plurality of clusters, a scheduler and a database. The scheduler is connected with the database, and can acquire index lists of a plurality of clusters and configuration information of each index included in the index lists from the database. The configuration information may include a date of index creation, a management requirement of the index, a timing of managing the index, and the like, and the management requirement of the index may include index cross-level group copy information, modification information of the index, and the like. The opportunities to manage the index may include index reservation periods, index management opportunities, and the like. The index of the target cluster may be managed when the timing to manage the index is reached according to the management requirement included in the configuration information.
In the above embodiment, the structure of the management system for multi-cluster index may be shown in fig. 3, and as shown in fig. 3, the management system for multi-cluster index may further include an index management unit. The index management unit corresponds to the index definition/maintenance/deletion in fig. 3. That is, the index management unit supports a cluster administrator to realize daily management work of one-stop multi-ES cluster indexes through a multi-cluster index management system, including index creation, index template definition, index reservation period definition, index modification, index deletion and cross-cluster replication of the indexes.
For a plurality of ES clusters distributed in different region data centers, the management system with multi-cluster index can conveniently manage the ES clusters, and meanwhile, the management system with multi-cluster index can also support the elastic search clusters with different versions, so that the manager is shielded from the differences among the different versions, and the management is simpler and more convenient. That is, versions of clusters included in the plurality of clusters may be the same or different.
The scheduler unit is used for connecting to the ES cluster and acquiring ES indexes and document data through a network protocol. And acquiring a cluster index list and a reserved period of indexes from a database, periodically polling the indexes meeting the cleaning conditions through a timing task set by a scheduler, and cleaning the indexes meeting the conditions.
The scheduler also supports cross-cluster replication of indexes, and for indexes for which replication tasks are set, automatically initiates data replication from a source cluster to a target cluster when a condition is satisfied. In the replication process, the pressure of the detection cluster is also supported, and the replication rate is dynamically adjusted according to the pressure of the cluster, so that the stability of other services is ensured.
The database is used for storing metadata, operation data and some configuration data generated in the index lifecycle management process, including configuration data such as a data center, an ES cluster, an index and the like, and metadata such as a scheduling job history record, an administrator login and operation record, an Es cluster version difference and the like.
The method for managing the index lifecycle of the cross-ES cluster realizes encapsulation and abstraction of the ES cluster operation, can automatically copy indexes and data among a plurality of different ES clusters, and can also periodically clean old index data.
According to the invention, an index list of a plurality of clusters and configuration information of each index included in the index list are obtained from a database; determining target clusters to be managed, which are included in the clusters, according to the configuration information; and managing the target index of the target cluster according to the management requirement included in the configuration information. The index list of the clusters and the configuration information of each index included in the index list can be stored in the database, so that unified management of the clusters is realized, and the indexes of the target clusters can be managed according to the management requirements included in the configuration information due to the management requirements included in the configuration information, so that the management efficiency is improved. Therefore, the problem of low index efficiency of the management cluster existing in the related technology can be solved, and the effect of improving the index efficiency of the management cluster is achieved.
In an exemplary embodiment, determining a target cluster to be managed included in a plurality of the clusters based on the configuration information includes: determining an index retention period included in each of the configuration information; determining a first index corresponding to an expiration index period included in the index reservation period; and determining the cluster corresponding to the first index as the target cluster. In this embodiment, the cluster index list and the index reservation period of the index may be obtained from the database, and the indexes meeting the cleaning condition may be periodically polled by the timing task set by the scheduler, and the indexes meeting the condition may be cleaned. The index meeting the cleaning condition is the index with the expiration of the index reservation period, the target cluster corresponding to the index can be determined, and the index is cleaned in the target cluster.
In an exemplary embodiment, determining a target cluster to be managed included in a plurality of the clusters based on the configuration information includes: determining index management opportunities included in each configuration information; determining a second index included in the index management opportunity reaching the index management opportunity; and determining the cluster corresponding to the second index as the target cluster. In this embodiment, the configuration information may further include an index management opportunity. The index management opportunity may be management time, determine a second index reaching the index management opportunity, and determine a cluster corresponding to the second index as a target cluster. When the index management opportunity of the target cluster arrives, the index in the target cluster can be managed. Management may include, among other things, modification, replication, etc.
In one exemplary embodiment, managing the target index of the target cluster based on the management requirements included in the configuration information includes at least one of: under the condition that the management requirement comprises an index clearing operation, clearing the target index included in the target cluster and document data corresponding to the target index; modifying the target index and the document data corresponding to the target index based on modification information included in the configuration information when the modification index operation is included in the management requirement; and when the management requirement includes a copy index operation, copying the target index and document data corresponding to the target index based on copy information included in the configuration information. In this embodiment, when the management requirement is to clear the index, the target index included in the target cluster and the document data corresponding to the target index may be cleared. When the management requirement is modification index operation, modification information can be determined from the configuration information, and the update content in the modification information is utilized to replace the content to be replaced in the document data corresponding to the target index. When the management requirement is a copy index operation, copy information included in the configuration information may be determined. The replication information may include a receiving cluster and data to be replicated. The index may be copied according to the copy information.
In one exemplary embodiment, copying the target index and the document data corresponding to the target index based on the copy information included in the configuration information includes: determining copy data from the document data based on the copy information; and a receiving cluster that receives the replicated data; and copying the copy data and the target index into the receiving cluster. In this embodiment, the replication information may indicate the replication data and the receiving cluster, and replicate the replication data and the target index into the receiving cluster to backup the target index. The copy data may be all data in the document data or may be part of the data in the document data.
In one exemplary embodiment, copying the copy data and the target index into the receiving cluster includes: determining load information of the receiving cluster; determining a replication rate at which the replicated data is replicated based on the load information; and copying the copy data and the target index into the receiving cluster according to the copy rate. In this embodiment, during the replication process, the pressure of the cluster is also supported, and the replication rate is dynamically adjusted according to the pressure of the cluster, so as to ensure the stability of other services.
In an exemplary embodiment, the method further comprises: obtaining index creation information from the database, wherein the index creation information comprises a creation cluster for creating an index; and creating an index according to the index creation information in the creation cluster. In this embodiment, the index management unit of the management system for multi-cluster index may also create an index in a cluster. When creating the index, the index creation information may be stored in the data, and the scheduler may acquire the index creation information from the database. The index creation information may include a creation cluster for creating an index. The scheduler may create an index in the creation cluster according to the creation information in case the creation opportunity included in the creation information is reached. Wherein the creation information may include configuration information of the index.
In the foregoing embodiment, the management method for multi-cluster indexes is suitable for index lifecycle management of multi-ES clusters, and can uniformly manage and monitor indexes on the multi-ES clusters. By defining flexible strategies and task scheduling, automatic index management is realized. And concurrent processing and performance optimization mechanisms are introduced, so that the efficiency and throughput of index lifecycle management are improved.
In the foregoing embodiment, the present invention can achieve the following effects:
1. unified and centralized control of index lifecycle management of multiple ES clusters is achieved.
2. An automatic index management strategy and task scheduling are introduced, and the workload of operators is reduced.
3. Support the processing of large-scale data sets and improve performance through concurrent processing.
4. The expandability is strong, and the configuration and the use are easy.
In this embodiment, there is also provided a management system for multi-cluster index, including: the system comprises a plurality of clusters, wherein each cluster comprises an index and document data corresponding to the index are stored in the clusters;
the database is used for storing index lists of indexes of a plurality of clusters and configuration information of the indexes;
the scheduler is connected with the clusters and the database, and is used for acquiring an index list of the clusters and configuration information of each index included in the index list from the database, determining target clusters to be managed, which are included in the clusters, based on the configuration information, and managing the target indexes of the target clusters based on management requirements, which are included in the configuration information.
In the above embodiment, the cluster may be an elastomer search cluster, which is an open source based distributed search and analysis engine that is built on top of the Apache Lucene library. It is widely used for real-time data indexing, searching and analysis, and is suitable for processing large and varied data. In the elastic search, the Index (Index) refers to a collection of logically related documents. It is similar to tables in traditional databases, but in elastic search, indexing is more flexible and free, fields can be created dynamically and full text searches can be performed.
In the above embodiment, the execution subject of the management method of the multi-cluster index may be a scheduler. The management method of the multi-cluster index can be applied to a management system of the multi-cluster index, and the management system of the multi-cluster index can comprise a plurality of clusters, a scheduler and a database. The scheduler is connected with the database, and can acquire index lists of a plurality of clusters and configuration information of each index included in the index lists from the database. The configuration information may include a date of index creation, a management requirement of the index, a timing of managing the index, and the like, and the management requirement of the index may include index cross-level group copy information, modification information of the index, and the like. The opportunities to manage the index may include index reservation periods, index management opportunities, and the like. The index of the target cluster may be managed when the timing to manage the index is reached according to the management requirement included in the configuration information.
In the above embodiment, the structure of the management system for multi-cluster index may be shown in fig. 3, and as shown in fig. 3, the management system for multi-cluster index may further include an index management unit. The index management unit corresponds to the index definition/maintenance/deletion in fig. 3. That is, the index management unit supports a cluster administrator to realize daily management work of one-stop multi-ES cluster indexes through a multi-cluster index management system, including index creation, index template definition, index reservation period definition, index modification, index deletion and cross-cluster replication of the indexes.
For a plurality of ES clusters distributed in different region data centers, the management system with multi-cluster index can conveniently manage the ES clusters, and meanwhile, the management system with multi-cluster index can also support the elastic search clusters with different versions, so that the manager is shielded from the differences among the different versions, and the management is simpler and more convenient. That is, versions of clusters included in the plurality of clusters may be the same or different.
The scheduler unit is used for connecting to the ES cluster and acquiring ES indexes and document data through a network protocol. And acquiring a cluster index list and a reserved period of indexes from a database, periodically polling the indexes meeting the cleaning conditions through a timing task set by a scheduler, and cleaning the indexes meeting the conditions.
The scheduler also supports cross-cluster replication of indexes, and for indexes for which replication tasks are set, automatically initiates data replication from a source cluster to a target cluster when a condition is satisfied. In the replication process, the pressure of the detection cluster is also supported, and the replication rate is dynamically adjusted according to the pressure of the cluster, so that the stability of other services is ensured.
The database is used for storing metadata, operation data and some configuration data generated in the index lifecycle management process, including configuration data such as a data center, an ES cluster, an index and the like, and metadata such as a scheduling job history record, an administrator login and operation record, an Es cluster version difference and the like.
The method for managing the index lifecycle of the cross-ES cluster realizes encapsulation and abstraction of the ES cluster operation, can automatically copy indexes and data among a plurality of different ES clusters, and can also periodically clean old index data.
According to the invention, an index list of a plurality of clusters and configuration information of each index included in the index list are obtained from a database; determining target clusters to be managed, which are included in the clusters, according to the configuration information; and managing the target index of the target cluster according to the management requirement included in the configuration information. The index list of the clusters and the configuration information of each index included in the index list can be stored in the database, so that unified management of the clusters is realized, and the indexes of the target clusters can be managed according to the management requirements included in the configuration information due to the management requirements included in the configuration information, so that the management efficiency is improved. Therefore, the problem of low index efficiency of the management cluster existing in the related technology can be solved, and the effect of improving the index efficiency of the management cluster is achieved.
In an exemplary embodiment, the management system of multi-cluster index further includes: the index management unit is connected with the database and used for converting management information included in the received index management instruction into configuration information and storing the configuration information into cluster information of clusters corresponding to the index management instruction, wherein the cluster information is included in the database. In this embodiment, the user may input the index management instruction through the control interface of the management system of the multi-cluster index. The index management instruction comprises management information. The management information comprises a target cluster to be managed, an index of the target cluster, a management requirement and management information. Wherein the management requirements may include modification, creation, deletion, etc. The management information may further include a management opportunity including an index reservation period, an index management opportunity, an index creation opportunity, and the like. When the management requirement is the clearing index, the target index included in the target cluster and the document data corresponding to the target index can be cleared. When the management requirement is to modify the index operation, the management information can also be modified, such as updated content and content to be replaced. The modification information can be determined from the configuration information, and the update content in the modification information is utilized to replace the content to be replaced in the document data corresponding to the target index. When the management requirement is a copy index operation, the management information may include copy information, where the copy information may include a receiving cluster and data to be copied. The index may be copied according to the copy information.
From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The embodiment also provides a device for managing multiple cluster indexes, which is used for implementing the foregoing embodiments and preferred embodiments, and is not described in detail. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
Fig. 4 is a block diagram of a multi-cluster index management apparatus according to an embodiment of the present invention, as shown in fig. 4, including:
an obtaining module 42, configured to obtain an index list of a plurality of clusters and configuration information of each index included in the index list from a database;
a determining module 44, configured to determine a target cluster to be managed included in a plurality of the clusters based on the configuration information;
and a management module 46, configured to manage a target index of the target cluster based on a management requirement included in the configuration information.
In an exemplary embodiment, the determining module 44 may determine the target cluster to be managed included in the plurality of clusters based on the configuration information by: determining an index retention period included in each of the configuration information; determining a first index corresponding to an expiration index period included in the index reservation period; and determining the cluster corresponding to the first index as the target cluster.
In an exemplary embodiment, the determining module 44 may determine the target cluster to be managed included in the plurality of clusters based on the configuration information by: determining index management opportunities included in each configuration information; determining a second index included in the index management opportunity reaching the index management opportunity; and determining the cluster corresponding to the second index as the target cluster.
In one exemplary embodiment, the management module 46 may implement management of the target index of the target cluster based on the management requirements included in the configuration information by at least one of: under the condition that the management requirement comprises an index clearing operation, clearing the target index included in the target cluster and document data corresponding to the target index; modifying the target index and the document data corresponding to the target index based on modification information included in the configuration information when the modification index operation is included in the management requirement; and when the management requirement includes a copy index operation, copying the target index and document data corresponding to the target index based on copy information included in the configuration information.
In one exemplary embodiment, the management module 46 may implement copying the target index and document data corresponding to the target index based on the copy information included in the configuration information by: determining copy data from the document data based on the copy information; and a receiving cluster that receives the replicated data; and copying the copy data and the target index into the receiving cluster.
In one exemplary embodiment, management module 46 may implement copying the copy data and the target index into the receiving cluster by: determining load information of the receiving cluster; determining a replication rate at which the replicated data is replicated based on the load information; and copying the copy data and the target index into the receiving cluster according to the copy rate.
In an exemplary embodiment, the apparatus may further be configured to obtain index creation information from the database, where the index creation information includes a creation cluster for creating an index; and creating an index according to the index creation information in the creation cluster.
It should be noted that each of the above modules may be implemented by software or hardware, and for the latter, it may be implemented by, but not limited to: the modules are all located in the same processor; alternatively, the above modules may be located in different processors in any combination.
Embodiments of the present invention also provide a computer readable storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
In one exemplary embodiment, the computer readable storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
An embodiment of the invention also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
In an exemplary embodiment, the electronic apparatus may further include a transmission device connected to the processor, and an input/output device connected to the processor.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (12)

1. A method for managing multiple cluster indexes, comprising:
acquiring an index list of a plurality of clusters and configuration information of each index included in the index list from a database;
determining target clusters to be managed, which are included in a plurality of clusters, based on the configuration information;
and managing the target index of the target cluster based on the management requirement included in the configuration information.
2. The method of managing a multi-cluster index according to claim 1, wherein determining a target cluster to be managed included in a plurality of the clusters based on the configuration information includes:
determining an index retention period included in each of the configuration information;
determining a first index corresponding to an expiration index period included in the index reservation period;
and determining the cluster corresponding to the first index as the target cluster.
3. The method of managing a multi-cluster index according to claim 1, wherein determining a target cluster to be managed included in a plurality of the clusters based on the configuration information includes:
determining index management opportunities included in each configuration information;
determining a second index included in the index management opportunity reaching the index management opportunity;
and determining the cluster corresponding to the second index as the target cluster.
4. The method of managing a multi-cluster index according to claim 1, wherein managing the target index of the target cluster based on the management requirement included in the configuration information includes at least one of:
under the condition that the management requirement comprises an index clearing operation, clearing the target index included in the target cluster and document data corresponding to the target index;
modifying the target index and the document data corresponding to the target index based on modification information included in the configuration information when the modification index operation is included in the management requirement;
and when the management requirement includes a copy index operation, copying the target index and document data corresponding to the target index based on copy information included in the configuration information.
5. The method of managing a plurality of cluster indexes according to claim 4, wherein copying the target index and document data corresponding to the target index based on copy information included in the configuration information includes:
determining copy data from the document data based on the copy information; and a receiving cluster that receives the replicated data;
and copying the copy data and the target index into the receiving cluster.
6. The method of claim 5, wherein copying the replication data and the target index into the receiving cluster comprises:
determining load information of the receiving cluster;
determining a replication rate at which the replicated data is replicated based on the load information;
and copying the copy data and the target index into the receiving cluster according to the copy rate.
7. The method of managing a multi-cluster index of claim 1, further comprising:
obtaining index creation information from the database, wherein the index creation information comprises a creation cluster for creating an index;
and creating an index according to the index creation information in the creation cluster.
8. A system for managing multi-cluster indexes, comprising:
the system comprises a plurality of clusters, wherein each cluster comprises an index and document data corresponding to the index are stored in the clusters;
the database is used for storing index lists of indexes of a plurality of clusters and configuration information of the indexes;
the scheduler is connected with the clusters and the database, and is used for acquiring an index list of the clusters and configuration information of each index included in the index list from the database, determining target clusters to be managed, which are included in the clusters, based on the configuration information, and managing the target indexes of the target clusters based on management requirements, which are included in the configuration information.
9. The multi-cluster index management system of claim 8, further comprising: the index management unit is connected with the database and used for converting management information included in the received index management instruction into configuration information and storing the configuration information into cluster information of clusters corresponding to the index management instruction, wherein the cluster information is included in the database.
10. A multi-cluster index management apparatus, comprising:
the acquisition module is used for acquiring index lists of a plurality of clusters and configuration information of each index included in the index lists from a database;
a determining module, configured to determine target clusters to be managed, where the target clusters are included in a plurality of clusters, based on the configuration information;
and the management module is used for managing the target index of the target cluster based on the management requirement included in the configuration information.
11. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a computer program, wherein the computer program is arranged to execute the method of any of the claims 1 to 7 when run.
12. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to run the computer program to perform the method of any of the claims 1 to 7.
CN202311483200.2A 2023-11-08 2023-11-08 Multi-cluster index management method, system, device and storage medium Pending CN117472907A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311483200.2A CN117472907A (en) 2023-11-08 2023-11-08 Multi-cluster index management method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311483200.2A CN117472907A (en) 2023-11-08 2023-11-08 Multi-cluster index management method, system, device and storage medium

Publications (1)

Publication Number Publication Date
CN117472907A true CN117472907A (en) 2024-01-30

Family

ID=89625305

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311483200.2A Pending CN117472907A (en) 2023-11-08 2023-11-08 Multi-cluster index management method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN117472907A (en)

Similar Documents

Publication Publication Date Title
US10509785B2 (en) Policy-driven data manipulation in time-series database systems
US10248671B2 (en) Dynamic migration script management
CA2929776C (en) Client-configurable security options for data streams
US8271457B2 (en) Database management system and method which monitors action results and adjusts user parameters in response
EP3069274B1 (en) Managed service for acquisition, storage and consumption of large-scale data streams
US10621049B1 (en) Consistent backups based on local node clock
EP3069228B1 (en) Partition-based data stream processing framework
US8918392B1 (en) Data storage mapping and management
US20150134795A1 (en) Data stream ingestion and persistence techniques
US11579981B2 (en) Past-state backup generator and interface for database systems
WO2015070232A1 (en) Data stream ingestion and persistence techniques
US11860741B2 (en) Continuous data protection
CN109271435A (en) A kind of data pick-up method and system for supporting breakpoint transmission
CN109885642B (en) Hierarchical storage method and device for full-text retrieval
CN112597218A (en) Data processing method and device and data lake framework
CN117472907A (en) Multi-cluster index management method, system, device and storage medium
US11836125B1 (en) Scalable database dependency monitoring and visualization system
CN112905386A (en) Table data backup cleaning method and device based on life cycle
US20230161733A1 (en) Change block tracking for transfer of data for backups

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination