WO2024113898A1 - Metadata reporting method and apparatus, and device and storage medium - Google Patents

Metadata reporting method and apparatus, and device and storage medium Download PDF

Info

Publication number
WO2024113898A1
WO2024113898A1 PCT/CN2023/108423 CN2023108423W WO2024113898A1 WO 2024113898 A1 WO2024113898 A1 WO 2024113898A1 CN 2023108423 W CN2023108423 W CN 2023108423W WO 2024113898 A1 WO2024113898 A1 WO 2024113898A1
Authority
WO
WIPO (PCT)
Prior art keywords
mds
services
service
target
metadata
Prior art date
Application number
PCT/CN2023/108423
Other languages
French (fr)
Chinese (zh)
Inventor
孙业宽
张在贵
Original Assignee
苏州元脑智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州元脑智能科技有限公司 filed Critical 苏州元脑智能科技有限公司
Publication of WO2024113898A1 publication Critical patent/WO2024113898A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • H04L67/1004Server selection for load balancing

Definitions

  • the present application relates to the field of data processing technology, and in particular to a metadata reporting method, device, equipment and storage medium.
  • the metadata retrieval function of the distributed file system based on Elasticsearch can achieve sub-minute retrieval of tens of billions of files, support users to configure simple and complex retrieval conditions, and efficiently assist users in data management.
  • Elasticsearch an open source distributed, RESTful-style search and data analysis engine
  • the file passes through the MDS service (metadata service). All files in the distributed file system are distributed by different MDSs. Each MDS will report the latest metadata to different ES when the metadata of the file it is responsible for is updated. Users configure different retrieval conditions to retrieve a list of files that meet the requirements from the ES cluster.
  • the distributed file system is managed by multiple MDS services, and the ES cluster is also composed of multiple ES services. When reporting metadata, if the MDS reporting pressure is all pressed on the same ES, the ES performance will be reduced.
  • the purpose of this application is to provide a metadata reporting method, device, equipment and medium, which can achieve a balanced metadata reporting pressure and maximize the reporting performance of the program.
  • the specific scheme is as follows:
  • the present application discloses a metadata reporting method, comprising:
  • An ES service is allocated to each MDS service in a load balancing manner, and configuration information is generated based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
  • a corresponding load balancing distribution method is determined, including:
  • the load balancing distribution method is the first target load balancing distribution method.
  • the initial allocation is completed by sorting all MDS services and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services;
  • one ES service is selected from the beginning and allocated to the MDS service that has not been allocated an ES service after the initial allocation, so that each MDS service corresponds to at least one ES service.
  • a corresponding load balancing distribution method is determined, including:
  • the load balancing distribution method is the second target load balancing distribution method.
  • the ES services are allocated to the MDS services with the same sorting position according to the sorting position of the ES services, so that there is a one-to-one correspondence between the MDS services and the ES services.
  • a corresponding load balancing distribution method is determined, including:
  • the load balancing distribution method is the third target load balancing distribution method.
  • the initial allocation is completed by sorting all MDS services and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services;
  • the ES services that are not allocated after the initial allocation are reallocated according to the order of the MDS services until all ES services are allocated so that each ES service corresponds to at least one MDS service.
  • the communication address is configured for each ES service in the ES cluster through the distributed file system, so that the MDS service can report data to the corresponding ES service according to the communication address of the ES service.
  • the metadata reporting method also includes:
  • the target MDS is switched to the normally operating ES service so that the target MDS reports metadata to the normally operating ES service.
  • the method further includes:
  • the target MDS reports data to a corresponding candidate ES service according to the configuration information, and determines whether the candidate ES service has a fault according to the data reporting result fed back by the candidate ES service.
  • switch the target MDS to a running ES service including:
  • the reporting object switching mode for the target MDS is determined according to the judgment result, and the target MDS is switched to the normally operating ES service according to the reporting object switching mode.
  • determining a reporting object switching method for the target MDS according to the judgment result, and switching the target MDS to a normally operating ES service according to the reporting object switching method including:
  • determining a reporting object switching method for the target MDS according to the judgment result, and switching the target MDS to a normally operating ES service according to the reporting object switching method including:
  • the target MDS is switched to the first target ES service.
  • the metadata reporting method also includes:
  • the normally operating ES service does not hit the candidate ES service corresponding to the target MDS, the status of all candidate ES services corresponding to the target MDS will be monitored regularly, and when there is a normally operating second target ES service among all candidate ES services corresponding to the target MDS, the target MDS will be switched back to the second target ES service.
  • the present application discloses a metadata reporting system, including an ES cluster and a distributed file system;
  • the ES service in the ES cluster and the MDS service in the distributed file system are services allocated by a load balancing allocation method; the load balancing allocation method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
  • the MDS service is used to report metadata to a corresponding ES service according to configuration information; the configuration information is configuration information generated based on the allocation result.
  • a metadata reporting device comprising:
  • the load balancing distribution mode determination module is used to determine the corresponding load balancing distribution mode according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
  • the allocation module is used to allocate ES services to each MDS service in a load balancing manner, and generate configuration information based on the allocation result so that the MDS service reports data to the corresponding ES service according to the configuration information.
  • an electronic device comprising:
  • Memory used to store computer programs
  • the processor is used to execute a computer program to implement the aforementioned metadata reporting method.
  • the present application discloses a non-volatile readable storage medium for storing a computer program; wherein the computer program implements the aforementioned metadata reporting method when executed by a processor.
  • the corresponding load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system; ES services are allocated to each MDS service according to the load balancing distribution method, and configuration information is generated based on the distribution result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information. It can be seen that by allocating ES services according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, the currently adapted load balancing distribution method is used.
  • the MDS realizes the balance of metadata reporting pressure according to the number of ES and other conditions through load balancing, that is, ES load balancing is realized, so as to maximize the reporting performance of the program.
  • FIG1 is a schematic diagram of the structure of a metadata reporting system in the prior art
  • FIG2 is a flowchart of a metadata reporting method provided by the present application.
  • FIG3 is a schematic diagram of a load balancing distribution method provided by the present application.
  • FIG4 is a schematic diagram of another load balancing distribution method provided by the present application.
  • FIG5 is a schematic diagram of another load balancing distribution method provided by the present application.
  • FIG6 is a flowchart of a specific metadata reporting method provided by the present application.
  • FIG7 is a schematic diagram of a specific ES service failure switching provided by the present application.
  • FIG8 is a flowchart of a specific metadata reporting method provided by the present application.
  • FIG9 is a schematic diagram of a specific ES service switchback provided by the present application.
  • FIG10 is a schematic diagram of the structure of a metadata reporting device provided by the present application.
  • FIG11 is a structural diagram of an electronic device provided in this application.
  • this application proposes a metadata reporting method that can achieve a balanced metadata reporting pressure and maximize the reporting performance.
  • Some embodiments of the present application disclose a metadata reporting method. Referring to FIG. 2 , the method may include the following steps:
  • Step S11 Determine a corresponding load balancing distribution method according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system.
  • the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system are first counted, and then the corresponding load balancing distribution method is determined based on the size relationship between the total number of ES services and the total number of MDS services, but at least each MDS corresponds to at least one ES.
  • the size relationship between the total number of ES services and the total number of MDS services can be divided into three situations, namely, the total number of ES services is less than the total number of MDS services, the total number of ES services is equal to the total number of MDS services, and the total number of ES services is greater than the total number of MDS services; each situation corresponds to Different load balancing methods.
  • ES is an open source distributed, RESTful-style search and data analysis engine that encapsulates Lucene. It provides a set of simple and consistent RESTful APIs to help implement storage and retrieval. Multiple ESs construct an ES distributed search cluster. After an ES service fails, the ES cluster can still provide services normally.
  • ES is a distributed search engine and also a distributed database.
  • a distributed file system refers to a cluster composed of multiple file storage node servers. Files are stored in blocks, with objects as the basic unit. It supports storing a copy of data on multiple nodes. Each node can obtain complete data through inter-node communication. When a node goes down, complete data can be restored according to the configured policy. It has the characteristics of high availability, high performance, and high scalability.
  • Each node provides metadata services, namely MDS, for various metadata access operations to balance business pressure.
  • MDS service or metadata service, is used to maintain file metadata and process different metadata requests from clients.
  • Multiple MDSs construct a metadata service cluster, and each MDS is responsible for different subtrees of the entire system file tree to form a distributed metadata service cluster.
  • Step S12 Allocate an ES service to each MDS service in a load balancing manner, and generate configuration information based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
  • an ES service is allocated to each MDS service in this manner, and the allocation result is saved to generate configuration information so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information. That is, an MDS service may be allocated multiple ES services, but the MDS service only reports metadata to one candidate ES service among all the allocated candidate ES services at the same time.
  • the corresponding load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, which may include: if the total number of ES services is less than the total number of MDS services, the load balancing distribution method is the first target load balancing distribution method.
  • allocating ES services to each MDS service according to the load balancing distribution method may include: sorting all MDS services, and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services to complete the initial allocation; according to the sorting of the ES services, selecting one ES service from the beginning in turn and assigning it to the MDS services that have not been assigned ES services after the initial allocation, so that each MDS service corresponds to at least one ES service.
  • the distributed file system and the ES cluster are two systems and are configured separately, there is no dependency between them, that is, the number of ESs and the number of MDSs are different, and the ES number can be assigned by the MDS number, where the MDS number can be uniformly assigned by the MDS cluster and incremented from 0, 0/1/2, etc.
  • the ES services are allocated to the MDS services with the same sorting position according to the sorting position of the ES services to complete the initial allocation, and then, according to the sorting of the ES services, one ES service is selected from the beginning and allocated to the MDS services that have not been allocated ES services after the initial allocation, so that each Each MDS service corresponds to at least one ES service.
  • one MDS service may correspond to multiple ES services.
  • the corresponding load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, which may include: if the total number of ES services is equal to the total number of MDS services, then the load balancing distribution method is the second target load balancing distribution method.
  • allocating ES services to each MDS service according to the load balancing distribution method may include: sorting all MDS services, and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services, so that there is a one-to-one correspondence between MDS services and ES services. For example, as shown in Figure 4, when the number of ES is equal to the number of MDS, a one-to-one correspondence is sufficient.
  • the corresponding load balancing distribution method is determined, which may include: if the total number of ES services is greater than the total number of MDS services, the load balancing distribution method is the third target load balancing distribution method.
  • allocating ES services to each MDS service according to the load balancing distribution method may include: sorting all MDS services, and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services to complete the initial allocation; and re-allocating the ES services that are not allocated after the initial allocation according to the sorting of the MDS services until all ES services are allocated, so that each ES service corresponds to at least one MDS service.
  • the present application before allocating ES services to each MDS service in a load balancing manner, it may also include: configuring a communication address for each ES service in the ES cluster through a distributed file system, so that the MDS service reports data to the corresponding ES service according to the communication address of the ES service. That is, the distributed storage cluster will configure the address of each ES in the ES cluster, and the MDS connects and communicates with the ES through these addresses to report metadata.
  • some embodiments of the present application do not rely on any service components, will not affect the business system, and are also of reference value in various advanced functions of distributed storage.
  • a corresponding load balancing distribution method is determined based on the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system; ES services are allocated to each MDS service according to the load balancing distribution method, and configuration information is generated based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
  • Some embodiments of the present application disclose a specific metadata reporting method. Referring to FIG6 , the method may include the following steps:
  • Step S21 Determine a corresponding load balancing distribution method according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system.
  • Step S22 Allocate an ES service to each MDS service in a load balancing manner, and generate configuration information based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
  • Step S23 If the candidate ES service for receiving metadata reporting currently corresponding to the target MDS service in the distributed file system fails, the target MDS is switched to the normally operating ES service so that the target MDS reports metadata to the normally operating ES service.
  • failures and service anomalies are normal. In fact, the normal process only accounts for 20% of the function realization, and the failure and anomaly handling accounts for 80% of the function realization, that is, the 28th rule. Therefore, in the metadata reporting of the distributed file system metadata retrieval function based on Elasticsearch, that is, in the process of MDS uploading data to ES, ES failure anomalies are normal. If not handled, it will affect the metadata reporting business and ultimately affect the user's search results. If the file is not updated in time, the user will retrieve files that do not meet the user's requirements. If the user performs a deletion operation, the consequence is that the data may be deleted by mistake, causing irreparable losses.
  • the target MDS is switched to the normally operating ES service, that is, when the ES fails, the MDS actively switches to the normal ES service to implement the fault switching function, so that the target MDS reports metadata to the normally operating ES service.
  • the process may further include: the target MDS reports data to a corresponding candidate ES service according to configuration information, and determines whether the candidate ES service has a fault according to the data reporting result fed back by the candidate ES service. Specifically, determining whether the ES is normal may be performed by sending an HTTP GET message to the ES and determining based on the response.
  • switching the target MDS to a normally operating ES service may include: judging whether there is a normally operating first target ES service among all candidate ES services corresponding to the target MDS according to the configuration information; determining a reporting object switching method for the target MDS according to the judgment result, and switching the target MDS to a normally operating ES service according to the reporting object switching method. That is, in some embodiments of the present application, when switching services, it is necessary to first judge whether there is a normally operating ES service among all candidate ES services corresponding to the target MDS, and then select a specific service switching object.
  • the reporting object switching mode for the target MDS is determined according to the judgment result, and
  • the reporting object switching method switches the target MDS to a normally operating ES service, which may include: if there is no normally operating first target ES service among all candidate ES services corresponding to the target MDS in the configuration information, poll all ES services in the ES cluster until a normally operating ES service is found, and switch the target MDS to the normally operating ES service.
  • a reporting object switching method for the target MDS is determined based on a judgment result, and the target MDS is switched to a normally operating ES service based on the reporting object switching method, which may include: if there is a first target ES service that is normally operating among all candidate ES services corresponding to the target MDS in the configuration information, the target MDS is switched to the first target ES service.
  • the MDS will first try to select a normal ES from the candidate ES. If all the candidate ESs are not normal after selection, it will select from all ESs in turn. If all ESs are not normal, it will select from all ESs again until a normal ES is selected. Metadata reporting is not performed during the selection of ES. As shown in Figure 7, according to the load balancing allocation algorithm, the candidate ESs of each MDS are: MDS0: ES0, ES3; MDS1: ES1, ES4; MDS2: ES2.
  • MDS0 When ES0 fails, MDS0 will first select a candidate ES, that is, ES3. If ES3 is still not normal, it will try to select one by one from all ESs from ES0 to ES4. If ES0 to ES4 are also not normal, it will traverse ES0 to ES4 again until a normal ES is selected.
  • MDS when MDS is reporting metadata, the ES service on the node may be abnormal, and the node may also lose power, but ES complies with its own redundancy rules and can still provide services normally when an ES service is hung up or an ES node is down.
  • MDS should actively select an available ES in the ES cluster to continue metadata reporting, rather than waiting for the ES assigned to this MDS to become normal before continuing to report, thereby achieving business continuity and high availability.
  • some embodiments of the present application do not rely on any service components, will not affect the business system, and are also of reference value in various advanced functions of distributed storage.
  • MDS metadata reporting implements automatic fault switching, achieves global load balancing and high availability of distributed systems, and does not require manual intervention, thereby improving system stability, better assisting user data management, improving product competitiveness, and improving user satisfaction.
  • some embodiments of the present application further disclose a specific metadata reporting method. As shown in FIG8 , the method may include the following steps:
  • Step S31 Determine a corresponding load balancing distribution method according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system.
  • Step S32 Allocate an ES service to each MDS service in a load balancing manner, and generate configuration information based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
  • Step S33 If the candidate ES service for receiving metadata reporting currently corresponding to the target MDS service in the distributed file system fails, the target MDS is switched to the normally operating ES service so that the target MDS reports metadata to the normally operating ES service.
  • Step S34 Determine whether the normally running ES service hits the candidate ES service corresponding to the target MDS according to the configuration information.
  • Step S35 According to the hit result, it is determined whether to perform a report object switchback process on the normally running ES service currently corresponding to the target MDS.
  • the above service switching may switch to the candidate ES service corresponding to the target MDS, or it may switch to a normal ES service that is not a candidate. If the service is switched to a candidate ES service, the current state after the switch is also in line with the load balancing. However, if the service is switched to a normal ES service that is not a candidate, the current distribution does not conform to the initial load balancing distribution. Therefore, service switching back is required to achieve load balancing on the basis of normal operation.
  • it is determined whether to perform report object switching back processing on the normally operating ES service currently corresponding to the target MDS which may include: if the normally operating ES service does not hit the candidate ES service corresponding to the target MDS, then periodically monitoring the status of all candidate ES services corresponding to the target MDS, and when there is a normally operating second target ES service among all the candidate ES services corresponding to the target MDS, switching the target MDS back to the second target ES service.
  • a reporting object switchback process on the normally operating ES service currently corresponding to the target MDS, which may include: if the normally operating ES service hits the candidate ES service corresponding to the target MDS, the current reporting connection is retained. That is, if the MDS switches to connect to other ES after an ES failure, and provides reporting metadata services normally, during this period, it will periodically check whether the currently connected ES is a candidate ES. If not, it will periodically send a message to the candidate ES to check whether the candidate ES has returned to normal. If it has returned, it will switch back to the candidate ES. For example, as shown in Figure 9, ES0 fails, and MDS0 switches to ES1. In the process of reporting metadata through ES1, it also periodically checks whether ES0 is normal. If ES0 is normal, it switches back to ES0, thereby achieving fault switching back and load balancing.
  • the distributed file system metadata retrieval function load balancing high availability implementation method disclosed in some embodiments of the present application is through 1.
  • the distributed file system configures the address of each ES; 2.
  • the MDS evenly distributes each ES to each MDS according to the load balancing algorithm; 3.
  • the MDS reporting process if the ES fails, the MDS first tries to select an ES from the candidate ES, and if it is normal, the candidate ES is used; 4. If all candidate ESs are not normal, select from all ESs in turn, and if a normal ES is selected, use this ES to continue metadata reporting. If all ESs are not normal, continue to traverse all ESs until one is selected. Normal ES; 5.
  • MDS independently implements a set of load balancing and high availability functions, which does not rely on any components and is not coupled with distributed system functions.
  • the independent load balancing and high availability functions realize continuous, efficient and high availability of metadata reporting services, ensure the stable operation of the system, better assist users in data management, and efficiently develop data value.
  • the normally operating ES service hits the candidate ES service corresponding to the target MDS based on the configuration information; based on the hit result, it is determined whether to perform the reporting object switchback processing on the normally operating ES service currently corresponding to the target MDS.
  • the service switchback processing is performed by reporting the object switchback processing, so as to ensure load balancing as much as possible on the basis of normal operation.
  • some embodiments of the present application also disclose a metadata reporting system, including an ES cluster and a distributed file system; wherein the ES service in the ES cluster and the MDS service in the distributed file system are services allocated through a load balancing distribution method; wherein the load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system; the MDS service is used to report metadata to a corresponding ES service according to configuration information; the configuration information is configuration information generated based on the allocation result.
  • some embodiments of the present application further disclose a metadata reporting device, as shown in FIG10 , the device includes:
  • the load balancing distribution mode determination module 11 is used to determine the corresponding load balancing distribution mode according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
  • the allocation module 12 is used to allocate an ES service to each MDS service in a load balancing allocation manner, and generate configuration information based on the allocation result, so that the MDS service reports data to the corresponding ES service according to the configuration information.
  • the corresponding load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
  • the method allocates ES services to each MDS service, and generates configuration information based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information. It can be seen that by allocating ES services according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, the current adapted load balancing allocation method is used to allocate ES services.
  • MDS achieves the balance of metadata reporting pressure according to the number of ES and other conditions through load balancing, that is, ES load balancing is achieved, thereby maximizing the reporting performance of the program.
  • the load balancing distribution mode determination module 11 may specifically include:
  • the first load balancing allocation mode determining unit is configured to determine that if the total number of ES services is less than the total number of MDS services, the load balancing allocation mode is the first target load balancing allocation mode.
  • the allocation module 12 may specifically include:
  • the primary allocation unit is used to sort all the MDS services and all the ES services, and to allocate the ES services to the MDS services with the same sorting position according to the sorting position of the ES services, so as to complete the primary allocation;
  • the reallocation unit is used to select an ES service from the beginning in sequence according to the order of the ES services and allocate it to the MDS services that have not been allocated ES services after the initial allocation, so that each MDS service corresponds to at least one ES service.
  • the load balancing distribution mode determination module 11 may specifically include:
  • the second load balancing allocation mode determining unit is configured to determine that if the total number of ES services is equal to the total number of MDS services, the load balancing allocation mode is the second target load balancing allocation mode.
  • the allocation module 12 may specifically include:
  • the allocation unit is used to sort all MDS services and all ES services, and allocate ES services to MDS services with the same sorting position according to the sorting position of the ES services, so that there is a one-to-one correspondence between the MDS services and the ES services.
  • the load balancing distribution mode determination module 11 may specifically include:
  • the third load balancing allocation mode determining unit is configured to determine that if the total number of ES services is greater than the total number of MDS services, the load balancing allocation mode is a third target load balancing allocation mode.
  • the allocation module 12 may specifically include:
  • the primary allocation unit is used to sort all the MDS services and all the ES services, and to allocate the ES services to the MDS services with the same sorting position according to the sorting position of the ES services, so as to complete the primary allocation;
  • the reallocation unit is used to reallocate the ES services that are not allocated after the initial allocation according to the order of the MDS services until all ES services are allocated, so that each ES service corresponds to at least one MDS service.
  • the metadata reporting device may specifically include:
  • the address sending unit is used to configure a communication address for each ES service in the ES cluster through a distributed file system, so that the MDS service reports data to the corresponding ES service according to the communication address of the ES service.
  • the metadata reporting device may specifically include:
  • the service switching unit is used to switch the target MDS to the normally operating ES service if the candidate ES service currently corresponding to the target MDS service in the distributed file system for receiving metadata reporting fails, so that the target MDS reports metadata to the normally operating ES service.
  • the metadata reporting device may specifically include:
  • the fault judgment unit is used for the target MDS to report data to a corresponding candidate ES service according to the configuration information, and to judge whether the candidate ES service has a fault according to the data reporting result fed back by the candidate ES service.
  • the service switching unit may specifically include:
  • a candidate service determination unit configured to determine whether there is a first target ES service that is operating normally among all candidate ES services corresponding to the target MDS according to the configuration information
  • the ES service switching unit is used to determine a reporting object switching mode for the target MDS according to the judgment result, and switch the target MDS to a normally operating ES service according to the reporting object switching mode.
  • the ES service switching unit may specifically include:
  • the first switching unit is used to poll all ES services in the ES cluster until a normally operating ES service is found, and switch the target MDS to the normally operating ES service if there is no normally operating first target ES service among all candidate ES services corresponding to the target MDS in the configuration information.
  • the ES service switching unit may specifically include:
  • the second switching unit is configured to switch the target MDS to the first target ES service if there is a first target ES service that is operating normally among all candidate ES services corresponding to the target MDS in the configuration information.
  • the metadata reporting device may specifically include:
  • a hit judgment unit used to judge whether the normally running ES service hits the candidate ES service corresponding to the target MDS according to the configuration information
  • the switchback judgment unit is used to judge whether to perform a report object switchback process on the normally running ES service currently corresponding to the target MDS according to the hit result.
  • the switchback determination unit may specifically include:
  • the service switching back unit is used to periodically monitor the status of all candidate ES services corresponding to the target MDS if the normally operating ES service does not hit the candidate ES service corresponding to the target MDS, and when there is a normally operating second target ES service among all the candidate ES services corresponding to the target MDS, switch the target MDS back to the second target ES service.
  • the switchback determination unit may specifically include:
  • the service reservation unit is used to reserve the current reporting connection if the normally operating ES service hits the candidate ES service corresponding to the target MDS.
  • some embodiments of the present application also disclose an electronic device, as shown in FIG11 .
  • the content in the figure cannot be regarded as any limitation on the scope of use of the present application.
  • FIG11 is a schematic diagram of the structure of an electronic device 20 provided in some embodiments of the present application.
  • the electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input/output interface 25, and a communication bus 26.
  • the memory 22 is used to store a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the metadata reporting method disclosed in any of the aforementioned embodiments.
  • the power supply 23 is used to provide working voltage for each hardware device on the electronic device 20;
  • the communication interface 24 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows is any communication protocol that can be applied to the technical solution of the present application, and is not specifically limited here;
  • the input and output interface 25 is used to obtain external input data or output data to the outside world, and its specific interface type can be selected according to specific application needs and is not specifically limited here.
  • the memory 22, as a carrier for storing resources can be a read-only memory, a random access memory, a disk or an optical disk, etc.
  • the resources stored thereon include an operating system 221, a computer program 222, and data 223 including configuration information, etc.
  • the storage method can be temporary storage or permanent storage.
  • the operating system 221 is used to manage and control the hardware devices and computer programs 222 on the electronic device 20, so as to realize the operation and processing of the massive data 223 in the memory 22 by the processor 21, and it can be Windows Server, Netware, Unix, Linux, etc.
  • the computer program 222 can further include computer programs that can be used to complete other specific tasks.
  • some embodiments of the present application also disclose a non-volatile readable storage medium, in which computer executable instructions are stored.
  • the computer executable instructions are loaded and executed by a processor, the metadata reporting method steps disclosed in any of the aforementioned embodiments are implemented.
  • each embodiment is described in a progressive manner. Each embodiment focuses on the differences from other embodiments. The same or similar parts between the embodiments can be referred to each other.
  • the description is relatively simple.
  • the relevant parts please refer to the method part description. Can.
  • the steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two.
  • the software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to the technical field of data processing. Provided are a metadata reporting method and apparatus, and a device and a storage medium. The method comprises: determining a corresponding load balanced allocation mode according to a magnitude relationship between the total number of ES services in an ES cluster and the total number of MDSs in a distributed file system; and allocating an ES service to each of the MDSs according to the load balanced allocation mode, and generating configuration information on the basis of the allocation result, such that the MDS performs metadata reporting for the allocated candidate ES service according to the configuration information. By means of balancing, MDSs are connected to different ESs, such that the MDSs realize, by means of load balancing, balance of pressure of metadata reporting according to conditions such as the number of ESs, that is, ES load balancing is realized, thereby exerting reporting performance to the maximum extent.

Description

一种元数据上报方法、装置、设备及存储介质A metadata reporting method, device, equipment and storage medium
相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS
本申请要求于2022年11月30日提交中国专利局,申请号为202211518325.X,申请名称为“一种元数据上报方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims priority to the Chinese patent application filed with the China Patent Office on November 30, 2022, with application number 202211518325.X, and application name “A metadata reporting method, device, equipment and storage medium”, all contents of which are incorporated by reference in this application.
技术领域Technical Field
本申请涉及数据处理技术领域,特别涉及一种元数据上报方法、装置、设备及存储介质。The present application relates to the field of data processing technology, and in particular to a metadata reporting method, device, equipment and storage medium.
背景技术Background technique
目前,基于Elasticsearch(ES,开源的分布式、RESTful风格的搜索和数据分析引擎)的分布式文件系统元数据检索功能可以实现百亿级文件亚分钟级检索,支持用户配置简单和复杂的检索条件,高效辅助用户进行数据管理。例如图1所示,文件在进行元数据访问时经过MDS服务(元数据服务),分布式文件系统的所有文件由不同的MDS分布负责一部分,每个MDS都会在其负责的文件元数据更新时将最新的元数据上报到不同的ES上,用户配置不同的检索条件从ES集群中检索出符合要求的文件列表。分布式文件系统是由多个MDS服务负责,ES集群也是由多个ES服务组成,在元数据上报时MDS上报压力如果都压向同一个ES,会降低ES性能。At present, the metadata retrieval function of the distributed file system based on Elasticsearch (ES, an open source distributed, RESTful-style search and data analysis engine) can achieve sub-minute retrieval of tens of billions of files, support users to configure simple and complex retrieval conditions, and efficiently assist users in data management. For example, as shown in Figure 1, when accessing metadata, the file passes through the MDS service (metadata service). All files in the distributed file system are distributed by different MDSs. Each MDS will report the latest metadata to different ES when the metadata of the file it is responsible for is updated. Users configure different retrieval conditions to retrieve a list of files that meet the requirements from the ES cluster. The distributed file system is managed by multiple MDS services, and the ES cluster is also composed of multiple ES services. When reporting metadata, if the MDS reporting pressure is all pressed on the same ES, the ES performance will be reduced.
发明内容Summary of the invention
有鉴于此,本申请的目的在于提供一种元数据上报方法、装置、设备及介质,能够实现元数据上报的压力的均衡,最大程序发挥上报性能。其具体方案如下:In view of this, the purpose of this application is to provide a metadata reporting method, device, equipment and medium, which can achieve a balanced metadata reporting pressure and maximize the reporting performance of the program. The specific scheme is as follows:
第一方面,本申请公开了一种元数据上报方法,包括:In a first aspect, the present application discloses a metadata reporting method, comprising:
根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式;Determine the corresponding load balancing distribution method based on the relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
按照负载均衡分配方式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向分配的一个候选ES服务进行元数据上报。An ES service is allocated to each MDS service in a load balancing manner, and configuration information is generated based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
可选的,根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,包括: Optionally, according to the relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, a corresponding load balancing distribution method is determined, including:
若ES服务的总数量小于MDS服务的总数量,则负载均衡分配方式为第一目标负载均衡分配方式。If the total number of ES services is less than the total number of MDS services, the load balancing distribution method is the first target load balancing distribution method.
可选的,按照负载均衡分配方式为每个MDS服务分配ES服务,包括:Optionally, allocate ES services to each MDS service in a load balancing manner, including:
通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以完成初次分配;The initial allocation is completed by sorting all MDS services and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services;
根据ES服务的排序,从头依次选取一个ES服务分配给初次分配后未被分配ES服务的MDS服务,以便每个MDS服务至少对应一个ES服务。According to the order of ES services, one ES service is selected from the beginning and allocated to the MDS service that has not been allocated an ES service after the initial allocation, so that each MDS service corresponds to at least one ES service.
可选的,根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,包括:Optionally, according to the relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, a corresponding load balancing distribution method is determined, including:
若ES服务的总数量等于MDS服务的总数量,则负载均衡分配方式为第二目标负载均衡分配方式。If the total number of ES services is equal to the total number of MDS services, the load balancing distribution method is the second target load balancing distribution method.
可选的,按照负载均衡分配方式为每个MDS服务分配ES服务,包括:Optionally, allocate ES services to each MDS service in a load balancing manner, including:
通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以便MDS服务与ES服务之间一一对应。By sorting all the MDS services and sorting all the ES services, the ES services are allocated to the MDS services with the same sorting position according to the sorting position of the ES services, so that there is a one-to-one correspondence between the MDS services and the ES services.
可选的,根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,包括:Optionally, according to the relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, a corresponding load balancing distribution method is determined, including:
若ES服务的总数量大于MDS服务的总数量,则负载均衡分配方式为第三目标负载均衡分配方式。If the total number of ES services is greater than the total number of MDS services, the load balancing distribution method is the third target load balancing distribution method.
可选的,按照负载均衡分配方式为每个MDS服务分配ES服务,包括:Optionally, allocate ES services to each MDS service in a load balancing manner, including:
通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以完成初次分配;The initial allocation is completed by sorting all MDS services and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services;
将初次分配后未分配的ES服务按照MDS服务的排序进行再次分配,直至所有ES服务分配完成,以便每个ES服务至少对应一个MDS服务。The ES services that are not allocated after the initial allocation are reallocated according to the order of the MDS services until all ES services are allocated so that each ES service corresponds to at least one MDS service.
可选的,按照负载均衡分配方式为每个MDS服务分配ES服务之前,还包括:Optionally, before allocating an ES service to each MDS service in a load balancing manner, the following is also included:
通过分布式文件系统为ES集群内的每个ES服务配置通信地址,以便MDS服务根据ES服务的通信地址向相应的ES服务进行数据上报。The communication address is configured for each ES service in the ES cluster through the distributed file system, so that the MDS service can report data to the corresponding ES service according to the communication address of the ES service.
可选的,元数据上报方法,还包括:Optionally, the metadata reporting method also includes:
若分布式文件系统内目标MDS服务当前对应的用于接收元数据上报的候选ES服务出现故障,将目标MDS切换到正常运行的ES服务,以便目标MDS向正常运行的ES服务上报元数据。 If the candidate ES service for receiving metadata reports currently corresponding to the target MDS service in the distributed file system fails, the target MDS is switched to the normally operating ES service so that the target MDS reports metadata to the normally operating ES service.
可选的,若分布式文件系统内目标MDS服务对应的用于接收元数据上报的候选ES服务出现故障之前,还包括:Optionally, before a candidate ES service for receiving metadata reports corresponding to a target MDS service in a distributed file system fails, the method further includes:
目标MDS根据配置信息向对应的一个候选ES服务进行数据上报,并根据候选ES服务反馈的数据上报结果判断该候选ES服务是否存在故障。The target MDS reports data to a corresponding candidate ES service according to the configuration information, and determines whether the candidate ES service has a fault according to the data reporting result fed back by the candidate ES service.
可选的,将目标MDS切换到正常运行的ES服务,包括:Optionally, switch the target MDS to a running ES service, including:
根据配置信息判断目标MDS对应的所有候选ES服务中是否存在正常运行的第一目标ES服务;Determine whether there is a first target ES service that is operating normally among all candidate ES services corresponding to the target MDS according to the configuration information;
根据判断结果确定针对目标MDS的上报对象切换方式,并根据上报对象切换方式将目标MDS切换到正常运行的ES服务。The reporting object switching mode for the target MDS is determined according to the judgment result, and the target MDS is switched to the normally operating ES service according to the reporting object switching mode.
可选的,根据判断结果确定针对目标MDS的上报对象切换方式,并根据上报对象切换方式将目标MDS切换到正常运行的ES服务,包括:Optionally, determining a reporting object switching method for the target MDS according to the judgment result, and switching the target MDS to a normally operating ES service according to the reporting object switching method, including:
若配置信息中目标MDS对应的所有候选ES服务中不存在正常运行的第一目标ES服务,则轮询ES集群中的所有ES服务,直至查询到正常运行的ES服务,并将目标MDS切换到正常运行的ES服务。If there is no first target ES service operating normally among all candidate ES services corresponding to the target MDS in the configuration information, all ES services in the ES cluster are polled until a normally operating ES service is found, and the target MDS is switched to the normally operating ES service.
可选的,根据判断结果确定针对目标MDS的上报对象切换方式,并根据上报对象切换方式将目标MDS切换到正常运行的ES服务,包括:Optionally, determining a reporting object switching method for the target MDS according to the judgment result, and switching the target MDS to a normally operating ES service according to the reporting object switching method, including:
若配置信息中目标MDS对应的所有候选ES服务中存在正常运行的第一目标ES服务,则将目标MDS切换到第一目标ES服务。If there is a first target ES service that is operating normally among all candidate ES services corresponding to the target MDS in the configuration information, the target MDS is switched to the first target ES service.
可选的,元数据上报方法,还包括:Optionally, the metadata reporting method also includes:
根据配置信息判断正常运行的ES服务是否命中目标MDS对应的候选ES服务;Determine whether the normally running ES service hits the candidate ES service corresponding to the target MDS based on the configuration information;
根据命中结果,判断是否对目标MDS当前对应的正常运行的ES服务进行上报对象切回处理。According to the hit result, it is determined whether to perform the reporting object switchback processing on the normally running ES service currently corresponding to the target MDS.
可选的,根据命中结果,判断是否对目标MDS当前对应的正常运行的ES服务进行上报对象切回处理,包括:Optionally, based on the hit result, it is determined whether to perform a report object switchback process on the normally running ES service currently corresponding to the target MDS, including:
若正常运行的ES服务没有命中目标MDS对应的候选ES服务,则定时监测目标MDS对应的所有候选ES服务的状态,并当目标MDS对应的所有候选ES服务中存在正常运行的第二目标ES服务后,将目标MDS切换回第二目标ES服务。If the normally operating ES service does not hit the candidate ES service corresponding to the target MDS, the status of all candidate ES services corresponding to the target MDS will be monitored regularly, and when there is a normally operating second target ES service among all candidate ES services corresponding to the target MDS, the target MDS will be switched back to the second target ES service.
可选的,根据命中结果,判断是否对目标MDS当前对应的正常运行的ES服务进行上报对象切回处理,包括:Optionally, based on the hit result, it is determined whether to perform a report object switchback process on the normally running ES service currently corresponding to the target MDS, including:
若正常运行的ES服务命中目标MDS对应的候选ES服务,则保留当前上报连接。 If the normally operating ES service hits the candidate ES service corresponding to the target MDS, the current reporting connection is retained.
又一方面,本申请公开了一种元数据上报系统,包括ES集群和分布式文件系统;In yet another aspect, the present application discloses a metadata reporting system, including an ES cluster and a distributed file system;
其中,ES集群中的ES服务与分布式文件系统中的MDS服务为经过负载均衡分配方式分配的服务;其中,负载均衡分配方式为根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系确定的;The ES service in the ES cluster and the MDS service in the distributed file system are services allocated by a load balancing allocation method; the load balancing allocation method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
MDS服务用于根据配置信息向对应的一个ES服务进行元数据上报;配置信息为基于分配结果生成的配置信息。The MDS service is used to report metadata to a corresponding ES service according to configuration information; the configuration information is configuration information generated based on the allocation result.
又一方面,本申请公开了一种元数据上报装置,包括:In another aspect, the present application discloses a metadata reporting device, comprising:
负载均衡分配方式确定模块,用于根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式;The load balancing distribution mode determination module is used to determine the corresponding load balancing distribution mode according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
分配模块,用于按照负载均衡分配方式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向对应的ES服务进行数据上报。The allocation module is used to allocate ES services to each MDS service in a load balancing manner, and generate configuration information based on the allocation result so that the MDS service reports data to the corresponding ES service according to the configuration information.
又一方面,本申请公开了一种电子设备,包括:In another aspect, the present application discloses an electronic device, comprising:
存储器,用于保存计算机程序;Memory, used to store computer programs;
处理器,用于执行计算机程序,以实现前述的元数据上报方法。The processor is used to execute a computer program to implement the aforementioned metadata reporting method.
又一方面,本申请公开了一种非易失性可读存储介质,用于存储计算机程序;其中计算机程序被处理器执行时实现前述的元数据上报方法。On the other hand, the present application discloses a non-volatile readable storage medium for storing a computer program; wherein the computer program implements the aforementioned metadata reporting method when executed by a processor.
本申请中,根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式;按照负载均衡分配方式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向分配的一个候选ES服务进行元数据上报。可见,通过根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,利用当前适配的负载均衡分配方式进行ES服务的分配,由此,基于Elasticsearch的分布式文件系统元数据检索功能,通过均衡每个MDS连接到不同的ES,MDS实现通过负载均衡根据ES个数等条件实现元数据上报的压力的均衡,即实现ES负载均衡,从而最大程序发挥上报性能。In this application, the corresponding load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system; ES services are allocated to each MDS service according to the load balancing distribution method, and configuration information is generated based on the distribution result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information. It can be seen that by allocating ES services according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, the currently adapted load balancing distribution method is used. Thus, based on the metadata retrieval function of the distributed file system of Elasticsearch, by balancing each MDS connected to different ES, the MDS realizes the balance of metadata reporting pressure according to the number of ES and other conditions through load balancing, that is, ES load balancing is realized, so as to maximize the reporting performance of the program.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。 In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required for use in the embodiments or the description of the prior art will be briefly introduced below. Obviously, the drawings described below are merely embodiments of the present application. For ordinary technicians in this field, other drawings can be obtained based on the provided drawings without paying any creative work.
图1为现有技术中元数据上报系统结构示意图;FIG1 is a schematic diagram of the structure of a metadata reporting system in the prior art;
图2为本申请提供的一种元数据上报方法流程图;FIG2 is a flowchart of a metadata reporting method provided by the present application;
图3为本申请提供的一种负载均衡分配方式示意图;FIG3 is a schematic diagram of a load balancing distribution method provided by the present application;
图4为本申请提供的又一种负载均衡分配方式示意图;FIG4 is a schematic diagram of another load balancing distribution method provided by the present application;
图5为本申请提供的又一种负载均衡分配方式示意图;FIG5 is a schematic diagram of another load balancing distribution method provided by the present application;
图6为本申请提供的一种具体的元数据上报方法流程图;FIG6 is a flowchart of a specific metadata reporting method provided by the present application;
图7为本申请提供的一种具体的ES服务故障切换示意图;FIG7 is a schematic diagram of a specific ES service failure switching provided by the present application;
图8为本申请提供的一种具体的元数据上报方法流程图;FIG8 is a flowchart of a specific metadata reporting method provided by the present application;
图9为本申请提供的一种具体的ES服务切回示意图;FIG9 is a schematic diagram of a specific ES service switchback provided by the present application;
图10为本申请提供的一种元数据上报装置结构示意图;FIG10 is a schematic diagram of the structure of a metadata reporting device provided by the present application;
图11为本申请提供的一种电子设备结构图。FIG11 is a structural diagram of an electronic device provided in this application.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solution and advantages of the embodiments of the present application clearer, the technical solution in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments are only part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by ordinary technicians in this field without creative work are within the scope of protection of this application.
现有技术中,分布式文件系统是由多个MDS服务负责,ES集群也是由多个ES服务组成,在元数据上报时MDS上报压力如果都压向同一个ES,会降低ES性能。为克服上述技术问题,本申请提出一种元数据上报方法,能够实现元数据上报的压力的均衡,最大程序发挥上报性能。In the prior art, a distributed file system is managed by multiple MDS services, and an ES cluster is also composed of multiple ES services. If the MDS reporting pressure is all applied to the same ES when reporting metadata, the ES performance will be reduced. In order to overcome the above technical problems, this application proposes a metadata reporting method that can achieve a balanced metadata reporting pressure and maximize the reporting performance.
本申请的一些实施例公开了一种元数据上报方法,参见图2所示,该方法可以包括以下步骤:Some embodiments of the present application disclose a metadata reporting method. Referring to FIG. 2 , the method may include the following steps:
步骤S11:根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式。Step S11: Determine a corresponding load balancing distribution method according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system.
本申请的一些实施例中,首先统计ES集群内ES服务的总数量,以及分布式文件系统内MDS服务的总数量,然后根据ES服务的总数量与MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,但至少保证每个MDS至少对应一个ES。ES服务的总数量与MDS服务的总数量之间的大小关系可以分为三种情况,即ES服务的总数量小于MDS服务的总数量,ES服务的总数量等于MDS服务的总数量以及ES服务的总数量大于MDS服务的总数量;每种情况对应 不同的负载均衡分配方式。In some embodiments of the present application, the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system are first counted, and then the corresponding load balancing distribution method is determined based on the size relationship between the total number of ES services and the total number of MDS services, but at least each MDS corresponds to at least one ES. The size relationship between the total number of ES services and the total number of MDS services can be divided into three situations, namely, the total number of ES services is less than the total number of MDS services, the total number of ES services is equal to the total number of MDS services, and the total number of ES services is greater than the total number of MDS services; each situation corresponds to Different load balancing methods.
ES是一个开源的分布式、RESTful风格的搜索和数据分析引擎,其对Lucene做了一层封装,它提供了一套简单一致的RESTfulAPI来帮助实现存储和检索,多个ES构造一套ES分布式搜索集群,在一个ES服务挂掉后ES集群仍能正常提供服务,ES是一种分布式搜索引擎,也是一种分布式数据库。ES is an open source distributed, RESTful-style search and data analysis engine that encapsulates Lucene. It provides a set of simple and consistent RESTful APIs to help implement storage and retrieval. Multiple ESs construct an ES distributed search cluster. After an ES service fails, the ES cluster can still provide services normally. ES is a distributed search engine and also a distributed database.
分布式文件系统是指多个文件存储节点服务器构成的集群,文件切块存储,以对象为基本单位,支持一份数据存储在多个节点上,每个节点通过节点间通信都可以获取到完整的数据,当节点出现宕机时根据配置的策略可以进行完整数据的恢复,具有高可用、高性能、高扩展性等特点,其中每个节点都提供元数据服务即MDS,用于元数据的各种访问操作,均衡业务压力。MDS服务,即元数据服务,用于维护文件元数据,处理客户端的不同元数据请求,多个MDS构造元数据服务集群,每个MDS分别负责整系统文件树的不同子树,以构成分布式元数据服务集群。A distributed file system refers to a cluster composed of multiple file storage node servers. Files are stored in blocks, with objects as the basic unit. It supports storing a copy of data on multiple nodes. Each node can obtain complete data through inter-node communication. When a node goes down, complete data can be restored according to the configured policy. It has the characteristics of high availability, high performance, and high scalability. Each node provides metadata services, namely MDS, for various metadata access operations to balance business pressure. MDS service, or metadata service, is used to maintain file metadata and process different metadata requests from clients. Multiple MDSs construct a metadata service cluster, and each MDS is responsible for different subtrees of the entire system file tree to form a distributed metadata service cluster.
步骤S12:按照负载均衡分配方式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向分配的一个候选ES服务进行元数据上报。Step S12: Allocate an ES service to each MDS service in a load balancing manner, and generate configuration information based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
本申请的一些实施例中,确定负载均衡分配方式后,按照该方式为每个MDS服务分配ES服务,并将分配的结果保存起来,生成配置信息,以便MDS服务根据配置信息向分配的一个候选ES服务进行元数据上报。即可能一个MDS服务被分配了多个ES服务,但相同时间内MDS服务仅向被分配的所有候选ES服务中的一个候选ES服务进行元数据上报。In some embodiments of the present application, after determining the load balancing allocation method, an ES service is allocated to each MDS service in this manner, and the allocation result is saved to generate configuration information so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information. That is, an MDS service may be allocated multiple ES services, but the MDS service only reports metadata to one candidate ES service among all the allocated candidate ES services at the same time.
本申请的一些实施例中,根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,可以包括:若ES服务的总数量小于MDS服务的总数量,则负载均衡分配方式为第一目标负载均衡分配方式。本申请的一些实施例中,按照负载均衡分配方式为每个MDS服务分配ES服务,可以包括:通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以完成初次分配;根据ES服务的排序,从头依次选取一个ES服务分配给初次分配后未被分配ES服务的MDS服务,以便每个MDS服务至少对应一个ES服务。In some embodiments of the present application, the corresponding load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, which may include: if the total number of ES services is less than the total number of MDS services, the load balancing distribution method is the first target load balancing distribution method. In some embodiments of the present application, allocating ES services to each MDS service according to the load balancing distribution method may include: sorting all MDS services, and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services to complete the initial allocation; according to the sorting of the ES services, selecting one ES service from the beginning in turn and assigning it to the MDS services that have not been assigned ES services after the initial allocation, so that each MDS service corresponds to at least one ES service.
即由于分布式文件系统和ES集群是两套系统并且分别配置,相互之前无依赖关系,即ES个数和MDS个数不一样,可以通过MDS编号ES编号,其中,MDS编号可以由MDS集群统一分配从0开始依次递增,0/1/2等,例如图3所示,当ES个数小于MDS个数,分配ES时,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以完成初次分配,然后,根据ES服务的排序,从头依次选取一个ES服务分配给初次分配后未被分配ES服务的MDS服务,以便每 个MDS服务至少对应一个ES服务,此种情况下或存在一个MDS服务对应多个ES服务的情况。That is, since the distributed file system and the ES cluster are two systems and are configured separately, there is no dependency between them, that is, the number of ESs and the number of MDSs are different, and the ES number can be assigned by the MDS number, where the MDS number can be uniformly assigned by the MDS cluster and incremented from 0, 0/1/2, etc. For example, as shown in Figure 3, when the number of ESs is less than the number of MDSs, when allocating ESs, the ES services are allocated to the MDS services with the same sorting position according to the sorting position of the ES services to complete the initial allocation, and then, according to the sorting of the ES services, one ES service is selected from the beginning and allocated to the MDS services that have not been allocated ES services after the initial allocation, so that each Each MDS service corresponds to at least one ES service. In this case, one MDS service may correspond to multiple ES services.
本申请的一些实施例中,根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,可以包括:若ES服务的总数量等于MDS服务的总数量,则负载均衡分配方式为第二目标负载均衡分配方式。本申请的一些实施例中,按照负载均衡分配方式为每个MDS服务分配ES服务,可以包括:通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以便MDS服务与ES服务之间一一对应。例如图4所示,当ES个数等于MDS个数时,一一对应即可。In some embodiments of the present application, the corresponding load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, which may include: if the total number of ES services is equal to the total number of MDS services, then the load balancing distribution method is the second target load balancing distribution method. In some embodiments of the present application, allocating ES services to each MDS service according to the load balancing distribution method may include: sorting all MDS services, and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services, so that there is a one-to-one correspondence between MDS services and ES services. For example, as shown in Figure 4, when the number of ES is equal to the number of MDS, a one-to-one correspondence is sufficient.
本申请的一些实施例中,根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,可以包括:若ES服务的总数量大于MDS服务的总数量,则负载均衡分配方式为第三目标负载均衡分配方式。本申请的一些实施例中,按照负载均衡分配方式为每个MDS服务分配ES服务,可以包括:通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以完成初次分配;将初次分配后未分配的ES服务按照MDS服务的排序进行再次分配,直至所有ES服务分配完成,以便每个ES服务至少对应一个MDS服务。例如图5所示,即当ES个数大于MDS个数时,ES个数比MDS个数多出来的ES也会均衡分配到各MDS,ES0和ES 3是MDS0的候选ES,ES1和ES 4是MDS1的候选ES,但初始时MDS0只和ES0建立连接。可见,通过以上3种分配方式实现元数据上报的负载均衡。In some embodiments of the present application, according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, the corresponding load balancing distribution method is determined, which may include: if the total number of ES services is greater than the total number of MDS services, the load balancing distribution method is the third target load balancing distribution method. In some embodiments of the present application, allocating ES services to each MDS service according to the load balancing distribution method may include: sorting all MDS services, and sorting all ES services, and allocating ES services to MDS services with the same sorting position according to the sorting position of the ES services to complete the initial allocation; and re-allocating the ES services that are not allocated after the initial allocation according to the sorting of the MDS services until all ES services are allocated, so that each ES service corresponds to at least one MDS service. For example, as shown in Figure 5, when the number of ES is greater than the number of MDSs, the ESs that are more than the number of MDSs will also be evenly distributed to each MDS, ES0 and ES 3 are candidate ESs for MDS0, ES1 and ES 4 are candidate ESs for MDS1, but initially MDS0 only establishes a connection with ES0. It can be seen that the load balancing of metadata reporting is achieved through the above three allocation methods.
本申请的一些实施例中,按照负载均衡分配方式为每个MDS服务分配ES服务之前,还可以包括:通过分布式文件系统为ES集群内的每个ES服务配置通信地址,以便MDS服务根据ES服务的通信地址向相应的ES服务进行数据上报。即分布式存储集群会配置ES集群每个ES的地址,MDS通过这些地址和ES进行连接通信进行元数据上报。且本申请的一些实施例不依赖任何服务组件,不会对业务系统产生影响,在分布式存储各高级功能中也具有参考价值。In some embodiments of the present application, before allocating ES services to each MDS service in a load balancing manner, it may also include: configuring a communication address for each ES service in the ES cluster through a distributed file system, so that the MDS service reports data to the corresponding ES service according to the communication address of the ES service. That is, the distributed storage cluster will configure the address of each ES in the ES cluster, and the MDS connects and communicates with the ES through these addresses to report metadata. In addition, some embodiments of the present application do not rely on any service components, will not affect the business system, and are also of reference value in various advanced functions of distributed storage.
由上可见,本申请的一些实施例中根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式;按照负载均衡分配方式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向分配的一个候选ES服务进行元数据上报。可见,通过根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,利用当前适配的负载均衡分配方式进行ES服务的分配,由此,基于Elasticsearch的分布式文件系统元数据检索功能,通过均衡每个MDS连接到不同的ES,MDS实现通过负载均衡根据ES个数等条件实现元数据上报的压力的均衡, 即实现ES负载均衡,从而最大程序发挥上报性能。As can be seen from the above, in some embodiments of the present application, a corresponding load balancing distribution method is determined based on the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system; ES services are allocated to each MDS service according to the load balancing distribution method, and configuration information is generated based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information. It can be seen that by allocating ES services based on the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system using the currently adapted load balancing distribution method, thereby, based on the metadata retrieval function of the distributed file system of Elasticsearch, by balancing each MDS connected to different ES, the MDS achieves the balance of metadata reporting pressure according to conditions such as the number of ES through load balancing, That is, ES load balancing is achieved, thereby maximizing the reporting performance of the program.
本申请的一些实施例公开了一种具体的元数据上报方法,参见图6所示,该方法可以包括以下步骤:Some embodiments of the present application disclose a specific metadata reporting method. Referring to FIG6 , the method may include the following steps:
步骤S21:根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式。Step S21: Determine a corresponding load balancing distribution method according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system.
步骤S22:按照负载均衡分配方式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向分配的一个候选ES服务进行元数据上报。Step S22: Allocate an ES service to each MDS service in a load balancing manner, and generate configuration information based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
步骤S23:若分布式文件系统内目标MDS服务当前对应的用于接收元数据上报的候选ES服务出现故障,将目标MDS切换到正常运行的ES服务,以便目标MDS向正常运行的ES服务上报元数据。Step S23: If the candidate ES service for receiving metadata reporting currently corresponding to the target MDS service in the distributed file system fails, the target MDS is switched to the normally operating ES service so that the target MDS reports metadata to the normally operating ES service.
任何一个系统在运行过程中,故障、服务异常等是常态,实际上正常流程只占功能实现的20%,故障异常处理占功能实现的80%,即28法则,所以基于Elasticsearch实现的分布式文件系统元数据检索功能元数据上报,即MDS将数据上传到ES的过程中,ES故障异常是常态,如果不处理则会影响到元数据上报业务,最终影响到用户检索结果,文件更新不及时,用户会检索到不符合用户要求的文件,如果用户执行的是删除操作,则后果是有可能误删数据,造成不可挽回的损失。因此,本申请的一些实施例中,若分布式文件系统内某个MDS服务当前对应的用于接收元数据上报的候选ES服务出现故障时,将目标MDS切换到正常运行的ES服务,即当ES故障时,MDS主动切换至正常的ES服务,实现故障切换功能,以便目标MDS向正常运行的ES服务上报元数据。During the operation of any system, failures and service anomalies are normal. In fact, the normal process only accounts for 20% of the function realization, and the failure and anomaly handling accounts for 80% of the function realization, that is, the 28th rule. Therefore, in the metadata reporting of the distributed file system metadata retrieval function based on Elasticsearch, that is, in the process of MDS uploading data to ES, ES failure anomalies are normal. If not handled, it will affect the metadata reporting business and ultimately affect the user's search results. If the file is not updated in time, the user will retrieve files that do not meet the user's requirements. If the user performs a deletion operation, the consequence is that the data may be deleted by mistake, causing irreparable losses. Therefore, in some embodiments of the present application, if a candidate ES service for receiving metadata reporting currently corresponding to a certain MDS service in the distributed file system fails, the target MDS is switched to the normally operating ES service, that is, when the ES fails, the MDS actively switches to the normal ES service to implement the fault switching function, so that the target MDS reports metadata to the normally operating ES service.
本申请的一些实施例中,若分布式文件系统内目标MDS服务对应的用于接收元数据上报的候选ES服务出现故障之前,还可以包括:目标MDS根据配置信息向对应的一个候选ES服务进行数据上报,并根据候选ES服务反馈的数据上报结果判断该候选ES服务是否存在故障。具体的,判断ES是否正常可以通过向这个ES发送HTTP的GET消息,并根据响应判断。In some embodiments of the present application, before a candidate ES service for receiving metadata reports corresponding to a target MDS service in a distributed file system fails, the process may further include: the target MDS reports data to a corresponding candidate ES service according to configuration information, and determines whether the candidate ES service has a fault according to the data reporting result fed back by the candidate ES service. Specifically, determining whether the ES is normal may be performed by sending an HTTP GET message to the ES and determining based on the response.
本申请的一些实施例中,将目标MDS切换到正常运行的ES服务,可以包括:根据配置信息判断目标MDS对应的所有候选ES服务中是否存在正常运行的第一目标ES服务;根据判断结果确定针对目标MDS的上报对象切换方式,并根据上报对象切换方式将目标MDS切换到正常运行的ES服务。即本申请的一些实施例中,在切换服务时,需要先判断目标MDS对应的所有候选ES服务中是否存在正常运行的ES服务,具此选择具体的服务切换对象。In some embodiments of the present application, switching the target MDS to a normally operating ES service may include: judging whether there is a normally operating first target ES service among all candidate ES services corresponding to the target MDS according to the configuration information; determining a reporting object switching method for the target MDS according to the judgment result, and switching the target MDS to a normally operating ES service according to the reporting object switching method. That is, in some embodiments of the present application, when switching services, it is necessary to first judge whether there is a normally operating ES service among all candidate ES services corresponding to the target MDS, and then select a specific service switching object.
本申请的一些实施例中,根据判断结果确定针对目标MDS的上报对象切换方式,并根据 上报对象切换方式将目标MDS切换到正常运行的ES服务,可以包括:若配置信息中目标MDS对应的所有候选ES服务中不存在正常运行的第一目标ES服务,则轮询ES集群中的所有ES服务,直至查询到正常运行的ES服务,并将目标MDS切换到正常运行的ES服务。In some embodiments of the present application, the reporting object switching mode for the target MDS is determined according to the judgment result, and The reporting object switching method switches the target MDS to a normally operating ES service, which may include: if there is no normally operating first target ES service among all candidate ES services corresponding to the target MDS in the configuration information, poll all ES services in the ES cluster until a normally operating ES service is found, and switch the target MDS to the normally operating ES service.
本申请的一些实施例中,根据判断结果确定针对目标MDS的上报对象切换方式,并根据上报对象切换方式将目标MDS切换到正常运行的ES服务,可以包括:若配置信息中目标MDS对应的所有候选ES服务中存在正常运行的第一目标ES服务,则将目标MDS切换到第一目标ES服务。In some embodiments of the present application, a reporting object switching method for the target MDS is determined based on a judgment result, and the target MDS is switched to a normally operating ES service based on the reporting object switching method, which may include: if there is a first target ES service that is normally operating among all candidate ES services corresponding to the target MDS in the configuration information, the target MDS is switched to the first target ES service.
即当ES故障时,其对应的MDS上报会异常报错,这时监测到ES故障,此时MDS会首先尝试从候选ES中选择正常的ES,如果候选ES全部选择完都不正常,则再从全部ES中依次选择,如果全部ES都不正常会再次从全部ES中选择,直到选择出一个正常的ES,在选择ES期间不进行元数据上报。如图7所示,根据负载均衡分配算法,每个MDS的候选ES分别是:MDS0:ES0、ES3;MDS1:ES1、ES4;MDS2:ES2。当ES0故障时,MDS0会首先选择候选ES,即ES3,如果ES3仍然不正常,则再次从ES0到ES4全部ES中进行逐个尝试选择,如果ES0到ES4也都不正常,则再次遍历ES0到ES4,直至选出一个正常的ES。That is, when an ES fails, its corresponding MDS report will be abnormal. At this time, the ES failure is detected. At this time, the MDS will first try to select a normal ES from the candidate ES. If all the candidate ESs are not normal after selection, it will select from all ESs in turn. If all ESs are not normal, it will select from all ESs again until a normal ES is selected. Metadata reporting is not performed during the selection of ES. As shown in Figure 7, according to the load balancing allocation algorithm, the candidate ESs of each MDS are: MDS0: ES0, ES3; MDS1: ES1, ES4; MDS2: ES2. When ES0 fails, MDS0 will first select a candidate ES, that is, ES3. If ES3 is still not normal, it will try to select one by one from all ESs from ES0 to ES4. If ES0 to ES4 are also not normal, it will traverse ES0 to ES4 again until a normal ES is selected.
可见,MDS在上报元数据过程中,节点上的ES服务可能异常,节点也可能掉电,但ES符合自身的冗余规则,在挂掉一个ES服务或者宕机一个ES节点时仍然能正常提供服务,此时MDS就应该主动选择ES集群中的一个可用ES继续进行元数据上报,而不是死等分配给这个MDS的ES变为正常之后再继续上报,从而实现业务的连续和高可用。且本申请的一些实施例不依赖任何服务组件,不会对业务系统产生影响,在分布式存储各高级功能中也具有参考价值。It can be seen that when MDS is reporting metadata, the ES service on the node may be abnormal, and the node may also lose power, but ES complies with its own redundancy rules and can still provide services normally when an ES service is hung up or an ES node is down. At this time, MDS should actively select an available ES in the ES cluster to continue metadata reporting, rather than waiting for the ES assigned to this MDS to become normal before continuing to report, thereby achieving business continuity and high availability. In addition, some embodiments of the present application do not rely on any service components, will not affect the business system, and are also of reference value in various advanced functions of distributed storage.
其中,关于上述步骤S21、步骤S22的具体过程可以参考前述实施例公开的相应内容,在此不再进行赘述。Among them, the specific processes of the above-mentioned steps S21 and S22 can refer to the corresponding contents disclosed in the above-mentioned embodiments, and will not be repeated here.
由上可见,本申请的一些实施例中,若分布式文件系统内目标MDS服务当前对应的用于接收元数据上报的候选ES服务出现故障,将目标MDS切换到正常运行的ES服务,以便目标MDS向正常运行的ES服务上报元数据。可见,当ES发生故障时会影响MDS上报元数据过程,为了保证元数据上报的连续和及时,防止用户检索出不符合用户要求的文件,MDS元数据上报实现自动故障切换,实现分布式系统的全局负载均衡和高可用,无需人工干预,提升系统的稳定性,更好的辅助用户数据管理,提升产品竞争力,提升用户满意度。As can be seen from the above, in some embodiments of the present application, if the candidate ES service for receiving metadata reports currently corresponding to the target MDS service in the distributed file system fails, the target MDS is switched to the normally operating ES service so that the target MDS reports metadata to the normally operating ES service. It can be seen that when an ES fails, it will affect the MDS metadata reporting process. In order to ensure the continuity and timeliness of metadata reporting and prevent users from retrieving files that do not meet user requirements, MDS metadata reporting implements automatic fault switching, achieves global load balancing and high availability of distributed systems, and does not require manual intervention, thereby improving system stability, better assisting user data management, improving product competitiveness, and improving user satisfaction.
在上述实施例基础上,本申请的一些实施例还公开了一种具体的元数据上报方法,参见图8所示,该方法可以包括以下步骤: Based on the above embodiments, some embodiments of the present application further disclose a specific metadata reporting method. As shown in FIG8 , the method may include the following steps:
步骤S31:根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式。Step S31: Determine a corresponding load balancing distribution method according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system.
步骤S32:按照负载均衡分配方式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向分配的一个候选ES服务进行元数据上报。Step S32: Allocate an ES service to each MDS service in a load balancing manner, and generate configuration information based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
步骤S33:若分布式文件系统内目标MDS服务当前对应的用于接收元数据上报的候选ES服务出现故障,将目标MDS切换到正常运行的ES服务,以便目标MDS向正常运行的ES服务上报元数据。Step S33: If the candidate ES service for receiving metadata reporting currently corresponding to the target MDS service in the distributed file system fails, the target MDS is switched to the normally operating ES service so that the target MDS reports metadata to the normally operating ES service.
步骤S34:根据配置信息判断正常运行的ES服务是否命中目标MDS对应的候选ES服务。Step S34: Determine whether the normally running ES service hits the candidate ES service corresponding to the target MDS according to the configuration information.
步骤S35:根据命中结果,判断是否对目标MDS当前对应的正常运行的ES服务进行上报对象切回处理。Step S35: According to the hit result, it is determined whether to perform a report object switchback process on the normally running ES service currently corresponding to the target MDS.
即上述服务切换可能切换到目标MDS对应的候选ES服务,也可能会切换到非候选的某个正常的ES服务,若切换的为候选ES服务,则当前切换后的状态也是符合负载均衡的,但是,若会切换到非候选的某个正常的ES服务,则当前分配并不符合最初的负载均衡分配,因此,需要进行服务切回处理,以便在正常运行的基础上,实现负载均衡。That is, the above service switching may switch to the candidate ES service corresponding to the target MDS, or it may switch to a normal ES service that is not a candidate. If the service is switched to a candidate ES service, the current state after the switch is also in line with the load balancing. However, if the service is switched to a normal ES service that is not a candidate, the current distribution does not conform to the initial load balancing distribution. Therefore, service switching back is required to achieve load balancing on the basis of normal operation.
本申请的一些实施例中,根据命中结果,判断是否对目标MDS当前对应的正常运行的ES服务进行上报对象切回处理,可以包括:若正常运行的ES服务没有命中目标MDS对应的候选ES服务,则定时监测目标MDS对应的所有候选ES服务的状态,并当目标MDS对应的所有候选ES服务中存在正常运行的第二目标ES服务后,将目标MDS切换回第二目标ES服务。In some embodiments of the present application, based on the hit result, it is determined whether to perform report object switching back processing on the normally operating ES service currently corresponding to the target MDS, which may include: if the normally operating ES service does not hit the candidate ES service corresponding to the target MDS, then periodically monitoring the status of all candidate ES services corresponding to the target MDS, and when there is a normally operating second target ES service among all the candidate ES services corresponding to the target MDS, switching the target MDS back to the second target ES service.
本申请的一些实施例中,根据命中结果,判断是否对目标MDS当前对应的正常运行的ES服务进行上报对象切回处理,可以包括:若正常运行的ES服务命中目标MDS对应的候选ES服务,则保留当前上报连接。即如果ES故障后MDS切换连接其他ES,并且正常提供上报元数据服务,在此期间会定时检查当前连接的ES是否是候选ES,如果不是则定时向候选ES发送消息检查候选ES是否恢复正常,如果恢复则切回候选ES。例如图9所示,ES0故障,MDS0切换连接至ES1,在通过ES1上报元数据的过程中,同时定时检查ES0是否正常,如果ES0正常,则切回ES0,从而实现故障切回和负载均衡。In some embodiments of the present application, based on the hit result, it is determined whether to perform a reporting object switchback process on the normally operating ES service currently corresponding to the target MDS, which may include: if the normally operating ES service hits the candidate ES service corresponding to the target MDS, the current reporting connection is retained. That is, if the MDS switches to connect to other ES after an ES failure, and provides reporting metadata services normally, during this period, it will periodically check whether the currently connected ES is a candidate ES. If not, it will periodically send a message to the candidate ES to check whether the candidate ES has returned to normal. If it has returned, it will switch back to the candidate ES. For example, as shown in Figure 9, ES0 fails, and MDS0 switches to ES1. In the process of reporting metadata through ES1, it also periodically checks whether ES0 is normal. If ES0 is normal, it switches back to ES0, thereby achieving fault switching back and load balancing.
可见,本申请的一些实施例中公开的分布式文件系统元数据检索功能负载均衡高可用实现方法,通过1.分布式文件系统配置各ES的地址;2.MDS根据负载均衡算法均衡分配每个ES到各MDS;3.在MDS上报过程中,如果ES故障,MDS首先尝试从候选ES中选择一个ES,如果正常则使用候选ES;4.如果候选ES都不正常,则从全部ES中依次选择,如果选择一个正常ES则使用这个ES继续进行元数据上报,如果全部ES都不正常,则继续遍历所有ES,直到选择一个 正常ES;5.如果MDS当前使用的ES不是候选ES,则定时检测候选ES是否变为正常,如果变为正常则切回候选ES。在ES故障时首选候选ES再选全部ES实现故障的自动切换,实现上报业务的连续可用,定时检测候选ES的状态实现故障后的自动切回,从而实现动态的负载均衡和高可用功能。由此,基于elasticsearch的分布式非结构化存储元数据检索功能,MDS单独实现一套负载均衡和高可用功能,不依赖任何组件,不和分布式系统功能耦合,独立的负载均衡和高可用功能实现元数据上报业务的连续高效高可用,保证系统的稳定运行,更好地辅助用户进行数据管理,高效开发数据价值。It can be seen that the distributed file system metadata retrieval function load balancing high availability implementation method disclosed in some embodiments of the present application is through 1. The distributed file system configures the address of each ES; 2. The MDS evenly distributes each ES to each MDS according to the load balancing algorithm; 3. During the MDS reporting process, if the ES fails, the MDS first tries to select an ES from the candidate ES, and if it is normal, the candidate ES is used; 4. If all candidate ESs are not normal, select from all ESs in turn, and if a normal ES is selected, use this ES to continue metadata reporting. If all ESs are not normal, continue to traverse all ESs until one is selected. Normal ES; 5. If the ES currently used by MDS is not a candidate ES, it will periodically check whether the candidate ES has become normal. If it has become normal, it will switch back to the candidate ES. In the event of an ES failure, the candidate ES is selected first and then all ES are selected to achieve automatic switching of the failure, and continuous availability of the reporting business. The status of the candidate ES is periodically detected to achieve automatic switching back after the failure, thereby achieving dynamic load balancing and high availability functions. Therefore, based on the distributed unstructured storage metadata retrieval function of elasticsearch, MDS independently implements a set of load balancing and high availability functions, which does not rely on any components and is not coupled with distributed system functions. The independent load balancing and high availability functions realize continuous, efficient and high availability of metadata reporting services, ensure the stable operation of the system, better assist users in data management, and efficiently develop data value.
其中,关于上述步骤S31、步骤S32、步骤S33的具体过程可以参考前述实施例公开的相应内容,在此不再进行赘述。Among them, the specific processes of the above-mentioned steps S31, S32, and S33 can refer to the corresponding contents disclosed in the above-mentioned embodiments, and will not be repeated here.
由上可见,本申请的一些实施例中根据配置信息判断正常运行的ES服务是否命中目标MDS对应的候选ES服务;根据命中结果,判断是否对目标MDS当前对应的正常运行的ES服务进行上报对象切回处理。通过上报对象切回处理进行服务切回处理,以便在正常运行的基础上,尽量保证负载均衡。根据负载均衡最初分配的MDS和ES对应关系,如果ES故障后MDS选择了不是按照负载均衡算法分配的ES,则实时监测负载均衡分配的ES何时变为正常,如果正常则切回该ES,从而实现全局的动态负载均衡。As can be seen from the above, in some embodiments of the present application, it is determined whether the normally operating ES service hits the candidate ES service corresponding to the target MDS based on the configuration information; based on the hit result, it is determined whether to perform the reporting object switchback processing on the normally operating ES service currently corresponding to the target MDS. The service switchback processing is performed by reporting the object switchback processing, so as to ensure load balancing as much as possible on the basis of normal operation. According to the correspondence between the MDS and ES initially assigned by the load balancing, if the MDS selects an ES that is not assigned according to the load balancing algorithm after the ES fails, it is monitored in real time when the ES assigned by the load balancing becomes normal. If it is normal, the ES is switched back to achieve global dynamic load balancing.
相应的,本申请的一些实施例还公开了一种元数据上报系统,包括ES集群和分布式文件系统;其中,ES集群中的ES服务与分布式文件系统中的MDS服务为经过负载均衡分配方式分配的服务;其中,负载均衡分配方式为根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系确定的;MDS服务用于根据配置信息向对应的一个ES服务进行元数据上报;配置信息为基于分配结果生成的配置信息。Correspondingly, some embodiments of the present application also disclose a metadata reporting system, including an ES cluster and a distributed file system; wherein the ES service in the ES cluster and the MDS service in the distributed file system are services allocated through a load balancing distribution method; wherein the load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system; the MDS service is used to report metadata to a corresponding ES service according to configuration information; the configuration information is configuration information generated based on the allocation result.
相应的,本申请的一些实施例还公开了一种元数据上报装置,参见图10所示,该装置包括:Accordingly, some embodiments of the present application further disclose a metadata reporting device, as shown in FIG10 , the device includes:
负载均衡分配方式确定模块11,用于根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式;The load balancing distribution mode determination module 11 is used to determine the corresponding load balancing distribution mode according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
分配模块12,用于按照负载均衡分配方式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向对应的ES服务进行数据上报。The allocation module 12 is used to allocate an ES service to each MDS service in a load balancing allocation manner, and generate configuration information based on the allocation result, so that the MDS service reports data to the corresponding ES service according to the configuration information.
由上可见,本申请的一些实施例中根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式;按照负载均衡分配方 式为每个MDS服务分配ES服务,并基于分配结果生成配置信息,以便MDS服务根据配置信息向分配的一个候选ES服务进行元数据上报。可见,通过根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,利用当前适配的负载均衡分配方式进行ES服务的分配,由此,基于Elasticsearch的分布式文件系统元数据检索功能,通过均衡每个MDS连接到不同的ES,MDS实现通过负载均衡根据ES个数等条件实现元数据上报的压力的均衡,即实现ES负载均衡,从而最大程序发挥上报性能。As can be seen from the above, in some embodiments of the present application, the corresponding load balancing distribution method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system; The method allocates ES services to each MDS service, and generates configuration information based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information. It can be seen that by allocating ES services according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system, the current adapted load balancing allocation method is used to allocate ES services. Therefore, based on the metadata retrieval function of the distributed file system of Elasticsearch, by balancing each MDS connected to different ES, MDS achieves the balance of metadata reporting pressure according to the number of ES and other conditions through load balancing, that is, ES load balancing is achieved, thereby maximizing the reporting performance of the program.
在一些具体实施例中,负载均衡分配方式确定模块11具体可以包括:In some specific embodiments, the load balancing distribution mode determination module 11 may specifically include:
第一负载均衡分配方式确定单元,用于若ES服务的总数量小于MDS服务的总数量,则负载均衡分配方式为第一目标负载均衡分配方式。The first load balancing allocation mode determining unit is configured to determine that if the total number of ES services is less than the total number of MDS services, the load balancing allocation mode is the first target load balancing allocation mode.
在一些具体实施例中,分配模块12具体可以包括:In some specific embodiments, the allocation module 12 may specifically include:
初次分配单元,用于通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以完成初次分配;The primary allocation unit is used to sort all the MDS services and all the ES services, and to allocate the ES services to the MDS services with the same sorting position according to the sorting position of the ES services, so as to complete the primary allocation;
再分配单元,用于根据ES服务的排序,从头依次选取一个ES服务分配给初次分配后未被分配ES服务的MDS服务,以便每个MDS服务至少对应一个ES服务。The reallocation unit is used to select an ES service from the beginning in sequence according to the order of the ES services and allocate it to the MDS services that have not been allocated ES services after the initial allocation, so that each MDS service corresponds to at least one ES service.
在一些具体实施例中,负载均衡分配方式确定模块11具体可以包括:In some specific embodiments, the load balancing distribution mode determination module 11 may specifically include:
第二负载均衡分配方式确定单元,用于若ES服务的总数量等于MDS服务的总数量,则负载均衡分配方式为第二目标负载均衡分配方式。The second load balancing allocation mode determining unit is configured to determine that if the total number of ES services is equal to the total number of MDS services, the load balancing allocation mode is the second target load balancing allocation mode.
在一些具体实施例中,分配模块12具体可以包括:In some specific embodiments, the allocation module 12 may specifically include:
分配单元,用于通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以便MDS服务与ES服务之间一一对应。The allocation unit is used to sort all MDS services and all ES services, and allocate ES services to MDS services with the same sorting position according to the sorting position of the ES services, so that there is a one-to-one correspondence between the MDS services and the ES services.
在一些具体实施例中,负载均衡分配方式确定模块11具体可以包括:In some specific embodiments, the load balancing distribution mode determination module 11 may specifically include:
第三负载均衡分配方式确定单元,用于若ES服务的总数量大于MDS服务的总数量,则负载均衡分配方式为第三目标负载均衡分配方式。The third load balancing allocation mode determining unit is configured to determine that if the total number of ES services is greater than the total number of MDS services, the load balancing allocation mode is a third target load balancing allocation mode.
在一些具体实施例中,分配模块12具体可以包括:In some specific embodiments, the allocation module 12 may specifically include:
初次分配单元,用于通过对所有MDS服务进行排序,以及对所有ES服务进行排序,根据ES服务的排序位置将ES服务分配给具有相同排序位置的MDS服务,以完成初次分配;The primary allocation unit is used to sort all the MDS services and all the ES services, and to allocate the ES services to the MDS services with the same sorting position according to the sorting position of the ES services, so as to complete the primary allocation;
再分配单元,用于将初次分配后未分配的ES服务按照MDS服务的排序进行再次分配,直至所有ES服务分配完成,以便每个ES服务至少对应一个MDS服务。The reallocation unit is used to reallocate the ES services that are not allocated after the initial allocation according to the order of the MDS services until all ES services are allocated, so that each ES service corresponds to at least one MDS service.
在一些具体实施例中,元数据上报装置具体可以包括: In some specific embodiments, the metadata reporting device may specifically include:
地址发送单元,用于通过分布式文件系统为ES集群内的每个ES服务配置通信地址,以便MDS服务根据ES服务的通信地址向相应的ES服务进行数据上报。The address sending unit is used to configure a communication address for each ES service in the ES cluster through a distributed file system, so that the MDS service reports data to the corresponding ES service according to the communication address of the ES service.
在一些具体实施例中,元数据上报装置具体可以包括:In some specific embodiments, the metadata reporting device may specifically include:
服务切换单元,用于若分布式文件系统内目标MDS服务当前对应的用于接收元数据上报的候选ES服务出现故障,将目标MDS切换到正常运行的ES服务,以便目标MDS向正常运行的ES服务上报元数据。The service switching unit is used to switch the target MDS to the normally operating ES service if the candidate ES service currently corresponding to the target MDS service in the distributed file system for receiving metadata reporting fails, so that the target MDS reports metadata to the normally operating ES service.
在一些具体实施例中,元数据上报装置具体可以包括:In some specific embodiments, the metadata reporting device may specifically include:
故障判断单元,用于目标MDS根据配置信息向对应的一个候选ES服务进行数据上报,并根据候选ES服务反馈的数据上报结果判断该候选ES服务是否存在故障。The fault judgment unit is used for the target MDS to report data to a corresponding candidate ES service according to the configuration information, and to judge whether the candidate ES service has a fault according to the data reporting result fed back by the candidate ES service.
在一些具体实施例中,服务切换单元具体可以包括:In some specific embodiments, the service switching unit may specifically include:
候选服务判断单元,用于根据配置信息判断目标MDS对应的所有候选ES服务中是否存在正常运行的第一目标ES服务;A candidate service determination unit, configured to determine whether there is a first target ES service that is operating normally among all candidate ES services corresponding to the target MDS according to the configuration information;
ES服务切换单元,用于根据判断结果确定针对目标MDS的上报对象切换方式,并根据上报对象切换方式将目标MDS切换到正常运行的ES服务。The ES service switching unit is used to determine a reporting object switching mode for the target MDS according to the judgment result, and switch the target MDS to a normally operating ES service according to the reporting object switching mode.
在一些具体实施例中,ES服务切换单元具体可以包括:In some specific embodiments, the ES service switching unit may specifically include:
第一切换单元,用于若配置信息中目标MDS对应的所有候选ES服务中不存在正常运行的第一目标ES服务,则轮询ES集群中的所有ES服务,直至查询到正常运行的ES服务,并将目标MDS切换到正常运行的ES服务。The first switching unit is used to poll all ES services in the ES cluster until a normally operating ES service is found, and switch the target MDS to the normally operating ES service if there is no normally operating first target ES service among all candidate ES services corresponding to the target MDS in the configuration information.
在一些具体实施例中,ES服务切换单元具体可以包括:In some specific embodiments, the ES service switching unit may specifically include:
第二切换单元,用于若配置信息中目标MDS对应的所有候选ES服务中存在正常运行的第一目标ES服务,则将目标MDS切换到第一目标ES服务。The second switching unit is configured to switch the target MDS to the first target ES service if there is a first target ES service that is operating normally among all candidate ES services corresponding to the target MDS in the configuration information.
在一些具体实施例中,元数据上报装置具体可以包括:In some specific embodiments, the metadata reporting device may specifically include:
命中判断单元,用于根据配置信息判断正常运行的ES服务是否命中目标MDS对应的候选ES服务;A hit judgment unit, used to judge whether the normally running ES service hits the candidate ES service corresponding to the target MDS according to the configuration information;
切回判断单元,用于根据命中结果,判断是否对目标MDS当前对应的正常运行的ES服务进行上报对象切回处理。The switchback judgment unit is used to judge whether to perform a report object switchback process on the normally running ES service currently corresponding to the target MDS according to the hit result.
在一些具体实施例中,切回判断单元具体可以包括:In some specific embodiments, the switchback determination unit may specifically include:
服务切回单元,用于若正常运行的ES服务没有命中目标MDS对应的候选ES服务,则定时监测目标MDS对应的所有候选ES服务的状态,并当目标MDS对应的所有候选ES服务中存在正常运行的第二目标ES服务后,将目标MDS切换回第二目标ES服务。 The service switching back unit is used to periodically monitor the status of all candidate ES services corresponding to the target MDS if the normally operating ES service does not hit the candidate ES service corresponding to the target MDS, and when there is a normally operating second target ES service among all the candidate ES services corresponding to the target MDS, switch the target MDS back to the second target ES service.
在一些具体实施例中,切回判断单元具体可以包括:In some specific embodiments, the switchback determination unit may specifically include:
服务保留单元,用于若正常运行的ES服务命中目标MDS对应的候选ES服务,则保留当前上报连接。The service reservation unit is used to reserve the current reporting connection if the normally operating ES service hits the candidate ES service corresponding to the target MDS.
进一步的,本申请的一些实施例还公开了一种电子设备,参见图11所示,图中的内容不能被认为是对本申请的使用范围的任何限制。Furthermore, some embodiments of the present application also disclose an electronic device, as shown in FIG11 . The content in the figure cannot be regarded as any limitation on the scope of use of the present application.
图11为本申请的一些实施例提供的一种电子设备20的结构示意图。该电子设备20,具体可以包括:至少一个处理器21、至少一个存储器22、电源23、通信接口24、输入输出接口25和通信总线26。其中,存储器22用于存储计算机程序,计算机程序由处理器21加载并执行,以实现前述任一实施例公开的元数据上报方法中的相关步骤。FIG11 is a schematic diagram of the structure of an electronic device 20 provided in some embodiments of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input/output interface 25, and a communication bus 26. The memory 22 is used to store a computer program, which is loaded and executed by the processor 21 to implement the relevant steps in the metadata reporting method disclosed in any of the aforementioned embodiments.
本申请的一些实施例中,电源23用于为电子设备20上的各硬件设备提供工作电压;通信接口24能够为电子设备20创建与外界设备之间的数据传输通道,其所遵循的通信协议是能够适用于本申请技术方案的任意通信协议,在此不对其进行具体限定;输入输出接口25,用于获取外界输入数据或向外界输出数据,其具体的接口类型可以根据具体应用需要进行选取,在此不进行具体限定。In some embodiments of the present application, the power supply 23 is used to provide working voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and external devices, and the communication protocol it follows is any communication protocol that can be applied to the technical solution of the present application, and is not specifically limited here; the input and output interface 25 is used to obtain external input data or output data to the outside world, and its specific interface type can be selected according to specific application needs and is not specifically limited here.
另外,存储器22作为资源存储的载体,可以是只读存储器、随机存储器、磁盘或者光盘等,其上所存储的资源包括操作系统221、计算机程序222及包括配置信息在内的数据223等,存储方式可以是短暂存储或者永久存储。In addition, the memory 22, as a carrier for storing resources, can be a read-only memory, a random access memory, a disk or an optical disk, etc. The resources stored thereon include an operating system 221, a computer program 222, and data 223 including configuration information, etc. The storage method can be temporary storage or permanent storage.
其中,操作系统221用于管理与控制电子设备20上的各硬件设备以及计算机程序222,以实现处理器21对存储器22中海量数据223的运算与处理,其可以是Windows Server、Netware、Unix、Linux等。计算机程序222除了包括能够用于完成前述任一实施例公开的由电子设备20执行的元数据上报方法的计算机程序之外,还可以进一步包括能够用于完成其他特定工作的计算机程序。The operating system 221 is used to manage and control the hardware devices and computer programs 222 on the electronic device 20, so as to realize the operation and processing of the massive data 223 in the memory 22 by the processor 21, and it can be Windows Server, Netware, Unix, Linux, etc. In addition to including computer programs that can be used to complete the metadata reporting method performed by the electronic device 20 disclosed in any of the aforementioned embodiments, the computer program 222 can further include computer programs that can be used to complete other specific tasks.
进一步的,本申请的一些实施例还公开了一种非易失性可读存储介质,非易失性可读存储介质中存储有计算机可执行指令,计算机可执行指令被处理器加载并执行时,实现前述任一实施例公开的元数据上报方法步骤。Furthermore, some embodiments of the present application also disclose a non-volatile readable storage medium, in which computer executable instructions are stored. When the computer executable instructions are loaded and executed by a processor, the metadata reporting method steps disclosed in any of the aforementioned embodiments are implemented.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即 可。In this specification, each embodiment is described in a progressive manner. Each embodiment focuses on the differences from other embodiments. The same or similar parts between the embodiments can be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple. For the relevant parts, please refer to the method part description. Can.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly using hardware, a software module executed by a processor, or a combination of the two. The software module may be placed in a random access memory (RAM), a memory, a read-only memory (ROM), an electrically programmable ROM, an electrically erasable programmable ROM, a register, a hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。Finally, it should be noted that, in this article, relational terms such as first and second, etc. are only used to distinguish one entity or operation from another entity or operation, and do not necessarily require or imply any such actual relationship or order between these entities or operations. Moreover, the terms "include", "comprise" or any other variants thereof are intended to cover non-exclusive inclusion, so that a process, method, article or device including a series of elements includes not only those elements, but also other elements not explicitly listed, or also includes elements inherent to such process, method, article or device. In the absence of further restrictions, the elements defined by the sentence "comprise a ..." do not exclude the presence of other identical elements in the process, method, article or device including the elements.
以上对本申请所提供的一种元数据上报方法、装置、设备及介质进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。 The above is a detailed introduction to a metadata reporting method, device, equipment and medium provided by the present application. Specific examples are used in this article to illustrate the principles and implementation methods of the present application. The description of the above embodiments is only used to help understand the method of the present application and its core idea; at the same time, for a person skilled in the art, according to the idea of the present application, there will be changes in the specific implementation method and application scope. In summary, the content of this specification should not be understood as a limitation on the present application.

Claims (20)

  1. 一种元数据上报方法,其特征在于,应用于基于ES的元数据检索架构,包括:A metadata reporting method, characterized in that it is applied to a metadata retrieval architecture based on ES, comprising:
    根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式;Determine the corresponding load balancing distribution method based on the relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
    按照所述负载均衡分配方式为每个所述MDS服务分配ES服务,并基于分配结果生成配置信息,以便所述MDS服务根据所述配置信息向分配的一个候选ES服务进行元数据上报。An ES service is allocated to each of the MDS services in accordance with the load balancing allocation method, and configuration information is generated based on the allocation result, so that the MDS service reports metadata to an allocated candidate ES service according to the configuration information.
  2. 根据权利要求1所述的元数据上报方法,其特征在于,所述根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,包括:The metadata reporting method according to claim 1 is characterized in that the step of determining the corresponding load balancing distribution mode according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system comprises:
    若所述ES服务的总数量小于所述MDS服务的总数量,则所述负载均衡分配方式为第一目标负载均衡分配方式。If the total number of the ES services is less than the total number of the MDS services, the load balancing allocation method is the first target load balancing allocation method.
  3. 根据权利要求2所述的元数据上报方法,其特征在于,所述按照所述负载均衡分配方式为每个所述MDS服务分配ES服务,包括:The metadata reporting method according to claim 2, characterized in that the allocating ES services to each of the MDS services according to the load balancing distribution method comprises:
    通过对所有所述MDS服务进行排序,以及对所有所述ES服务进行排序,根据所述ES服务的排序位置将所述ES服务分配给具有相同排序位置的所述MDS服务,以完成初次分配;By sorting all the MDS services and sorting all the ES services, the ES services are allocated to the MDS services with the same sorting position according to the sorting position of the ES services, so as to complete the initial allocation;
    根据所述ES服务的排序,从头依次选取一个ES服务分配给所述初次分配后未被分配ES服务的MDS服务,以便每个所述MDS服务至少对应一个所述ES服务。According to the ranking of the ES services, one ES service is selected from the beginning and allocated to the MDS services that have not been allocated an ES service after the initial allocation, so that each of the MDS services corresponds to at least one ES service.
  4. 根据权利要求1所述的元数据上报方法,其特征在于,所述根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式,包括:The metadata reporting method according to claim 1 is characterized in that the step of determining the corresponding load balancing distribution mode according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system comprises:
    若所述ES服务的总数量等于所述MDS服务的总数量,则所述负载均衡分配方式为第二目标负载均衡分配方式。If the total number of the ES services is equal to the total number of the MDS services, the load balancing distribution mode is the second target load balancing distribution mode.
  5. 根据权利要求4所述的元数据上报方法,其特征在于,所述按照所述负载均衡分配方式为每个所述MDS服务分配ES服务,包括:The metadata reporting method according to claim 4, characterized in that the allocating ES service to each of the MDS services according to the load balancing distribution method comprises:
    通过对所有所述MDS服务进行排序,以及对所有所述ES服务进行排序,根据所述ES服务的排序位置将所述ES服务分配给具有相同排序位置的所述MDS服务,以便所述MDS服务与所述ES服务之间一一对应。By sorting all the MDS services and sorting all the ES services, the ES services are allocated to the MDS services with the same sorting position according to the sorting position of the ES services, so that there is a one-to-one correspondence between the MDS services and the ES services.
  6. 根据权利要求1所述的元数据上报方法,其特征在于,所述根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载 均衡分配方式,包括:The metadata reporting method according to claim 1 is characterized in that the corresponding load is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system. Balanced distribution methods include:
    若所述ES服务的总数量大于所述MDS服务的总数量,则所述负载均衡分配方式为第三目标负载均衡分配方式。If the total number of the ES services is greater than the total number of the MDS services, the load balancing allocation method is the third target load balancing allocation method.
  7. 根据权利要求6所述的元数据上报方法,其特征在于,所述按照所述负载均衡分配方式为每个所述MDS服务分配ES服务,包括:The metadata reporting method according to claim 6, characterized in that the allocating ES services to each of the MDS services according to the load balancing distribution method comprises:
    通过对所有所述MDS服务进行排序,以及对所有所述ES服务进行排序,根据所述ES服务的排序位置将所述ES服务分配给具有相同排序位置的所述MDS服务,以完成初次分配;By sorting all the MDS services and sorting all the ES services, the ES services are allocated to the MDS services with the same sorting position according to the sorting position of the ES services, so as to complete the initial allocation;
    将所述初次分配后未分配的ES服务按照所述MDS服务的排序进行再次分配,直至所有所述ES服务分配完成,以便每个所述ES服务至少对应一个所述MDS服务。The ES services that are not allocated after the initial allocation are allocated again according to the order of the MDS services until all the ES services are allocated, so that each of the ES services corresponds to at least one MDS service.
  8. 根据权利要求1所述的元数据上报方法,其特征在于,所述按照所述负载均衡分配方式为每个所述MDS服务分配ES服务之前,还包括:The metadata reporting method according to claim 1, characterized in that before allocating an ES service to each of the MDS services according to the load balancing distribution method, it also includes:
    通过所述分布式文件系统为所述ES集群内的每个ES服务配置通信地址,以便所述MDS服务根据所述ES服务的通信地址向相应的ES服务进行数据上报。A communication address is configured for each ES service in the ES cluster through the distributed file system, so that the MDS service reports data to the corresponding ES service according to the communication address of the ES service.
  9. 根据权利要求1所述的元数据上报方法,其特征在于,还包括:The metadata reporting method according to claim 1, further comprising:
    若所述分布式文件系统内目标MDS服务当前对应的用于接收元数据上报的候选ES服务出现故障,将所述目标MDS切换到正常运行的ES服务,以便所述目标MDS向所述正常运行的ES服务上报元数据。If a candidate ES service for receiving metadata reports currently corresponding to the target MDS service in the distributed file system fails, the target MDS is switched to a normally operating ES service so that the target MDS reports metadata to the normally operating ES service.
  10. 根据权利要求9所述的元数据上报方法,其特征在于,所述若所述分布式文件系统内目标MDS服务对应的用于接收元数据上报的候选ES服务出现故障之前,还包括:The metadata reporting method according to claim 9 is characterized in that before the candidate ES service for receiving metadata reporting corresponding to the target MDS service in the distributed file system fails, it also includes:
    所述目标MDS根据所述配置信息向对应的一个候选ES服务进行数据上报,并根据所述候选ES服务反馈的数据上报结果判断该候选ES服务是否存在故障。The target MDS reports data to a corresponding candidate ES service according to the configuration information, and determines whether the candidate ES service has a fault according to the data reporting result fed back by the candidate ES service.
  11. 根据权利要求9所述的元数据上报方法,其特征在于,所述将所述目标MDS切换到正常运行的ES服务,包括:The metadata reporting method according to claim 9, characterized in that the step of switching the target MDS to a normally operating ES service comprises:
    根据所述配置信息判断所述目标MDS对应的所有所述候选ES服务中是否存在正常运行的第一目标ES服务;Determine, according to the configuration information, whether there is a first target ES service that is operating normally among all the candidate ES services corresponding to the target MDS;
    根据判断结果确定针对所述目标MDS的上报对象切换方式,并根据所述上报对象切换方式将所述目标MDS切换到正常运行的ES服务。A reporting object switching mode for the target MDS is determined according to the judgment result, and the target MDS is switched to a normally operating ES service according to the reporting object switching mode.
  12. 根据权利要求11所述的元数据上报方法,其特征在于,所述根据判断结果确定针对所述目标MDS的上报对象切换方式,并根据所述上报对象切换方式将所述目标MDS切换到正常运行的ES服务,包括: The metadata reporting method according to claim 11, characterized in that the step of determining a reporting object switching mode for the target MDS according to the judgment result, and switching the target MDS to a normally operating ES service according to the reporting object switching mode, comprises:
    若所述配置信息中所述目标MDS对应的所有所述候选ES服务中不存在正常运行的第一目标ES服务,则轮询所述ES集群中的所有ES服务,直至查询到正常运行的ES服务,并将所述目标MDS切换到所述正常运行的ES服务。If there is no first target ES service operating normally among all the candidate ES services corresponding to the target MDS in the configuration information, all ES services in the ES cluster are polled until a normally operating ES service is found, and the target MDS is switched to the normally operating ES service.
  13. 根据权利要求11所述的元数据上报方法,其特征在于,所述根据判断结果确定针对所述目标MDS的上报对象切换方式,并根据所述上报对象切换方式将所述目标MDS切换到正常运行的ES服务,包括:The metadata reporting method according to claim 11, characterized in that the step of determining a reporting object switching mode for the target MDS according to the judgment result, and switching the target MDS to a normally operating ES service according to the reporting object switching mode, comprises:
    若所述配置信息中所述目标MDS对应的所有所述候选ES服务中存在正常运行的第一目标ES服务,则将所述目标MDS切换到所述第一目标ES服务。If there is a first target ES service that is operating normally among all the candidate ES services corresponding to the target MDS in the configuration information, the target MDS is switched to the first target ES service.
  14. 根据权利要求9所述的元数据上报方法,其特征在于,还包括:The metadata reporting method according to claim 9, further comprising:
    根据所述配置信息判断所述正常运行的ES服务是否命中所述目标MDS对应的所述候选ES服务;Determine, according to the configuration information, whether the normally running ES service hits the candidate ES service corresponding to the target MDS;
    根据命中结果,判断是否对所述目标MDS当前对应的所述正常运行的ES服务进行上报对象切回处理。According to the hit result, it is determined whether to perform a report object switchback process on the normally running ES service currently corresponding to the target MDS.
  15. 根据权利要求14所述的元数据上报方法,其特征在于,所述根据命中结果,判断是否对所述目标MDS当前对应的所述正常运行的ES服务进行上报对象切回处理,包括:The metadata reporting method according to claim 14 is characterized in that judging whether to perform a reporting object switchback process on the normally operating ES service currently corresponding to the target MDS according to the hit result comprises:
    若所述正常运行的ES服务没有命中所述目标MDS对应的所述候选ES服务,则定时监测所述目标MDS对应的所有所述候选ES服务的状态,并当所述目标MDS对应的所有所述候选ES服务中存在正常运行的第二目标ES服务后,将所述目标MDS切换回所述第二目标ES服务。If the normally operating ES service does not hit the candidate ES service corresponding to the target MDS, the status of all the candidate ES services corresponding to the target MDS is monitored regularly, and when there is a normally operating second target ES service among all the candidate ES services corresponding to the target MDS, the target MDS is switched back to the second target ES service.
  16. 根据权利要求14所述的元数据上报方法,其特征在于,所述根据命中结果,判断是否对所述目标MDS当前对应的所述正常运行的ES服务进行上报对象切回处理,包括:The metadata reporting method according to claim 14 is characterized in that judging whether to perform a reporting object switchback process on the normally operating ES service currently corresponding to the target MDS according to the hit result comprises:
    若所述正常运行的ES服务命中所述目标MDS对应的所述候选ES服务,则保留当前上报连接。If the normally running ES service hits the candidate ES service corresponding to the target MDS, the current reporting connection is retained.
  17. 一种元数据上报系统,其特征在于,包括ES集群和分布式文件系统;A metadata reporting system, characterized by comprising an ES cluster and a distributed file system;
    其中,所述ES集群中的ES服务与所述分布式文件系统中的MDS服务为经过负载均衡分配方式分配的服务;其中,所述负载均衡分配方式为根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系确定的;The ES service in the ES cluster and the MDS service in the distributed file system are services allocated by a load balancing allocation method; wherein the load balancing allocation method is determined according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
    所述MDS服务用于根据配置信息向对应的一个ES服务进行元数据上报;所述配置信息为基于分配结果生成的配置信息。 The MDS service is used to report metadata to a corresponding ES service according to configuration information; the configuration information is configuration information generated based on the allocation result.
  18. 一种元数据上报装置,其特征在于,包括:A metadata reporting device, characterized by comprising:
    负载均衡分配方式确定模块,用于根据ES集群内ES服务的总数量与分布式文件系统内MDS服务的总数量之间的大小关系,确定出对应的负载均衡分配方式;The load balancing distribution mode determination module is used to determine the corresponding load balancing distribution mode according to the size relationship between the total number of ES services in the ES cluster and the total number of MDS services in the distributed file system;
    分配模块,用于按照所述负载均衡分配方式为每个所述MDS服务分配ES服务,并基于分配结果生成配置信息,以便所述MDS服务根据所述配置信息向对应的ES服务进行数据上报。The allocation module is used to allocate an ES service to each of the MDS services according to the load balancing allocation method, and generate configuration information based on the allocation result, so that the MDS service reports data to the corresponding ES service according to the configuration information.
  19. 一种电子设备,其特征在于,包括:An electronic device, comprising:
    存储器,用于保存计算机程序;Memory, used to store computer programs;
    处理器,用于执行所述计算机程序,以实现如权利要求1至16任一项所述的元数据上报方法。A processor, configured to execute the computer program to implement the metadata reporting method according to any one of claims 1 to 16.
  20. 一种非易失性可读存储介质,其特征在于,用于存储计算机程序;其中计算机程序被处理器执行时实现如权利要求1至16任一项所述的元数据上报方法。 A non-volatile readable storage medium, characterized in that it is used to store a computer program; wherein when the computer program is executed by a processor, the metadata reporting method as described in any one of claims 1 to 16 is implemented.
PCT/CN2023/108423 2022-11-30 2023-07-20 Metadata reporting method and apparatus, and device and storage medium WO2024113898A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211518325.XA CN115550368B (en) 2022-11-30 2022-11-30 Metadata reporting method, device, equipment and storage medium
CN202211518325.X 2022-11-30

Publications (1)

Publication Number Publication Date
WO2024113898A1 true WO2024113898A1 (en) 2024-06-06

Family

ID=84722242

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/108423 WO2024113898A1 (en) 2022-11-30 2023-07-20 Metadata reporting method and apparatus, and device and storage medium

Country Status (2)

Country Link
CN (1) CN115550368B (en)
WO (1) WO2024113898A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115550368B (en) * 2022-11-30 2023-03-10 苏州浪潮智能科技有限公司 Metadata reporting method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101729357A (en) * 2008-10-14 2010-06-09 华为技术有限公司 Method and device for storage processing and service processing of media files and server cluster
US20120131093A1 (en) * 2010-11-22 2012-05-24 International Business Machines Corporation Connection distribution for load balancing in a distributed database
CN114443573A (en) * 2022-01-17 2022-05-06 苏州浪潮智能科技有限公司 Metadata retrieval method and device, electronic equipment and medium
US20220156262A1 (en) * 2020-11-17 2022-05-19 Microstrategy Incorporated Enahanced data indexing and searching
CN115550368A (en) * 2022-11-30 2022-12-30 苏州浪潮智能科技有限公司 Metadata reporting method, device, equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335250B (en) * 2014-07-28 2018-09-28 浙江大华技术股份有限公司 A kind of data reconstruction method and device based on distributed file system
CN106375420B (en) * 2016-08-31 2020-01-10 宝信软件(武汉)有限公司 Server cluster intelligent monitoring system and method based on load balancing
CN109218355B (en) * 2017-06-30 2021-06-15 华为技术有限公司 Load balancing engine, client, distributed computing system and load balancing method
CN107590249A (en) * 2017-09-18 2018-01-16 郑州云海信息技术有限公司 A kind of balancing method of loads of distributed file system, device and equipment
CN113886841A (en) * 2021-10-27 2022-01-04 中国人民解放军战略支援部队信息工程大学 Credible tracing method for cloud data operation behaviors

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101729357A (en) * 2008-10-14 2010-06-09 华为技术有限公司 Method and device for storage processing and service processing of media files and server cluster
US20120131093A1 (en) * 2010-11-22 2012-05-24 International Business Machines Corporation Connection distribution for load balancing in a distributed database
US20220156262A1 (en) * 2020-11-17 2022-05-19 Microstrategy Incorporated Enahanced data indexing and searching
CN114443573A (en) * 2022-01-17 2022-05-06 苏州浪潮智能科技有限公司 Metadata retrieval method and device, electronic equipment and medium
CN115550368A (en) * 2022-11-30 2022-12-30 苏州浪潮智能科技有限公司 Metadata reporting method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN115550368B (en) 2023-03-10
CN115550368A (en) 2022-12-30

Similar Documents

Publication Publication Date Title
CN108683516B (en) Application instance upgrading method, device and system
JP5902716B2 (en) Large-scale storage system
CN109343963B (en) Application access method and device for container cluster and related equipment
CN111385114B (en) VNF service instantiation method and device
CN104137085A (en) Method for controlling access of clients to a service in a cluster environment
WO2024113898A1 (en) Metadata reporting method and apparatus, and device and storage medium
JP2005228278A (en) Management method, management device and management program of storage area
US20020194182A1 (en) Computer system
JP2000207370A (en) Distributed file management device and distributed file management system
Hsieh et al. The incremental load balance cloud algorithm by using dynamic data deployment
CN109299225B (en) Log retrieval method, system, terminal and computer readable storage medium
US20130054930A1 (en) Method, system and program product for storing downloadable content on a plurality of enterprise storage system (ess) cells
JP3782429B2 (en) Load balancing system and computer management program
CN114879907A (en) Data distribution determination method, device, equipment and storage medium
JP3672483B2 (en) Content distribution apparatus, content distribution method, and recording medium recording content distribution program
CN117806815B (en) Data processing method, system, electronic device and storage medium
US20230176908A1 (en) Systems, methods and computer program products for job management
WO2017094194A1 (en) Computer system and device management method
CN117390067A (en) Data aggregation processing method, device, equipment and storage medium
JP5294352B2 (en) Thin client system, session management apparatus, session management method and program
CN115086333A (en) Service request distribution method, device and storage medium
CN117806815A (en) Data processing method, system, electronic device and storage medium
IL227415A (en) Large scale storage system