CN116737486A - Method, device, equipment and medium for determining running state of distributed storage system - Google Patents

Method, device, equipment and medium for determining running state of distributed storage system Download PDF

Info

Publication number
CN116737486A
CN116737486A CN202210194600.0A CN202210194600A CN116737486A CN 116737486 A CN116737486 A CN 116737486A CN 202210194600 A CN202210194600 A CN 202210194600A CN 116737486 A CN116737486 A CN 116737486A
Authority
CN
China
Prior art keywords
storage
storage service
service
determining
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210194600.0A
Other languages
Chinese (zh)
Inventor
葛凯凯
陈鹏
罗韩梅
张智
罗维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202210194600.0A priority Critical patent/CN116737486A/en
Publication of CN116737486A publication Critical patent/CN116737486A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3055Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3089Monitoring arrangements determined by the means or processing involved in sensing the monitored data, e.g. interfaces, connectors, sensors, probes, agents
    • G06F11/3093Configuration details thereof, e.g. installation, enabling, spatial arrangement of the probes

Abstract

The application discloses a method, a device, equipment and a medium for determining the running state of a distributed storage system, and belongs to the technical field of data storage. The method comprises the following steps: determining the existing storage service in the first logic pool according to first management information, wherein the first management information is used for indicating a storage data distribution strategy of the first logic pool where the first storage service is located; determining a second storage service in the existing storage services according to the selection condition, wherein the first storage service is used for detecting the running state of the second storage service; and determining the running state of the second storage service through the state message of the second storage service. According to the application, the second storage service of the first storage service detection running state is limited in the first logic pool, so that the network load among different logic pools is effectively reduced; the situation that the storage service fault information outside the logic pool where the first storage service is received is avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is guaranteed.

Description

Method, device, equipment and medium for determining running state of distributed storage system
Technical Field
The present application relates to the field of data storage technologies, and in particular, to a method, an apparatus, a device, and a medium for determining an operating state of a distributed storage system.
Background
With the development of computer technology, the demand for data storage is increasing. The large amounts of computer data need to be securely stored, which places high demands on the reliability of data storage.
In the related art, the distributed storage system generally detects the operation state of the second storage service through the heartbeat signal by using the first storage service, and reports the failed second storage service to the monitoring node, so that the monitoring node knows the operation state of the second storage service, further distributes the storage distribution of the data, and ensures the security of the data storage of the computer.
However, under the condition that the monitoring node in the distributed storage system acquires the running state of the second storage service according to the first storage service, reporting errors of the first storage service can occur, and how to ensure the accuracy of determining the running state of the distributed storage system is a problem to be solved.
Disclosure of Invention
The application provides a method, a device, equipment and a medium for determining the running state of a distributed storage system, wherein the technical scheme is as follows:
According to an aspect of the present application, there is provided an operation state determining method of a distributed storage system, where the distributed storage system includes a first logical pool, the first logical pool includes a first storage service, the first storage service corresponds to a first storage unit, and the first storage service is used for performing storage management on the first storage unit in the distributed storage system; the method is performed by a first storage service, the method comprising:
determining the existing storage service in the first logic pool according to first management information, wherein the first management information is used for indicating a storage data distribution strategy of the first logic pool where the first storage service is located, and the first management information carries information of all storage services of the first logic pool;
determining a second storage service in the existing storage services according to a selection condition, wherein the first storage service is used for detecting the running state of the second storage service;
and determining the running state of the second storage service through the state message of the second storage service, wherein the running state is used for indicating whether the second storage service has faults or not.
According to another aspect of the present application, there is provided an operation state determining apparatus of a distributed storage system, the distributed storage system including a first logical pool including a first storage service corresponding to a first storage unit, the first storage service being used for performing storage management on the first storage unit in the distributed storage system; the apparatus is executed by a first storage service, the apparatus comprising:
the processing module is used for determining the existing storage service in the first logic pool according to first management information, wherein the first management information is used for indicating a storage data distribution strategy of the first logic pool where the first storage service is located, and the first management information carries information of all storage services of the first logic pool;
the selection module is used for determining a second storage service in the existing storage services according to selection conditions, and the first storage service is used for detecting the running state of the second storage service;
the detection module is used for determining the running state of the second storage service through the state message of the second storage service, and the running state is used for indicating whether the second storage service has faults or not.
In an alternative design of the application, the storage services of the first logical pool have sequence numbers for ordering;
the selection module is further configured to:
determining an existing storage service satisfying the selection condition as the second storage service;
wherein the selection conditions include: the sequence number of the existing storage service belongs to a sequence numbers before the sequence number of the first storage service, and/or the sequence number of the existing storage service belongs to b sequence numbers after the sequence number of the first storage service; a and b are both positive integers, a and b being preconfigured.
In an alternative design of the application, the first storage service belongs to a first storage node in the first logical pool, and the storage services of the first logical pool have sequence numbers for ordering;
the selection module is further configured to:
determining a second storage node in existing storage nodes to which the existing storage service belongs according to node selection conditions, wherein the existing storage nodes are determined according to node attribution conditions of the existing storage service in the first logic pool;
and determining the second storage service in the second storage node according to a service selection condition.
In an alternative design of the application, the selection module is further configured to:
determining, in the second storage node, a storage service that satisfies the service selection condition as the second storage service;
wherein the service selection condition includes: the second storage service has the same relative location in the second storage node as the first storage service in the first storage node.
In an alternative design of the application, the device further comprises:
and the sending module is used for sending the fault information of the second storage service to a monitoring node in the distributed storage system under the condition that the running state of the second storage service is a fault state, wherein the monitoring node is used for managing the running state of the storage service in the distributed storage system.
In an alternative design of the application, the status message is sent by the second storage service upon receipt of a request message sent by the first storage service;
the detection module is also used for:
determining that the running state of the second storage service is a normal state under the condition that the time interval between the state message and the request message is smaller than a target threshold value;
And determining that the running state of the second storage service is a fault state under the condition that the time interval between the state message and the request message is larger than a target threshold value.
In an alternative design of the application, the processing module is further configured to:
determining a corresponding first root node according to the first storage service, wherein the first root node is the root node of the first logic pool where the first storage service is located;
and determining the first management information according to the first root node, wherein the first root node is a starting node indicated by the first management information.
In an alternative design of the application, the processing module is further configured to:
performing reverse retrieval on the first storage service, and determining a first storage node to which the first storage service belongs;
and carrying out reverse retrieval on the first storage node, and determining a first root node corresponding to the first storage node, wherein the first storage node belongs to the first logic pool.
In an alternative design of the application, the device further comprises:
and the acquisition module is used for acquiring the first management information from the distributed storage system, wherein the first management information carries information of all storage services of the first logic pool.
In an alternative design of the application, the processing module is further configured to:
determining all storage services of the first logic pool according to the storage data distribution strategy of the first logic pool indicated by the first management information;
and determining the existing storage service according to all the storage services of the first logic pool.
According to another aspect of the present application, there is provided a computer device comprising a processor and a memory having stored therein at least one instruction, at least one program, a set of codes or a set of instructions, the at least one instruction, the at least one program, the set of codes or the set of instructions being loaded and executed by the processor to implement the method of operating state determination of a distributed storage system as described in the above aspect.
According to another aspect of the present application, there is provided a computer readable storage medium having stored therein at least one instruction, at least one program, a set of codes or a set of instructions, the at least one instruction, the at least one program, the set of codes or the set of instructions being loaded and executed by a processor to implement the method of determining an operating state of a distributed storage system as described in the above aspect.
According to another aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium, from which a processor reads and executes the computer instructions to implement the method of determining the operating state of a distributed storage system as described in the above aspects.
The technical scheme provided by the application has the beneficial effects that at least:
the second storage service is determined in the first logic pool where the first storage service is located, so that the second storage service of the first storage service for detecting the running state is limited in the first logic pool, and the network load among different logic pools is effectively reduced; the condition that the storage service fault information outside the logic pool is received from the first storage service is avoided, the monitoring node only obtains the storage service running state in the logic pool through the first storage service, reporting errors of the first storage service are avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a computer system provided in accordance with an exemplary embodiment of the present application;
FIG. 2 is a flow chart of a method of determining an operational status of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 3 is a flow chart of a method of determining an operational status of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 4 is a flow chart of a method of determining an operational status of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 5 is a schematic diagram of a distributed storage system provided by an exemplary embodiment of the present application;
FIG. 6 is a schematic diagram of a distributed storage system provided by an exemplary embodiment of the present application;
FIG. 7 is a schematic diagram of a distributed storage system provided by an exemplary embodiment of the present application;
FIG. 8 is a schematic diagram of a distributed storage system provided by an exemplary embodiment of the present application;
FIG. 9 is a schematic diagram of a distributed storage system provided by an exemplary embodiment of the present application;
FIG. 10 is a schematic diagram of a storage service tree structure of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 11 is a schematic diagram of a distributed storage system provided by an exemplary embodiment of the present application;
FIG. 12 is a flowchart of a method for determining an operational status of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 13 is a schematic view of a storage service tree structure of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 14 is a flowchart of a method for determining an operational status of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 15 is a flowchart of a method for determining an operational status of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 16 is a schematic diagram of heartbeat signal detection as provided by an exemplary embodiment of the present application;
FIG. 17 is a flowchart of a method for determining an operational status of a distributed storage system according to an exemplary embodiment of the present application;
FIG. 18 is a block diagram of an operational status determination apparatus of a distributed storage system according to an exemplary embodiment of the present application;
fig. 19 is a block diagram of a server according to an exemplary embodiment of the present application.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region. For example, the information such as the first management information in the present application is acquired under a sufficient authorization.
It should be understood that, although the terms first, second, etc. may be used in this disclosure to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first parameter may also be referred to as a second parameter, and similarly, a second parameter may also be referred to as a first parameter, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
FIG. 1 illustrates a schematic diagram of a computer system provided by an exemplary embodiment of the present application. The computer system may implement a system architecture that becomes a method of acquiring storage system failure information. The computer system may be implemented as a distributed storage system, the computer system comprising: a first logical pool 100 and a second logical pool 200;
distributed storage systems provide unlimited horizontal expansion capability that can continually increase storage unit and/or storage node expansion as more and more data is stored during use. The Ceph file system is typical for a distributed storage system, but this does not exclude the case where the distributed storage system is a distributed computing file system (Hadoop Distributed File System, HDFS), a Gluster file system, or other file system.
The first logical pool 100 includes at least one storage node, where the storage node may be an independent physical server, or may be a server cluster or a distributed system formed by multiple physical servers, or may be a disk cabinet or other computer devices that may implement a storage function; in fig. 1, a case is shown in which the first logical pool 100 includes a first storage node 110 and a second storage node 120, and the second logical pool 200 includes a third storage node 210 as an example. Those skilled in the art will appreciate that only two logical pools are shown in FIG. 1, but that in different embodiments there may be a plurality of other logical pools depending on the functional or logical requirements of the computer system; similar to the first logical pool 100, the second logical pool 200 and the other logical pools each include at least one storage node.
The storage node comprises at least one storage service, wherein the storage service is used for managing storage units in the computer system and managing at least one of allocation, writing and releasing of storage space; each storage service corresponds to a storage unit, and the storage units are generally in one-to-one correspondence with the services, but the situation that one service provides management services for a plurality of storage units or one storage unit receives a plurality of services for management is not excluded; the storage unit may be a memory and/or a magnetic disk, and the magnetic disk includes, but is not limited to, at least one of a hard disk, a floppy disk, and an optical disk, and the storage unit may also be other computer devices that may implement a storage function;
In fig. 1, it is shown that the first storage node 110 includes a first storage service 111a and a second storage service 112a, and the first storage service 111a and the second storage service 112a correspond to the first storage unit 111 and the second storage unit 112, respectively; the first storage service 111a and the second storage service 112a provide storage management services for the first storage unit 111 and the second storage unit 112, respectively.
The second storage node 120 includes a third storage service 121a and a fourth storage service 122a, and the third storage service 121a and the fourth storage service 122a correspond to a third storage unit 121 and a fourth storage unit 122, respectively; the third storage service 121a and the fourth storage service 122a provide management services for the third storage unit 121 and the fourth storage unit 122, respectively.
The third storage node 210 includes a fifth storage service 211a and a sixth storage service 212a, and the fifth storage service 211a and the sixth storage service 212a correspond to the fifth storage unit 211 and the sixth storage unit 212, respectively; the fifth storage service 211a and the sixth storage service 212a provide management services for the fifth storage unit 211 and the sixth storage unit 212, respectively.
It should be noted that, in the above description, the sequence numbers before the logic pool, the storage node, the storage unit and the storage service are only descriptions for convenience of distinguishing, and the sequence numbers can be changed; any of the memory cells described above may be referred to as a first memory cell in one particular example.
The computer system also comprises a monitoring node 10, wherein the monitoring node 10 is used for managing the state information of the computer system, the monitoring node 10 stores the distribution condition of the storage service in the computer system, and manages the running state of the storage service; in an alternative implementation, the monitoring node 10 may be a storage node in a computer system, i.e. the storage node may implement both a data storage function and a monitoring function for the computer system; the monitoring node 10 may also be a node other than a storage node in a computer system. Those skilled in the art will appreciate that the monitoring node 10 may be one or more in a computer system, and the present application is not limited in this regard.
Fig. 2 is a flowchart illustrating a method for determining an operation state of a distributed storage system according to an exemplary embodiment of the present application. The method may be performed by a first storage service. The method comprises the following steps:
step 510: determining an existing storage service in the first logical pool according to the first management information;
in the distributed storage system, the first storage service corresponds to a first storage unit, and the first storage service is used for carrying out storage management on the first storage unit in the distributed storage system; the first storage service may be any one or more storage services in a distributed storage system; that is, any one of the storage services in the distributed storage system may have the capability of determining the operation state of the second storage service, and there may be a case where a plurality of the storage services in the distributed storage system have the capability of determining the operation state of the second storage service; further, the number of the second storage services may be one or a plurality.
The first management information is used for indicating a storage data distribution strategy of a first logic pool where the first storage service is located, and the first management information carries information of all storage services of the first logic pool; the first management information directly or indirectly carries information of all storage services of the first logical pool through the indicated data distribution strategy, and the information of all storage services of the first logical pool carried in the first management information can be obtained through calculation or reverse retrieval or directly. Illustratively, the management information includes specific rules of a squeeze (Rule) algorithm, namely, a squeeze Rule (Rule).
Step 520: determining a second storage service in the existing storage services according to the selection condition;
the first storage service is used for detecting the running state of the second storage service; the method for detecting the operation state of the second storage service by the first storage service is usually detected by a heartbeat signal, but does not exclude the case of detecting by other modes, for example, detecting by a status message periodically reported by the second storage service or detecting by the first storage service periodically attempting to establish a data transmission channel with the second storage service; the present embodiment does not make any limiting provision for this.
For example, the first storage service may directly determine the second storage service according to the selection condition among the existing storage services; the second storage service can also be indirectly determined in the existing storage node comprising the existing storage service through the attribution condition of the existing storage service; the present embodiment does not make any restrictive provision for this.
Step 530: determining the running state of the second storage service through the state information of the second storage service;
the running state is used for indicating whether the second storage service fails; for example, in the case that the first storage service detects the operation state of the second storage service through the heartbeat signal, the state message is a heartbeat reply signal of a heartbeat transmission signal transmitted by the first storage service to the second storage service, and in the case that a time interval between the heartbeat reply signal and the heartbeat transmission signal exceeds a heartbeat timeout time, the first storage service determines that the operation state of the second storage service is a failure state; for example, when detecting the operation state of the second storage service, the first storage service determines that the operation state of the second storage service is a fault state if the state information carries fault state information or satisfies a fault state condition.
The operational state of the second storage service may include a normal state failure state, where the normal state indicates that the second storage service may provide the management service for the corresponding second storage unit, and the failure state indicates that the second storage service may not provide the management service for the corresponding second storage unit.
In summary, in the method provided in this embodiment, the second storage service is determined in the first logical pool where the first storage service is located, so that the second storage service in the running state detected by the first storage service is limited in the first logical pool, thereby effectively reducing the network load between different logical pools; the condition that the storage service fault information outside the logic pool is received from the first storage service is avoided, the monitoring node only obtains the storage service running state in the logic pool through the first storage service, reporting errors of the first storage service are avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
Next, a process of determining a second storage service among the existing storage services according to the selection condition will be described in detail; in the exemplary embodiment provided by the present application, there are at least two implementations of the above-described process, namely, there are at least two implementations of step 520 in the embodiment illustrated in fig. 2:
The implementation mode is as follows: directly determining a second storage service in the existing storage service;
the implementation mode II is as follows: determining a second storage node in the existing storage nodes to which the existing storage service belongs, and further determining the second storage service;
specifically, two implementations will be described in detail through flowcharts of the operation state determining method of the distributed storage system shown in fig. 3 and fig. 4;
the implementation mode is as follows: directly determining a second storage service in the existing storage service;
fig. 3 is a flowchart illustrating a method for determining an operation state of a distributed storage system according to an exemplary embodiment of the present application. The method may be performed by a first storage service. That is, based on the embodiment shown in fig. 2, step 520 may be implemented as the following steps:
step 522: determining an existing storage service satisfying the selection condition as a second storage service;
the storage services of the first logical pool have sequence numbers for ordering; for example, the sequence number of the storage service may be at least one of a storage unit space, a storage service state, an online time, a storage data attribute, or other dimensions corresponding to the storage service, and the generation of the sequence number according to the embodiment is not limited in any way.
The selection conditions include: the sequence number of the existing storage service belongs to a sequence numbers before the sequence number of the first storage service and/or the sequence number of the existing storage service belongs to b sequence numbers after the sequence number of the first storage service; a and b are positive integers, a and b are preconfigured; optionally, a is less than the minimum value of the sequential number of the first storage service minus the sequential number of the existing storage service; optionally, b is less than the maximum value of the sequential number of the existing storage service minus the sequential number of the first storage service.
Such as: the first storage service has a sequence number of 5, and two existing storage services having sequence numbers of 3 and 4, which are two sequence numbers before the sequence number of 5, are determined as the second storage service. When the number of the existing storage services corresponding to the sequence number before the sequence number of the first storage service is less than a, the existing storage services corresponding to the sequence number after the sequence number of the first storage service are replenished; such as: the first logic pool comprises six storage services, the sequence numbers of the six storage services are sequentially 1 to 6, the sequence number of the first storage service is 3, and the existing storage service corresponding to the three sequence numbers before the sequence number belongs to 3 is determined to be the second storage service; in this case, there are two existing storage services corresponding to the sequence number preceding the sequence number 3, and the existing storage service corresponding to one sequence number following the sequence number of the first storage service is replenished; the next sequential number of the first storage service may be determined as the second storage service, i.e., the existing storage service having the sequential number 4 is determined as the second storage service; the sequence numbers after the sequence number of the first storage service may be arranged in the order from small to large until the sequence number 1, and the existing storage service corresponding to the sequence number 6 before the sequence number 1 may be determined as the second storage service; the present embodiment does not make any limiting provision for this. It should be noted that the above is merely an exemplary description of the selection condition including a sequence number a before the sequence number of the existing storage service belongs to the sequence number of the first storage service, and those skilled in the art will understand that the selection condition includes b sequence numbers after the sequence number of the existing storage service belongs to the sequence number of the first storage service; and the selection condition includes a sequence number before the sequence number of the existing storage service belongs to the sequence number of the first storage service, and b sequence numbers after the sequence number of the existing storage service belongs to the sequence number of the first storage service, similar exemplary examples can be obtained.
In summary, in the method provided in this embodiment, the second storage service is directly determined in the existing storage service in the first logical pool where the first storage service is located, so that the second storage service in the first logical pool, where the first storage service detects the running state, is limited, thereby effectively reducing the network load between different logical pools; the condition that the storage service fault information outside the logic pool is received from the first storage service is avoided, the monitoring node only obtains the storage service running state in the logic pool through the first storage service, reporting errors of the first storage service are avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
The implementation mode II is as follows: determining a second storage node in the existing storage nodes to which the existing storage service belongs, and further determining the second storage service;
fig. 4 is a flowchart illustrating a method for determining an operation state of a distributed storage system according to an exemplary embodiment of the present application. The method may be performed by a first storage service. That is, based on the embodiment shown in fig. 2, step 520 may be implemented as the following steps:
step 524: determining a second storage node in the existing storage nodes to which the existing storage service belongs according to the node selection condition;
The first storage service belongs to a first storage node in a first logical pool; illustratively, the first storage node includes at least one storage service. The storage services of the first logical pool have sequence numbers for ordering; similar to step 522 in the embodiment shown in fig. 4 above, this embodiment does not provide any limiting provision for the generation of sequential numbering basis.
The existing storage node is determined according to the node attribution condition of the existing storage service in the first logic pool; it should be noted that, in this embodiment, the existing storage node may be determined according to the first management information, or may be obtained by reverse search of the existing storage node, which is not defined in any limitation in this embodiment;
in this embodiment, the second storage node selected by the node selection condition may be part or all of the existing storage nodes; the second storage node may be randomly determined in an existing storage node, or may be determined by at least one of a storage space, a storage service state, a number of storage services, an online time, a storage data attribute, or other dimension of the storage node.
Step 526: and determining a second storage service in the second storage node according to the service selection condition.
In this embodiment, the second storage service selected by the service selection condition may be part or all of the storage services in the second storage node; the second storage service may be randomly determined in the second storage node, or may be determined by at least one of a storage unit space, a storage service state, a number of storage services, an online time, a storage data attribute, or other dimensions corresponding to the storage service.
In the second storage node, a storage service satisfying the service selection condition is determined as a second storage service;
exemplary service selection conditions include: the second storage service has the same relative location in the second storage node as the first storage service in the first storage node. Such as: the location of the first storage service in the first storage node is the first storage service in the first storage node; a first storage service in the second storage node is determined to be a second storage service.
In summary, in the method provided in this embodiment, the second storage service is indirectly determined in the existing storage nodes of the first logical pool where the first storage service is located, so that the second storage service in the running state detected by the first storage service is limited in the first logical pool, and network load between different logical pools is effectively reduced; the condition that the storage service fault information outside the logic pool is received from the first storage service is avoided, the monitoring node only obtains the storage service running state in the logic pool through the first storage service, reporting errors of the first storage service are avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
Specifically, a procedure of determining a second storage service among existing storage services according to a selection condition will be described next by way of one embodiment:
FIG. 5 illustrates a schematic diagram of a distributed storage system provided by an exemplary embodiment of the present application. The distributed storage system comprises a first logic pool and a second logic pool, wherein the first logic pool comprises five storage nodes, and the sequence numbers of the five storage nodes are sequentially 1 to 5; the second logical pool comprises two storage nodes, and the sequence numbers of the two storage nodes are 6 to 7 in sequence.
The distributed storage system comprises 14 storage services, each storage node comprises two storage services, and the sequence numbers of the 14 storage services are sequentially 0 to 13; wherein, the storage node with the sequence number of 1 comprises the storage services with the sequence numbers of 0 and 7; the storage node with the sequence number of 2 comprises storage services with the sequence numbers of 1 and 8; the storage node with the sequence number 3 comprises storage services with the sequence numbers of 2 and 9; the storage node with the sequence number 4 comprises storage services with the sequence numbers of 3 and 10; the storage node with the sequence number of 5 comprises storage services with the sequence numbers of 4 and 11; the storage node with the sequence number of 6 comprises storage services with the sequence numbers of 5 and 12; the storage node with sequence number 7 comprises storage services with sequence numbers 6, 13.
Illustratively, as shown in FIG. 6, the first storage service is a storage service with a sequence number of 1; the first storage service determines an existing storage service in the first logical pool from the distributed storage system according to the first management information, that is, the storage services having the sequence numbers 0 to 4 and 7 to 11 are determined as the existing storage service.
The first storage service determines a second storage service from the existing storage services according to the selection condition; first, as shown in fig. 7, the first storage service determines the storage service corresponding to the preceding sequential number and the following sequential number of the sequential numbers thereof as the second storage service; that is, the storage service with the sequence number 0 and 2 is determined as the second storage service, the storage service with the sequence number 0 is designated as the previous (Pre), and the storage service with the sequence number 2 is designated as the Next (Next). Next, as shown in fig. 8, a second storage node is determined from among the existing storage nodes to which the existing storage service belongs, by randomly selecting the existing storage nodes having the sequence numbers 4 and 5; determining a storage service in the second storage node, which has the same relative position as the first storage service in the first storage node, as a second storage service; the first storage service is the first storage service within the storage node with sequence number 2; accordingly, the first storage service in the second storage node with the sequence numbers 4 and 5, that is, the storage service with the sequence numbers 3 and 4 is determined as the second storage service. Finally, as shown in fig. 9, the storage service corresponding to the five sequential numbers after the first storage service has its sequential number is determined as the second storage service, that is, the storage service having the sequential numbers of 7 to 11 is determined as the second storage service. The first storage service thus far determines the storage service having the sequence numbers 0, 2 to 4, 7 to 11 as the second storage service.
In summary, in the method provided in this embodiment, the existing storage service that meets the selection condition is sequentially determined as the second storage service in the first logical pool where the first storage service is located, so that the second storage service that detects the running state of the first storage service is limited in the first logical pool, thereby effectively reducing the network load between different logical pools; the condition that the storage service fault information outside the logic pool is received from the first storage service is avoided, the monitoring node only obtains the storage service running state in the logic pool through the first storage service, reporting errors of the first storage service are avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
Next, the first management information will be described by the following embodiments:
when the logic pool is divided in the distributed storage system, corresponding management information is designated while the logic pool is created, and the management information establishes a bridge effect between the logic pool and the storage service to link the logic pool and the storage service.
Fig. 10 is a schematic view of a storage service tree structure of a distributed storage system according to an exemplary embodiment of the present application. The management information establishes a bridge effect between the logic pool and the storage service; under the condition of dividing a logic device, dividing the logic pool to realize the isolation of the logic pool into the division of management information; the management information includes first management information and second management information;
The first management information designates a first root node of the tree structure, wherein the child nodes of the first root node are two storage nodes, and the sequence numbers of the storage nodes are 1 and 2; the storage node with the sequence number of 1 comprises two storage services, and the sequence numbers of the storage services are 0 and 4; the storage node with the sequence number of 2 comprises two storage services, and the sequence numbers of the storage services are 1 and 5. The second management information designates a second root node of the tree structure, the child nodes of the second root node are two storage nodes, and the sequence numbers of the storage nodes are 3 and 4; the storage node with the sequence number of 3 comprises two storage services, and the sequence numbers of the storage services are 2 and 6; the storage node with the sequence number of 4 comprises two storage services, and the sequence numbers of the storage services are 3 and 7.
FIG. 11 illustrates a schematic diagram of a distributed storage system provided by an exemplary embodiment of the present application. The distributed storage system has the same storage service tree structure as shown in fig. 7; the first logic pool comprises storage nodes with sequence numbers of 1 and 2; the second logical pool comprises storage nodes with sequence numbers of 3 and 4; determining a second storage service corresponding to the first storage service according to the first management information, namely, determining the second storage service by the storage service with the sequence number of 1; and determining that the second storage service is distributed only in the first logic pool according to the first management information.
In one example, the first management information is to indicate a storage data distribution policy of the first logical pool; specifically, the first management information indicates that the storage data received between the first timestamp and the second timestamp is stored in the storage unit corresponding to the storage service with the sequence numbers of 0 and 4; and storing the storage data received between the third time stamp and the fourth time stamp into storage units corresponding to the storage services with the sequence numbers of 1 and 5. Those skilled in the art will appreciate that the first management information may indicate the corresponding storage unit for the received storage data according to more criteria. That is, the storage data stored in the first logical pool is obtained as the storage location of the storage data according to the instruction of the first management information.
Optionally, the first management information may indicate at least one of a backup number of the storage data, a backup policy of the storage data, a type of a storage unit corresponding to the storage data, and a failure domain in the distributed storage system.
It should be noted that, in the embodiment of the present application, the first management information may be stored by the first storage service, or may be sent to the first storage service by another storage service or a storage node in the distributed storage system; the present application does not provide any limiting provision for the source of the first management information.
In the case where the first management information is stored by the first storage service, the first storage service may store one or more management information; in the case where the first storage service stores a plurality of management information, it is necessary to determine the first management information among the plurality of management information.
Similarly, in the case where the first management information is transmitted to the first storage service by another storage service or storage node in the distributed storage system, and the first storage service receives a plurality of management information, it is necessary to determine the first management information among the plurality of management information; in one exemplary embodiment, in the event that the first storage service receives one of the management information, other storage services or storage nodes in the distributed storage system that send the first management information to the first storage service need to determine the first management information.
Fig. 12 is a flowchart illustrating a method for determining an operation state of a distributed storage system according to an exemplary embodiment of the present application. The method may be performed by a first storage service. I.e. on the basis of the embodiment shown in fig. 2, the method further comprises the following steps:
step 502: determining a corresponding first root node according to a first storage service;
The first root node is the root node of a first logical pool in which the first storage service is located; the first root node may be obtained directly from the first storage service or indirectly through the first storage node to which the first storage service belongs; in an alternative design of this embodiment, step 502 may be implemented as the following sub-steps:
performing reverse retrieval on the first storage service to determine a first storage node to which the first storage service belongs;
performing reverse retrieval on the first storage node, and determining a first root node corresponding to the first storage node, wherein the first storage node belongs to a first logic pool;
the reverse search may be directly performed by the first storage service, the first storage node, and the first logical pool, or may be indirectly performed by the distribution of the storage data in the first storage unit corresponding to the first storage service.
Fig. 13 is a schematic view of a storage service tree structure of a distributed storage system according to an exemplary embodiment of the present application. The first management information designates a first root node of the tree structure, wherein the child nodes of the first root node are two storage nodes, and the sequence numbers of the storage nodes are 1 and 2; the storage node with the sequence number of 1 comprises two storage services, and the sequence numbers of the storage services are 0 and 4; the storage node with the sequence number of 2 comprises two storage services, and the sequence numbers of the storage services are 1 and 5. The second management information designates a second root node of the tree structure, the child nodes of the second root node are two storage nodes, and the sequence numbers of the storage nodes are 3 and 4; the storage node with the sequence number of 3 comprises two storage services, and the sequence numbers of the storage services are 2 and 6; the storage node with the sequence number of 4 comprises two storage services, and the sequence numbers of the storage services are 3 and 7.
Illustratively, reverse searching is performed on the storage service with the sequence number of 4, a storage node with the sequence number of 2 is determined, reverse searching is performed on the storage node with the sequence number of 2, and a first root node is determined; illustratively, the first root node is a virtual node corresponding to the first logical pool, and the first root node is a start node indicated by the first management information. Performing reverse retrieval on the storage service with the sequence number of 6, determining a storage node with the sequence number of 3, performing reverse retrieval on the storage node with the sequence number of 4, and determining a second root node; illustratively, the second root node is a virtual node corresponding to the second logical pool, and the second root node is a start node indicated by the management information corresponding to the second logical pool.
Step 504: determining first management information according to a first root node;
the first root node is a start node indicated by the first management information. The management information specifies a starting node of a tree topology display of the storage service, and the corresponding first management information can be determined by the first root node.
In summary, according to the method provided by the embodiment, the first storage service is searched, the corresponding first root node is determined, and then the first management information is determined, and the storage data distribution strategy of the first logic pool indicated by the first management information lays a foundation for effectively reducing the network load among different logic pools; the condition that the storage service fault information outside the logic pool is received from the first storage service is avoided, the monitoring node only obtains the storage service running state in the logic pool through the first storage service, reporting errors of the first storage service are avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
Fig. 14 is a flowchart illustrating a method for determining an operation state of a distributed storage system according to an exemplary embodiment of the present application. The method may be performed by a first storage service. I.e. on the basis of the embodiment shown in fig. 2, the method further comprises the following steps:
step 506: acquiring first management information from a distributed storage system;
the first management information carries information of all storage services of the first logical pool. For example, the first storage service may obtain the first management information from other storage services or storage nodes in the distributed storage system; in the case where the management information acquired by the first storage service includes a plurality of management information, the first management information may or may not be acquired simultaneously with other management information; similarly, the first management information may have the same information source as other management information, or may have a different information source.
In an alternative design of the present embodiment, the first storage service obtains the first storage service from a monitoring node of the distributed storage system; the monitoring node is used for managing the state information of the distributed storage system, storing the distribution condition of the storage service in the distributed storage system in the monitoring node, and managing the running state of the storage service; in an alternative implementation, the monitoring node may be implemented as a storage node, i.e. the storage node may implement both a data storage function and a monitoring function for the distributed storage system; the monitoring nodes may also be implemented as nodes other than storage nodes in a distributed storage system.
In summary, according to the method provided by the embodiment, the first management information is obtained from the distributed storage system, and the storage data distribution policy of the first logic pool indicated by the first management information lays a foundation for effectively reducing the network load among different logic pools; the condition that the storage service fault information outside the logic pool is received from the first storage service is avoided, the monitoring node only obtains the storage service running state in the logic pool through the first storage service, reporting errors of the first storage service are avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
Fig. 15 is a flowchart illustrating a method for determining an operation state of a distributed storage system according to an exemplary embodiment of the present application. The method may be performed by a first storage service. That is, based on the embodiment shown in fig. 2, step 510 may be implemented as the following sub-steps:
step 512: determining all storage services of the first logic pool according to the storage data distribution strategy of the first logic pool indicated by the first management information;
the first management information is used for indicating a storage data distribution strategy of a first logic pool where the first storage service is located, and the first management information carries information of all storage services of the first logic pool; for example, the distribution policy indicated by the first management information is used for indicating that text data in a txt format is stored in a first storage unit corresponding to the first storage service, and document data in a doc format is stored in a second storage unit corresponding to the second storage service; one skilled in the art can understand that the management information can obtain a data distribution policy according to at least one of the source, time, use, format, importance, transmission speed requirement, redundancy backup requirement and the like of the data; the present embodiment does not make any limiting provision for this.
In an exemplary embodiment, all storage services of the first logical pool are indirectly carried in the first management information, and according to the storage data distribution policy of the first logical pool indicated by the first management information, all storage services of the first logical pool are determined by calculating the first management information or reversely retrieving the storage data distribution policy indicated by the first management information.
Step 514: determining existing storage services according to all the storage services of the first logic pool;
illustratively, the existing storage service is all or part of all storage services of the first logical pool; in an alternative design, in the case where the existing storage service is a part of all storage services in the first logical pool, the existing storage service is a storage service whose operation state is a normal state among all storage services in the first logical pool; those skilled in the art will appreciate that the existing storage service may be determined from all storage services of the first logical pool from other dimensions, which the present embodiment does not provide for any limiting provision.
In summary, the method provided in this embodiment determines, according to the storage data distribution policy of the first logical pool indicated by the first management information, an existing storage service; the second storage service of the first storage service detection running state is limited in the first logic pool, so that network load among different logic pools is effectively reduced; the condition that the storage service fault information outside the logic pool is received from the first storage service is avoided, the monitoring node only obtains the storage service running state in the logic pool through the first storage service, reporting errors of the first storage service are avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
Next, the detection of the operation state of the second storage service by the heartbeat signal by the first storage service exemplified in the above embodiment will be described in detail;
fig. 16 shows a schematic diagram of heartbeat signal detection provided by an exemplary embodiment of the present application. The first storage service sends a heartbeat sending signal to the second storage service at a first time stamp t1, the second storage service sends a heartbeat reply signal to the first storage service under the condition that the second storage service receives the heartbeat sending signal, and the time for the second storage service to send the heartbeat reply signal to the first storage service is the second time stamp t2;
optionally, after the first storage service receives the heartbeat interval time after the heartbeat reply signal, the heartbeat sending signal is sent again to the second storage service at a third timestamp t 3; illustratively, the heartbeat interval is 5 seconds, i.e., the first storage service sends a heartbeat transmit signal again to the second storage service 5 seconds after receiving the heartbeat reply signal.
For example, in the case that a time interval between the first storage service sending the heartbeat sending signal and the first storage service receiving the corresponding heartbeat reply signal sent by the second storage service is greater than the heartbeat timeout time, the first storage service determines that the running state of the second storage service is a fault state; for example, in the case that the time interval between the third timestamp t3 and the fourth timestamp t4 is greater than the heartbeat timeout time, the first storage service determines that the operation state of the second storage service is a failure state; illustratively, the heartbeat timeout time is 20 seconds.
Fig. 17 is a flowchart illustrating a method for determining an operation state of a distributed storage system according to an exemplary embodiment of the present application. The method may be performed by a first storage service. That is, based on the embodiment shown in fig. 2, step 530 may be implemented as step 532 and step 534, and the embodiment further includes step 540:
step 532: determining that the operating state of the second storage service is a normal state if the time interval between the state message and the request message is less than the target threshold;
the status message is sent by the second storage service upon receipt of the request message sent by the first storage service; for example, in the case where the first storage service detects the operation state of the second storage service through the heartbeat signal, the request message is a heartbeat transmission signal, and the status message is a heartbeat reply signal; the normal state is used to indicate that the second storage service can provide management services for the corresponding second storage unit.
Step 534: and determining that the operating state of the second storage service is a fault state if the time interval between the state message and the request message is greater than a target threshold.
The failure state is used to indicate that the second storage service may not provide management services for the corresponding second storage unit. Exemplary reasons for the operational status of the second storage service to be a failure status include, but are not limited to, at least one of: the second storage unit corresponding to the second storage service is damaged, and the second storage service is offline in the distributed storage system.
Step 540: under the condition that the running state of the second storage service is a fault state, fault information of the second storage service is sent to a monitoring node in the distributed storage system;
the monitoring node is used for managing the state information of the distributed storage system, storing the distribution condition of the storage service in the distributed storage system in the monitoring node, and managing the running state of the storage service;
it should be noted that the above embodiment is only an exemplary illustration, and the step 540 in the embodiment may be combined with the embodiment shown in fig. 2 to obtain a new embodiment; similarly, two sub-steps included in step 530 on the basis of the embodiment shown in fig. 2 may also be combined with steps 510, 520 to obtain a new embodiment.
In summary, in the method provided in this embodiment, fault information of the second storage service is sent to the monitoring node in the distributed storage system; the monitoring node only obtains the running state of the storage service in the logic pool through the first storage service, so that the situation that the storage service fault information outside the logic pool is received from the first storage service is avoided, and the accuracy of the storage service fault information under the condition that the logic pool is divided is ensured.
It will be appreciated by those skilled in the art that the above embodiments may be implemented independently, or the above embodiments may be combined freely to form a new embodiment to implement the method for determining the operation state of the distributed storage system of the present application.
Fig. 18 is a block diagram illustrating an operation state determining apparatus of a distributed storage system according to an exemplary embodiment of the present application.
The distributed storage system comprises a first logic pool, wherein the first logic pool comprises a first storage service, a first storage unit is corresponding to the first storage service, and the first storage service is used for carrying out storage management on the first storage unit in the distributed storage system; the apparatus is executed by a first storage service; the device comprises:
a processing module 810, configured to determine an existing storage service in the first logical pool according to first management information, where the first management information is used to indicate a storage data distribution policy of the first logical pool where the first storage service is located, and the first management information carries information of all storage services in the first logical pool;
a selection module 820, configured to determine a second storage service from the existing storage services according to a selection condition, where the first storage service is configured to detect an operation state of the second storage service;
And a detection module 830, configured to determine, according to a status message of the second storage service, an operation status of the second storage service, where the operation status is used to indicate whether the second storage service fails.
In an alternative design of this embodiment, the storage services of the first logical pool have sequential numbers for ordering;
the selection module 820 is further configured to:
determining an existing storage service satisfying the selection condition as the second storage service;
wherein the selection conditions include: the sequence number of the existing storage service belongs to a sequence numbers before the sequence number of the first storage service, and/or the sequence number of the existing storage service belongs to b sequence numbers after the sequence number of the first storage service; a and b are both positive integers, a and b being preconfigured.
In an optional design of this embodiment, the first storage service belongs to a first storage node in the first logical pool, and a sequence number for ordering exists in the storage service of the first logical pool;
the selection module 820 is further configured to:
determining a second storage node in existing storage nodes to which the existing storage service belongs according to node selection conditions, wherein the existing storage nodes are determined according to node attribution conditions of the existing storage service in the first logic pool;
And determining the second storage service in the second storage node according to a service selection condition.
In an alternative design of this embodiment, the selection module 820 is further configured to:
determining, in the second storage node, a storage service that satisfies the service selection condition as the second storage service;
wherein the service selection condition includes: the second storage service has the same relative location in the second storage node as the first storage service in the first storage node.
In an alternative design of this embodiment, the apparatus further comprises:
and the sending module 840 is configured to send, when the running state of the second storage service is a failure state, failure information of the second storage service to a monitoring node in the distributed storage system, where the monitoring node is configured to manage the running state of the storage service in the distributed storage system.
In an alternative design of this embodiment, the status message is sent by the second storage service upon receipt of a request message sent by the first storage service;
the detection module 830 is further configured to:
determining that the running state of the second storage service is a normal state under the condition that the time interval between the state message and the request message is smaller than a target threshold value;
And determining that the running state of the second storage service is a fault state under the condition that the time interval between the state message and the request message is larger than a target threshold value.
In an alternative design of the present embodiment, the processing module 810 is further configured to:
determining a corresponding first root node according to the first storage service, wherein the first root node is the root node of the first logic pool where the first storage service is located;
and determining the first management information according to the first root node, wherein the first root node is a starting node indicated by the first management information.
In an alternative design of the present embodiment, the processing module 810 is further configured to:
performing reverse retrieval on the first storage service, and determining a first storage node to which the first storage service belongs;
and carrying out reverse retrieval on the first storage node, and determining a first root node corresponding to the first storage node, wherein the first storage node belongs to the first logic pool.
In an alternative design of this embodiment, the apparatus further comprises:
and an obtaining module 850, configured to obtain the first management information from the distributed storage system, where the first management information carries information of all storage services of the first logical pool.
In an alternative design of the present embodiment, the processing module 810 is further configured to:
determining all storage services of the first logic pool according to the storage data distribution strategy of the first logic pool indicated by the first management information;
and determining the existing storage service according to all the storage services of the first logic pool.
It should be noted that, when the apparatus provided in the foregoing embodiment performs the functions thereof, only the division of the respective functional modules is used as an example, in practical application, the foregoing functional allocation may be performed by different functional modules according to actual needs, that is, the content structure of the device is divided into different functional modules, so as to perform all or part of the functions described above.
With respect to the apparatus in the above embodiments, the specific manner in which the respective modules perform the operations has been described in detail in the embodiments regarding the method; the technical effects achieved by the execution of the operations by the respective modules are the same as those in the embodiments related to the method, and will not be described in detail herein.
The embodiment of the application also provides a computer device, which comprises: a processor and a memory, the memory storing a computer program; the processor is configured to execute the computer program in the memory to implement the method for determining an operation state of the distributed storage system provided by the foregoing method embodiments.
Optionally, the computer device is a server. Illustratively, fig. 19 is a block diagram of a server provided by an exemplary embodiment of the present application.
In general, the server 2300 includes: a processor 2301 and a memory 2302.
The processor 2301 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 2301 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 2301 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a central processor (Central Processing Unit, CPU), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 2301 may be integrated with an image processor (Graphics Processing Unit, GPU) for use in connection with rendering and rendering of content to be displayed by the display screen. In some embodiments, the processor 2301 may also include an artificial intelligence (Artificial Intelligence, AI) processor for processing computing operations related to machine learning.
Memory 2302 may include one or more computer-readable storage media, which may be non-transitory. Memory 2302 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 2302 is used to store at least one instruction for execution by processor 2301 to implement the method of determining an operational state of a distributed storage system provided by a method embodiment of the present application.
In some embodiments, server 2300 may further optionally include: an input interface 2303 and an output interface 2304. The processor 2301 and the memory 2302 may be connected to the input interface 2303 and the output interface 2304 through buses or signal lines. The respective peripheral devices may be connected to the input interface 2303 and the output interface 2304 through buses, signal lines, or a circuit board. Input interface 2303, output interface 2304 may be used to connect at least one Input/Output (I/O) related peripheral device to processor 2301 and memory 2302. In some embodiments, the processor 2301, memory 2302, and input interface 2303, output interface 2304 are integrated on the same chip or circuit board; in some other embodiments, the processor 2301, the memory 2302, and either or both of the input interface 2303 and the output interface 2304 may be implemented on separate chips or circuit boards, as embodiments of the application are not limited in this respect.
Those skilled in the art will appreciate that the structures shown above are not limiting of server 2300 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.
In an exemplary embodiment, a chip is also provided, the chip including programmable logic circuits and/or program instructions for implementing the method of determining an operating state of a distributed storage system according to the above aspect when the chip is run on a computer device.
In an exemplary embodiment, a computer program product or a computer program is also provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor reads and executes the computer instructions from the computer readable storage medium to implement the method for determining the operation state of the distributed storage system provided by the above method embodiments.
In an exemplary embodiment, there is also provided a computer readable storage medium having stored therein a computer program that is loaded and executed by a processor to implement the method for determining an operation state of a distributed storage system provided by the above-described method embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the embodiments of the present application may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but rather, the application is to be construed as limited to the appended claims.

Claims (14)

1. The running state determining method of the distributed storage system is characterized in that the distributed storage system comprises a first logic pool, the first logic pool comprises a first storage service, the first storage service corresponds to a first storage unit, and the first storage service is used for carrying out storage management on the first storage unit in the distributed storage system; the method is performed by a first storage service, the method comprising:
determining the existing storage service in the first logic pool according to first management information, wherein the first management information is used for indicating a storage data distribution strategy of the first logic pool where the first storage service is located, and the first management information carries information of all storage services of the first logic pool;
determining a second storage service in the existing storage services according to a selection condition, wherein the first storage service is used for detecting the running state of the second storage service;
and determining the running state of the second storage service through the state message of the second storage service, wherein the running state is used for indicating whether the second storage service has faults or not.
2. The method of claim 1, wherein the storage services of the first logical pool have sequence numbers for ordering;
Said determining a second storage service among said existing storage services according to a selection condition, comprising:
determining an existing storage service satisfying the selection condition as the second storage service;
wherein the selection conditions include: the sequence number of the existing storage service belongs to a sequence numbers before the sequence number of the first storage service, and/or the sequence number of the existing storage service belongs to b sequence numbers after the sequence number of the first storage service; a and b are both positive integers, a and b being preconfigured.
3. The method of claim 1, wherein the first storage service belongs to a first storage node in the first logical pool, the storage services of the first logical pool having a sequence number for ordering;
said determining a second storage service among said existing storage services according to a selection condition, comprising:
determining a second storage node in existing storage nodes to which the existing storage service belongs according to node selection conditions, wherein the existing storage nodes are determined according to node attribution conditions of the existing storage service in the first logic pool;
and determining the second storage service in the second storage node according to a service selection condition.
4. A method according to claim 3, wherein said determining said second storage service in said second storage node according to a service selection condition comprises:
determining, in the second storage node, a storage service that satisfies the service selection condition as the second storage service;
wherein the service selection condition includes: the second storage service has the same relative location in the second storage node as the first storage service in the first storage node.
5. The method according to any one of claims 1 to 4, further comprising:
and under the condition that the running state of the second storage service is a fault state, sending fault information of the second storage service to a monitoring node in the distributed storage system, wherein the monitoring node is used for managing the running state of the storage service in the distributed storage system.
6. The method according to any of claims 1 to 4, wherein the status message is sent by the second storage service upon receipt of a request message sent by the first storage service;
the determining, by the status message of the second storage service, an operation status of the second storage service includes:
Determining that the running state of the second storage service is a normal state under the condition that the time interval between the state message and the request message is smaller than a target threshold value;
and determining that the running state of the second storage service is a fault state under the condition that the time interval between the state message and the request message is larger than a target threshold value.
7. The method according to any one of claims 1 to 4, further comprising:
determining a corresponding first root node according to the first storage service, wherein the first root node is the root node of the first logic pool where the first storage service is located;
and determining the first management information according to the first root node, wherein the first root node is a starting node indicated by the first management information.
8. The method of claim 7, wherein the determining the corresponding first root node from the first storage service comprises:
performing reverse retrieval on the first storage service, and determining a first storage node to which the first storage service belongs;
and carrying out reverse retrieval on the first storage node, and determining a first root node corresponding to the first storage node, wherein the first storage node belongs to the first logic pool.
9. The method according to any one of claims 1 to 4, further comprising:
and acquiring the first management information from the distributed storage system, wherein the first management information carries information of all storage services of the first logic pool.
10. The method of any of claims 1 to 4, wherein determining existing storage services in the first logical pool based on the first management information comprises:
determining all storage services of the first logic pool according to the storage data distribution strategy of the first logic pool indicated by the first management information;
and determining the existing storage service according to all the storage services of the first logic pool.
11. An operation state determining device of a distributed storage system is characterized in that the distributed storage system comprises a first logic pool, the first logic pool comprises a first storage service, the first storage service corresponds to a first storage unit, and the first storage service is used for carrying out storage management on the first storage unit in the distributed storage system; the apparatus is executed by a first storage service, the apparatus comprising:
The processing module is used for determining the existing storage service in the first logic pool according to first management information, wherein the first management information is used for indicating a storage data distribution strategy of the first logic pool where the first storage service is located, and the first management information carries information of all storage services of the first logic pool;
the selection module is used for determining a second storage service in the existing storage services according to selection conditions, and the first storage service is used for detecting the running state of the second storage service;
the detection module is used for determining the running state of the second storage service through the state message of the second storage service, and the running state is used for indicating whether the second storage service has faults or not.
12. A computer device, the computer device comprising: a processor and a memory, wherein at least one section of program is stored in the memory; the processor is configured to execute the at least one program in the memory to implement the method for determining an operating state of the distributed storage system according to any one of claims 1 to 10.
13. A computer readable storage medium having stored therein executable instructions that are loaded and executed by a processor to implement the method of operating state determination of a distributed storage system as claimed in any one of claims 1 to 10.
14. A computer program product or computer program, characterized in that it comprises computer instructions stored in a computer readable storage medium, from which a processor reads and executes the computer instructions to implement the method of determining the operational state of a distributed storage system according to any of the preceding claims 1 to 10.
CN202210194600.0A 2022-03-01 2022-03-01 Method, device, equipment and medium for determining running state of distributed storage system Pending CN116737486A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210194600.0A CN116737486A (en) 2022-03-01 2022-03-01 Method, device, equipment and medium for determining running state of distributed storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210194600.0A CN116737486A (en) 2022-03-01 2022-03-01 Method, device, equipment and medium for determining running state of distributed storage system

Publications (1)

Publication Number Publication Date
CN116737486A true CN116737486A (en) 2023-09-12

Family

ID=87903072

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210194600.0A Pending CN116737486A (en) 2022-03-01 2022-03-01 Method, device, equipment and medium for determining running state of distributed storage system

Country Status (1)

Country Link
CN (1) CN116737486A (en)

Similar Documents

Publication Publication Date Title
CN102402395B (en) Quorum disk-based non-interrupted operation method for high availability system
CN103714097A (en) Method and device for accessing database
CN103069752B (en) The method of the agency of collection information and storage management system
CN110096472B (en) Selection of management nodes in a node cluster
CN112836152B (en) Page rendering method, system, computer device and computer readable storage medium
CN111314158B (en) Big data platform monitoring method, device, equipment and medium
CN111104283A (en) Fault detection method, device, equipment and medium of distributed storage system
CN110121694B (en) Log management method, server and database system
CN114816820A (en) Method, device, equipment and storage medium for repairing chproxy cluster fault
CN114003350B (en) Data distribution method and system of super-fusion system
CN111198662A (en) Data storage method and device and computer readable storage medium
CN116226139B (en) Distributed storage and processing method and system suitable for large-scale ocean data
CN109510730A (en) Distributed system and its monitoring method, device, electronic equipment and storage medium
US20080250421A1 (en) Data Processing System And Method
CN116737486A (en) Method, device, equipment and medium for determining running state of distributed storage system
CN112054926B (en) Cluster management method and device, electronic equipment and storage medium
CN103685359A (en) Data processing method and device
CN113407374A (en) Fault processing method and device, fault processing equipment and storage medium
CN111817892A (en) Network management method, system, electronic equipment and storage medium
CN106850715A (en) A kind of primary main frame dynamic selection method of Intrusion Detection based on host state and priority
CN110401582A (en) Detection method, device and the storage medium of cloud computing system storage health distress
CN112306781B (en) Thread fault processing method, device, medium and equipment
CN113568710B (en) High availability realization method, device and equipment for virtual machine
CN115150253B (en) Fault root cause determining method and device and electronic equipment
CN108959170B (en) Virtual device management method, device, stacking system and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40094537

Country of ref document: HK