CN111614701B - Distributed cluster and container state switching method and device - Google Patents

Distributed cluster and container state switching method and device Download PDF

Info

Publication number
CN111614701B
CN111614701B CN201910131705.XA CN201910131705A CN111614701B CN 111614701 B CN111614701 B CN 111614701B CN 201910131705 A CN201910131705 A CN 201910131705A CN 111614701 B CN111614701 B CN 111614701B
Authority
CN
China
Prior art keywords
container
container engine
state
available state
engine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910131705.XA
Other languages
Chinese (zh)
Other versions
CN111614701A (en
Inventor
陶琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201910131705.XA priority Critical patent/CN111614701B/en
Publication of CN111614701A publication Critical patent/CN111614701A/en
Application granted granted Critical
Publication of CN111614701B publication Critical patent/CN111614701B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • H04L41/0663Performing the actions predefined by failover planning, e.g. switching to standby network elements

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a distributed cluster and a container state switching method and device, wherein in the distributed cluster, a container engine is configured in each node, the state of the container engine comprises an available state and a non-available state, the execution flow of the container engine in the non-available state is similar, for example, a container engine in the non-available state is used for monitoring whether the container engine in the available state fails or not, if the container engine in the available state fails, the container engine in the available state is judged whether to have a container engine in the available state which fails or not and is not replaced based on a specified record in a preset storage area, and if the container engine in the available state exists, the container engine in the available state is switched to the available state. Therefore, according to the scheme, manual intervention is not needed, the container engine in the unavailable state automatically switches the state of the container engine, and the switching time is shortened.

Description

Distributed cluster and container state switching method and device
Technical Field
The present invention relates to the field of distributed cluster technologies, and in particular, to a distributed cluster and a method and an apparatus for switching a container status.
Background
The related distributed cluster generally includes a master node and a standby node, and the master node may provide services such as data query and data storage. If the main node fails, manual intervention is usually required to switch the service in the main node to the standby node, and the standby node provides the service.
However, in this scheme, the service is switched by manual intervention, which results in a long time-consuming switching.
Disclosure of Invention
Embodiments of the present invention provide a distributed cluster, and a method and an apparatus for switching a container status, so as to shorten time consumed for switching.
In order to achieve the above object, an embodiment of the present invention provides a distributed cluster, where the cluster includes multiple nodes, and each node is configured with a container engine, where the container engine includes a container engine in an available state and a container engine in a non-available state;
the non-available container engine is used for monitoring whether the available container engine fails or not; if the failure occurs, judging whether a container engine in an unsuccessfully failed available state exists or not based on a specified record in a preset storage area; and if so, switching the state of the container engine in the non-available state into an available state.
Optionally, the container engine in the non-available state is further configured to determine whether a container engine identifier exists in the record, and if the container engine identifier does not exist in the record, switch the state of the container engine in the non-available state to an available state, and add the identifier of the container engine switched to the available state to the record.
Optionally, the container engines in the non-available state are further configured to determine whether the number of container engine identifiers in the record is smaller than the number of container engines in the available state that have failed; and if the number of the container engines in the non-available state is less than the number of the container engines in the available state, switching the state of the container engines in the non-available state into the available state, and adding the self identification of the container engines switched into the available state into the record.
Optionally, the container engine in the available state is configured to perform load balancing processing on the container engines in the available states in the distributed cluster.
Optionally, the container engine in the non-available state is further configured to access the specified record in the preset storage area through a probe in the container engine.
Optionally, the cluster further includes a sharing device, the preset storage area is located in the sharing device, and each container engine is in communication connection with the sharing device;
the non-available container engine is further configured to access a specified record in the shared device when the failure of the available container engine is monitored.
To achieve the above object, an embodiment of the present invention further provides a container state switching method, which is applied to a first node in a distributed cluster, where a first container engine is configured in the first node, and a state of the first container engine is: an available state or a non-available state; the method comprises the following steps:
monitoring whether a container engine in an available state in the cluster fails when the state of the first container engine is in a non-available state;
if the failure occurs, judging whether a container engine in an unsuccessfully failed available state exists or not based on a specified record in a preset storage area;
if so, switching the state of the first container engine to an available state.
Optionally, the determining whether there is an unsuccessfully failed container engine in an available state based on the specified record in the preset storage area includes:
judging whether a container engine identifier exists in a specified record in a preset storage area or not;
if not, indicating that there is a failed container engine that is not superseded;
the method further comprises the following steps:
adding the identifier of the first container engine in the record if the container engine identifier does not exist in the record.
Optionally, the determining whether there is an unsuccessfully failed container engine in an available state based on the specified record in the preset storage area includes:
judging whether the number of the container engine identifications in the specified record in the preset storage area is less than the number of the container engines in the available state with faults;
if so, indicating that there is a failed container engine that is not superseded;
the method further comprises the following steps:
in case the number of container engine identifications in the record is smaller than the number of container engines in a failed available state, adding the identification of the first container engine in the record.
Optionally, the method further includes:
and when the state of the first container engine is an available state, performing load balancing processing on the container engines in the available states in the distributed cluster.
Optionally, the first container engine includes a first probe therein; the method further comprises the following steps:
and accessing the specified record in the preset storage area through the first probe.
To achieve the above object, an embodiment of the present invention further provides a container state switching apparatus, which is applied to a first node in a distributed cluster, where a first container engine is configured in the first node, and a state of the first container engine is: an available state or a non-available state; the device comprises:
a monitoring module, configured to monitor whether a container engine in an available state in the cluster fails when a state of the first container engine is a non-available state; if the fault occurs, triggering a judgment module;
the judging module is used for judging whether a container engine in an unsuccessfully failed available state exists or not based on the specified record in the preset storage area; if not, triggering a switching module;
and the switching module is used for switching the state of the first container engine into an available state.
Optionally, the determining module is specifically configured to:
judging whether a container engine identifier exists in a specified record in a preset storage area or not; if not, indicating that there is a failed container engine that is not superseded;
the device further comprises:
a first adding module, configured to add, in the record, an identifier of the first container engine if the identifier of the container engine does not exist in the record.
Optionally, the determining module is specifically configured to:
judging whether the number of the container engine identifications in the specified record in the preset storage area is less than the number of the container engines in the available state with faults; if so, indicating that there is a failed container engine that is not superseded;
the device further comprises:
a second adding module, configured to add, in the record, an identifier of the first container engine if the number of container engine identifiers in the record is smaller than the number of container engines in the failed available state.
Optionally, the apparatus further comprises:
and the processing module is used for carrying out load balancing processing on the container engines in the available states in the distributed cluster under the condition that the state of the first container engine is the available state.
Optionally, the first container engine includes a first probe therein; the device further comprises:
and the access module is used for accessing the specified record in the preset storage area through the first probe.
In order to achieve the above object, an embodiment of the present invention further provides an electronic device, including a processor and a memory;
a memory for storing a computer program;
and the processor is used for realizing any container state switching method when executing the program stored in the memory.
In order to achieve the above object, an embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the computer-readable storage medium implements any one of the above container state switching methods.
By applying the distributed cluster provided by the embodiment of the invention, the container engine is configured in each node, the state of the container engine comprises an available state and a non-available state, the execution flow of the container engine in each non-available state is similar, for example, a container engine in a non-available state is taken as an example, the container engine in an available state is monitored whether to have a fault, if the fault occurs, the container engine in a failed state which is not replaced is judged whether to exist based on a specified record in a preset storage area, and if the fault occurs, the self state is switched to the available state. Therefore, according to the scheme, manual intervention is not needed, the container engine in the unavailable state automatically switches the state of the container engine, and the switching time is shortened.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1a is a schematic structural diagram of a distributed cluster according to an embodiment of the present invention;
fig. 1b is a schematic structural diagram of another distributed cluster according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a container status switching method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a container state switching device according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to solve the foregoing technical problem, embodiments of the present invention provide a distributed cluster, and a method and an apparatus for switching a container status, where first, the distributed cluster provided in embodiments of the present invention is described in detail below.
Fig. 1a is a schematic structural diagram of a distributed cluster provided in an embodiment of the present invention, where the distributed cluster includes a plurality of nodes: node 1, node 2 … …, node N. Each node is configured with a container engine, which may also be referred to as a container, which may be understood as a lightweight, stand-alone executable software package containing data, such as code, system tools, system libraries, etc., necessary to run the software. For example, the container engine may be Docker, rkt of CoreOS (an operating system), or the like, and is not limited in particular.
For convenience of description, the container engine configured in the node 1 is referred to as container 1, the container engine configured in the node 2 is referred to as container 2 … …, and the container engine configured in the node N is referred to as container N. The container engines include a container engine in an available state and a container engine in a non-available state, or the state of the container engine is divided into two states, namely an available state and a non-available state. In one case, a container engine in an available state may exist in the distributed cluster, and one container engine provides services such as data reading and writing, so that the situation of repeated data writing can be reduced, and the situation of data reading and writing errors is also reduced. Or, in another case, a plurality of container engines in available states may also exist in the distributed cluster, and the plurality of container engines provide services such as data reading and writing, so that on one hand, the processing efficiency may be improved, and on the other hand, the stability of the cluster may be improved, for example, when one container engine fails, other container engines may still provide data reading and writing services.
In this embodiment, the execution flow of each container engine in the unavailable state is similar, and a container engine in the unavailable state is taken as an example for description below:
a container engine in a non-available state for monitoring whether the container engine in the available state fails; if the failure occurs, judging whether a container engine in an unsuccessfully failed available state exists or not based on a specified record in a preset storage area; and if so, switching the state of the container engine in the non-available state into an available state.
In one case, only the non-available container engine monitors the available container engine for a failure; in another case, the container engines can monitor each other to monitor whether the other side has a fault.
For example, each container engine may include a probe, and the container engines may be monitored by the probe. Alternatively, a monitoring service may be included in each container engine, and the monitoring service may be a process, and the container engines monitor each other through the monitoring service. In one case, the container engines configured in each node are the same container engine, so that mutual monitoring among the container engines is facilitated. If the container engine that detects the available status fails, the other container engines may start a contention mode, i.e., contend for the container engine to become available.
As an embodiment, if there is only one available container engine in the distributed cluster, the non-available container engine may determine whether there is a new available container engine based on a specified record in the preset storage area. For example, it may be determined whether a container engine identifier exists in a specified record in the preset storage area, and if not, it indicates that there is no container engine in a new available state, that is, there is a container engine in a failed available state that is not taken over; in this case, the state of the container engine in the non-available state may be switched to the available state, and the self identifier of the container engine switched to the available state may be added to the record.
In one embodiment, the container engine in the non-available state may access a specified record in a preset storage area in the case where it is monitored that the container engine in the available state fails, and determine whether there is a failed container engine in the available state that has not been taken over based on the record.
In another embodiment, each container engine may periodically or aperiodically access a designated record in a preset storage area; in this embodiment, the container engine in the non-available state may determine whether there is a container engine in the failed available state that is not taken over according to the latest record obtained by accessing when the container engine in the available state is monitored to be failed.
Referring to fig. 1a, assuming that a container 2 is a container engine in a usable state, other containers monitor whether the container 2 is malfunctioning. Taking container 1 as an example, assuming that container 1 detects a failure of container 2, container 1 accesses a specified record in a preset storage area. If the identity of container 3 already exists in the record, it indicates that container 3 has been preemptively made available to the container engine. If the container engine identification does not exist in the record, the container 1 switches the self state into the available state, and adds the identification of the container 1 in the record, so that when other containers access the record, the other containers can know that the container 1 is the container engine in the available state in advance.
The container engine identifier may be an ID, an IP address, or the like of the container, and is not particularly limited.
As another embodiment, if there are multiple container engines in available states in the distributed cluster, the container engine in a non-available state may determine whether the number of container engine identifiers in the designated record of the preset storage area is less than the number of container engines in a failed available state; if so, indicating that there is a failed container engine that is not superseded; in this case, the state of the container engine in the non-available state is switched to the available state, and the self identifier of the container engine switched to the available state is added to the record.
Still referring to FIG. 1a, assuming that containers 2 and 4 are available status container engines, other containers monitor whether containers 2 and 4 are malfunctioning. Taking container 1 as an example, assuming that container 1 detects that container 2 is faulty, that is, the number of container engines in a faulty available state is 1, container 1 accesses a specified record in a preset storage area. If the identification of the container 3 already exists in the record, that is, the number 1 of container engine identifications in the record is not less than the number 1 of container engines in the failed available state, or the container 3 has already taken over the container engine in which the container 2 (the failed available state container engine) was preemptively made available.
If the container engine identifications do not exist in the record, the number 0 of the container engine identifications in the record is smaller than the number 1 of the container engines in the available state with faults, the container 1 switches the state of the container 1 into the available state, and the identifications of the container 1 are added in the record, so that when other containers access the record, the container 1 can be known to replace the container engines in which the container 2 is in the available state in advance.
As another example, assuming that containers 2 and 4 are available container engines, other containers monitor whether containers 2 and 4 are malfunctioning. Taking container 1 as an example, assuming that container 1 detects that both container 2 and container 4 are malfunctioning, that is, the number of container engines in the malfunctioning available state is 2, container 1 accesses a specified record in a preset storage area. If the identification of the container 3 already exists in the record, that is, the number 1 of container engine identifications in the record is smaller than the number 2 of container engines in the failed available state, so that there still exists a container engine in the failed available state that has not been taken over, the container 1 switches its state to the available state, and the identification of the container 1 is added to the record. In this way, when other containers access the record, it can be known that container 3 and container 1 have superseded the container engine that container 2 and container 4 were preemptively made available.
If the identification of container 3 and container 5 is present in the record, the number of container engine identifications 2 in the record is not less than the number of failed container engines in available state 2, indicating that container 3 and container 5 have superseded container engines in which container 2 and container 4 were preemptively made available.
As described above, in one case, each container engine may include a probe therein, and each container engine may access the specified record in the preset storage area through the probe in its own container engine.
Alternatively, a service for accessing the preset storage area may be included in each container engine, and the service may be a process, and each container engine accesses the specified record in the preset storage area through the service.
The available state container engine may process service requests of the user device. For example, the service request may be a query request, a storage request, and the like, which is not limited in particular. For example, the user equipment may send a query request to the distributed cluster, in which case, the available state container engine queries data corresponding to the query request in the distributed cluster, and returns the found data to the user equipment.
For another example, the user device may send a storage request to the distributed cluster, in which case the available state container engine allocates a storage address for the data to be stored, stores the data to be stored to the storage address, and returns the storage address to the user device.
If a plurality of available state container engines are included in the distributed cluster, the available state container engines can also perform load balancing processing on the available state container engines in the distributed cluster. For example, taking an available state container engine as an example, when receiving a service request from a user, it (the available state container engine) can determine the memory occupation amounts of itself and other available state container engines, and if the memory occupation amount of itself is minimum, it processes the service request. Alternatively, it may determine the number of service requests it and other available state container engines are currently processing, and if it is the smallest number of service requests it is currently processing, it processes the service requests.
As an implementation manner, referring to fig. 1b, the distributed cluster may further include a shared device, the preset storage area is located in the shared device, and each container engine is communicatively connected to the shared device. For example, the sharing device may be a hard disk and a File System shared by the plurality of nodes, for example, NFS (Network File System), and is not limited specifically.
In this embodiment, the container engine in the non-available state may access a specified record in the shared device when the container engine in the available state is monitored to be in failure, and determine whether there is a failed container engine in the failed state that is not taken over based on the record.
Alternatively, the container engine in the non-available state may also access the specified records in the shared device periodically or aperiodically; the container engine in the non-available state can judge whether the container engine in the available state which is not succeeded in the container engine in the available state which is failed exists according to the latest record obtained by accessing under the condition that the container engine in the available state is monitored to be failed.
In one case, for a container engine, it can continuously monitor other container engines, and can continuously access the designated records in the preset storage area, and the order in which the container engine executes the steps is not limited.
By applying the distributed cluster provided by the embodiment of the invention, the container engine is configured in each node, the state of the container engine comprises an available state and a non-available state, the execution flow of the container engine in each non-available state is similar, for example, a container engine in a non-available state is taken as an example, the container engine in an available state is monitored whether to have a fault, if the fault occurs, the container engine in a failed state which is not replaced is judged whether to exist based on a specified record in a preset storage area, and if the fault occurs, the self state is switched to the available state. Therefore, in the scheme, on the first aspect, manual intervention is not needed, the container engine in the unavailable state automatically switches the state of the container engine, the switching time is shortened, and a user basically has no perception in the switching process.
In the second aspect, if a secondary failure occurs, for example, the container 2 in the available state fails, the container 1 switches its own state to the available state, and then the container 1 also fails, in which case, other container engines continue to compete for the container engine in the new available state, manual intervention is still not needed, and the switching time is short.
In a third aspect, in some related schemes, a master application program is installed in a node, and master-slave switching is realized through the master application program.
Corresponding to the distributed cluster embodiment, the embodiment of the invention also provides a container state switching method and a device. The method and the device are applied to a first node in the distributed cluster, the first node can be any node in the distributed cluster, and for convenience of description, an execution main body is called the first node. A container engine is configured in each node of the distributed cluster, and for convenience of description, the container engine configured in the first node is referred to as a first container engine. The state of the first container engine is: an available state or an unavailable state.
As shown in fig. 2, the method may include the steps of:
s201: monitoring whether a container engine in an available state in the cluster fails or not under the condition that the state of the first container engine is in a non-available state; if a failure occurs, S202 is performed.
The container engine may also be referred to as a container, which may be understood as a lightweight independently executable software package containing data required to run software, such as code, system tools, system libraries, and the like. For example, the container engine may be Docker, rkt of CoreOS (an operating system), or the like, and is not limited in particular.
In one case, a container engine in an available state may exist in the distributed cluster, and one container engine provides services such as data reading and writing, so that the situation of repeated data writing can be reduced, and the situation of data reading and writing errors is also reduced. Or, in another case, multiple container engines in available states may also exist in the distributed cluster, and the multiple container engines provide services such as data read-write, so that on one hand, processing efficiency may be improved, and on the other hand, stability of the cluster may be improved, for example, when one container engine fails, other container engines may still provide data read-write services.
In one case, only the non-available container engine monitors the available container engine for a failure; in another case, the container engines can monitor each other to monitor whether the other side fails.
For example, each container engine may include a probe, and the container engines may be monitored by the probe. Alternatively, a monitoring service may be included in each container engine, and the monitoring service may be a process, and the container engines monitor each other through the monitoring service. In one case, the container engines configured in each node are the same container engine, so that mutual monitoring among the container engines is facilitated.
If the first container engine detects a failure of the available container engine, S202 is performed.
S202: based on the specified record in the preset storage area, it is determined whether there is a container engine in a failed available state that has not been taken over. If so, S203 is performed.
In one embodiment, the first container engine may access a specified record in a preset storage area in the case where it is monitored that the container engine in the available state fails, and determine whether there is a failed container engine in the available state that has not been taken over based on the record.
In another embodiment, each container engine may periodically or aperiodically access a designated record in a preset storage area; in this embodiment, the first container engine may determine, when it is detected that the container engine in the available state fails, whether there is a container engine in the failed available state that is not replaced, according to the latest record obtained by accessing.
In one embodiment, the first container engine includes a first probe therein; the first container engine may access the specified records in the preset storage area through the first probe.
Alternatively, a service for accessing the preset storage area may be included in each container engine, and the service may be a process, and each container engine accesses the specified record in the preset storage area through the service.
As an implementation manner, referring to fig. 1b, the cluster may further include a sharing device, the preset storage area is located in the sharing device, and each container engine is communicatively connected to the sharing device. For example, the sharing device may be a hard disk and a File System shared by the plurality of nodes, for example, NFS (Network File System), and is not limited specifically.
In this embodiment, the first container engine may access a specified record in the shared device when the failure of the container engine in the available state is monitored, and determine whether there is a failed container engine in the available state that is not taken over based on the record.
Alternatively, the first container engine may also access specified records in the shared device periodically or aperiodically; the container engine in the non-available state can judge whether the unsuccessfully-replaced container engine in the available state exists according to the latest record obtained by accessing under the condition that the container engine in the available state is monitored to be in the fault state.
S203: the state of the first container engine is switched to an available state.
As an embodiment, if there is only one available state container engine in the distributed cluster, S202 may include: based on the specified record in the preset storage area, it is determined whether there is a new available state container engine. For example, it may be determined whether a container engine identifier exists in the specified record, and if not, it indicates that there is no container engine in a new available state, or there is a container engine in a failed available state that is not taken over; in this case, the state of the first container engine may be switched to an available state and the identity of the first container engine may be added to the record.
Referring to fig. 1a, assuming that a container 2 is a container engine in a usable state, other containers monitor whether the container 2 is malfunctioning. Assuming that container 1 is the first container engine, container 1 detects a failure of container 2 and container 1 accesses a specified record in a preset storage area. If the identity of container 3 already exists in the record, it indicates that container 3 has been preemptively made available to the container engine. If the container engine identification does not exist in the record, the container 1 switches the self state into the available state, and adds the identification of the container 1 in the record, so that when other containers access the record, the other containers can know that the container 1 is the container engine in the available state in advance.
The container engine identifier may be an ID, an IP address, or the like of the container, and is not particularly limited.
As another embodiment, if there are multiple container engines of available state in the distributed cluster, S202 may include: and judging whether the number of the container engine identifications in the specified record of the preset storage area is less than the number of the container engines in the available state with faults. If so, indicating that there is a failed container engine that is not superseded; in this case, the state of the first container engine is switched to the available state and the identity of the first container engine is added to the record.
Still referring to FIG. 1a, assuming that containers 2 and 4 are available status container engines, other containers monitor whether containers 2 and 4 are malfunctioning. Assuming that the container 1 is the first container engine, the container 1 detects that the container 2 is out of order, that is, the number of available container engines in the out-of-order state is 1, and the container 1 accesses a specified record in a preset storage area. If the identification of the container 3 already exists in the record, that is, the number 1 of container engine identifications in the record is not less than the number 1 of container engines in the failed available state, or the container 3 has already taken over the container engine in which the container 2 (the failed available state container engine) was preemptively made available.
If the container engine identifications do not exist in the record, the number 0 of the container engine identifications in the record is smaller than the number 1 of the container engines in the available state with faults, the container 1 switches the state of the container 1 into the available state, and the identifications of the container 1 are added in the record, so that when other containers access the record, the container 1 can be known to replace the container engines in which the container 2 is in the available state in advance.
As another example, assuming that containers 2 and 4 are available container engines, other containers monitor whether containers 2 and 4 are malfunctioning. Assuming that container 1 is the first container engine, container 1 detects that both container 2 and container 4 are malfunctioning, that is, the number of available container engines in the malfunctioning state is 2, and container 1 accesses a specified record in a preset storage area. If the identification of the container 3 already exists in the record, that is, the number 1 of container engine identifications in the record is smaller than the number 2 of container engines in the failed available state, so that there still exists a container engine in the failed available state that has not been taken over, the container 1 switches its state to the available state, and the identification of the container 1 is added to the record. In this way, when other containers access the record, it can be known that container 3 and container 1 have superseded the container engine that container 2 and container 4 were preemptively made available.
If the identification of container 3 and container 5 is present in the record, the number of container engine identifications 2 in the record is not less than the number of failed container engines in available state 2, indicating that container 3 and container 5 have superseded container engines in which container 2 and container 4 were preemptively made available.
The first container engine may process the service request of the user device if the state of the first container engine is an available state. For example, the service request may be a query request, a storage request, and the like, which is not limited in particular. For example, the user equipment may send an inquiry request to the distributed cluster, in which case the first container engine inquires data corresponding to the inquiry request in the distributed cluster, and returns the found data to the user equipment.
For another example, the user device may send a storage request to the distributed cluster, in which case the first container engine allocates a storage address for the data to be stored, stores the data to be stored to the storage address, and returns the storage address to the user device.
If the state of the first container engine is an available state and the distributed cluster comprises a plurality of container engines in available states, the first container engine can also perform load balancing processing on the container engines in the available states in the distributed cluster. For example, when a service request of a user is received, the first container engine may determine the memory occupation amount of the container engine itself and other available states, and if the memory occupation amount of the first container engine is minimum, the service request is processed by the first container engine. Alternatively, the first container engine may determine the number of service requests it and other available state container engines are currently processing, and if the first container engine is currently processing the smallest number of service requests it is processing, the first container engine processes the service requests.
In one case, the first container engine may continuously monitor other container engines, may continuously access a specified record in a predetermined storage area, and may perform steps in a non-limited order.
By applying the embodiment of the invention, if the state of the first container engine is in a non-available state, the first container engine monitors whether the container engine in the available state in the cluster has a fault; if the failure occurs, judging whether a container engine in a failure available state which is not replaced exists or not based on a specified record in a preset storage area; if so, the state of the first container engine is switched to an available state. Therefore, in the scheme, on the first aspect, manual intervention is not needed, the first container engine is automatically switched to the available state, the switching time is shortened, and a user basically has no perception in the switching process.
In the second aspect, if a secondary failure occurs, for example, after the first container engine switches its own state to an available state, the first container engine also fails, in which case, other container engines continue to compete for the container engine in the new available state, manual intervention is still not needed, and the switching time is relatively short.
In a third aspect, in some related schemes, a master application program is installed in a node, and master-slave switching is realized through the master application program.
Corresponding to the foregoing method embodiment, an embodiment of the present invention further provides a container state switching apparatus, which is applied to a first node in a distributed cluster, where the first node is configured with a first container engine, and a state of the first container engine is: an available state or an unavailable state. As shown in fig. 3, the apparatus includes:
a monitoring module 301, configured to monitor whether a container engine in an available state in the cluster fails when the state of the first container engine is a non-available state; if a fault occurs, the judging module 302 is triggered;
a judging module 302, configured to judge whether there is an unsuccessfully failed container engine in an available state based on a specified record in a preset storage area; if not, triggering a switching module;
a switching module 303, configured to switch a state of the first container engine to an available state.
As an embodiment, the determining module 302 may specifically be configured to: judging whether a container engine identifier exists in a specified record in a preset storage area or not; if not, indicating that there is a failed container engine that is not superseded;
in this embodiment, the apparatus may further include: a first adding module (not shown in the figure) for adding the identifier of the first container engine in the record if the identifier of the container engine does not exist in the record.
As an embodiment, the determining module 302 may specifically be configured to: judging whether the number of the container engine identifications in the specified record in the preset storage area is less than the number of the container engines in the available state with faults; if so, indicating that there is a failed container engine that is not superseded;
in this embodiment, the apparatus may further include: a second adding module (not shown in the figure) for adding the identification of the first container engine in the record in case the number of container engine identifications in the record is smaller than the number of container engines in the failed available state.
As an embodiment, the apparatus further comprises:
a processing module (not shown in the figure), configured to perform load balancing processing on the container engines in the available states in the distributed cluster when the state of the first container engine is an available state.
In one embodiment, the first container engine includes a first probe therein; the device further comprises:
and an accessing module (not shown in the figure) for accessing the specified record in the preset storage area through the first probe.
By applying the embodiment of the invention, if the state of the first container engine is in a non-available state, the first container engine monitors whether the container engine in the available state in the cluster has a fault; if the failure occurs, judging whether a container engine in an unsuccessfully failed available state exists or not based on a specified record in a preset storage area; if so, the state of the first container engine is switched to an available state. Therefore, in the scheme, on the first hand, manual intervention is not needed, the first container engine is automatically switched to the available state, and the switching time is shortened.
An embodiment of the present invention further provides an electronic device, as shown in fig. 4, including a processor 401 and a memory 402;
a memory 402 for storing a computer program;
the processor 401 is configured to implement any of the above-described container state switching methods when executing the program stored in the memory 402.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
The embodiment of the invention also provides a computer-readable storage medium, wherein a computer program is stored in the computer-readable storage medium, and when the computer program is executed by a processor, the computer program realizes any one of the container state switching methods.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the method embodiment, the apparatus embodiment, the device embodiment, and the computer-readable storage medium embodiment, since they are substantially similar to the distributed cluster embodiment, the description is relatively simple, and the relevant points can be referred to the partial description of the distributed cluster embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (18)

1. A distributed cluster is characterized in that the cluster comprises a plurality of nodes, each node is provided with a container engine, and the container engines comprise a container engine in an available state and a container engine in a non-available state;
the non-available container engine is used for monitoring whether the available container engine fails or not; if the failure occurs, judging whether a container engine in an unsuccessfully failed available state exists or not based on a specified record in a preset storage area; if so, switching the state of the container engine in the non-available state into an available state;
wherein, the container engine identifier of the container engine of the new available state is recorded in the specified record.
2. The distributed cluster according to claim 1, wherein the container engine in the unavailable state is further configured to determine whether a container engine identifier exists in the record, and if not, switch the state of the container engine in the unavailable state to the available state, and add the identifier of the container engine switched to the available state to the record.
3. The distributed cluster of claim 1, wherein the unavailable container engine is further configured to determine whether the number of container engine identifiers in the record is less than the number of failed available container engines; and if the number of the container engines in the non-available state is less than the number of the container engines in the available state, switching the state of the container engines in the non-available state into the available state, and adding the self identification of the container engines switched into the available state into the record.
4. The distributed cluster of claim 3, wherein the available state container engines are configured to load balance the available state container engines in the distributed cluster.
5. The distributed cluster of claim 1, wherein the unavailable container engine is further configured to access the specified records in the predetermined storage area through a probe in its own container engine.
6. The distributed cluster of claim 1, wherein the cluster further comprises a shared device, wherein the predetermined storage area is located in the shared device, and wherein each container engine is communicatively connected to the shared device;
the non-available state container engine is further configured to access a specified record in the shared device when the failure of the available state container engine is monitored.
7. A container state switching method is applied to a first node in a distributed cluster, wherein a first container engine is configured in the first node, and the state of the first container engine is as follows: an available state or a non-available state; the method comprises the following steps:
monitoring whether a container engine in an available state in the cluster fails if the state of the first container engine is in a non-available state;
if the failure occurs, judging whether a container engine in an unsuccessfully failed available state exists or not based on a specified record in a preset storage area;
if so, switching the state of the first container engine to an available state;
wherein, the container engine identification of the container engine with the new available state is recorded in the specified record.
8. The method of claim 7, wherein determining whether there is a failed container engine that is not superseded based on a specified record in a preset storage area comprises:
judging whether a container engine identifier exists in a specified record in a preset storage area or not;
if not, indicating that there is a failed container engine that is not superseded;
the method further comprises the following steps:
adding the identity of the first container engine in the record if the container engine identity is not present in the record.
9. The method of claim 7, wherein determining whether there is a failed container engine that is not superseded based on a specified record in a preset storage area comprises:
judging whether the number of the container engine identifications in the specified record in the preset storage area is less than the number of the container engines in the available state with faults;
if so, indicating that there is a failed container engine in an unsubscribed state;
the method further comprises the following steps:
in case the number of container engine identifications in the record is smaller than the number of container engines in a failed available state, adding the identification of the first container engine in the record.
10. The method of claim 9, further comprising:
and when the state of the first container engine is an available state, performing load balancing processing on the container engines in the available states in the distributed cluster.
11. The method of claim 7, wherein the first container engine includes a first probe therein; the method further comprises the following steps:
and accessing the specified record in the preset storage area through the first probe.
12. A container state switching device is applied to a first node in a distributed cluster, wherein a first container engine is configured in the first node, and a state of the first container engine is as follows: an available state or a non-available state; the device comprises:
a monitoring module, configured to monitor whether a container engine in an available state in the cluster fails when a state of the first container engine is a non-available state; if the fault occurs, triggering a judgment module;
the judging module is used for judging whether a container engine in an unsuccessfully failed available state exists or not based on the specified record in the preset storage area; if not, triggering a switching module;
a switching module for switching the state of the first container engine to an available state;
wherein, the container engine identification of the container engine with the new available state is recorded in the specified record.
13. The apparatus of claim 12, wherein the determining module is specifically configured to:
judging whether a container engine identifier exists in a specified record in a preset storage area or not; if not, indicating that there is a failed container engine that is not superseded;
the device further comprises:
a first adding module, configured to add, in the record, an identifier of the first container engine if the identifier of the container engine does not exist in the record.
14. The apparatus of claim 12, wherein the determining module is specifically configured to:
judging whether the number of the container engine identifications in the specified record in the preset storage area is less than the number of the container engines in the available state with faults; if so, indicating that there is a failed container engine that is not superseded;
the device further comprises:
a second adding module, configured to add the identifier of the first container engine to the record if the number of container engine identifiers in the record is less than the number of container engines in the failed available state.
15. The apparatus of claim 14, further comprising:
and the processing module is used for carrying out load balancing processing on the container engines in the available states in the distributed cluster under the condition that the state of the first container engine is the available state.
16. The apparatus of claim 12, wherein the first container engine comprises a first probe therein; the device further comprises:
and the access module is used for accessing the specified record in the preset storage area through the first probe.
17. An electronic device comprising a processor and a memory;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 7 to 11 when executing a program stored in the memory.
18. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any of the claims 7-11.
CN201910131705.XA 2019-02-22 2019-02-22 Distributed cluster and container state switching method and device Active CN111614701B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910131705.XA CN111614701B (en) 2019-02-22 2019-02-22 Distributed cluster and container state switching method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910131705.XA CN111614701B (en) 2019-02-22 2019-02-22 Distributed cluster and container state switching method and device

Publications (2)

Publication Number Publication Date
CN111614701A CN111614701A (en) 2020-09-01
CN111614701B true CN111614701B (en) 2022-09-02

Family

ID=72197509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910131705.XA Active CN111614701B (en) 2019-02-22 2019-02-22 Distributed cluster and container state switching method and device

Country Status (1)

Country Link
CN (1) CN111614701B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113905048A (en) * 2021-09-30 2022-01-07 北京蓝海医信科技有限公司 Method and device for scheduling engine instance by cluster manager and computer equipment
CN113867649B (en) * 2021-10-20 2024-05-10 上海万向区块链股份公司 System and method for adaptive blockchain data storage plugin

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394807A (en) * 2011-08-23 2012-03-28 北京京北方信息技术有限公司 System and method for decentralized scheduling of autonomous flow engine load balancing clusters
CN107171874A (en) * 2017-07-21 2017-09-15 维沃移动通信有限公司 A kind of speech engine switching method, mobile terminal and server
CN108052827A (en) * 2017-12-25 2018-05-18 北京天融信网络安全技术有限公司 A kind of switching method with double engines, device and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9928148B2 (en) * 2014-08-21 2018-03-27 Netapp, Inc. Configuration of peered cluster storage environment organized as disaster recovery group

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102394807A (en) * 2011-08-23 2012-03-28 北京京北方信息技术有限公司 System and method for decentralized scheduling of autonomous flow engine load balancing clusters
CN107171874A (en) * 2017-07-21 2017-09-15 维沃移动通信有限公司 A kind of speech engine switching method, mobile terminal and server
CN108052827A (en) * 2017-12-25 2018-05-18 北京天融信网络安全技术有限公司 A kind of switching method with double engines, device and storage medium

Also Published As

Publication number Publication date
CN111614701A (en) 2020-09-01

Similar Documents

Publication Publication Date Title
EP3221795B1 (en) Service addressing in distributed environment
WO2017140131A1 (en) Data writing and reading method and apparatus, and cloud storage system
CN110888889B (en) Data information updating method, device and equipment
CN108881512B (en) CTDB virtual IP balance distribution method, device, equipment and medium
CN110535692B (en) Fault processing method and device, computer equipment, storage medium and storage system
CN111614701B (en) Distributed cluster and container state switching method and device
CN106789308B (en) GIS service device with micro-service architecture capable of automatically stretching and retracting and control method thereof
CN113986149B (en) System fault processing method, device, equipment and storage medium
CN112860386A (en) Method for switching nodes in distributed master-slave system
CN113965576B (en) Container-based big data acquisition method, device, storage medium and equipment
CN117290557A (en) Data loading method, related device, equipment and readable storage medium
CN110333984B (en) Interface abnormality detection method, device, server and system
CN113596195B (en) Public IP address management method, device, main node and storage medium
CN114584454B (en) Processing method and device of server information, electronic equipment and storage medium
CN113568781B (en) Database error processing method and device and database cluster access system
CN114168071B (en) Distributed cluster capacity expansion method, distributed cluster capacity expansion device and medium
CN115686368A (en) Method, system, apparatus and medium for storage capacity expansion of nodes of block chain network
CN110650059B (en) Fault cluster detection method, device, computer equipment and storage medium
CN114978871A (en) Node switching method and node switching device of service system and electronic equipment
CN114157663A (en) Cloud data access method and cloud server
CN112433875A (en) Middleware-based database operation method and device and terminal equipment
KR102066178B1 (en) Web monitoring system and method for generating of response time distribution information using the same
CN112445802A (en) Method and device for expanding database, electronic equipment and computer storage medium
CN110244903B (en) Data storage method and device
CN111367885A (en) Database management system, database management method, storage medium, and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant