CN109101196A - Host node switching method, device, electronic equipment and computer storage medium - Google Patents
Host node switching method, device, electronic equipment and computer storage medium Download PDFInfo
- Publication number
- CN109101196A CN109101196A CN201810925076.3A CN201810925076A CN109101196A CN 109101196 A CN109101196 A CN 109101196A CN 201810925076 A CN201810925076 A CN 201810925076A CN 109101196 A CN109101196 A CN 109101196A
- Authority
- CN
- China
- Prior art keywords
- node
- host node
- service center
- coordination service
- metadata
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0629—Configuration or reconfiguration of storage systems
- G06F3/0631—Configuration or reconfiguration of storage systems by allocating resources to storage systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
This application involves Internet technical fields, disclose a kind of host node switching method, device, electronic equipment and computer readable storage medium, wherein, host node switching method includes: when Fisrt fault monitoring control devices to current primary node break down, contention requests are sent to coordination service center, contention requests are for requesting coordination service center that metadata node corresponding with Fisrt fault controller is determined as target host node;Then when receiving the confirmation message at coordination service center, metadata node corresponding with Fisrt fault controller is switched to by target host node by virtual IP VIP.The method of the embodiment of the present application realizes the compatibility to lowest version client access host node so that existing lowest version client also can normally access the host node after switching even if the active-standby switch of metadata node occurs.
Description
Technical field
This application involves Internet technical fields, specifically, this application involves a kind of host node switching method, device,
Electronic equipment and computer storage medium.
Background technique
In current large-scale distributed storage system, purview certification and quota control are concentrated in order to realize, it is main to use
The method of centralized metadata management is centrally stored in several metadata that is, by the metadata of data all in whole system
In node (Name Node).
In such framework, from the metadata node as host node to the inquiry of relative client offer data, more
It is new to wait service, at this point, the availability of metadata node is directly related to the availability of whole system, therefore in distributed storage system
The availability of metadata node is usually promoted in system by way of redundancy.Currently, the method for promoting metadata node availability,
The mode for usually passing through HA (High Availablity, high availability) by metadata node, utilizes spare metadata node
The metadata node as host node in abnormality is switched off, i.e., spare metadata node is switched to new host node
The service such as inquiry, update to clients providing data.
However, the above method is only applicable to the highest version client with certain decision logic, i.e. highest version client energy
Enough judge whether current meta data node is host node, if it is judged that being non-master, then continues to judge other metadata
Whether node is host node, until finding the metadata node as host node, to pass through first number as host node
Inquiry, the update etc. of data are carried out according to node.Since the lowest version client originally designed does not have decision logic, and can only visit
Ask the host node of fixing address, then, when host node switches because breaking down, will lead to lowest version client can not be obtained
Host node address after switching, so that inquiry, the update etc. of data, the i.e. above method can not be carried out by the host node after switching
It cannot be used for the lowest version client without decision logic, the compatibility of lowest version client can not be carried out.
Summary of the invention
The purpose of the application is intended at least can solve above-mentioned one of technological deficiency, can not especially be compatible with and not have judgement
The technological deficiency of the lowest version client of logic.
In a first aspect, providing a kind of host node switching method, comprising:
When Fisrt fault monitoring control devices to current primary node break down, competition is sent to coordination service center and is asked
It asks, contention requests are for requesting coordination service center that metadata node corresponding with Fisrt fault controller is determined as target master
Node;
It, will be with Fisrt fault control by virtual IP VIP when receiving the confirmation message at coordination service center
The corresponding metadata node of device processed is switched to target host node.
Specifically, whether monitoring current primary node breaks down, comprising:
Fault inquiry request is sent to coordination service center with prefixed time interval, the fault inquiry request is for requesting
Detect the fault message to break down in coordination service center with the presence or absence of current primary node;
If receiving the confirmation message of coordination service center return, it is determined that current primary node breaks down.
Further, fault message is that the second failed controller monitors the corresponding first number for being currently at host node
The fault message sent when according to nodes break down;Wherein, Fisrt fault controller and the second failed controller are by coordinating to take
The unified management of business center.
Further, metadata node corresponding with Fisrt fault controller is switched to by target host node by VIP, wrapped
It includes:
By VIP and the unbinding relationship of host node that breaks down, and by VIP member number corresponding with Fisrt fault controller
Binding relationship is established according to node, for the corresponding metadata node of Fisrt fault controller to be switched to target host node.
Further, it before VIP metadata node corresponding with Fisrt fault controller is established binding relationship, also wraps
It includes:
The host node to break down is isolated with target host node.
Further, it after VIP metadata node corresponding with Fisrt fault controller is established binding relationship, also wraps
It includes:
The corresponding metadata node of Fisrt fault controller is switched to active state by inactive state, by the first event
The corresponding metadata node of barrier controller is switched to target host node.
Second aspect provides a kind of host node switching device, comprising:
Sending module, for when Fisrt fault monitoring control devices to current primary node break down, to Distributed Application
Program Coordination service coordination service centre sends contention requests, and contention requests are for requesting coordination service center will be with Fisrt fault
The corresponding metadata node of controller is determined as target host node;
Switching module, for being incited somebody to action by virtual IP VIP when receiving the confirmation message at coordination service center
The corresponding metadata node of Fisrt fault controller is switched to target host node.
Specifically, sending module includes that fault inquiry submodule and failure determine submodule;
Fault inquiry submodule, for sending fault inquiry request, failure to coordination service center with prefixed time interval
Fault message of the inquiry request for breaking down in request detection coordination service center with the presence or absence of current primary node;
Failure determines submodule, for determining current main section when receiving the confirmation message of coordination service center return
Point breaks down.
Further, fault message is that the second failed controller monitors the corresponding first number for being currently at host node
The fault message sent when according to nodes break down;Wherein, Fisrt fault controller and the second failed controller are by coordinating to take
The unified management of business center.
Further, switching module is specifically used for VIP and the unbinding relationship of host node that breaks down, and by VIP
Metadata node corresponding with Fisrt fault controller establishes binding relationship, to be used for the corresponding first number of Fisrt fault controller
Target host node is switched to according to node.
It further, further include isolation module;
The isolation module, for the host node to break down to be isolated with target host node.
It further, further include processing module;
The processing module, for the corresponding metadata node of Fisrt fault controller to be switched to work by inactive state
Dynamic state, is switched to target host node for the corresponding metadata node of Fisrt fault controller.
The third aspect, provides a kind of electronic equipment, including memory, processor and storage on a memory and can located
The computer program run on reason device, processor realize above-mentioned host node switching method when executing described program.
Fourth aspect provides a kind of computer readable storage medium, calculating is stored on computer readable storage medium
Machine program, the program realize above-mentioned host node switching method when being executed by processor.
The application implements the host node switching method provided, when event occurs for Fisrt fault monitoring control devices to current primary node
When barrier, contention requests are sent to coordination service center, contention requests are for requesting coordination service center that will control with Fisrt fault
The corresponding metadata node of device is determined as target host node, and being will first number corresponding with Fisrt fault controller subsequently through VIP
Target host node is switched to according to node to lay the foundation;It, will be with by VIP when receiving the confirmation message at coordination service center
The corresponding metadata node of one failed controller is switched to target host node, from regardless of highest version client or lowest version visitor
Family end is not necessarily to judge whether current meta data node is host node, and only need to specify in the client one is directed toward host node always
Fixation VIP, can by host node carry out data access so that even if occur metadata node active-standby switch, it is existing
Lowest version client also can normally access the host node after switching, without batch upgrade lowest version client, realize pair
The compatibility of lowest version client access host node.
The additional aspect of the application and advantage will be set forth in part in the description, these will become from the following description
It obtains obviously, or recognized by the practice of the application.
Detailed description of the invention
The application is above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments
Obviously and it is readily appreciated that, in which:
Fig. 1 is the flow diagram of the host node switching method of the embodiment of the present application;
Fig. 2 is the process schematic that the host node of the embodiment of the present application switches;
Fig. 3 is the basic structure schematic diagram of the host node switching device of the embodiment of the present application;
Fig. 4 is the detailed construction schematic diagram of the host node switching device of the embodiment of the present application;
Fig. 5 is the structural schematic diagram of the electronic equipment of the embodiment of the present application.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end
Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached
The embodiment of figure description is exemplary, and is only used for explaining the application, and cannot be construed to the limitation to the application.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one
It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in the description of the present application
Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition
Other one or more features, integer, step, operation, element, component and/or their group.It should be understood that when we claim member
Part is " connected " or when " coupled " to another element, it can be directly connected or coupled to other elements, or there may also be
Intermediary element.In addition, " connection " used herein or " coupling " may include being wirelessly connected or wirelessly coupling.It is used herein to arrange
Diction "and/or" includes one or more associated wholes for listing item or any cell and all combinations.
To keep the purposes, technical schemes and advantages of the application clearer, below in conjunction with attached drawing to the application embodiment party
Formula is described in further detail.
ZKFC (ZooKeeper FailOver Controller, distributed application program coordination service failed controller)
It is HDFS (Hadoop Distributed File System, Hadoop distributed file system) 2.0 versions HA (High later
Availablity, high availability) scheme component, ZKFC can real-time monitoring whether go out as the metadata node of host node
Existing failure, and when a failure occurs, the host node of failure can be switched off using spare metadata node, i.e., by spare member
Back end is switched to new host node and services to inquiry, the update etc. with the matched clients providing data of HDFS2.0.
Wherein, with the matched client of HDFS2.0 be highest version client (such as client 2.0), and with HDFS1.0
The client matched is lowest version client (such as client 1.0).Since highest version client is the HDFS2.0 with introducing ZKFC
Match, thus there is certain decision logic to host node, can judge whether current meta data node is host node, such as
Fruit judging result is non-master, then continues to judge whether other metadata node is host node, until finding as main section
The metadata node of point, to carry out inquiry, the update etc. of data by the metadata node as host node.
However, lowest version client (such as client 1.0) be it is matched with HDFS1.0, due to being not introduced into HDFS1.0
ZKFC component, so also just not having decision logic with the matched lowest version client of HDFS1.0, to can only access fixedly
The host node of location then when just will appear host node and switching because breaking down, causes lowest version client that can not obtain switching
Host node address afterwards, so that the situation of inquiry, update of data etc. can not be carried out by the host node after switching.
Host node switching method, device, electronic equipment and computer readable storage medium provided by the present application, it is intended to solve
The technical problem as above of the prior art.
How the technical solution of the application and the technical solution of the application are solved with specifically embodiment below above-mentioned
Technical problem is described in detail.These specific embodiments can be combined with each other below, for the same or similar concept
Or process may repeat no more in certain embodiments.Below in conjunction with attached drawing, embodiments herein is described.
Embodiment one
The embodiment of the present application provides a kind of host node switching method, as shown in Figure 1, comprising:
Step S110 is sent out when Fisrt fault monitoring control devices to current primary node break down to coordination service center
Contention requests are sent, contention requests are for requesting coordination service center to determine metadata node corresponding with Fisrt fault controller
For target host node.
Specifically, the failed controller in the embodiment of the present application can be ZKFC (ZooKeeper FailOver
Controller, distributed application program coordination service failed controller), coordination service center can be ZooKeeper (distribution
The coordination service of formula application program), below by taking failed controller is ZKFC, coordination service center is ZooKeeper as an example, to this Shen
Please embodiment be illustrated:
It with the continuous development of technology and updates, on the basis of HDFS (i.e. HDFS1.0) system of 1.0 versions, has sent out
Include several in the new HDFS system (i.e. HDFS2.0) of 2.0 versions of cloth namely the HDFS cluster of server end (such as 2
It is a) HDFS2.0.It wherein, include metadata node (NameNode), a multiple back end in each HDFS system
(DataNode) and multiple from metadata node (Secondary NameNode), each HDFS is by metadata node to phase
It answers reading, write-in of clients providing data etc. to service, and passes through the healthy shape of corresponding ZKFC real-time monitoring metadata node
Condition, for example, passing through second by the health status of the metadata node in first HDFS2.0 of the first ZKFC real-time monitoring
The health status of metadata node in second HDFS2.0 of ZKFC real-time monitoring, it is of course also possible to real-time by the 2nd ZKFC
The health status for monitoring the metadata node in first HDFS2.0, by second HDFS2.0 of the first ZKFC real-time monitoring
Metadata node health status, the embodiment of the present application is not limited it.Meanwhile HDFS cluster several
In HDFS2.0, there can only be the metadata node of one of HDFS2.0 as host node, provide data to relative client
The services such as reading, write-in.
It is (following to be referred to as with the metadata node of first corresponding HDFS2.0 of the first ZKFC real-time monitoring below
Metadata node 1) health status, second corresponding HDFS2.0 of the 2nd ZKFC real-time monitoring metadata node (under
State referred to as metadata node 2) health status for, the embodiment of the present application is illustrated:
Specifically, the health status of the 2nd ZKFC real-time monitoring metadata node 2, and the metadata node 2 is host node,
I.e. at this time VIP is bound with metadata node 2, i.e. VIP is directed to metadata node 2, due to that can only have one of them
The metadata node of HDFS2.0 is as host node, so the metadata node (i.e. metadata node 1) of first HDFS2.0 is
Non-master (or referred to as standby host node), wherein (non-master is standby for the metadata node 1 of the first ZKFC real-time monitoring
With host node) health status.
It further, can be to distribution when the 2nd ZKFC monitors that current primary node (i.e. metadata node 2) breaks down
Formula application program coordination service ZooKeeper reporting fault information, while the first ZKFC can be monitored currently by ZooKeeper
Whether host node breaks down, and when the first ZKFC monitors that current primary node breaks down, can send and use to ZooKeeper
Metadata node 1 corresponding with the first ZKFC is determined as to the contention requests of target host node in request ZooKeeper, i.e., first
ZKFC initiates the competitive behavior of target host node to ZooKeeper, i.e., when the first ZKFC monitors that current primary node breaks down
When, contention requests are sent to ZooKeeper, the contention requests are for requesting ZooKeeper will first number corresponding with the first ZKFC
It is determined as target host node according to node.
Wherein, in the case where monitoring 1 health of metadata node, if monitoring, event occurs the first ZKFC for current primary node
Barrier then initiates the competitive behavior of target host node to ZooKeeper, monitors that metadata node 1 is also unhealthy in the first ZKFC
In the case where, even if it monitors that current primary node breaks down, the first ZKFC will not initiate the main section of target to ZooKeeper
The competitive behavior of point.
Further, ZKFC inspects periodically the health of metadata node by health monitor (i.e. HealthMonitor)
Situation coordinates change notification distributed application program by call back function when metadata node health status changes
Service fault monitor (i.e. ZooKeeperFailOverController, referred to as ZKFailOverController), then lead to
It crosses ZKFailOverController and reports ZooKeeper.Wherein, HealthMonitor can be by periodically to metadata section
Point sends the mode of request data package, checks the health status of metadata node, is somebody's turn to do if being not received by metadata node and being directed to
The response that request data package returns, or receiving the duration of response has been more than preset duration threshold value, it is determined that metadata node
It is unhealthy.
Step S120 will be with by virtual IP VIP when receiving the confirmation message at coordination service center
The corresponding metadata node of one failed controller is switched to target host node.
Specifically, when the first ZKFC monitors that current primary node breaks down, competition host node is initiated to ZooKeeper
Competitive behavior, wherein the first ZKFC can by ZooKeeper send for requesting ZooKeeper will be with the first ZKFC
Corresponding metadata node is determined as the contention requests of target host node, to the competitive behavior for initiating competition host node.Wherein,
First ZKFC can also carry the health and fitness information of its real-time monitoring metadata node 1, metadata node address in contention requests
Information etc..
Further, ZooKeeper can be based on pre-defined rule pair after the contention requests for receiving the first ZKFC transmission
Metadata node 1 corresponding with the first ZKFC is detected, when metadata node 1 corresponding with the first ZKFC meets pre-defined rule
When, when the first ZKFC metadata node 1 monitored is updated to host node by ZooKeeper determination, and it can be sent to the first ZKFC
Corresponding confirmation message.When the first ZKFC receives the confirmation message of ZooKeeper return, illustrate the first ZKFC successful contention
Host node is arrived, at this point, the first ZKFC can pass through VIP (Virtual Internet Protocol, virtual IP)
Metadata node corresponding with the first ZKFC is switched to target host node, i.e. the first ZKFC by by VIP by original direction
Metadata node 2 (host node to break down) is changed to be directed toward metadata node 1 (target host node), to realize host node
Switching so that no matter highest version client also lowest version client be not necessarily to judge current meta data node whether based on save
Point need to only specify the fixation VIP for being directed toward host node always in the client, by the host node before switching or can cut
Host node after changing carries out data access.
Host node switching method provided by the embodiments of the present application, compared with prior art, when Fisrt fault monitoring control devices
When breaking down to current primary node, contention requests are sent to coordination service center, contention requests are for requesting in coordination service
Metadata node corresponding with Fisrt fault controller is determined as target host node by the heart, and being will be with the first event subsequently through VIP
The corresponding metadata node of barrier controller is switched to target host node and lays the foundation;When the confirmation letter for receiving coordination service center
When breath, metadata node corresponding with Fisrt fault controller is switched to by target host node by VIP, from regardless of highest version
Client or lowest version client are not necessarily to judge whether current meta data node is host node, need to only specify in the client
One is directed toward the fixation VIP of host node always, can carry out data access by host node, so that even if metadata node occurs
Active-standby switch, existing lowest version client also can normally access switching after host node, without the low version of batch upgrade
This client realizes the compatibility to lowest version client access host node.
Embodiment two
The embodiment of the present application provides alternatively possible implementation, further includes implementing on the basis of example 1
Method shown in example two, wherein
It further include step S100 (being not marked in figure) before step S110: Fisrt fault monitoring control devices current primary node
Whether breaking down, step S100 specifically includes step S1001 (being not marked in figure) and step S1002 (being not marked in figure),
In,
Step S1001: fault inquiry request is sent to coordination service center with prefixed time interval, fault inquiry request is used
The fault message to break down in request detection coordination service center with the presence or absence of current primary node.
Step S1002: if receiving the confirmation message of coordination service center return, it is determined that event occurs for current primary node
Barrier.
Wherein, fault message is that the second failed controller monitors the corresponding metadata section for being currently at host node
The fault message that point is sent when breaking down;Wherein, Fisrt fault controller and the second failed controller are by coordination service
Heart unified management.
Specifically, the 2nd ZKFC can monitor the corresponding metadata node as host node using prefixed time interval
2 health status, prefixed time interval can be 1 second, 3 seconds, 5 seconds etc., naturally it is also possible to be set as according to the actual situation other
Value, wherein the 2nd ZKFC by way of sending request data package to host node, can check master according to prefixed time interval
The health status of node if being not received by host node is directed to the response that the request data package returns, or receives response
Duration has been more than preset duration threshold value, it is determined that host node breaks down, at this point, the 2nd ZKFC sends main section to ZooKeeper
The failure message that point breaks down.
Further, it after the failure message that the host node that ZooKeeper receives the 2nd ZKFC transmission breaks down, can protect
The failure message is deposited, meanwhile, the first ZKFC can also send fault inquiry request to ZooKeeper with prefixed time interval, preset
Time interval can be 1 second, 2 seconds, 4 seconds etc., naturally it is also possible to be set as other values according to the actual situation, fault inquiry request
The fault message to break down for whether there is current primary node in request detection ZooKeeper, when being saved in ZooKeeper
When the fault message for having current primary node to break down, corresponding confirmation message can be returned to the first ZKFC, the first ZKFC is received
To ZooKeeper return confirmation message when, that is, can determine that current primary node breaks down.
Further, the first ZKFC is the component of different HDFS from the 2nd ZKFC, each is present in different HDFS, i.e.,
First ZKFC and the 2nd ZKFC is mutually indepedent existing, but the first ZKFC and the 2nd ZKFC is led to ZooKeeper
Letter is carried out the competition of host node by ZooKeeper, i.e., is managed collectively by ZooKeeper.
For the embodiment of the present application, the first ZKFC is by way of sending fault inquiry request to ZooKeeper, Ke Yishi
When monitor whether current primary node breaks down, it is ensured that it can initiate at the first time main section when host node breaks down
The competition of point, when effectively preventing host node and breaking down, due to switching host node not in time, caused client can not be visited
The occurrence of asking host node.
Embodiment three
The embodiment of the present application provides alternatively possible implementation, further includes implementing on the basis of example 2
Method shown in example three, wherein
Step S120 is specifically included: by VIP and the unbinding relationship of host node to break down, and VIP and first is former
The corresponding metadata node of barrier controller establishes binding relationship, for cutting the corresponding metadata node of Fisrt fault controller
It is changed to target host node.
It further include step S111 (being not marked in figure) before step S120: by the host node to break down and the main section of target
Point is isolated.
It further include step S121 (being not marked in figure) after step S120: by the corresponding metadata of Fisrt fault controller
Node is switched to active state by inactive state, and the corresponding metadata node of Fisrt fault controller is switched to target master
Node.
Specifically, when the first ZKFC receives the confirmation message that ZooKeeper is returned for its contention requests, illustrate the
One ZKFC successful contention has arrived host node, wherein the contention requests are for requesting ZooKeeper will member corresponding with the first ZKFC
Back end is determined as target host node.After the first ZKFC is competed successfully, need to cut corresponding metadata node 1
It is changed to host node, i.e. starting host node switching flow.
Further, in the handoff procedure of host node, firstly, the first ZKFC can trigger isolation process, i.e., event will occur
The host node of barrier is isolated with target host node, is not currently in moving type with the host node to break down for ensuring to monitor
State provides service to client not as host node, namely ensure currently to only have unique metadata node as host node,
Fissure occurs to prevent from occurring simultaneously two host nodes.Then, the first ZKFC will start the switching flow of VIP, will originally refer to
It to the VIP of the host node to break down, is switched to and is directed toward metadata node 1 corresponding with the first ZKFC, i.e., by VIP and generation event
The unbinding relationship of the host node of barrier, and VIP metadata node corresponding with the first ZKFC is established into binding relationship, for inciting somebody to action
The corresponding metadata node of first ZKFC is switched to target host node, so that the client of various versions being capable of basis
VIP carries out data access to pass through host node in real time to HDFS.Then, the first ZKFC notifies corresponding metadata node 1
Active state is switched to by inactive state, i.e. corresponding metadata node is switched to by the first ZKFC by inactive state
The corresponding metadata node of first ZKFC is switched to target host node by active state, thus to the client of various versions
Service is provided.
Further, Fig. 2 is the process schematic of the host node handoff procedure of above-described embodiment one to embodiment three, is being schemed
In 2, the first ZKFC monitors the health status of metadata node 1 in real time, and the 2nd ZKFC monitors the metadata as host node in real time
The health status of node 2, when the 2nd ZKFC monitors that host node breaks down, meeting real-time report ZooKeeper, while first
ZKFC also can send the fault inquiry request whether query master node breaks down to ZooKeeper in real time, and the first ZKFC is being looked into
Asking or confirming that host node breaks down is just to rob lock to ZooKeeper, i.e., host node is competed to ZooKeeper, when first
After ZKFC competition to host node, the switching flow of host node just will start, wherein the switching flow of host node includes in Fig. 2
Step 1 completes the switching flow of host node after completing step 1 to step 3 to step 3, thus will be with the first ZKFC pairs
The metadata node 1 answered is switched to host node, i.e. VIP has been directed toward metadata node 1, and is no longer point to metadata node 2.
For the embodiment of the present application, the host node to break down is isolated with target host node, it is ensured that host node
Uniqueness, effectively prevent the generation of fissure;By VIP and the unbinding relationship of host node that breaks down, and by VIP and the
The corresponding metadata node of one failed controller establishes binding relationship, so that VIP is directed toward host node always, it is ensured that lowest version client
End also can correctly get current primary node, ensure that the access of lowest version client is patrolled after host node switches
It collects normally, realizes transparent access of the client to host node;Fisrt fault controller by corresponding metadata node by
Inactive state is switched to active state, so that the corresponding metadata node of Fisrt fault controller is switched to target host node,
And service is provided to the client of various versions.
Example IV
Fig. 3 is a kind of structural schematic diagram of host node switching device provided by the embodiments of the present application, as shown in figure 3, the dress
Setting 30 may include sending module 31 and switching module 32;Wherein,
Sending module 31 is used for when Fisrt fault monitoring control devices to current primary node break down, to Distributed Application
Program Coordination service coordination service centre sends contention requests, and contention requests are for requesting coordination service center will be with Fisrt fault
The corresponding metadata node of controller is determined as target host node;
Switching module 32 is used for when receiving the confirmation message at coordination service center, passes through virtual IP VIP
The corresponding metadata node of Fisrt fault controller is switched to target host node.
Specifically, sending module 31 determines submodule 312 including fault inquiry submodule 311 and failure, as shown in figure 4,
Wherein,
Fault inquiry submodule 311 is used to send fault inquiry request to coordination service center with prefixed time interval, therefore
Hinder fault message of the inquiry request for breaking down in request detection coordination service center with the presence or absence of current primary node;
Failure determines that submodule 312 is current main for determining when receiving the confirmation message of coordination service center return
Nodes break down.
Further, fault message is that the second failed controller monitors the corresponding first number for being currently at host node
The fault message sent when according to nodes break down;Wherein, Fisrt fault controller and the second failed controller are by coordinating to take
The unified management of business center.
Further, switching module 32 is specifically used for VIP and the unbinding relationship of host node to break down, and will
VIP metadata node corresponding with Fisrt fault controller establishes binding relationship, for Fisrt fault controller is corresponding
Metadata node is switched to target host node.
It further, further include isolation module 33, as shown in Figure 4, wherein isolation module 33 is used for the master that will be broken down
Node is isolated with target host node.
It further, further include processing module 34, as shown in Figure 4, wherein processing module 34 is for controlling Fisrt fault
The corresponding metadata node of device is switched to active state by inactive state, by the corresponding metadata section of Fisrt fault controller
Point is switched to target host node.
Device provided by the embodiments of the present application, compared with prior art, when Fisrt fault monitoring control devices to current main section
When point breaks down, contention requests are sent to coordination service center, contention requests are for requesting coordination service center will be with first
The corresponding metadata node of failed controller is determined as target host node, and being will be with Fisrt fault controller pair subsequently through VIP
The metadata node answered is switched to target host node and lays the foundation;When receiving the confirmation message at coordination service center, pass through
Metadata node corresponding with Fisrt fault controller is switched to target host node by VIP, from regardless of highest version client also
That lowest version client is not necessarily to judge whether current meta data node is host node, only need to specify in the client one always
It is directed toward the fixation VIP of host node, data access can be carried out by host node, so that being cut even if the active and standby of metadata node occurs
It changes, existing lowest version client also can normally access the host node after switching, without batch upgrade lowest version client,
Realize the compatibility to lowest version client access host node.
Embodiment five
The embodiment of the present application provides a kind of electronic equipment, as shown in figure 5, electronic equipment shown in fig. 5 500 includes: place
Manage device 501 and memory 503.Wherein, processor 501 is connected with memory 503, is such as connected by bus 502.Further,
Electronic equipment 500 can also include transceiver 504.It should be noted that transceiver 504 is not limited to one in practical application, it should
The structure of electronic equipment 500 does not constitute the restriction to the embodiment of the present application.
Wherein, processor 501 is applied in the embodiment of the present application, for realizing Fig. 3 or sending module shown in Fig. 4 with cut
Change the mold the function of block and the function of isolation module shown in Fig. 4 and processing module.
Processor 501 can be CPU, general processor, DSP, ASIC, FPGA or other programmable logic device, crystalline substance
Body pipe logical device, hardware component or any combination thereof.It, which may be implemented or executes, combines described by present disclosure
Various illustrative logic blocks, module and circuit.Processor 501 is also possible to realize the combination of computing function, such as wraps
It is combined containing one or more microprocessors, DSP and the combination of microprocessor etc..
Bus 502 may include an access, and information is transmitted between said modules.Bus 502 can be pci bus or EISA
Bus etc..Bus 502 can be divided into address bus, data/address bus, control bus etc..For convenient for indicating, in Fig. 5 only with one slightly
Line indicates, it is not intended that an only bus or a type of bus.
Memory 503 can be ROM or can store the other kinds of static storage device of static information and instruction, RAM
Or the other kinds of dynamic memory of information and instruction can be stored, it is also possible to EEPROM, CD-ROM or other CDs
Storage, optical disc storage (including compression optical disc, laser disc, optical disc, Digital Versatile Disc, Blu-ray Disc etc.), magnetic disk storage medium
Or other magnetic storage apparatus or can be used in carry or store have instruction or data structure form desired program generation
Code and can by any other medium of computer access, but not limited to this.
Memory 503 is used to store the application code for executing application scheme, and is held by processor 501 to control
Row.Processor 501 is for executing the application code stored in memory 503, to realize that Fig. 3 or embodiment illustrated in fig. 4 are mentioned
The movement of the host node switching device of confession.
Electronic equipment provided by the embodiments of the present application, including memory, processor and storage on a memory and can located
The computer program that runs on reason device, when processor executes program, compared with prior art, it can be achieved that: when Fisrt fault controls
When device monitors that current primary node breaks down, competition is sent to distributed application program coordination service coordination service center and is asked
It asks, contention requests are for requesting coordination service center that metadata node corresponding with Fisrt fault controller is determined as target master
Node lays the foundation for metadata node corresponding with Fisrt fault controller is switched to target host node subsequently through VIP;
When receiving the confirmation message at coordination service center, metadata node corresponding with Fisrt fault controller is cut by VIP
It is changed to target host node, from regardless of highest version client or lowest version client are not necessarily to judge that current meta data node is
No is host node, need to only specify the fixation VIP for being directed toward host node always in the client, can be counted by host node
According to access, so that existing lowest version client also can normally access switching even if the active-standby switch of metadata node occurs
Host node afterwards realizes the compatibility to lowest version client access host node without batch upgrade lowest version client.
The embodiment of the present application provides a kind of computer readable storage medium, is stored on the computer readable storage medium
Computer program realizes method shown in embodiment one when the program is executed by processor.Compared with prior art, when the first event
When barrier monitoring control devices break down to current primary node, contention requests are sent to coordination service center, contention requests are for asking
Ask coordination service center that metadata node corresponding with Fisrt fault controller is determined as target host node, for subsequently through
Metadata node corresponding with Fisrt fault controller is switched to target host node and laid the foundation by VIP;Coordinate clothes when receiving
When the confirmation message at business center, metadata node corresponding with Fisrt fault controller is switched to by target host node by VIP,
From regardless of highest version client or lowest version client are not necessarily to judge whether current meta data node is host node, only need
The fixation VIP for being directed toward host node always is specified in the client, can carry out data access by host node, so that even if
The active-standby switch of metadata node occurs, existing lowest version client also can normally access the host node after switching, and nothing
Batch upgrade lowest version client is needed, the compatibility to lowest version client access host node is realized.
Computer readable storage medium provided by the embodiments of the present application is suitable for any embodiment of the above method.Herein not
It repeats again.
It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other
At least part of the sub-step or stage of step or other steps executes in turn or alternately.
The above is only some embodiments of the application, it is noted that for the ordinary skill people of the art
For member, under the premise of not departing from the application principle, several improvements and modifications can also be made, these improvements and modifications are also answered
It is considered as the protection scope of the application.
Claims (10)
1. a kind of host node switching method characterized by comprising
When Fisrt fault monitoring control devices to current primary node break down, contention requests, institute are sent to coordination service center
Contention requests are stated for requesting coordination service center that metadata node corresponding with Fisrt fault controller is determined as target master
Node;
It, will be with Fisrt fault controller by virtual IP VIP when receiving the confirmation message at coordination service center
Corresponding metadata node is switched to target host node.
2. the method according to claim 1, wherein whether monitoring current primary node breaks down, comprising:
Fault inquiry request is sent to coordination service center with prefixed time interval, the fault inquiry request is used for request detection
The fault message to break down in coordination service center with the presence or absence of current primary node;
If receiving the confirmation message of coordination service center return, it is determined that current primary node breaks down.
3. according to the method described in claim 2, it is characterized in that, the fault message be the second failed controller monitor with
The fault message that its corresponding metadata node for being currently at host node is sent when breaking down;Wherein, Fisrt fault controls
Device and the second failed controller are managed collectively by coordination service center.
4. the method according to claim 1, wherein will first number corresponding with Fisrt fault controller by VIP
Target host node is switched to according to node, comprising:
By VIP and the unbinding relationship of host node that breaks down, and by VIP metadata section corresponding with Fisrt fault controller
Point establishes binding relationship, for the corresponding metadata node of Fisrt fault controller to be switched to target host node.
5. according to the method described in claim 4, it is characterized in that, by VIP metadata corresponding with Fisrt fault controller
Node is established before binding relationship, further includes:
The host node to break down is isolated with target host node.
6. according to the method described in claim 4, it is characterized in that, by VIP metadata corresponding with Fisrt fault controller
Node is established after binding relationship, further includes:
The corresponding metadata node of Fisrt fault controller is switched to active state by inactive state, by Fisrt fault control
The corresponding metadata node of device processed is switched to target host node.
7. a kind of host node switching device characterized by comprising
Sending module, for when Fisrt fault monitoring control devices to current primary node break down, to distributed application program
Coordination service coordination service center sends contention requests, and the contention requests are for requesting coordination service center will be with Fisrt fault
The corresponding metadata node of controller is determined as target host node;
Switching module, for when receiving the confirmation message at coordination service center, by virtual IP VIP by first
The corresponding metadata node of failed controller is switched to target host node.
8. device according to claim 7, which is characterized in that the sending module includes fault inquiry submodule and failure
Determine submodule;
The fault inquiry submodule, it is described for sending fault inquiry request to coordination service center with prefixed time interval
Fault message of the fault inquiry request for breaking down in request detection coordination service center with the presence or absence of current primary node;
The failure determines submodule, for determining current main section when receiving the confirmation message of coordination service center return
Point breaks down.
9. a kind of electronic equipment including memory, processor and stores the calculating that can be run on a memory and on a processor
Machine program, which is characterized in that the processor realizes that host node described in any one of claims 1-6 is cut when executing described program
Change method.
10. a kind of computer readable storage medium, which is characterized in that be stored with computer on the computer readable storage medium
Program, the program realize host node switching method described in any one of claims 1-6 when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810925076.3A CN109101196A (en) | 2018-08-14 | 2018-08-14 | Host node switching method, device, electronic equipment and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810925076.3A CN109101196A (en) | 2018-08-14 | 2018-08-14 | Host node switching method, device, electronic equipment and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109101196A true CN109101196A (en) | 2018-12-28 |
Family
ID=64849677
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810925076.3A Pending CN109101196A (en) | 2018-08-14 | 2018-08-14 | Host node switching method, device, electronic equipment and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109101196A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110417600A (en) * | 2019-08-02 | 2019-11-05 | 秒针信息技术有限公司 | Node switching method, device and the computer storage medium of distributed system |
CN110688148A (en) * | 2019-10-08 | 2020-01-14 | 中国建设银行股份有限公司 | Method, device, equipment and storage medium for equipment management |
CN111404647A (en) * | 2019-01-02 | 2020-07-10 | 中兴通讯股份有限公司 | Control method of node cooperative relationship and related equipment |
CN111444062A (en) * | 2020-04-01 | 2020-07-24 | 山东汇贸电子口岸有限公司 | Method and device for managing master node and slave node of cloud database |
CN112087336A (en) * | 2020-09-11 | 2020-12-15 | 杭州海康威视系统技术有限公司 | Deployment and management method and device of virtual IP service system and electronic equipment |
CN113852506A (en) * | 2021-09-27 | 2021-12-28 | 深信服科技股份有限公司 | Fault processing method and device, electronic equipment and storage medium |
CN113949691A (en) * | 2021-10-15 | 2022-01-18 | 湖南麒麟信安科技股份有限公司 | ETCD-based virtual network address high-availability implementation method and system |
CN114338370A (en) * | 2022-01-10 | 2022-04-12 | 北京金山云网络技术有限公司 | Highly available method, system, apparatus, electronic device and storage medium for Ambari |
CN115396296A (en) * | 2022-08-18 | 2022-11-25 | 中电金信软件有限公司 | Service processing method and device, electronic equipment and computer readable storage medium |
CN116781494A (en) * | 2023-08-17 | 2023-09-19 | 天津南大通用数据技术股份有限公司 | Main-standby switching judgment method based on existing network equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101729290A (en) * | 2009-11-04 | 2010-06-09 | 中兴通讯股份有限公司 | Method and device for realizing business system protection |
CN103973424A (en) * | 2014-05-22 | 2014-08-06 | 乐得科技有限公司 | Method and device for removing faults in cache system |
CN205901808U (en) * | 2016-08-05 | 2017-01-18 | 国家电网公司 | Accomplish distributed storage system of first data nodes automatic switch -over |
CN106911728A (en) * | 2015-12-22 | 2017-06-30 | 华为技术服务有限公司 | The choosing method and device of host node in distributed system |
-
2018
- 2018-08-14 CN CN201810925076.3A patent/CN109101196A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101729290A (en) * | 2009-11-04 | 2010-06-09 | 中兴通讯股份有限公司 | Method and device for realizing business system protection |
CN103973424A (en) * | 2014-05-22 | 2014-08-06 | 乐得科技有限公司 | Method and device for removing faults in cache system |
CN106911728A (en) * | 2015-12-22 | 2017-06-30 | 华为技术服务有限公司 | The choosing method and device of host node in distributed system |
CN205901808U (en) * | 2016-08-05 | 2017-01-18 | 国家电网公司 | Accomplish distributed storage system of first data nodes automatic switch -over |
Non-Patent Citations (1)
Title |
---|
邓鹏: "主从式云计算平台高可用性研究", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111404647A (en) * | 2019-01-02 | 2020-07-10 | 中兴通讯股份有限公司 | Control method of node cooperative relationship and related equipment |
CN111404647B (en) * | 2019-01-02 | 2023-11-28 | 中兴通讯股份有限公司 | Control method of node cooperative relationship and related equipment |
CN110417600A (en) * | 2019-08-02 | 2019-11-05 | 秒针信息技术有限公司 | Node switching method, device and the computer storage medium of distributed system |
CN110688148A (en) * | 2019-10-08 | 2020-01-14 | 中国建设银行股份有限公司 | Method, device, equipment and storage medium for equipment management |
CN111444062B (en) * | 2020-04-01 | 2023-09-19 | 山东汇贸电子口岸有限公司 | Method and device for managing master node and slave node of cloud database |
CN111444062A (en) * | 2020-04-01 | 2020-07-24 | 山东汇贸电子口岸有限公司 | Method and device for managing master node and slave node of cloud database |
CN112087336A (en) * | 2020-09-11 | 2020-12-15 | 杭州海康威视系统技术有限公司 | Deployment and management method and device of virtual IP service system and electronic equipment |
CN112087336B (en) * | 2020-09-11 | 2022-09-02 | 杭州海康威视系统技术有限公司 | Deployment and management method and device of virtual IP service system and electronic equipment |
CN113852506A (en) * | 2021-09-27 | 2021-12-28 | 深信服科技股份有限公司 | Fault processing method and device, electronic equipment and storage medium |
CN113852506B (en) * | 2021-09-27 | 2024-04-09 | 深信服科技股份有限公司 | Fault processing method and device, electronic equipment and storage medium |
CN113949691A (en) * | 2021-10-15 | 2022-01-18 | 湖南麒麟信安科技股份有限公司 | ETCD-based virtual network address high-availability implementation method and system |
CN114338370A (en) * | 2022-01-10 | 2022-04-12 | 北京金山云网络技术有限公司 | Highly available method, system, apparatus, electronic device and storage medium for Ambari |
CN115396296A (en) * | 2022-08-18 | 2022-11-25 | 中电金信软件有限公司 | Service processing method and device, electronic equipment and computer readable storage medium |
CN116781494A (en) * | 2023-08-17 | 2023-09-19 | 天津南大通用数据技术股份有限公司 | Main-standby switching judgment method based on existing network equipment |
CN116781494B (en) * | 2023-08-17 | 2024-03-26 | 天津南大通用数据技术股份有限公司 | Main-standby switching judgment method based on existing network equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109101196A (en) | Host node switching method, device, electronic equipment and computer storage medium | |
US10389824B2 (en) | Service management modes of operation in distributed node service management | |
JP6026705B2 (en) | Update management system and update management method | |
US9749415B2 (en) | Service management roles of processor nodes in distributed node service management | |
EP1643681B1 (en) | Scheduled determination of networks resource availability | |
CN109344014B (en) | Main/standby switching method and device and communication equipment | |
CN110855792B (en) | Message pushing method, device, equipment and medium | |
CN103888277B (en) | A kind of gateway disaster-tolerant backup method, device and system | |
JP2004280738A (en) | Proxy response device | |
CN105141400A (en) | High-availability cluster management method and related equipment | |
US20070270984A1 (en) | Method and Device for Redundancy Control of Electrical Devices | |
CN106230622A (en) | A kind of cluster implementation method and device | |
JPH10312365A (en) | Load decentralization system | |
CN110119314A (en) | A kind of server calls method, apparatus, server and storage medium | |
CN110224872B (en) | Communication method, device and storage medium | |
CA2745824C (en) | Registering an internet protocol phone in a dual-link architecture | |
JP5613119B2 (en) | Master / slave system, control device, master / slave switching method, and master / slave switching program | |
CN110661836B (en) | Message routing method, device and system, and storage medium | |
CN113824595B (en) | Link switching control method and device and gateway equipment | |
CN115484208A (en) | Distributed drainage system and method based on cloud security resource pool | |
EP3435615B1 (en) | Network service implementation method, service controller, and communication system | |
CN112394662A (en) | Transformer substation monitoring system server role determination method and system | |
CN109697126A (en) | A kind of data processing method and device for server | |
CN110890989A (en) | Channel connection method and device | |
JP2004295656A (en) | Communication system, client device, load distribution method of server device by client device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181228 |