CN108768758A - Distributed memory system online upgrading method, apparatus, equipment and storage medium - Google Patents

Distributed memory system online upgrading method, apparatus, equipment and storage medium Download PDF

Info

Publication number
CN108768758A
CN108768758A CN201811013865.6A CN201811013865A CN108768758A CN 108768758 A CN108768758 A CN 108768758A CN 201811013865 A CN201811013865 A CN 201811013865A CN 108768758 A CN108768758 A CN 108768758A
Authority
CN
China
Prior art keywords
upgrading
node
failure
memory system
distributed memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811013865.6A
Other languages
Chinese (zh)
Inventor
刘杰
安祥文
尚付飞
赵赞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Yunhai Information Technology Co Ltd
Original Assignee
Zhengzhou Yunhai Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Yunhai Information Technology Co Ltd filed Critical Zhengzhou Yunhai Information Technology Co Ltd
Priority to CN201811013865.6A priority Critical patent/CN108768758A/en
Publication of CN108768758A publication Critical patent/CN108768758A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0813Configuration setting characterised by the conditions triggering a change of settings
    • H04L41/082Configuration setting characterised by the conditions triggering a change of settings the condition being updates or upgrades of network functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/08Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
    • H04L43/0876Network utilisation, e.g. volume of load or congestion level
    • H04L43/0882Utilisation of link capacity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Environmental & Geological Engineering (AREA)
  • Hardware Redundancy (AREA)

Abstract

The invention discloses a kind of distributed memory system online upgrading methods, including:Nodal information acquisition is carried out to the node IP to be upgraded of acquisition, and the nodal information collected is made a backup store;Upgrade node is treated one by one and carries out edition upgrading, and cluster state inquiry is carried out to each node after upgrading;When upgrading failure, nodal information recovery is carried out to failure node according to the nodal information backed up in advance, and record to upgrading failure information, terminate current upgrading.This method carries out edition upgrading one by one to each node, it can ensure the continuously available of storage service in escalation process, and automatic business processing is carried out according to when escalation process occurs various abnormal, the smooth upgrade of distributed memory system can be realized under application scenarios, improve user experience;The invention also discloses a kind of distributed memory system online upgrading device, equipment and readable storage medium storing program for executing, have above-mentioned advantageous effect.

Description

Distributed memory system online upgrading method, apparatus, equipment and storage medium
Technical field
The present invention relates to field of data storage, more particularly to a kind of distributed memory system online upgrading method, apparatus is set Standby and a kind of readable storage medium storing program for executing.
Background technology
With the continuous development of information technology, data are gradually taken seriously as a kind of precious resources, how quickly to be located Reason data resource simultaneously obtains expected results as one of the critical issue changed by resource to assets.People are in Working Life Various activities can all generate data, and many useful informations can be obtained by analyzing processing again by collecting these data, be realized by providing Source to assets conversion, to be catalyzed the high speed development of big data, high-performance calculation.Data store the core as data resource One of heart element has also welcome the period of high speed development.
Distributed network storage system uses expansible system structure, not only increases reliability, the availability of system And access efficiency, it is also easy to extend, to receive approval by more and more business units.Distributed memory system generally has 1 It is constituted to N number of node, to provide the storage of high-performance, mass data.
It with the constantly bringing forth new ideas of technology, breaks through, and the modification to BUG, the distributed memory system disposed needs not Regularly upgrade-system.There are two types of modes for system upgrade, and one is upgraded in offline, that is, after stopping all cluster services, to distribution Formula storage system does edition upgrading, and in escalation process, client traffic interrupts, larger to regular traffic influence on system operation.Secondly being Online upgrading, the online upgrading i.e. client traffic in escalation process do not interrupt, and application is relatively broad.
No matter online upgrading or upgraded in offline, abnormal and upgrading failure feelings are often will appear in escalation process Condition, and current upgrade method usually is found to rise in this case without unified effective processing method by technical staff It is configured investigation manually after grade exception.
In distributed memory system, online upgrading focuses on how ensureing that storage service persistently may be used in escalation process With and escalation process in abnormality processing.
Therefore, in the escalation process of distributed memory system, how in ensureing escalation process storage service persistently may be used Under the premise of, complete effective processing is carried out to the unusual condition in escalation process, is that those skilled in the art need to solve The technical issues of.
Invention content
The object of the present invention is to provide a kind of distributed memory system online upgrading method, this method to each node one by one into Row edition upgrading, it is ensured that storage service is continuously available in escalation process, and various exceptions occurs according in escalation process Shi Jinhang automatic business processings can realize the smooth upgrade of distributed memory system under application scenarios, improve user experience;This The another object of invention is to provide a kind of distributed memory system online upgrading fixture, equipment and readable storage medium storing program for executing.
In order to solve the above technical problems, the present invention provides a kind of distributed memory system online upgrading method, including:
Nodal information acquisition is carried out to the node IP to be upgraded of acquisition, and the nodal information collected is subjected to backup and is deposited Storage;
Edition upgrading is carried out to the node to be upgraded one by one, and cluster state inquiry is carried out to each node after upgrading;
When upgrading failure, nodal information recovery is carried out to failure node according to the nodal information backed up in advance, and to rising Grade failure information is recorded, and current upgrading is terminated;Wherein, described upgrade includes unsuccessfully:Install failure and cluster state are different Often;The upgrading failure information includes:Failure node, error reason and upgrading progress.
Preferably, the acquisition methods of the node IP to be upgraded include:
Cluster information in acquisition system;
Node IP parsing is carried out according to the cluster information, obtains node IP;
Duplicate removal processing is carried out to the node IP, obtains node IP to be upgraded.
Preferably, described one by one to further including before the node progress edition upgrading to be upgraded:
Cluster current capacities utilization rate is detected;
If current capacities utilization rate is less than threshold value, edition upgrading is carried out to the node to be upgraded one by one;
If current capacities utilization rate is not less than threshold value, terminate current upgrading.
Preferably, the distributed memory system online upgrading method further includes:
When carrying out edition upgrading to the node to be upgraded one by one, real time node is upgraded into progress writing progress file;
Obtain current upgrading progress msg according to the schedule file, and by obtained current upgrading progress msg export to Terminal.
Preferably, described export obtained current upgrading progress msg to terminal includes:
Obtained current upgrading progress msg is subjected to progress strips conversion, and the upgrading progress bar being converted to is defeated Go out to terminal.
Preferably, the distributed memory system online upgrading method further includes:
When upgrading failure, the corresponding prompt message of output upgrading failure scenarios.
Preferably, the distributed memory system online upgrading method further includes:
When upgrading failure, obtains in current Upgrade process upgrade node IP and record, obtain blacklist;
Before next time upgrades, the node recorded in the blacklist is traversed, except black name is gone in node to be upgraded Single interior joint, is simplified upgrade node;
Then carrying out edition upgrading to the node to be upgraded one by one is specially:Version is carried out to the simplified upgrade node one by one This upgrading.
The present invention discloses a kind of distributed memory system online upgrading device, including:
Information acquisition unit for carrying out nodal information acquisition to the node IP to be upgraded of acquisition, and will collect Nodal information makes a backup store;
Edition upgrading unit, for carrying out edition upgrading to the node to be upgraded one by one, and to each node after upgrading Carry out cluster state inquiry;
Failure handling unit, for when upgrading failure, being saved to failure node according to the nodal information backed up in advance Point Information recovering, and upgrading failure information is recorded, terminate current upgrading;Wherein, described upgrade includes unsuccessfully:Installation is lost It loses and cluster state is abnormal;The upgrading failure information includes:Failure node, error reason and upgrading progress.
The present invention discloses a kind of distributed memory system online upgrading equipment, including:
Memory, for storing program;
Processor, the step of convolutional calculation method is realized when for executing described program.
The present invention discloses a kind of readable storage medium storing program for executing, has program stored therein on the readable storage medium storing program for executing, and described program is located The step of reason device realizes the distributed memory system online upgrading method when executing.
Distributed memory system online upgrading method provided by the present invention passes through the node IP to be upgraded progress to acquisition Nodal information acquires, and the nodal information collected is made a backup store, in order to when upgrading unsuccessfully cluster state it is fast Quick-recovery ensures normal functional response.Upgrade node is treated one by one and carries out edition upgrading, and version liter is carried out one by one to each node Grade, the business function of only individual node is suspended in cluster, due to the redundancy distribution rule of distributed memory system data, Individual node function can realize its business function when suspending by other nodes, to the response base of the business function of entire cluster This is without influence, it is ensured that storage service is continuously available in escalation process, and carries out cluster state to each node after upgrading Inquiry, the case where installing successfully but be unable to operate normally to avoid upgrade package;When upgrading failure, according to the node backed up in advance Information carries out nodal information recovery to failure node, with the function of recovery nodes, is protected guarantee after failing in upgrading The normal operation for demonstrate,proving cluster avoids caused clustering functionality disorder when current upgrading failure, can be to different in escalation process Normal situation is effectively handled;And upgrading failure information is recorded, the failure information technical staff of record can check And cluster or escalation process are repaired according to the content of its record.
The present invention also provides a kind of distributed memory system online upgrading device, equipment and readable storage medium storing program for executing, have Above-mentioned advantageous effect, details are not described herein.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is the flow chart of distributed memory system online upgrading method provided in an embodiment of the present invention;
Fig. 2 is the structure diagram of distributed memory system online upgrading device provided in an embodiment of the present invention;
Fig. 3 is the structural schematic diagram of distributed memory system online upgrading equipment provided in an embodiment of the present invention.
Specific implementation mode
Core of the invention is to provide a kind of distributed memory system online upgrading method, this method to each node one by one into Row edition upgrading, it is ensured that storage service is continuously available in escalation process, and various exceptions occurs according in escalation process Shi Jinhang automatic business processings can realize the smooth upgrade of distributed memory system under application scenarios, improve user experience;This Another core of invention is to provide a kind of distributed memory system online upgrading device, equipment and readable storage medium storing program for executing.
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art The every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Referring to FIG. 1, Fig. 1 is the flow chart of distributed memory system online upgrading method provided in this embodiment;The party Method may include:
Step s110, nodal information acquisition, and the nodal information that will be collected are carried out to the node IP to be upgraded of acquisition It makes a backup store.
Node IP operation information to be upgraded acquisition script carry out nodal information acquisition, backup cluster environment information, mainly Including driving version information, service profiles, hardware information etc. in cluster.If upgrading failure, can be according to collected letter Breath restores cluster environment, to ensure the normal work of cluster.
The acquisition methods for treating upgrade node IP do not limit, and are referred to existing acquisition methods and obtain system configuration letter Breath, screens configuration information interior joint information, and removes and arrive node to be upgraded after uncorrelated node.General configuration information Middle can include a large amount of irrelevant informations, can greatly increase screening calculation amount when carrying out nodal information acquisition, can also increase error Probability, it is preferable that the step of acquisition of node IP to be upgraded is specially:
Step 1:Cluster information in acquisition system;
Step 2:Node IP parsing is carried out according to cluster information, obtains node IP;
Step 3:Duplicate removal processing is carried out to node IP, obtains node IP to be upgraded.
Redundancy is less in cluster information, can relatively easily filter out node IP, may in the node IP of acquisition The case where duplicating can be obtained node IP information to be upgraded after carrying out duplicate removal processing to node IP.
Step s120, upgrade node is treated one by one and carries out edition upgrading, and cluster state is carried out to each node after upgrading Inquiry.
It includes mainly being loaded into the content of upgrade package to be upgraded node to treat upgrade node and carry out the process of edition upgrading On, after having upgraded node, restart the node with reboot, does not limit specifically upgrading flow, be referred to existing at this Technology.
Escalation process is the process upgraded one by one to node, i.e., the liter of node B is carried out after node A completes upgrading Grade, and so on.Pass through node escalation process one by one, it can be ensured that distributed memory system can persistently carry in escalation process For storage service, not broken clients front-end business.Specifically, the decision procedure that node upgrading is completed does not limit, can be every The detection that cluster state is carried out after a node installation upgrade package, waits for distributed storage cluster state to revert to HEALTH_OK, then Judge that current cluster update is completed, carries out the upgrading of next node.Herein only by taking above-mentioned decision procedure as an example, it determine that side Details are not described herein for formula.
Cluster state inquiry is carried out to each node after upgrading, such as upgrade package is installed to node A, to operating status The case where being inquired, installed successfully to avoid upgrade package but being unable to operate normally is kept away for finding node upgrading failure in time Exempt from the generation of hidden danger.
Step s130, when upgrading failure, it is extensive that nodal information is carried out to failure node according to the nodal information backed up in advance It is multiple, and upgrading failure information is recorded, terminate current upgrading.
Situations such as cluster state is abnormal after upgrading is unsuccessfully installed successfully including upgrade package install failure, upgrade package, wherein Cluster state is that abnormal discriminant approach does not limit after upgrade package is installed successfully, such as can installation is complete in node upgrade package Shi Jinhang cluster states inquire, determine whether HEALTH_OK states, if installation is complete be more than after twenty minutes cluster still do not have HEALTH_OK states are restored to, then judge cluster state exception.
No matter being which kind of error situation, node can not completely realize its original function, the node that will be backed up in advance at this time Information recovering restores cluster environment to failure node, terminates current upgrading.The successful node operation of edition upgrading in cluster at this time Version before new version, failure node and non-upgrade node are still pressed is run, it is ensured that the normal operation of each node, cluster Normal data response.
The purpose of record failure information is convenient for checking and repairing adjustment to failure cause later.At this to upgrading The specifying information type that failure information includes does not limit, and can need voluntarily to be configured according to what is checked, for example, upgrading Failure information may include:Failure node, error reason and upgrading progress, record failure node are used to carry out failure node Follow-up details is checked, to further search for failure cause;Error reason refers to the state description and analysis of mistake appearance, passes through It checks error reason, orientation problem can be facilitated, to be repaired.In addition, can further include in upgrading failure information:Failure Timing node, upgrade node information etc., details are not described herein.
Based on above-mentioned introduction, distributed memory system online upgrading method provided in this embodiment to acquisition by waiting rising Grade node IP carries out nodal information acquisition, and the nodal information collected is made a backup store, in order to when upgrading unsuccessfully The fast quick-recovery of cluster state, ensures normal functional response.Treat one by one upgrade node carry out edition upgrading, to each node by A carry out edition upgrading, the business function of only individual node is suspended in cluster, superfluous due to distributed memory system data Remaining property distribution rule, individual node function can realize its business function when suspending by other nodes, to the industry of entire cluster The response for function of being engaged in is substantially without influence, it is ensured that storage service is continuously available in escalation process, and to each section after upgrading Point carries out cluster state inquiry, the case where installing successfully but be unable to operate normally to avoid upgrade package;When upgrading failure, according to The nodal information backed up in advance carries out nodal information recovery to failure node, with the function of recovery nodes, in escalation process Unusual condition carries out complete effective processing, and is recorded to upgrading failure information, and the failure information technical staff of record can To check and be repaired to cluster or escalation process according to the content of its record.
Based on above-described embodiment, the edition upgrading process of node needs certain capacity to occupy, to avoid influencing cluster Normal function is run, it is preferable that can further include before treating upgrade node one by one and carrying out edition upgrading:
Cluster current capacities utilization rate is detected;
If current capacities utilization rate is less than threshold value, upgrade node is treated one by one and carries out edition upgrading;
If current capacities utilization rate is not less than threshold value, terminate current upgrading.
Such as it is 80% that threshold value, which is set as the utilization rate of cluster capacity, when detecting that current cluster capacity utilization is When 70%, online upgrading can be continued;If detect that current cluster using capacity is 90%, terminate current upgrading stream Journey, record upgrading failure.
In addition, to understand the upgrading progress of each node in real time, it is preferable that it is aobvious that output can be carried out to real-time upgrading progress Show.Specifically, when treating upgrade node progress edition upgrading one by one, real time node is upgraded into progress writing progress file;Root Current upgrading progress msg is obtained according to schedule file, and obtained current upgrading progress msg is exported to terminal.
Specifically upgrading progress msg the way of output do not limit, such as can in the form of individual node progress bar, The either form of whole cluster upgrade progress bar or can also be in the form of whole cluster task list etc., can be according to looking into It sees and needs voluntarily to be configured output form.
Wherein it is preferred to progress strips conversion can be carried out obtained current upgrading progress msg, and will convert To upgrading progress bar export to terminal.The form of progress bar simply will currently can upgrade performance and intuitively be shown Show.The progress bar of each node can be arranged in progress bar, and the setting that progress bar is carried out as unit of node can also be arranged.Such as When node to be upgraded is 10, overall progress is divided into ten parts according to node upgrade case, per an a corresponding node Upgrade case etc. does not limit the specific set-up mode of progress bar at this, according to different output and can check needs It is configured.
In addition, when escalation process occur it is abnormal lead to upgrading failure when, understands upgrading in time for related technical personnel and fails Information and carry out corresponding adjustment in time and ensure the professional ability of cluster to ensure the upgrading progress of cluster as possible, it is preferable that Can be when upgrading failure, the corresponding prompt message of output upgrading failure scenarios.Prompt message may include current failure letter Breath can also include suggesting operation information etc., not limit the information category that prompt message includes.
When upgrading failure occurs in some node, failure node may be not the node of first upgrading, i.e., current Before upgrading unsuccessfully stops current Upgrade process, may have the node that several upgradings are completed.Due to this upgrading failure, also need Cluster is upgraded at this after carrying out related adjustment, the node for the upgrading that succeeded before may be repeated at this time Upgrading.
To avoid time, loss spatially caused by upgrade node repeatedly upgrading, it is preferable that can be when upgrading failure When, it obtains in current Upgrade process upgrade node IP and records, obtain blacklist;Before next time upgrades, to remembering in blacklist The node of record is traversed, and blacklist interior joint is removed in node to be upgraded, is simplified upgrade node;Liter is then treated one by one Grade node carries out edition upgrading:One by one edition upgrading is carried out to simplifying upgrade node.
Blacklist is mainly used for upgrading M nodes (M>=3) in colonization process, nth node (N>=2) upgrading failure, is being repaiied After multiple N nodes, when continuing online upgrading, the node that avoids repeatedly upgrading from upgrading.I.e. it is abnormal to be mainly used for upgrading for blacklist Subsequent grade scene of continuing rising is repaired, is checked, the node IP in update blacklist, the corresponding sections of IP in blacklist are skipped in escalation process Point.Nodal information in blacklist can update manually.
Certainly, the above process be built upon upgraded version used by adjacent escalation process twice it is identical in the case of, two Preceding primary upgrading failure in secondary escalation process.If in the case of preceding primary upgrading failure, upgrading next time and the preceding version once upgraded This is not both the version information of node updates to be added when can be added record to blacklist again, to the section in blacklist Point blacklist interior joint version is compared with current newer version while traversal compares, if identical, at this Blacklist interior joint is skipped in secondary escalation process, if version is different, blacklist interior joint can be emptied, directly to being needed Upgrade node carries out escalation process.
For the understanding for deepening to distributed memory system online upgrading method provided by the invention, the present embodiment is to whole liter Grade process is introduced, and other online upgrading methods based on the present invention can refer to the introduction of the present embodiment.The present embodiment carries The online upgrading method of confession includes mainly promotion condition inspection, upgrade node ip acquisition of information, skips member in blacklist, rises Before first grade packet copy, upgrading node the business being upgraded on node, upgrading are cut off before backup cluster environment information, upgrading Progress bar shows, it is synchronous upgrade log information arrive a node, restart be upgraded node after waiting cluster state recovery HEALTH_ OK, it cannot restore HEALTH_OK at the interior cluster of time-out and then exit upgrading flow.
Specifically, escalation process specifically includes following steps:
1, promotion condition inspection.
It checks cluster state, when state is HEALTH_OK, allows online upgrading.
Confirm whether distributed memory system currently supports online upgrading according to set scalable condition inspection
It checks cluster current version and wants upgraded version, when wanting upgraded version newer than current version, allow online upgrading.
It checks cluster capacity utilization, when utilization rate is no more than 80%, allows online upgrading.
2, upgrade node ip acquisition of information.
The IP that all public networks are parsed from cluster information, is ranked up, after duplicate removal IP information, you can obtain The IP information of all nodes in cluster upgrades cluster version according to these IP.
3, the member in blacklist is skipped.
Judge currently be upgraded node whether in blacklist, if, skip the node continue to upgrade it is next Node;If it was not then continuing to upgrade the node.
4, upgrade first node backup cluster environment information before.
When judging that present node is first node of upgrading, operation information acquires script, and backup cluster environment information is main If driving version information, service profiles, hardware information etc. in cluster.
5, the business being upgraded on node is cut off before upgrading.
The business for suspending the node before each node is upgraded ensures that only there are one nodes in entire escalation process Because escalation process business is suspended, ensure the operation of cluster regular traffic.
6, started to upgrade cluster version according to actual upgrade packet content.
During installing upgrade package, if the case where encountering upgrading failure, is updated to -1, and export by upgrading progress Error reason.Output journal is unified to journal file by the main flow that upgrades, facilitates orientation problem and checks detailed upgrading progress.
7, upgrading progress bar is shown.
When upgrade package content is installed, upgrading progress is written in a file, upgrade script reads this file Acquisition of information be currently upgraded the upgrading progress of node, and printed in the form of progress bar and shown in terminal.
8, the synchronous log information that upgrades is to a node.
9, restart waits for cluster state to restore HEALTH_OK after being upgraded node.
10, HEALTH_OK cannot be restored in the interior cluster of time-out and then exit upgrading flow.
The normal process flow and abnormality processing flow of online upgrading method online design upgrading provided in this embodiment, really It is transparent, controllable to protect entire online upgrading flow.In normal upgrading flow, it is ensured that distributed memory system can be in escalation process Storage service is persistently provided, not broken clients front-end business;When upgrading occurs abnormal, can be got by log information different Normal reason, it is abnormal solve after can continue escalation process by configuring blacklist, avoid upgrade node repeatedly upgrading cause Time, loss spatially.The smooth upgrade that distributed memory system can be realized under Client application scene, makes user feel Know and upgrades less than back-end system;During online upgrading, automatic updating is realized, reduce human intervention, improve user experience.
Referring to FIG. 2, Fig. 2 is the structural frames of distributed memory system online upgrading device provided in an embodiment of the present invention Figure;May include mainly:Information acquisition unit 210, edition upgrading unit 220 and failure handling unit 230.The present embodiment carries The distributed memory system online upgrading device of confession can mutually be compareed with above-mentioned distributed memory system online upgrading method.
Wherein, information acquisition unit 210 is mainly used for carrying out nodal information acquisition to the node IP to be upgraded of acquisition, and will The nodal information collected makes a backup store.
Edition upgrading unit 220 is mainly used for treating upgrade node progress edition upgrading one by one, and to each section after upgrading Point carries out cluster state inquiry.
Failure handling unit 230 is mainly used for when upgrading failure, according to the nodal information backed up in advance to failure node Nodal information recovery is carried out, and upgrading failure information is recorded, terminates current upgrading;Wherein, upgrade and include unsuccessfully:Installation Failure and cluster state are abnormal;Upgrading failure information includes:Failure node, error reason and upgrading progress.
Distributed memory system online upgrading device provided in this embodiment by carrying out edition upgrading one by one to each node, It can ensure the continuously available of storage service in escalation process, and be automated according to when escalation process occurs various abnormal Processing can realize the smooth upgrade of distributed memory system under application scenarios, improve user experience.
Preferably, what is connect with information acquisition unit, which is used to obtain the IP acquiring units of node IP to be upgraded, includes mainly:
Cluster information obtains subelement, for cluster information in acquisition system;
IP parsing subunits obtain node IP for carrying out node IP parsing according to cluster information;
IP determination subelements to be upgraded obtain node IP to be upgraded for carrying out duplicate removal processing to node IP.
Preferably, distributed memory system online upgrading device can further include:Capacity check unit.Capacity check unit It connect, is mainly used for edition upgrading unit:Cluster current capacities utilization rate is detected;If current capacities utilization rate is less than Threshold value treats upgrade node and carries out edition upgrading one by one;If current capacities utilization rate is not less than threshold value, terminate current upgrading.
Preferably, distributed memory system online upgrading device can further include:Progress output unit, progress output unit It is connect with edition upgrading unit, includes mainly:Subelement is written in progress msg, and version is carried out for that ought treat upgrade node one by one When upgrading, real time node is upgraded into progress writing progress file;And output subelement, it is current for being obtained according to schedule file Upgrade progress msg, and obtained current upgrading progress msg is exported to terminal.
Wherein it is preferred to which the current upgrading progress msg that output subelement specifically can be used for obtain carries out progress bar Form is converted, and the upgrading progress bar being converted to is exported to terminal.
Preferably, distributed memory system online upgrading device can further include:Prompt unit, for when upgrading failure When, the corresponding prompt message of output upgrading failure scenarios.
Preferably, distributed memory system online upgrading device can further include:Blacklist processing unit.Blacklist processing Unit is connect with edition upgrading unit and failure handling unit simultaneously, includes mainly:
Subelement is recorded, for when upgrading failure, obtaining in current Upgrade process upgrade node IP and recording, is obtained Blacklist;
Subelement is traversed, for before next time upgrades, being traversed to the node recorded in blacklist, in node to be upgraded Middle removal blacklist interior joint, is simplified upgrade node.
The edition upgrading unit then being connect with blacklist processing unit is specifically used for:One by one version is carried out to simplifying upgrade node This upgrading.
The present embodiment provides a kind of distributed memory system online upgrading equipment, including:Memory and processor.
Wherein, memory is for storing program;
It realizes such as the step of above-mentioned distributed memory system online upgrading method when processor is for executing program, specifically may be used With reference to the introduction of above-mentioned distributed memory system online upgrading method.
Referring to FIG. 3, for the structural schematic diagram of distributed memory system online upgrading equipment provided in this embodiment, it is somebody's turn to do Line updating apparatus can generate bigger difference because configuration or performance are different, may include one or more processors (central processing units, CPU) 322 (for example, one or more processors) and memory 332, one Or (such as one or more mass memories are set the storage medium 330 of more than one storage application program 342 or data 344 It is standby).Wherein, memory 332 and storage medium 330 can be of short duration storage or persistent storage.It is stored in the journey of storage medium 330 Sequence may include one or more modules (diagram does not mark), and each module may include to one in data processing equipment Series of instructions operates.Further, central processing unit 322 could be provided as communicating with storage medium 330, be set in online upgrading The series of instructions operation in storage medium 330 is executed on standby 301.
Online upgrading equipment 301 can also include one or more power supplys 326, one or more wired or nothings Wired network interface 350, one or more input/output interfaces 358, and/or, one or more operating systems 341, Such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM etc..
Step in distributed memory system online upgrading method described in above figure 1 can be by distributed memory system The structure of online upgrading equipment is realized.
The present embodiment discloses a kind of readable storage medium storing program for executing, is stored thereon with program, is realized such as when program is executed by processor The step of above-mentioned distributed memory system online upgrading method, specifically can refer to above-mentioned distributed memory system online upgrading method Introduction.
Each embodiment is described by the way of progressive in specification, the highlights of each of the examples are with other realities Apply the difference of example, just to refer each other for identical similar portion between each embodiment.For device disclosed in embodiment Speech, since it is corresponded to the methods disclosed in the examples, so description is fairly simple, related place is referring to method part illustration ?.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, depends on the specific application and design constraint of technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
Above to distributed memory system online upgrading method, apparatus, equipment and readable storage medium provided by the present invention Matter is described in detail.Principle and implementation of the present invention are described for specific case used herein, above The explanation of embodiment is merely used to help understand the method and its core concept of the present invention.It should be pointed out that for the art Those of ordinary skill for, without departing from the principle of the present invention, can also to the present invention carry out it is several improvement and repair Decorations, these improvement and modification are also fallen within the protection scope of the claims of the present invention.

Claims (10)

1. a kind of distributed memory system online upgrading method, which is characterized in that including:
Nodal information acquisition is carried out to the node IP to be upgraded of acquisition, and the nodal information collected is made a backup store;
Edition upgrading is carried out to the node to be upgraded one by one, and cluster state inquiry is carried out to each node after upgrading;
When upgrading failure, nodal information recovery is carried out to failure node according to the nodal information backed up in advance, and lose to upgrading It loses information to be recorded, terminates current upgrading;Wherein, described upgrade includes unsuccessfully:Install failure and cluster state are abnormal;Institute Stating upgrading failure information includes:Failure node, error reason and upgrading progress.
2. distributed memory system online upgrading method as described in claim 1, which is characterized in that the node IP to be upgraded Acquisition methods include:
Cluster information in acquisition system;
Node IP parsing is carried out according to the cluster information, obtains node IP;
Duplicate removal processing is carried out to the node IP, obtains node IP to be upgraded.
3. distributed memory system online upgrading method as described in claim 1, which is characterized in that described to wait for one by one described Further include before upgrade node progress edition upgrading:
Cluster current capacities utilization rate is detected;
If current capacities utilization rate is less than threshold value, edition upgrading is carried out to the node to be upgraded one by one;
If current capacities utilization rate is not less than threshold value, terminate current upgrading.
4. distributed memory system online upgrading method as described in claim 1, which is characterized in that further include:
When carrying out edition upgrading to the node to be upgraded one by one, real time node is upgraded into progress writing progress file;
Current upgrading progress msg is obtained according to the schedule file, and obtained current upgrading progress msg was exported to end End.
5. distributed memory system online upgrading method as claimed in claim 4, which is characterized in that described current by what is obtained Upgrading progress msg, which is exported to terminal, includes:
By obtained current upgrading progress msg carry out progress strips conversion, and by the upgrading progress bar being converted to export to Terminal.
6. distributed memory system online upgrading method as described in claim 1, which is characterized in that further include:
When upgrading failure, the corresponding prompt message of output upgrading failure scenarios.
7. such as claim 1 to 6 any one of them distributed memory system online upgrading method, which is characterized in that further include:
When upgrading failure, obtains in current Upgrade process upgrade node IP and record, obtain blacklist;
Before next time upgrades, the node recorded in the blacklist is traversed, is removed in blacklist in node to be upgraded Node is simplified upgrade node;
Then carrying out edition upgrading to the node to be upgraded one by one is specially:Version liter is carried out to the simplified upgrade node one by one Grade.
8. a kind of distributed memory system online upgrading device, which is characterized in that including:
Information acquisition unit carries out nodal information acquisition, and the node that will be collected for the node IP to be upgraded to acquisition Information makes a backup store;
Edition upgrading unit for carrying out edition upgrading to the node to be upgraded one by one, and carries out each node after upgrading Cluster state is inquired;
Failure handling unit, for when upgrading failure, node letter to be carried out to failure node according to the nodal information backed up in advance Breath restores, and is recorded to upgrading failure information, terminates current upgrading;Wherein, described upgrade includes unsuccessfully:Install failure with And cluster state is abnormal;The upgrading failure information includes:Failure node, error reason and upgrading progress.
9. a kind of distributed memory system online upgrading equipment, which is characterized in that including:
Memory, for storing program;
Processor is realized as described in any one of claim 1 to 7 when for executing described program the step of convolutional calculation method.
10. a kind of readable storage medium storing program for executing, which is characterized in that have program stored therein on the readable storage medium storing program for executing, described program is located It manages and is realized when device executes as described in any one of claim 1 to 7 the step of distributed memory system online upgrading method.
CN201811013865.6A 2018-08-31 2018-08-31 Distributed memory system online upgrading method, apparatus, equipment and storage medium Pending CN108768758A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811013865.6A CN108768758A (en) 2018-08-31 2018-08-31 Distributed memory system online upgrading method, apparatus, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811013865.6A CN108768758A (en) 2018-08-31 2018-08-31 Distributed memory system online upgrading method, apparatus, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN108768758A true CN108768758A (en) 2018-11-06

Family

ID=63966647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811013865.6A Pending CN108768758A (en) 2018-08-31 2018-08-31 Distributed memory system online upgrading method, apparatus, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN108768758A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110688125A (en) * 2019-08-28 2020-01-14 北京浪潮数据技术有限公司 Deployment method and system of big data platform
CN110795121A (en) * 2019-09-27 2020-02-14 北京浪潮数据技术有限公司 Virtualization system upgrading method, device, equipment and computer readable storage medium
CN110795123A (en) * 2019-10-18 2020-02-14 苏州浪潮智能科技有限公司 Version upgrading method and system for power supply unit in storage system and related device
CN111338678A (en) * 2020-03-08 2020-06-26 苏州浪潮智能科技有限公司 Method and equipment for upgrading and checking storage system
CN111694516A (en) * 2020-05-26 2020-09-22 苏州浪潮智能科技有限公司 Version online upgrading method and terminal of distributed block storage system
CN112333471A (en) * 2020-11-05 2021-02-05 上海网达软件股份有限公司 Hot upgrading method, device, equipment and storage medium of audio and video online transcoder
WO2021022713A1 (en) * 2019-08-05 2021-02-11 平安科技(深圳)有限公司 Distributed module update method, device, and storage medium
CN112650624A (en) * 2020-12-25 2021-04-13 浪潮(北京)电子信息产业有限公司 Cluster upgrading method, device and equipment and computer readable storage medium
CN113031987A (en) * 2021-03-26 2021-06-25 山东英信计算机技术有限公司 Method, system and device for upgrading client
CN114844799A (en) * 2022-05-27 2022-08-02 深信服科技股份有限公司 Cluster management method and device, host equipment and readable storage medium
CN115827789A (en) * 2023-02-21 2023-03-21 苏州浪潮智能科技有限公司 Method, system, equipment and storage medium for optimizing file type database upgrading

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6304882B1 (en) * 1998-05-05 2001-10-16 Informix Software, Inc. Data replication system and method
USRE41162E1 (en) * 2000-02-28 2010-03-02 Lucent Technologies Inc. Method for providing scaleable restart and backout of software upgrades for clustered computing
CN102103613A (en) * 2009-12-22 2011-06-22 中兴通讯股份有限公司 Distributed database upgrade method, upgrade processing device and upgrade control device
CN102427466A (en) * 2011-08-24 2012-04-25 厦门雅迅网络股份有限公司 Long-distance updating system and long-distance software automatic updating method based on same
CN105468418A (en) * 2015-12-09 2016-04-06 上海爱数信息技术股份有限公司 System and method for upgrading software of smart terminal cluster
CN105635216A (en) * 2014-11-03 2016-06-01 华为软件技术有限公司 Distributed application upgrade method, device and distributed system
CN107943510A (en) * 2017-11-23 2018-04-20 郑州云海信息技术有限公司 Distributed memory system upgrade method, system, device and readable storage medium storing program for executing

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6304882B1 (en) * 1998-05-05 2001-10-16 Informix Software, Inc. Data replication system and method
USRE41162E1 (en) * 2000-02-28 2010-03-02 Lucent Technologies Inc. Method for providing scaleable restart and backout of software upgrades for clustered computing
CN102103613A (en) * 2009-12-22 2011-06-22 中兴通讯股份有限公司 Distributed database upgrade method, upgrade processing device and upgrade control device
CN102427466A (en) * 2011-08-24 2012-04-25 厦门雅迅网络股份有限公司 Long-distance updating system and long-distance software automatic updating method based on same
CN105635216A (en) * 2014-11-03 2016-06-01 华为软件技术有限公司 Distributed application upgrade method, device and distributed system
CN105468418A (en) * 2015-12-09 2016-04-06 上海爱数信息技术股份有限公司 System and method for upgrading software of smart terminal cluster
CN107943510A (en) * 2017-11-23 2018-04-20 郑州云海信息技术有限公司 Distributed memory system upgrade method, system, device and readable storage medium storing program for executing

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021022713A1 (en) * 2019-08-05 2021-02-11 平安科技(深圳)有限公司 Distributed module update method, device, and storage medium
CN110688125A (en) * 2019-08-28 2020-01-14 北京浪潮数据技术有限公司 Deployment method and system of big data platform
CN110795121A (en) * 2019-09-27 2020-02-14 北京浪潮数据技术有限公司 Virtualization system upgrading method, device, equipment and computer readable storage medium
CN110795123A (en) * 2019-10-18 2020-02-14 苏州浪潮智能科技有限公司 Version upgrading method and system for power supply unit in storage system and related device
CN111338678A (en) * 2020-03-08 2020-06-26 苏州浪潮智能科技有限公司 Method and equipment for upgrading and checking storage system
CN111694516A (en) * 2020-05-26 2020-09-22 苏州浪潮智能科技有限公司 Version online upgrading method and terminal of distributed block storage system
CN111694516B (en) * 2020-05-26 2022-07-19 苏州浪潮智能科技有限公司 Version online upgrading method and terminal of distributed block storage system
CN112333471A (en) * 2020-11-05 2021-02-05 上海网达软件股份有限公司 Hot upgrading method, device, equipment and storage medium of audio and video online transcoder
CN112650624A (en) * 2020-12-25 2021-04-13 浪潮(北京)电子信息产业有限公司 Cluster upgrading method, device and equipment and computer readable storage medium
CN112650624B (en) * 2020-12-25 2023-05-16 浪潮(北京)电子信息产业有限公司 Cluster upgrading method, device, equipment and computer readable storage medium
CN113031987A (en) * 2021-03-26 2021-06-25 山东英信计算机技术有限公司 Method, system and device for upgrading client
CN114844799A (en) * 2022-05-27 2022-08-02 深信服科技股份有限公司 Cluster management method and device, host equipment and readable storage medium
CN115827789A (en) * 2023-02-21 2023-03-21 苏州浪潮智能科技有限公司 Method, system, equipment and storage medium for optimizing file type database upgrading

Similar Documents

Publication Publication Date Title
CN108768758A (en) Distributed memory system online upgrading method, apparatus, equipment and storage medium
US9069889B2 (en) Automated enablement of performance data collection
US9170888B2 (en) Methods and apparatus for virtual machine recovery
US8560889B2 (en) Adding scalability and fault tolerance to generic finite state machine frameworks for use in automated incident management of cloud computing infrastructures
US8862927B2 (en) Systems and methods for fault recovery in multi-tier applications
US9940598B2 (en) Apparatus and method for controlling execution workflows
CN109656742B (en) Node exception handling method and device and storage medium
CN109947596A (en) PCIE device failure system delay machine processing method, device and associated component
US20190379576A1 (en) Providing dynamic serviceability for software-defined data centers
CN110109741B (en) Method and device for managing circular tasks, electronic equipment and storage medium
CN111176783A (en) High-availability method and device for container treatment platform and electronic equipment
CN110727508A (en) Task scheduling system and scheduling method
CN116016123A (en) Fault processing method, device, equipment and medium
CN110502399B (en) Fault detection method and device
CN114064438A (en) Database fault processing method and device
CN113190396A (en) Method, system and medium for collecting CPU register data
CN112579289B (en) Distributed analysis engine method and device capable of being intelligently scheduled
EP0701209A2 (en) Apparatus and methods for software rejuvenation
CN104158843A (en) Storage unit invalidation detecting method and device for distributed file storage system
CN109104314B (en) Method and device for modifying log configuration file
CN114090198A (en) Distributed task scheduling method and device, electronic equipment and storage medium
CN113656239A (en) Monitoring method and device for middleware and computer program product
US7406678B2 (en) Manager component resource addition and/or resource removal on behalf of distributed software application
CN110597609A (en) Cluster migration and automatic recovery method and system
CN109474694A (en) A kind of management-control method and device of the NAS cluster based on SAN storage array

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181106