CN101079896A - A multi-availability mechanism coexistence framework of concurrent storage system - Google Patents

A multi-availability mechanism coexistence framework of concurrent storage system Download PDF

Info

Publication number
CN101079896A
CN101079896A CN 200710018108 CN200710018108A CN101079896A CN 101079896 A CN101079896 A CN 101079896A CN 200710018108 CN200710018108 CN 200710018108 CN 200710018108 A CN200710018108 A CN 200710018108A CN 101079896 A CN101079896 A CN 101079896A
Authority
CN
China
Prior art keywords
data
framework
state
high available
available mechanism
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN 200710018108
Other languages
Chinese (zh)
Other versions
CN101079896B (en
Inventor
伍卫国
张虎
董小社
钱德沛
王恩东
胡雷钧
戴罗庚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Beijing Electronic Information Industry Co Ltd
Xian Jiaotong University
Original Assignee
Inspur Beijing Electronic Information Industry Co Ltd
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Beijing Electronic Information Industry Co Ltd, Xian Jiaotong University filed Critical Inspur Beijing Electronic Information Industry Co Ltd
Priority to CN200710018108A priority Critical patent/CN101079896B/en
Publication of CN101079896A publication Critical patent/CN101079896A/en
Application granted granted Critical
Publication of CN101079896B publication Critical patent/CN101079896B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

This invention discloses a coexistence structure of a multi-use unit of a parallel storage system including a state test and control frame, a data service frame, a metadata service frame, a data synchronous frame, a customer end frame, a system management frame and a high usability module supporting load and unload of on-line modules, in which, the high usability module realizes that all frames call necessary interface functions in the mode of plug-in components and the frames call corresponding interfaces realized in the usable module based on the high usability kind used by logic data to finish specific function, and users can select the most suitable unit in the high usability unit to secure reliability and usability of logic data.

Description

A kind of multi-availability mechanism coexistence framework of parallel memory system
Technical field
The present invention relates to the Computer Applied Technology field, is a kind of multi-availability mechanism coexistence framework of parallel memory system, particularly is based upon the multi-availability mechanism coexistence framework of the distributed memory system on parallel file system and the distributed file system.
Background technology
A high-availability system is meant and can cause that when occurring software or hardware fault in the system system stops service, but the permission system operates with failure.In parallel memory system, prior art is to realize by data redundancy mostly.If some data are unavailable, the service that provides can be provided its Backup Data.High-availability system is made up of two or more nodes usually, and these nodes link to each other with client by the internet, and each node has the local memory space of oneself.
Existing high-availability system has only provided a single high available mechanism mostly, and all logical datas all use this a kind of high available mechanism to guarantee safety of data.Because different logical datas has different high available demands, uses single high available mechanism will inevitably cause the performance loss and the waste of storage space of system.Though some high-availability system is the high available mechanism of dynamic-configuration as required, can not dynamically determine use which kind of high available mechanism according to the demand of logical data.
But in the height time spent of realizing parallel memory system, provide a kind of multi-availability mechanism coexistence framework that can be applied to this system supporting multiple high available mechanism, and to make the Different Logic data can use different high available mechanism be necessary.The multi-availability mechanism coexistence framework that this patent proposes is exactly in order to satisfy this demand of logical data.Under the support of this framework, the user can be according to the availability requirement of logical data, read write attribute, and the user decides the suitable available mechanism that provides in the using system with the reliability of assurance logical data and the availability of data, services to the quality of service requirement of logical data.
Summary of the invention
The objective of the invention is to overcome above-mentioned prior art deficiency, a kind of multi-availability mechanism coexistence framework of parallel memory system is provided, purpose is to make the user can select the high available mechanism of logical data, has reduced unnecessary performance loss and disk redundancy.
Technical scheme of the present invention be achieved in that this framework by following 7 the part form, be respectively: state-detection and control framework, the data, services framework, the metadata service framework, the data sync framework, the client framework, the system management framework, high available mechanism module, state-detection and control framework are responsible for detecting and controlling the state of entities all on this node, the data, services framework is responsible for creating concrete data, services thread, dispense request is to the data, services thread and finish required data redundancy of specific high available mechanism and service take-over function, the metadata service framework calls different functions according to the difference of the high available mechanism of logical data and finishes metadata operation, the data sync framework is supported the data sync thread coexistence of multiple high available mechanism, finish the data sync operation between the data of mutual redundancy, the client framework provides the function of a whole set of user capture parallel memory system, support multiple high available mechanism module, call corresponding high available mechanism function according to the high available mechanism type of request, the system management framework provides one to realize system configuration, system monitoring, the interface of system control function, high available mechanism module realizes the functional interface of other 6 parts as plug-in unit
The workflow of whole framework is as follows:
When a. the user initiates read and write access at logical data, at first transmit a request to the metadata service framework by the function in the client framework, obtain the metadata information of this logical data, this metadata information comprises and indicates the employed high available mechanism type of this section logical data;
B. then, the client framework calls the interface function of the client framework of realizing in the corresponding high available mechanism module according to the high available mechanism type of logical data, and this function sends access request to finish read-write operation by the data, services framework on back end;
C. the data, services framework is according to the high available mechanism type of logical data subsidiary in the access request, request of data is distributed in the corresponding data, services thread, by this request of data, services thread process and respond back to the client framework, to finish the operation response of request of data;
D. if data, services framework or data, services thread on certain back end can not be accessed, client framework then transmit status is confirmed state-detection and the control framework of request to this back end, confirm the current state of this data, services thread, after state-detection and control framework are received the state confirmation request, call corresponding status poll function and confirm the current state of institute's query entity, simultaneously, calling condition monitoring framework on the related entities place node of relevant status poll function and this entity communicates by letter and obtains its current state, according to all states that obtain, inquire about the state translation table in the configuration information of this high available mechanism, when certain preceding paragraph of state translation table is mated all current states, the state of the state that local entity then is set for indicating in this state exchange clauses and subclauses consequent, if there are not the clauses and subclauses that to mate, then do not do any operation, then the entity state after client is returned conversion;
E. when certain back end breaks down, data, services thread on other back end relevant with the data, services thread of node moves under the normal situation, the modification daily record of the redundant data that the data, services framework record of related data node is relevant with the malfunctioning node data;
F. after fault takes place, the system manager can learn that by the system management framework certain data, services thread breaks down, after manual intervention, the system manager is by the synchronous flow process of system management framework log-on data, be arranged in the synchronous function log-on data synchronizing thread that data sync framework on the former fault node is written into high available mechanism module, visit the data sync framework of other interdependent node, and according to revising the synchronous malfunctioning node data of daily record, after data sync is finished, notify state-detection and control framework on this back end, adjust the state of data, services thread.
Described parallel memory system is made up of a plurality of nodes, between the node by the network interconnection, node can be divided into four types according to function: metadata node, back end, client node, system management node, the descriptor of metadata node stored logic data, and response surface is to the access request of this information, file data itself is split into logic data block and is stored on a plurality of back end, client node is the user of logical data, and management node system-oriented keeper provides the configuration and administration feature of system.
Described state-detection and control framework, on each back end of parallel memory system and metadata node, a state-detection and control framework are arranged all, be responsible for detecting and controlling entity state all on this node, obtain being in the entity state on other nodes if desired, then state-detection is communicated by letter with control framework together with the state-detection on other nodes with control framework, obtain the state of required entity by state-detection on other nodes and control framework, entity is all service routines relevant with high available mechanism, comprise every kind high data, services thread in the available mechanism, the data sync thread, parallel memory system needs to detect in real time the ruuning situation of these entities, obtain the state of each entity, entity is different because of the difference state of high available mechanism type, but includes the two states that the expression activity still stops at least.
Described data, services framework, when starting, create the data, services thread, be written into concrete data, services function, the high available mechanism type of the logical data of asking according to client then, to be distributed to corresponding data, services thread towards the access request of this logical data, and by this thread response request and return the client desired data, simultaneously, the data, services thread is finished required data redundancy of specific high available mechanism and service take-over function, the data, services framework is supported the coexistence of multiple high available mechanism module, can load and unload various high available mechanism modules dynamically.
Described data, services thread, every kind high available mechanism all has one or more corresponding data, services thread, each data, services thread is exactly an entity for state-detection and control framework, entity state includes active state, Status of Backups, synchronous regime, halted state, represent different ruuning situation respectively, data, services thread and data, services framework provide the access interface of entity state, only allow state-detection and control framework visit, its state of change that entity also can be autonomous simultaneously with node.
Described metadata service framework, support multiple high available mechanism, call different interface functions according to the difference of the high available mechanism of logical data and finish metadata operation, metadata operation is implemented in the different high available mechanism modules according to the difference of high available mechanism, dynamically loads and unloads.
Described data sync framework can be supported the data sync thread coexistence of multiple high available mechanism, fault recovery under the high available mechanism is provided, data sync operation between the redundant mutually data, the data sync thread is divided into as synchrodata provides the data synchronization service end line journey of end and as two types of the data sync client end line journeys of synchronous data requests end, high available mechanism type and synchronization request thread type that the data sync framework takes place according to data sync are distributed to corresponding data synchronization service thread with data synchronization request.
Described client framework provides the function interface of a whole set of user capture parallel memory system, the user is by calling the data in these interface accessing parallel memory systems, simultaneously, the client framework is supported multiple high available mechanism coexistence, high available mechanism module loading and unloading operation are provided, the high available mechanism type of the logical data of visiting according to client is called respective function and is finished data access, each high available mechanism all provides a cover client-access function, corresponding with the function interface that the client framework is provided, client is finished the distinctive data redundancy of each high available mechanism under the support of these functions, failover, failure recovery operation.
Described system management framework provides the interface of a close friend's management parallel memory system, system configuration is realized at this interface, system monitoring, system control function, comprise in the system configuration towards the configuration of parallel memory system itself, comprise system scale, node type, the nodal information configuration, also comprise simultaneously towards high available mechanism modules configured, realize monitoring in the system monitoring to the operation information of the Global Information of system and each node, comprise simultaneously the entity information of all high available mechanism module correspondences and the detection of entity state, offer the complete system view of system manager, realize real-time control in the system control function to all nodes in the system, and to the State Control of the entity of all high available mechanism correspondences, the keeper uses these controlled function manually intrasystem data, services of opening and closing and data sync operation.
Described high available mechanism module is implemented as different plug-in units, plug-in unit is realized the functional interface of other six frameworks, the high available mechanism type that these frameworks use according to logical data, the interface function of realizing in the corresponding availability module is called in selection, finishes data access, fault recovery and system management.
The present invention can dynamically hold multiple high available mechanism and exist simultaneously, and the suitable high available mechanism thread that provides in the high available demand calling system according to logical data is handled owing to adopt technique scheme.Dissimilar logical datas are used the high available mechanism that is fit to the most, the unnecessary performance that can avoid causing because use single high available mechanism is lost and waste of storage space, satisfying under the prerequisite of user to the availability requirement of data, service quality also is guaranteed.
Description of drawings
Fig. 1 is the structure chart of using this multi-availability mechanism and depositing the parallel memory system of framework.
Parallel file system file access flow chart when Fig. 2 is fault-free.
Fig. 3 is a back end software pie graph.
Fig. 4 is a metadata node software pie graph.
Fig. 5 is a client node software pie graph.
Data sync flow chart when Fig. 6 is the fault recovery.
Fig. 7 is a back end when breaking down, the client-access flow chart.
Below in conjunction with accompanying drawing content of the present invention is described in further detail.
Embodiment
With reference to shown in Figure 1, use in the parallel memory system of this framework and form by a plurality of nodes, between the node by the network interconnection.Node can be divided into four types according to function: metadata node, back end, client node, system management node.The descriptor of metadata node stored logic data, and response surface is to the access request of this information.Logical data is split into data block and is stored on a plurality of back end.Client node is the user of logical data.Management node system-oriented keeper provides the configuration and administration feature of system.
With reference to shown in Figure 2, the parallel memory system file access flow process during fault-free is: at first by the function access metadata service module in the client framework, obtain the metadata information of this logical data.The client framework calls the data, services framework on the function access back end of corresponding high available mechanism according to high available mechanism type, through network service, the data, services framework obtains this request, learn the high available mechanism type of this logical data according to logical data metadata information subsidiary in the request, again request of data is distributed in the data, services thread of high available mechanism of the type correspondence, by this request of data, services thread process and respond back to client, finish request of data.
With reference to shown in Figure 3, data serving node software pie graph comprises state-detection and control framework, data, services framework, various high available mechanism modules, management end agency and data sync framework.State-detection and the detection and the control function interface of calling in correspondence in the various high available mechanism modules, the state-detection and the setting operation of responsible various entities.The data, services framework is called in corresponding data, services function, is responsible for the data, services on this node.Management end is acted on behalf of the request of receiving management node, and carries out desired operation in the request.The data sync framework is called in corresponding data sync function interface, is responsible for the equalization operation of data under the fault recovery situation.
With reference to shown in Figure 4, metadata node software pie graph comprises state-detection and control framework, the metadata service framework, various high available mechanism modules and management end agency, state-detection and control framework are by calling the specific function of high available mechanism module, obtain and the state that the entity relevant with this high available mechanism is set, metadata frame also needs to call the function in the concrete high available mechanism module, finishes the metadata operation relevant with this high available mechanism.
With reference to shown in Figure 5, client node software pie graph comprises various lib functions, client framework, various high available mechanism modules and management end agency.When lib function of certain operation calls, the lib function is finished this operation by the respective function that the client framework calls in the corresponding high available mechanism module.Each high available mechanism module all should realize the interface that the client framework needs.
With reference to shown in Figure 6, the data sync flow chart is during fault recovery: the system management framework sends the request that begins data sync to the state-detection and the control framework of fault data node, this request is forwarded on the data sync framework of this back end, the data sync framework is according to the type of high available mechanism, the service state that current high available mechanism is set to state-detection and control framework request is a blocked state, and starts all related data sync client threads.State-detection and control framework send the request of opening data synchronization service end line journey to other related data nodes more then, have also blocked the data, services thread of the corresponding high available mechanism of present node when this request is finished.After data synchronization service end line journey is opened, analyze the modification daily record of being noted, reorganize daily record, generate tabulation synchronously, wait for the visit of data sync client end line journey.After starting, data sync client end line journey, finishes up to data sync then directly to the data synchronization server end line journey request msg sync item of interdependent node.At last, the deletion log information, the client of data sync thread and service end send the notice of finishing synchronously to separately state-detection and control framework respectively, be adjusted into normally by state-detection and the state of control framework data services, and termination data synchronizing thread.
With reference to shown in Figure 7, client node comprises consumer process, client framework and each high available mechanism module, and data serving node comprises state-detection and control module, data, services framework, data sync framework, administration agent and each high available mechanism module.When back end produced fault, the client-access flow chart was: client at first connects the failure of data service framework, inquires about the state of current high available mechanism thread then to state-detection and control module.The state that state-detection and control framework call in the current high available mechanism module obtains the state that function obtains this entity and related entities thereof, and returns to client.Administration agent is to system management module report current state situation of change.
Structure is applicable to the high available mechanism of parallel memory system and deposits framework and comprises 7 parts that make up this framework: state-detection and control framework, data, services framework, metadata service framework, client framework, data sync framework, system management module and high available mechanism module.
State-detection and control framework are the agencies of the state solicit operation of entity on the node, be responsible for to detect and control entity state all on this node, to satisfy in the high available mechanism in the failover and fault recovery part the request to entity state.In realization, it is a resident program, all exists on each back end of system and metadata node, and be written into all high available mechanism modules when starting.Its work comprises and receives outside solicit operation to entity state, and to the request of monitoring entity transmit status and the state of controlled entity as required, response is replied.State-detection and control framework dynamically load and unload various high available mechanism modules.
The data, services framework also is a resident program, all exists on each back end.It is written into all high available mechanism modules and also generates at least one data, services thread for each module when starting.The high available mechanism type of the logical data that the data, services framework is asked according to client, to be distributed to corresponding data, services thread towards the access request of this logical data, and respond this request and return the client desired data by this thread, simultaneously, the data, services thread is finished functions such as required data redundancy of specific high available mechanism and service take-over.The data, services framework is supported the coexistence of multiple high available mechanism module, can load and unload the data, services thread of various high available mechanism module correspondences dynamically, and according to the request content of client, correct distribution customer terminal request, the while treated side is to the total operation of all mechanism, as the service of restarting waits operation.
Metadata frame is deployed on the metadata node, and it also is a resident program.It is written into all high available mechanism modules and obtains the pointer of each processing function that is associated with high available mechanism when starting, the metadata service outwards is provided.It receives the request of other nodes to metadata, and calls different functions according to the difference of high available mechanism and finish the operation relevant with mechanism, echo reply.
Client provides the function interface of a whole set of user capture parallel memory system, can be function library interface or by the VFS interface, and it is data in the addressable parallel memory system that the user calls this interface.Concrete access interface is different and different with parallel memory system.In order to support multiple high available mechanism coexistence, the client framework provides the operation of high available mechanism module loading and unloading, and the high available mechanism type of the logical data that can visit according to client is called respective function and finished data access simultaneously.When concrete operations, by the client is the high available mechanism that logical data is specified to be needed, user's access function interface can load the pointer that this high available mechanism module also obtains the concrete function relevant with mechanism according to configuration information, calls this function and finishes concrete operations.Each high available mechanism all provides a cover client-access function, and is corresponding with the interface function that the client framework is provided.Client is finished associative operations such as the distinctive data redundancy of each high available mechanism, failover, fault recovery under the support of these functions.
The data sync framework provides the fault recovery stage under the high available mechanism, the data sync operation between the redundant mutually data, and the data sync framework can be supported the data sync thread coexistence of multiple high available mechanism.When specific implementation, it is a resident program, all exists in each back end.When this resident program starts, just be written into all modules, but do not generate thread, but when receiving synchronization request, just create concrete thread according to configuration information.The data sync thread has two types: provide the data synchronization server end line journey of end as synchrodata, and as the data sync client end line journey of synchronous data requests end.The data sync framework can be distributed to the request of data sync client end line journey the data synchronization server end line journey of correspondence according to the high available mechanism type and the synchronizing thread type of data sync generation.The data sync thread has multiple running status as the monitored entity of a kind of state-detection and control framework.Outside control to its state is finished by state-detection on its place node and monitoring framework.
System management module provides the interface of a close friend's management parallel memory system, and system configuration, system monitoring, system control function are realized in this interface.Comprise in the system configuration towards the configuration of parallel memory system itself, comprise system scale, node type, nodal information configuration, comprise also that simultaneously these configurations comprise towards high available mechanism modules configured: essential information, the every kind high distinctive configuration information of available mechanism of the high available mechanism module of high available mechanism quantity of being supported and correspondence.Realize in the system monitoring monitoring of the operation information of the Global Information of system and each node is comprised simultaneously to the entity information of all high available mechanism module correspondences and the detection of entity state, offer the complete system view of system manager.Realize in the system control function to the real-time control of all nodes in the system and to the State Control of the entity of all high available mechanism correspondences.The keeper uses these controls manually intrasystem data, services of opening and closing and data sync operation.During specific implementation, at all service nodes, comprise that metadata service node and data serving node all have corresponding management end proxy module, system management module and these agencies communicate the needed information on the corresponding node of obtaining.
Different high available mechanism modules are implemented as different plug-in units, and plug-in unit is realized the functional interface of other 6 parts.The high available mechanism type that other framework uses according to logical data in the framework selects to call the interface of realizing in the corresponding availability module, and data access, fault recovery and system management are finished in cooperation mutually.
The user will visit one section logical data, at first by the function access metadata service framework in the client framework, obtain the metadata information of this logical data, comprising, the employed high available mechanism type of this logical data, if the establishment logical data should indicate the high available mechanism type that this section logical data will use in the operation of creating.After obtaining metadata information, be kept in the client-side program, in follow-up data access, the client framework calls the function of corresponding high available mechanism with the data, services framework on the visit data node according to high available mechanism type, through network service, the data, services framework obtains this request, learn the high available mechanism type of this logical data according to logical data metadata information subsidiary in the request, again request of data is distributed in the data, services thread of high available mechanism of the type correspondence, by this request of data, services thread process and respond back to client, finish request of data.If when the data, services framework of request of data corresponding data node, break down, the data, services framework of certain data back end or data, services thread can not be normally accessed, the client function is the state-detection and the control framework of this node of transmit status affirmation request visit then, confirm the current state of this data, services thread, after state-detection and control framework are received the state confirmation request, according to the high available mechanism type of institute's request entity and the indications of entity, the status poll function that calls this entity is confirmed the current state of this entity, simultaneously, related entities to all these entities, call the status poll function of related entities and the condition monitoring framework communication on other nodes obtains its current state, according to all states that obtain, inquire about the state translation table of this high available mechanism, when certain preceding paragraph of state translation table was mated all current states, then the consequent state that local entity is set that transforms clauses and subclauses according to this state was the state that indicates in consequent.If do not have the clauses and subclauses that to mate, then do not do any operation.Entity state after client is returned conversion then.Client is according to the state according to up-to-date data, services thread, judge whether and to conduct interviews,, can select other nodes to conduct interviews according to employed high available mechanism if this entity can not be accessed again, repeat said process, up to correctly being responded or returning and make mistakes.Above-mentioned entity state request is distinguished according to the promoter of request, if swap status between the condition monitoring framework, what return is entity state before the state exchange, and other promoter then returns is entity state after the state conversion.In the storing process of data, common high available mechanism all can produce some data redundancies to guarantee the availability of storage system when node failure takes place.The operation that produces data redundancy can be implemented in the data, services thread, also can be implemented in the client function.The data, services thread is under the normal situation of operation, redundant data is write in the storage medium of back end normally, if node breaks down, the data that are written on the malfunctioning node can not be written in the storage medium, then need write down the change of the redundant data relevant with the malfunctioning node data on other related data nodes.So the data, services framework is provided with the modification daily record relevant with logical data, each data, services thread is provided with one and revises journal file, is recorded in the modification of the related data in other node failures stage.The foundation of data sync when recovering to be provided at malfunctioning node.
When having malfunctioning node, storage system enters the stage of operating with failure automatically, and record modification daily record automatically, the system manager can learn that by system management module node breaks down, after manual intervention, recover normal as this node, data before system breaks down are not lost, and after joining system with initial condition, the system manager is by system control function in the system management module, and to the request that the state-detection and the control framework of malfunctioning node sends the beginning data sync, this request is forwarded on the data sync framework of this node, the data sync framework is opened all related data sync client threads then according to the type of high available mechanism.State-detection and control framework send the request of opening data synchronization service end line journey to other interdependent nodes more then.After data synchronization server end line journey is opened, at first analyze the modification daily record of being noted, reorganize daily record, generate tabulation synchronously, wait for the visit of data sync client end line journey.Then direct data synchronization server end line journey request msg sync item after data sync client end line journey starts to interdependent node, and handle the sync item of request one by one, when data synchronization service end line journey needs synchronous quantity less than a threshold values in finding current daily record, send the request of asking the blocking data service thread to state-detection and control framework, after obtaining confirming, the data, services thread stops service, in an of short duration time, the data sync client end can be fast all data synchronously, when the synchronous chained list in data synchronization service end line journey is empty, the client of data sync thread and service end send the notice of finishing synchronously to separately state-detection and control framework respectively, be adjusted into normally by state-detection and control framework state, and stop the data synchronizing thread data services.After the high available mechanism on all malfunctioning nodes had all been finished above-mentioned operation, the operation of the data sync of malfunctioning node was finished.

Claims (10)

1, a kind of multi-availability mechanism coexistence framework of parallel memory system, it is characterized in that: this framework is made up of following 7 parts, be respectively: state-detection and control framework, the data, services framework, the metadata service framework, the data sync framework, the client framework, the system management framework, high available mechanism module, state-detection and control framework are responsible for detecting and controlling the state of entities all on this node, the data, services framework is responsible for creating concrete data, services thread, dispense request is to the data, services thread and finish required data redundancy of specific high available mechanism and service take-over function, the metadata service framework calls different functions according to the difference of the high available mechanism of logical data and finishes metadata operation, the data sync framework is supported the data sync thread coexistence of multiple high available mechanism, finish the data sync operation between the data of mutual redundancy, the client framework provides the function of a whole set of user capture parallel memory system, support multiple high available mechanism module, call corresponding high available mechanism function according to the high available mechanism type of request, the system management framework provides one to realize system configuration, system monitoring, the interface of system control function, high available mechanism module realizes the functional interface of other 6 parts as plug-in unit
The workflow of whole framework is as follows:
When a. the user initiates read and write access at logical data, at first transmit a request to the metadata service framework by the function in the client framework, obtain the metadata information of this logical data, this metadata information comprises and indicates the employed high available mechanism type of this section logical data;
B. then, the client framework calls the interface function of the client framework of realizing in the corresponding high available mechanism module according to the high available mechanism type of logical data, and this function sends access request to finish read-write operation by the data, services framework on back end;
C. the data, services framework is according to the high available mechanism type of logical data subsidiary in the access request, request of data is distributed in the corresponding data, services thread, by this request of data, services thread process and respond back to the client framework, to finish the operation response of request of data;
D. if data, services framework or data, services thread on certain back end can not be accessed, client framework then transmit status is confirmed state-detection and the control framework of request to this back end, confirm the current state of this data, services thread, after state-detection and control framework are received the state confirmation request, call corresponding status poll function and confirm the current state of institute's query entity, simultaneously, calling condition monitoring framework on the related entities place node of relevant status poll function and this entity communicates by letter and obtains its current state, according to all states that obtain, inquire about the state translation table in the configuration information of this high available mechanism, when certain preceding paragraph of state translation table is mated all current states, the state of the state that local entity then is set for indicating in this state exchange clauses and subclauses consequent, if there are not the clauses and subclauses that to mate, then do not do any operation, then the entity state after client is returned conversion;
E. when certain back end breaks down, data, services thread on other back end relevant with the data, services thread of node moves under the normal situation, the modification daily record of the redundant data that the data, services framework record of related data node is relevant with the malfunctioning node data;
F. after fault takes place, the system manager can learn that by the system management framework certain data, services thread breaks down, after manual intervention, the system manager is by the synchronous flow process of system management framework log-on data, be arranged in the synchronous function log-on data synchronizing thread that data sync framework on the former fault node is written into high available mechanism module, visit the data sync framework of other interdependent node, and according to revising the synchronous malfunctioning node data of daily record, after data sync is finished, notify state-detection and control framework on this back end, adjust the state of data, services thread.
2, the multi-availability mechanism coexistence framework of parallel memory system according to claim 1, it is characterized in that: described parallel memory system is made up of a plurality of nodes, between the node by the network interconnection, node can be divided into four types according to function: metadata node, back end, client node, the system management node, the descriptor of metadata node stored logic data, and response surface is to the access request of this information, file data itself is split into logic data block and is stored on a plurality of back end, client node is the user of logical data, and management node system-oriented keeper provides the configuration and administration feature of system.
3, the multi-availability mechanism coexistence framework of parallel memory system according to claim 1, it is characterized in that: described state-detection and control framework, on each back end of parallel memory system and metadata node, a state-detection and control framework are arranged all, be responsible for detecting and controlling entity state all on this node, obtain being in the entity state on other nodes if desired, then state-detection is communicated by letter with control framework together with the state-detection on other nodes with control framework, obtain the state of required entity by state-detection on other nodes and control framework, entity is all service routines relevant with high available mechanism, comprise every kind high data, services thread in the available mechanism, the data sync thread, parallel memory system needs to detect in real time the ruuning situation of these entities, obtain the state of each entity, entity is different because of the difference state of high available mechanism type, but includes the two states that the expression activity still stops at least.
4, the multi-availability mechanism coexistence framework of parallel memory system according to claim 1, it is characterized in that: described data, services framework, when starting, create the data, services thread, be written into concrete data, services function, the high available mechanism type of the logical data of asking according to client then, to be distributed to corresponding data, services thread towards the access request of this logical data, and by this thread response request and return the client desired data, simultaneously, the data, services thread is finished required data redundancy of specific high available mechanism and service take-over function, the data, services framework is supported the coexistence of multiple high available mechanism module, can load and unload various high available mechanism modules dynamically.
5, multi-availability mechanism coexistence framework according to claim 1 or 3 described parallel memory systems, it is characterized in that: described data, services thread, every kind high available mechanism all has one or more corresponding data, services thread, each data, services thread is exactly an entity for state-detection and control framework, entity state includes active state, Status of Backups, synchronous regime, halted state, represent different ruuning situation respectively, data, services thread and data, services framework provide the access interface of entity state, only allow state-detection and control framework visit, its state of change that entity also can be autonomous simultaneously with node.
6, the multi-availability mechanism coexistence framework of parallel memory system according to claim 1, it is characterized in that: described metadata service framework, support multiple high available mechanism, call different interface functions according to the difference of the high available mechanism of logical data and finish metadata operation, metadata operation is implemented in the different high available mechanism modules according to the difference of high available mechanism, dynamically loads and unloads.
7, the multi-availability mechanism coexistence framework of parallel memory system according to claim 1, it is characterized in that: described data sync framework can be supported the data sync thread coexistence of multiple high available mechanism, fault recovery under the high available mechanism is provided, data sync operation between the redundant mutually data, the data sync thread is divided into as synchrodata provides the data synchronization service end line journey of end and as two types of the data sync client end line journeys of synchronous data requests end, high available mechanism type and synchronization request thread type that the data sync framework takes place according to data sync are distributed to corresponding data synchronization service thread with data synchronization request.
8, the multi-availability mechanism coexistence framework of parallel memory system according to claim 1, it is characterized in that: described client framework provides the function interface of a whole set of user capture parallel memory system, the user is by calling the data in these interface accessing parallel memory systems, simultaneously, the client framework is supported multiple high available mechanism coexistence, high available mechanism module loading and unloading operation are provided, the high available mechanism type of the logical data of visiting according to client is called respective function and is finished data access, each high available mechanism all provides a cover client-access function, corresponding with the function interface that the client framework is provided, client is finished the distinctive data redundancy of each high available mechanism under the support of these functions, failover, failure recovery operation.
9, the multi-availability mechanism coexistence framework of parallel memory system according to claim 1, it is characterized in that: described system management framework provides the interface of a close friend's management parallel memory system, system configuration is realized at this interface, system monitoring, system control function, comprise in the system configuration towards the configuration of parallel memory system itself, comprise system scale, node type, the nodal information configuration, also comprise simultaneously towards high available mechanism modules configured, realize monitoring in the system monitoring to the operation information of the Global Information of system and each node, comprise simultaneously the entity information of all high available mechanism module correspondences and the detection of entity state, offer the complete system view of system manager, realize real-time control in the system control function to all nodes in the system, and to the State Control of the entity of all high available mechanism correspondences, the keeper uses these controlled function manually intrasystem data, services of opening and closing and data sync operation.
10, according to the multi-availability mechanism coexistence framework of the parallel memory system described in the claim 1, it is characterized in that: described high available mechanism module is implemented as different plug-in units, plug-in unit is realized the functional interface of other six frameworks, the high available mechanism type that these frameworks use according to logical data, the interface function of realizing in the corresponding availability module is called in selection, finishes data access, fault recovery and system management.
CN200710018108A 2007-06-22 2007-06-22 A method for constructing multi-availability mechanism coexistence framework of concurrent storage system Expired - Fee Related CN101079896B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200710018108A CN101079896B (en) 2007-06-22 2007-06-22 A method for constructing multi-availability mechanism coexistence framework of concurrent storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200710018108A CN101079896B (en) 2007-06-22 2007-06-22 A method for constructing multi-availability mechanism coexistence framework of concurrent storage system

Publications (2)

Publication Number Publication Date
CN101079896A true CN101079896A (en) 2007-11-28
CN101079896B CN101079896B (en) 2010-05-19

Family

ID=38907123

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200710018108A Expired - Fee Related CN101079896B (en) 2007-06-22 2007-06-22 A method for constructing multi-availability mechanism coexistence framework of concurrent storage system

Country Status (1)

Country Link
CN (1) CN101079896B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101470735B (en) * 2007-12-27 2011-05-04 财团法人工业技术研究院 Virtual file management system and its system configuration establishing and file access method
CN102291449A (en) * 2011-08-08 2011-12-21 浪潮电子信息产业股份有限公司 Method for testing and adjusting cluster storage system performance based on synchronous strategy
CN103235753A (en) * 2013-04-09 2013-08-07 国家电网公司 Method and device for monitoring information server
CN104123300A (en) * 2013-04-26 2014-10-29 上海云人信息科技有限公司 Data distributed storage system and method
CN105550094A (en) * 2015-12-10 2016-05-04 国网四川省电力公司信息通信公司 Automatic state monitoring method of high-availability system
CN106357646A (en) * 2016-09-21 2017-01-25 郑州云海信息技术有限公司 Agent control system for storing management software
WO2017187263A1 (en) * 2016-04-26 2017-11-02 Umbra Technologies Ltd. Sling-routing logic and load balancing
CN107710165A (en) * 2015-12-15 2018-02-16 华为技术有限公司 Method and apparatus for the request of memory node synchronous service
CN108388632A (en) * 2011-11-15 2018-08-10 起元科技有限公司 Data divide group, segmentation and parallelization
US10574482B2 (en) 2015-04-07 2020-02-25 Umbra Technologies Ltd. Multi-perimeter firewall in the cloud
US10630505B2 (en) 2015-01-28 2020-04-21 Umbra Technologies Ltd. System and method for a global virtual network
US10841360B2 (en) 2014-12-08 2020-11-17 Umbra Technologies Ltd. System and method for content retrieval from remote network regions
CN109120691B (en) * 2018-08-15 2021-05-14 恒生电子股份有限公司 Method, system, device and computer readable medium for detecting state of service system
CN113395358A (en) * 2021-08-16 2021-09-14 贝壳找房(北京)科技有限公司 Network request execution method and execution system
US11360945B2 (en) 2015-12-11 2022-06-14 Umbra Technologies Ltd. System and method for information slingshot over a network tapestry and granularity of a tick
US11711346B2 (en) 2015-01-06 2023-07-25 Umbra Technologies Ltd. System and method for neutral application programming interface

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7149918B2 (en) * 2003-03-19 2006-12-12 Lucent Technologies Inc. Method and apparatus for high availability distributed processing across independent networked computer fault groups
US6996502B2 (en) * 2004-01-20 2006-02-07 International Business Machines Corporation Remote enterprise management of high availability systems

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101470735B (en) * 2007-12-27 2011-05-04 财团法人工业技术研究院 Virtual file management system and its system configuration establishing and file access method
CN102291449A (en) * 2011-08-08 2011-12-21 浪潮电子信息产业股份有限公司 Method for testing and adjusting cluster storage system performance based on synchronous strategy
CN102291449B (en) * 2011-08-08 2014-04-02 浪潮电子信息产业股份有限公司 Method for testing and adjusting cluster storage system performance based on synchronous strategy
CN108388632A (en) * 2011-11-15 2018-08-10 起元科技有限公司 Data divide group, segmentation and parallelization
CN108388632B (en) * 2011-11-15 2021-11-19 起元科技有限公司 Data clustering, segmentation, and parallelization
CN103235753A (en) * 2013-04-09 2013-08-07 国家电网公司 Method and device for monitoring information server
CN104123300A (en) * 2013-04-26 2014-10-29 上海云人信息科技有限公司 Data distributed storage system and method
CN104123300B (en) * 2013-04-26 2017-10-13 上海云人信息科技有限公司 Data distribution formula storage system and method
US10841360B2 (en) 2014-12-08 2020-11-17 Umbra Technologies Ltd. System and method for content retrieval from remote network regions
US11503105B2 (en) 2014-12-08 2022-11-15 Umbra Technologies Ltd. System and method for content retrieval from remote network regions
US11711346B2 (en) 2015-01-06 2023-07-25 Umbra Technologies Ltd. System and method for neutral application programming interface
US11881964B2 (en) 2015-01-28 2024-01-23 Umbra Technologies Ltd. System and method for a global virtual network
US10630505B2 (en) 2015-01-28 2020-04-21 Umbra Technologies Ltd. System and method for a global virtual network
US11240064B2 (en) 2015-01-28 2022-02-01 Umbra Technologies Ltd. System and method for a global virtual network
US11750419B2 (en) 2015-04-07 2023-09-05 Umbra Technologies Ltd. Systems and methods for providing a global virtual network (GVN)
US11418366B2 (en) 2015-04-07 2022-08-16 Umbra Technologies Ltd. Systems and methods for providing a global virtual network (GVN)
US10574482B2 (en) 2015-04-07 2020-02-25 Umbra Technologies Ltd. Multi-perimeter firewall in the cloud
US10659256B2 (en) 2015-04-07 2020-05-19 Umbra Technologies Ltd. System and method for virtual interfaces and advanced smart routing in a global virtual network
US10756929B2 (en) 2015-04-07 2020-08-25 Umbra Technologies Ltd. Systems and methods for providing a global virtual network (GVN)
US11799687B2 (en) 2015-04-07 2023-10-24 Umbra Technologies Ltd. System and method for virtual interfaces and advanced smart routing in a global virtual network
US11271778B2 (en) 2015-04-07 2022-03-08 Umbra Technologies Ltd. Multi-perimeter firewall in the cloud
CN105550094B (en) * 2015-12-10 2018-02-06 国网四川省电力公司信息通信公司 A kind of high-availability system state automatic monitoring method
CN105550094A (en) * 2015-12-10 2016-05-04 国网四川省电力公司信息通信公司 Automatic state monitoring method of high-availability system
US11360945B2 (en) 2015-12-11 2022-06-14 Umbra Technologies Ltd. System and method for information slingshot over a network tapestry and granularity of a tick
US11681665B2 (en) 2015-12-11 2023-06-20 Umbra Technologies Ltd. System and method for information slingshot over a network tapestry and granularity of a tick
CN107710165A (en) * 2015-12-15 2018-02-16 华为技术有限公司 Method and apparatus for the request of memory node synchronous service
CN107710165B (en) * 2015-12-15 2020-01-03 华为技术有限公司 Method and device for storage node synchronization service request
US11630811B2 (en) 2016-04-26 2023-04-18 Umbra Technologies Ltd. Network Slinghop via tapestry slingshot
US10922286B2 (en) 2016-04-26 2021-02-16 UMBRA Technologies Limited Network Slinghop via tapestry slingshot
US11146632B2 (en) 2016-04-26 2021-10-12 Umbra Technologies Ltd. Data beacon pulser(s) powered by information slingshot
US11743332B2 (en) 2016-04-26 2023-08-29 Umbra Technologies Ltd. Systems and methods for routing data to a parallel file system
WO2017187263A1 (en) * 2016-04-26 2017-11-02 Umbra Technologies Ltd. Sling-routing logic and load balancing
US11789910B2 (en) 2016-04-26 2023-10-17 Umbra Technologies Ltd. Data beacon pulser(s) powered by information slingshot
CN106357646B (en) * 2016-09-21 2019-12-31 苏州浪潮智能科技有限公司 Agent control system for storage management software
CN106357646A (en) * 2016-09-21 2017-01-25 郑州云海信息技术有限公司 Agent control system for storing management software
CN109120691B (en) * 2018-08-15 2021-05-14 恒生电子股份有限公司 Method, system, device and computer readable medium for detecting state of service system
CN113395358A (en) * 2021-08-16 2021-09-14 贝壳找房(北京)科技有限公司 Network request execution method and execution system
CN113395358B (en) * 2021-08-16 2021-11-05 贝壳找房(北京)科技有限公司 Network request execution method and execution system

Also Published As

Publication number Publication date
CN101079896B (en) 2010-05-19

Similar Documents

Publication Publication Date Title
CN101079896B (en) A method for constructing multi-availability mechanism coexistence framework of concurrent storage system
US7340637B2 (en) Server duplexing method and duplexed server system
EP2104041B1 (en) System and method for failover
US6134673A (en) Method for clustering software applications
US7565572B2 (en) Method for rolling back from snapshot with log
KR100725066B1 (en) A system server for data communication with multiple clients and a data processing method
US8856091B2 (en) Method and apparatus for sequencing transactions globally in distributed database cluster
US20010056554A1 (en) System for clustering software applications
US20070061379A1 (en) Method and apparatus for sequencing transactions globally in a distributed database cluster
EP2281240A1 (en) Maintaining data integrity in data servers across data centers
CN101751415B (en) Metadata service system, metadata synchronized method and writing server updating method
US8527454B2 (en) Data replication using a shared resource
CN102411639A (en) Multi-copy storage management method and system of metadata
US20050283636A1 (en) System and method for failure recovery in a cluster network
CN110807064A (en) Data recovery device in RAC distributed database cluster system
CN103294167A (en) Data behavior based low-energy consumption cluster storage replication device and method
CN116055563A (en) Task scheduling method, system, electronic equipment and medium based on Raft protocol
CN115878384A (en) Distributed cluster based on backup disaster recovery system and construction method
CN106294031B (en) A kind of business management method and storage control
CN114363350A (en) Service management system and method
CN107404511A (en) The replacement method and equipment of server in cluster
CN112702206A (en) Main and standby cluster deployment method and system
CN117201284A (en) Gateway management method, system, device and medium
CN115632947A (en) Configuration issuing method and device
CN108241701A (en) A kind of method for improving MySQL high availability

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C17 Cessation of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20100519

Termination date: 20130622