CN103199972B - The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system - Google Patents

The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system Download PDF

Info

Publication number
CN103199972B
CN103199972B CN201310097042.7A CN201310097042A CN103199972B CN 103199972 B CN103199972 B CN 103199972B CN 201310097042 A CN201310097042 A CN 201310097042A CN 103199972 B CN103199972 B CN 103199972B
Authority
CN
China
Prior art keywords
machine
main frame
data
control feature
judge
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310097042.7A
Other languages
Chinese (zh)
Other versions
CN103199972A (en
Inventor
余学波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHENGDU REALCODE ELECTRIC Co Ltd
Original Assignee
CHENGDU REALCODE ELECTRIC Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHENGDU REALCODE ELECTRIC Co Ltd filed Critical CHENGDU REALCODE ELECTRIC Co Ltd
Priority to CN201310097042.7A priority Critical patent/CN103199972B/en
Publication of CN103199972A publication Critical patent/CN103199972A/en
Application granted granted Critical
Publication of CN103199972B publication Critical patent/CN103199972B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Hardware Redundancy (AREA)

Abstract

The invention discloses a kind of two-node cluster hot backup changing method, control feature is responsible for processing host and is switched from machine, distributed data storage, coordinates the slave work of power station monitoring system; Main frame and the information sent by control feature from machine and/or heartbeat judge from machine or main frame whether abnormal, if abnormal, then take over the task of the other side; Main frame and all start all mission threads from machine, but: data query task mainly born by main frame; From machine while supervision main frame again at real-time reception monitor data, and filtration treatment is carried out to data, receive only slave computer intelligent monitoring device initiatively on the data sent.The invention also discloses the equipment of application the method, comprise control feature, main frame and from machine.Slave runs simultaneously, and task process is all in starting state, and the data in database are at any time all by control feature backed up in synchronization, during switching, do not need starting problem and the data loss problem of considering database, for Two-channel switching is raced against time.

Description

The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system
Technical field
The present invention relates to a kind of changing method based on the realization of SOA, RS485 bus and hot backup system.
Background technology
Along with the development of industry control technology, the application of watch on-line is increasingly extensive, and the monitoring software moment keeps the monitoring to field data, therefore the moment is needed to keep monitoring software to have high reliability, once software failure, the loss of real time data will be caused, produce serious consequence.But in this type systematic, due to each several part design rationality, the reasons such as hardware reliability, can not ensure that again the absolute stability of system is reliable, not occur any problem, therefore in such a system, except being optimized program itself, the backup carrying out system is the scheme usually selected.When system malfunctions, to another set of systematic evaluation, design a set of effective Two-channel switching technical scheme to ensure the reliable of whole system, stable operation.
The standard scheme of tradition two-node cluster hot backup is the two-node cluster hot backup based on Storage sharing.For this mode, adopt two-server, use shared memory device (magnetic disk array cabinet or Store Area Network), data continuity after switching for Guarantee Status, have the heartbeat detection link based on TCP/IPSocket mode between standby machine, between standby machine, guarantee that whether detection is normal to the system of machine by continuous communication contact.Main frame is once break down, and standby host starts Relational database service or user's application at once.This pattern determines the switching time that traditional two-node cluster hot backup can not reach more than level second.Under reason has:
1, the shared storage subsystem of standby machine, although storage system is shared, and do not mean that shared memory systems can be accessed simultaneously, when host work, the read-write of the I/O of storage subsystem monopolized by main frame, and now standby host cannot access disk array subsystem.Only have the standby host when standby machine switches just can obtain the Read-write Catrol power of the I/O of storage subsystem, and the control of this read-write conversion need the time.Approximately need the time of 5-20 second in general, if system cache is excessive, can increase the time of main frame write-back buffer, switching time can be longer.
2, during standby host adapter main frame, the user application or the database program that start standby host is needed.Time needed for these service routines of such startup is completely relevant with the startup speed of application program by the performance of machine.
3, heartbeat detection link be not main frame once the machine standby host of delaying will start related service, but main frame is delayed after machine, standby host will through repeatedly checking after just can determine that main frame has quit work or machine of delaying really.And the safe threshold of this detection time is generally decided to be about 12 seconds, meanwhile, partial data will inevitably be lost.
Common two-node cluster hot backup product is sending out the flow process once switched in sum, and the time of approximately cost should the time of 1-2 minute.
SOA
Enterprise SOA is a kind of framework model, and it can carry out distributed deployment, combination and use by network to loosely-coupled coarseness application component according to demand.All functions or service all use descriptive language to be defined, and respective interface adopts independent mode definition, are not subject to service to realize the impact of the hardware platform at place, operating system and programming language.
WCF
WCF uses Managed Code to set up and runs the Unified frame of service-oriented (ServiceOriented) application program.It makes developer can set up cross-platform safe, believable, a businesslike solution, and can cooperate with existed system compatibility.Further, also tool has the following advantages:
(1) production efficiency is high:
A, unify existing various distributed computing technology
B, exploitation based on attribute (Attribute)
C, with VS2005 and above version Seamless integration-
(2) interaction is good:
A, widely support WS_* series of canonical
B, with existing Microsoft distributed computing technology compatible
(3) service-oriented exploitation
A, write loosely-coupled service and become and be more prone to
The behavior of B, service and attribute can be specified by configuring
RS485 bus:
RS485 is a kind of support multinode, accepts high sensitivity and can be applicable to remote digital communication bus standard.
Summary of the invention
For solving the technical problem existed in prior art, the invention provides a kind of stable, efficient two-node cluster hot backup changing method and hot backup system.
To achieve the above object of the invention, the technical solution adopted in the present invention is: provide at a kind of two-node cluster hot backup changing method realized based on SOA, RS485 bus, it is characterized in that,
Control feature is responsible for processing host and is switched from machine, distributed data storage, coordinates the slave work of power station monitoring system;
Main frame and the information sent by control feature from machine and/or heartbeat judge from machine or main frame whether abnormal, if abnormal, then take over the task of the other side;
Main frame and all start all mission threads from machine, but:
Data query task born by main frame, and main frame receives and after processing such data, by data backup memory, data delivered to control feature in real time simultaneously, control feature by after data backup memory, then piecemeal by data image to from machine;
From machine while supervision main frame, again at real-time reception monitor data, filtration treatment is carried out to data, receive only slave computer intelligent monitoring device initiatively on the data sent, equally, from machine process and Backup Data, simultaneously, will data are delivered to control feature store, control feature by after data backup memory, then piecemeal by data image to main frame; To back up in realtime host data from machine, user is monitored data by configuration interface equally.
Further, in the method, main frame is monitored from machine by heartbeat, main frame timing sends heartbeat packet to from machine, after extracting heartbeat packet information from machine, reply accordingly, if do not make corresponding reply or reply content incorrect, then judge that main frame is abnormal, from the task of machine by adapter main frame; If abnormal from machine, main frame also can not receive reply, then judge from machine abnormal, and main frame is by the task of adapter from machine;
Control feature all adopts TCP binary channels to be connected with main frame with between machine, and any one passage disconnects, then judge main frame or from machine exception, and take over the task of the other side.
Further, in the method, main frame and from chance start to the next quick-witted can watch-dog write data write thread, this writes thread is a circulation thread, comprises following concrete steps:
A1, beginning;
Thread is write in A2, startup;
A3, determine whether main frame, if yes then enter steps A 4, if not then entering steps A 5;
A4, issue Frame, then enter steps A 6;
A5, issue Frame and be blocked, then get back to steps A 3;
A6, Message processing, then enter steps A 7;
A7, data processing, then get back to steps A 3.
Further, in the method, main frame and be enabled in from chance and read thread under normal circumstances from slave computer intelligent monitoring device read data, this reads thread is a circulation thread, comprises following concrete steps:
B1, beginning;
Thread is read in B2, startup;
B3, judge whether main frame, if yes then enter step B4, if otherwise enter step B6;
B4, accept data message, then enter step B5;
B5, Message processing, then enter step B8;
B6, accept Frame, then enter step B7;
B7, Message processing, then enter step B8;
B8, data processing, then get back to step B3.
Further, in the method, main frame and after starting read thread from chance, also can start and abnormal judge processing threads, for judging whether the other side is abnormal and handling it; Main frame or whether extremely judge from machine or main frame by the information that control feature sends from chance; If normal, continue to monitor; If main frame or abnormal from machine, then accept the data message that originally do not receive or Frame at once from machine or main frame, and this data message or Frame are also processed; Main frame or the information sent by control feature from machine or heartbeat judge from machine or main frame whether abnormal.
Further, in the method, main frame or after switching from machine, all need again to control feature registration, and identified by log-on message, control feature needs to check database data, has checked whether loss of data.
Further, in the method, described exception judges that processing threads comprises following concrete steps:
C1, beginning;
Thread is read in C2, startup;
C3, judge whether it is main frame, if yes then enter step C4, if otherwise enter step C8;
C4, judge from machine whether abnormal, if yes then enter step C5, if otherwise get back to step C3;
C5, acceptance, under normal circumstances by the Frame accepted from machine, then enter step C6;
C6, Message processing, then enter C7;
C7, data processing, then get back to C4;
C8, judge that main frame is whether abnormal, if yes then enter step C9, if otherwise get back to step C3;
The data message that C9, acceptance are accepted by main frame under normal circumstances, then enters step C10;
C10, Message processing, then enter C11;
C11, data processing, then get back to C8.
Further, in the method, described main frame and from machine under normal circumstances, the operation needing transmitting order to lower levels frame made from machine of user, will make main frame and from passive being switched by heartbeat of machine.
Further, in the method, described main frame and from the passive process switched by heartbeat and/or control feature of machine, is comprised the steps:
D1, formerly send information from machine to original host, content is AAAAAA ..., self current state is: IsMaster=false, IsStandby=ture, IsSwitch=false; Represent that oneself is still from machine at present, does not also switch, and sends request switching command to main frame;
After D2, original host receive information, adjustment state, and to from machine return information, content is BBBBBB ..., after adjustment, oneself state is IsMaster=false, IsStandby=false, IsSwitch=ture; Represent and oneself switched to now newly from machine;
D3, former receive information from machine after, adjustment state, and Xiang Xinji sends information, and content is: CCCCCC ..., after adjustment, oneself state is IsMaster=ture, IsStandby=false, IsSwitch=false; Represent and oneself switch to main frame, and no longer require switching command to original host transmission;
D4, newly receive information from machine after, the state of oneself is adjusted to IsMaster=false, IsStandby=false, IsSwitch=false; Represent oneself to be from machine, no longer send handover success information to new main frame.
Present invention also offers a kind of dual-machine hot backup system realized based on SOA, RS485 bus, it is characterized in that,
The framework of this dual-machine hot backup system, based on SOA model, comprises
As control feature and at least one monitoring system of electric substation as client of service end;
Data cube computation between control feature and monitoring system of electric substation;
Described control feature comprises at least one stores service end and at least one data storage;
Described monitoring system of electric substation comprises main frame, from machine and the next intelligent monitoring device; Main frame is connected by bus with slave computer watch-dog with from machine; Described main frame, be respectively arranged with data storage from machine; Main frame, to be connected with stores service end data respectively from machine; Main frame and be provided with heartbeat between machine;
Described stores service end is WCF server module, described main frame and be WCF client from machine;
Described heartbeat is RS485 bus.
In sum, tool of the present invention has the following advantages:
1. introduce third party's (control feature) system between slave, coordinate slave and switch.
2. control feature and main frame adopt WCF to communicate with between machine, control feature is as service end, slave is as client, TCP binary channels is adopted to be connected between service end with client, when wherein a passage disconnects, control feature can judge main frame or abnormal from machine, and notifies that another side switches.Another main frame and between machine again by RS485 bus communication, this order wire and main frame and from the heartbeat between machine, monitored main frame from machine by heartbeat.TCP binary channels is combined with RS485 bus, more can judge that slave is abnormal quickly and accurately.
3. in conjunction with the theory that cloud stores, distributed data storage, do not use shared disk array, slave only accesses native database, namely avoid guest machine when slave can not access disk subsystem and master-slave swap simultaneously and obtain the time delay solely writing control, accelerate switching time.
4. when main frame runs, main unit load shared by guest machine, and guest machine is in running status for a long time, do not need to restart application program and database, directly the task of adapter main frame during switching, races against time for switching.
5. comprehensive above 2,3,4, can will bring up to Millisecond (50ms-100ms) switching time, considerably beyond traditional Two-channel switching time.
6. network interface (wcf) communication and serial communication (RS485) combine, realize Two-channel switching simultaneously, insurance and safety more, even if a wherein link failure, another also can realize, avoid the single heartbeat of traditional Two-channel switching once break down, just there will be two main frames or two from the simultaneous risk of machine.
7. main frame and can any switching laws from machine, and can automatically switching according to the operation of user, user can monitor data at main frame with from machine simultaneously, and can realize same operation, allows the difference of the imperceptible slave of user.
Accompanying drawing explanation
Fig. 1 is traditional Two-channel switching flow chart.
Fig. 2 is Two-channel switching networking diagram.
Fig. 3 is control feature and each site hosts, from the associated diagram between machine.
Fig. 4 is the schematic diagram of certain website slave to control feature log-on message.
Fig. 5 be monitor data in control feature and slave task respectively with distributed data storage figure.
Fig. 6 is that main frame writes flow chart with when machine communicates with slave computer intelligent monitoring device.
Fig. 7 is that main frame reads flow chart with when machine communicates with slave computer intelligent monitoring device.
Fig. 8 is when main frame or from read data flow chart during machine exception.
Fig. 9 is control feature (WCF service end) part interface figure.
Figure 10 works as slave under normal circumstances, the logic diagram of master-slave swap.
In Fig. 9, oblique boldface letter shows the flag bit of change.IsMaster: main frame machine mark (true is main frame, and false is from machine); IsStandby: request switches mark (true represents from machine to main frame and sends request switching command frame, and before receiving host response, this state can not change, and continue to send switching command frame from machine, once receive reply, this flag bit becomes false); IsSwitch: handover success mark (true represents main frame handover success, and concurrent command notice is from machine, and from being cut to main frame after machine receives orders and replying, after main frame receives reply, this flag bit becomes false).
Embodiment
Below in conjunction with accompanying drawing, the specific embodiment of the present invention is described in detail:
Fig. 1 is traditional two-node cluster hot backup switching flow, specifically describes and refers to background technology.
Fig. 2 is the Two-channel switching configuration figure for monotropic power station, the framework of this system is based on SOA model, SOA model is Enterprise SOA model, as seen from the figure, whole system is not based on Storage sharing, all data acquisition distributed storage, main frame, all store tasks is born from machine and control feature, main frame is all communicated with slave computer intelligent monitoring device with Ethernet (the concrete communication port adopting which kind of mode to depend on device) by RS485 with from machine, main frame is communicated with control feature by WCF technology with from machine, wherein control feature is as WCF service end, main frame and all existing as WCF client from machine.Separately, main frame and being connected by RS485 bus (heartbeat) between machine.Slave computer intelligent monitoring device refers to generator protection equipment, electric energy meter, GPS, direct current cabinet etc. in figure; control feature is made up of multiple program module; comprising WCF service end, data server etc., be responsible for process slave and switch, slave data storage management.Main frame, from machine and slave computer intelligent monitoring device etc. form monitoring system of electric substation.
Fig. 3 be control feature and each transformer station main frame and from the graph of a relation between machine, as seen from the figure, control feature and main frame adopt WCF to communicate with between machine, TCP binary channels is adopted to be connected between service end with client, like this, as long as keep passage not disconnect, control desk and main frame or data can be transmitted mutually by serving between machine, no matter the information sent in real time of device or the control command sent from the data access layer such as web server, mobile phone (by control feature to slave forwarding data), all can the transmission of real-time high-efficiency.On the contrary, if any one passage disconnects, then can judge main frame or abnormal from machine.Another main frame and between machine again by RS485 bus communication, this order wire and main frame and from the heartbeat between machine, monitored main frame from machine by heartbeat, main frame timing (5ms) Xiang Congji sends heartbeat packet, after extracting heartbeat packet information from machine, reply accordingly.It can thus be appreciated that two kinds of modes judge main frame or from machine fault, safer so more efficient simultaneously.Such as, if main frame breaks down, TCP binary channels (TCP channel 1 in Fig. 3) between control feature and main frame will disconnect immediately, and control feature is immediately by notifying from machine from the passage (TCP channel 2 in Fig. 3) between machine, and main frame there occurs fault.During this period, disconnect from passage 1, passage 2 transmission of information is all software simulating, and without hardware interface restrict access, the time of middle cost should in 20ms.Simultaneously, if main frame is abnormal, main frame and also disconnecting from the heartbeat packet between machine, main frame was set as (5ms) to the time interval sending heartbeat packet from machine, if still can not receive heartbeat packet from machine in 20ms, then judge main frame abnormal (same, if also can not receive reply from machine abnormal host in 20ms, then to judge from machine abnormal).Pass through two information from machine like this, one is heartbeat information, and one is control feature information, judges hostdown, adapter host task immediately.Equally, safer than traditional holocentric wire jumper monitoring link switching like this, efficiently.
As shown in the above, control feature is responsible for process slave and is switched, distributed data storage, coordinate slave work under each website, in an electric power monitoring system, multiple power station may be had, multiple main frame and from machine, does how control feature manage them and distinguishes? as shown in Figure 4, when slave starts, can initiatively register to control feature, control feature can issue broadcasting command in net, after slave receives broadcast, log-on message is submitted to control feature, the information submitted to comprises current affiliated website, slave state etc., control feature has the module of concrete management log-on message, be responsible for process log-on message.
Fig. 5 be monitor data in control feature and slave task, task respectively with distributed data storage figure.
In electric power system, the data that slave computer intelligent monitoring device send can be divided into two kinds from transmission means analysis: one is passive sending, one is that active is sent, send on passive is need supervisory control system to pass through to issue inquiry or control command frame, slave computer watch-dog replys data according to command frame, such data volume general is comparatively large, needs Millisecond to refresh; And initiatively on the data volume sent less, and be all that not timing is sent, the situation such as only to break down, abnormal just on send data, and this situation is relatively less, and we are different from traditional two-node cluster hot backup, specific as follows when designing two-node cluster hot backup:
Data query (comprising remote measurement, remote signalling, accumulation amount etc.) task mainly born by main frame, and issue control command frame (comprising remote control, failure wave-recording etc.) task, main frame receives and after processing such data, by data backup memory, data are delivered to WCF server module (stores service end) in real time by (by WCF client) simultaneously, WCF server module by after data again storage backup, then piecemeal by data image to from machine.
Stand-by state is on the surface from machine, but reality is sharing the load of main frame, from machine while supervision main frame, again at real-time reception monitor data, filtration treatment is carried out to data, receive only slave computer intelligent monitoring device initiatively on the data (such as SOE, accident message etc.) sent, equally, from machine process and Backup Data, simultaneously, by data are delivered to WCF server module store, by WCF server module by after data again storage backup, then piecemeal by data image to main frame.To back up in realtime equally host data from machine, user can be monitored data by configuration interface equally, and and non-fully is in idle state.
As seen from the figure, control feature, slave all store data, and control feature stores all data of slave.And from description above, slave is all in running status, but the data only related to for oneself task just really send.Like this when slave switches, need not consider the access right to shared storage device, the problems such as database startup, for Two-channel switching is raced against time.Be described as follows.
Fig. 6 is that slave writes the flow process of data to the next quick-witted energy watch-dog, writing thread is a circulation thread, every 10ms writes a secondary data to equipment, as seen from the figure, it is no matter main frame and from machine, capital starts writes thread, difference is, main frame can process the data (message) of write slave computer intelligent monitoring device, and be issued in corresponding slave computer watch-dog, and interception has been done for the data from machine write slave computer intelligent monitoring device, be not issued in corresponding watch-dog, and directly return thread, namely slave computer watch-dog can not receive the data from machine write.If but main frame there occurs exception, switch to main frame (by completing in the process of description figure tri-) from machine, this interception is by immediate cancel.Because thread starts, can not start-up routine thread again, lose time, be that the time is very short during this, just can solve with regard to a flag bit, in the time 5ms of cost.
Write thread and comprise following concrete steps:
A1, beginning;
Thread is write in A2, startup;
A3, determine whether main frame, if yes then enter steps A 4, if not then entering steps A 5;
A4, issue Frame, then enter steps A 6;
A5, issue Frame and be blocked, then get back to steps A 3;
A6, Message processing, then enter steps A 7;
A7, data processing, then get back to steps A 3.
Fig. 7 is the machine-readable slave computer data flow of principal and subordinate, reading thread equally is also a circulation thread, every 10ms reads a secondary data, as seen from the figure, main frame and all can start from machine and read thread, and main frame and all can deal with data from machine, because main frame can to device transmitting order to lower levels frame, so main frame assume responsibility for completely need transmitting order to lower levels frame just available information, such as remote measurement, remote signalling, accumulation amount etc., this part information is that timing is sent, and at any time in change, another part is remote control, fixed value modification etc., this part information also needs transmitting order to lower levels frame, and this two parts information accounts for more than 80% of all information, so the supervisory control system of this category information of process is defined as main frame.On the contrary, from the data message that machine processing unit send automatically, such as soe, accident message etc., this part information just occurs once in a while, can send accident message information when such as device breaks down.Alleviate the load of main frame like this.This flow process specifically comprises the steps:
B1, beginning;
Thread is write in B2, startup;
B3, determine whether main frame, if yes then enter step B4, if not then entering step B5;
B4, issue Frame, then enter step B6;
B6, Message processing, then enter step B7;
B7, data processing, then get back to step B3;
B5, issue Frame and be blocked, then get back to step B3.
Fig. 7 is main frame and from the normal situation of machine, main frame or share task from machine, Fig. 8 be when main frame or abnormal from machine time flow chart of data processing figure, in order to ensure the complete of monitor data, during when main frame with from machine exception, another must process all data.Therefore, after thread is read in startup, whether abnormal main frame and also can start from machine and judge the other side's and flow process of handling it, be called for short and extremely judge processing threads, main frame or from chance by the information that control feature sends judge from machine or main frame whether abnormal; If normal, continue to monitor; If main frame or abnormal from machine, then accept the data message that originally do not receive or Frame at once from machine or main frame, and this data message or Frame are also processed, thus replace abnormal main frame or from machine.This judges that processing threads concrete steps are as follows extremely:
C1, beginning;
Thread is read in C2, startup;
C3, judge whether it is main frame, if yes then enter step C4, if otherwise enter step C8;
C4, judge from machine whether abnormal, if yes then enter step C5, if otherwise get back to step C3;
C5, acceptance, under normal circumstances by the Frame accepted from machine, then enter step C6;
C6, Message processing, then enter C7;
C7, data processing, then get back to C4;
C8, judge that main frame is whether abnormal, if yes then enter step C9, if otherwise get back to step C3;
The data message that C9, acceptance are accepted by main frame machine under normal circumstances, then enters step C10;
C10, Message processing, then enter C11;
C11, data processing, then get back to C8.
Fig. 9 is an interface of control feature (being named as data server in reality), control feature and other component communications mainly adopt WCF mode, (control feature) can comprise multiple stores service end (stores service end is WCF server module) and multiple storage Controlling vertex (each storage Controlling vertex comprises the logic module of multiple process service) composition, each WCF server module has clear and definite task, such as bulletin board service module (announces address of service, the information such as the website of registration belonging to slave), (the process remote measurement of Real-time Data Service module, the service of the data such as remote signalling), transfer of data and file transfer services module are (for transmitting data and file, realize the service of the functions such as slave mirror back-up of data), event management service (charge of overseeing passage connection state, the functions such as notice slave is abnormal) etc., this control feature serves just its partial function for two-node cluster hot backup provides, this case focuses on the changing method describing two-node cluster hot backup, so other functional modules are not described at this.
Figure 10 works as slave under normal circumstances, the logic diagram of master-slave swap.Due to the operation of user, operation comprises the behavior that remote control, fixed value modification etc. need transmitting order to lower levels frame, and make the switching of the passive generation of slave, this situation is that role there occurs exchange.Main frame and switching by heartbeat and/or control feature from machine is passive.As seen from the figure, main frame and there occurs alternately from machine.
Main frame and from the passive process switched by heartbeat and/or control feature of machine, is comprised the steps:
D1, formerly send information from machine to original host, content is AAAAAA ..., self current state is: IsMaster=false, IsStandby=ture, IsSwitch=false; Represent that oneself is still from machine at present, does not also switch, and sends request switching command to main frame;
After D2, original host receive information, adjustment state, and to from machine return information, content is BBBBBB ..., after adjustment, oneself state is IsMaster=false, IsStandby=false, IsSwitch=ture; Represent and oneself switched to now newly from machine.
D3, former receive information from machine after, adjustment state, and Xiang Xinji sends information, and content is: CCCCCC ..., after adjustment, oneself state is IsMaster=ture, IsStandby=false, IsSwitch=false; Represent and oneself switch to main frame, and no longer require switching command to original host transmission;
D4, newly receive information from machine after, the state of oneself is adjusted to IsMaster=false, IsStandby=false, IsSwitch=false; Represent oneself to be from machine, no longer send handover success information to new main frame.
In the transformer substation system of reality, main frame and may at different engineer stations or power distribution room from machine, because like this, user can in the monitoring of multiple places, but run into emergency, such as need to control certain circuit breaker or certain equipment from machine, then must transmitting order to lower levels frame, but now from without control.RC3000 electric power monitoring system has done special process to this situation: user can directly pass through from machine remote control (certain equipment of remote control), when the moment of user's remote control, message is sent out to main frame from machine, tell main frame I need control, now main frame is vertical switches to from machine, and notify original to switch to main frame from machine, issue guidance command frame (note: be the communication interface what mode to be issued main dependence device by for the guidance command frame issued simultaneously, if com port is then issued by RS485 bus, if network interface, then issued by netting twine), do not affect user operation, namely can automatically switch with the operation of user between principal and subordinate, the imperceptible slave of user is allowed to distinguish, such design is more humane.
In sum, tool of the present invention has the following advantages:
1. introduce third party's (control feature) system between slave, coordinate slave and switch.
2. control feature and main frame adopt WCF to communicate with between machine, control feature is as service end, slave is as client, TCP binary channels is adopted to be connected between service end with client, when wherein a passage disconnects, control feature can judge main frame or abnormal from machine, and notifies that another side switches.Another main frame and between machine again by RS485 bus communication, this order wire and main frame and from the heartbeat between machine, monitored main frame from machine by heartbeat.TCP binary channels is combined with RS485 bus, more can judge that slave is abnormal quickly and accurately.
3. in conjunction with the theory that cloud stores, distributed data storage, do not use shared disk array, slave only accesses native database, namely avoid guest machine when slave can not access disk subsystem and master-slave swap simultaneously and obtain the time delay solely writing control, accelerate switching time.
4. when main frame runs, main unit load shared by guest machine, and guest machine is in running status for a long time, do not need to restart application program and database, directly the task of adapter main frame during switching, races against time for switching.
5. comprehensive above 2,3,4, can will bring up to Millisecond (50ms-100ms) switching time, considerably beyond traditional Two-channel switching time.
6. network interface (wcf) communication and serial communication (RS485) combine, realize Two-channel switching simultaneously, insurance and safety more, even if a wherein link failure, another also can realize, avoid the single heartbeat of traditional Two-channel switching once break down, just there will be two main frames or two from the simultaneous risk of machine.
7. main frame and can any switching laws from machine, and can automatically switching according to the operation of user, user can monitor data at main frame with from machine simultaneously, and can realize same operation, allows the difference of the imperceptible slave of user.
The present invention is not limited to above-mentioned example, and in claims limited range of the present invention, the various distortion that those skilled in the art can make without creative work or amendment are all by the protection of this patent.

Claims (9)

1., based on the two-node cluster hot backup changing method that SOA, RS485 bus realizes, it is characterized in that,
Control feature is responsible for processing host and is switched from machine, distributed data storage, coordinates the slave work of power station monitoring system;
Main frame and the information sent by control feature from machine and/or heartbeat judge from machine or main frame whether abnormal, if abnormal, then take over the task of the other side;
Main frame and all start all mission threads from machine, but:
Data query task born by main frame, and main frame receives and after processing such data, by data backup memory, data delivered to control feature in real time simultaneously, control feature by after data backup memory, then piecemeal by data image to from machine;
From machine while supervision main frame, again at real-time reception monitor data, filtration treatment is carried out to data, receive only slave computer intelligent monitoring device initiatively on the data sent, equally, from machine process and Backup Data, simultaneously, will data are delivered to control feature store, control feature by after data backup memory, then piecemeal by data image to main frame; To back up in realtime host data from machine, user is monitored data by configuration interface equally.
2. by method according to claim 1, it is characterized in that, main frame is monitored from machine by heartbeat, main frame timing sends heartbeat packet to from machine, after extracting heartbeat packet information from machine, reply accordingly, if do not make corresponding reply or reply content incorrect, then judge that main frame is abnormal, from the task of machine by adapter main frame; If abnormal from machine, main frame also can not receive reply, then judge from machine abnormal, and main frame is by the task of adapter from machine;
Control feature all adopts TCP binary channels to be connected with main frame with between machine, and any one passage disconnects, then judge main frame or from machine exception, and take over the task of the other side.
3., by method described in claim 1 or 2, it is characterized in that, main frame and from chance start to the next quick-witted can watch-dog write data write thread, this writes thread is a circulation thread, comprises following concrete steps:
A1, beginning;
Thread is write in A2, startup;
A3, determine whether main frame, if yes then enter steps A 4, if not then entering steps A 5;
A4, issue Frame, then enter steps A 6;
A5, issue Frame and be blocked, then get back to steps A 3;
A6, Message processing, then enter steps A 7;
A7, data processing, then get back to steps A 3.
4. by method according to claim 1, it is characterized in that, main frame and be enabled in from chance and read thread under normal circumstances from slave computer intelligent monitoring device read data, this reads thread is a circulation thread, comprises following concrete steps:
B1, beginning;
Thread is read in B2, startup;
B3, judge whether main frame, if yes then enter step B4, if otherwise enter step B6;
B4, accept data message, then enter step B5;
B5, Message processing, then enter step B8;
B6, accept Frame, then enter step B7;
B7, Message processing, then enter step B8;
B8, data processing, then get back to step B3.
5. by method according to claim 4, it is characterized in that, main frame and after starting read thread from chance, also can start and extremely judge processing threads, for judging whether the other side is abnormal and handling it; Main frame or whether extremely judge from machine or main frame by the information that control feature sends from chance; If normal, continue to monitor; If main frame or abnormal from machine, then accept the data message that originally do not receive or Frame at once from machine or main frame, and this data message or Frame are also processed; Main frame or the information sent by control feature from machine or heartbeat judge from machine or main frame whether abnormal.
6. by method according to claim 5, it is characterized in that, main frame or after switching from machine, all need again to control feature registration, and identified by log-on message, control feature needs to check database data, has checked whether loss of data.
7. by the method described in claim 5 or 6, it is characterized in that: described exception judges that processing threads comprises following concrete steps:
C1, beginning;
Thread is read in C2, startup;
C3, judge whether it is main frame, if yes then enter step C4, if otherwise enter step C8;
C4, judge from machine whether abnormal, if yes then enter step C5, if otherwise get back to step C3;
C5, acceptance, under normal circumstances by the Frame accepted from machine, then enter step C6;
C6, Message processing, then enter C7;
C7, data processing, then get back to C4;
C8, judge that main frame is whether abnormal, if yes then enter step C9, if otherwise get back to step C3;
The data message that C9, acceptance are accepted by main frame under normal circumstances, then enters step C10;
C10, Message processing, then enter C11;
C11, data processing, then get back to C8.
8. by method according to claim 1, it is characterized in that, described main frame and from machine under normal circumstances, the operation needing transmitting order to lower levels frame made from machine of user, will main frame be made and from passive being switched by heartbeat of machine.
9. by method according to claim 8, it is characterized in that described main frame and from the passive process switched by heartbeat and/or control feature of machine is comprised the steps:
D1, formerly send information from machine to original host, content is AAAAAA ..., self current state is: IsMaster=false, IsStandby=ture, IsSwitch=false; Represent that oneself is still from machine at present, does not also switch, and sends request switching command to main frame;
After D2, original host receive information, adjustment state, and to from machine return information, content is BBBBBB ..., after adjustment, oneself state is IsMaster=false, IsStandby=false, IsSwitch=ture; Represent and oneself switched to now newly from machine;
D3, former receive information from machine after, adjustment state, and Xiang Xinji sends information, and content is: CCCCCC ..., after adjustment, oneself state is IsMaster=ture, IsStandby=false, IsSwitch=false; Represent and oneself switch to main frame, and no longer require switching command to original host transmission;
D4, newly receive information from machine after, the state of oneself is adjusted to IsMaster=false, IsStandby=false, IsSwitch=false; Represent oneself to be from machine, no longer send handover success information to new main frame.
CN201310097042.7A 2013-03-25 2013-03-25 The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system Expired - Fee Related CN103199972B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310097042.7A CN103199972B (en) 2013-03-25 2013-03-25 The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310097042.7A CN103199972B (en) 2013-03-25 2013-03-25 The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system

Publications (2)

Publication Number Publication Date
CN103199972A CN103199972A (en) 2013-07-10
CN103199972B true CN103199972B (en) 2016-04-20

Family

ID=48722340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310097042.7A Expired - Fee Related CN103199972B (en) 2013-03-25 2013-03-25 The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system

Country Status (1)

Country Link
CN (1) CN103199972B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062184A (en) * 2018-08-10 2018-12-21 中国船舶重工集团公司第七〇九研究所 Two-shipper emergency and rescue equipment, failure switching method and rescue system

Families Citing this family (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473328A (en) * 2013-09-17 2013-12-25 中电长城网际系统应用有限公司 MYSQL (my structured query language)-based database cloud and construction method for same
CN103532753B (en) * 2013-10-11 2016-08-17 中国电子科技集团公司第二十八研究所 A kind of double hot standby method of synchronization of skipping based on internal memory
CN103532767B (en) * 2013-10-28 2016-06-08 国家电网公司 Based on SOA framework active and standby adjust integration man-machine interactive system realize method
CN104669268B (en) * 2013-11-26 2016-08-03 中国科学院沈阳自动化研究所 A kind of redundancy underwater robot self-control system based on Hot Spare and method
CN105354015A (en) * 2014-08-20 2016-02-24 南京普爱射线影像设备有限公司 Multi-thread communication technology
CN104394142B (en) * 2014-11-24 2018-02-16 北京京东尚科信息技术有限公司 For realizing the method and system of the automatic master-slave swaps of Redis
CN105824571A (en) * 2015-01-05 2016-08-03 中国移动通信集团四川有限公司 Data seamless migration method and device
CN105207866B (en) * 2015-10-26 2019-03-05 珠海格力电器股份有限公司 Communication means and airconditioning control network based on airconditioning control network-based control terminal
CN105933135B (en) * 2015-11-16 2019-07-16 中国银联股份有限公司 It is a kind of it is determining execute scheduler task method and execute scheduler task the first host
CN105323889A (en) * 2015-11-19 2016-02-10 广东正力通用电气有限公司 Hot backup automatic switching time control system
CN106201825B (en) * 2016-07-13 2019-03-08 深圳市爱培科技术股份有限公司 A kind of intelligent back vision mirror running state monitoring method and system
CN106354589A (en) * 2016-08-24 2017-01-25 天津天大求实电力新技术股份有限公司 Double-unit hot standby method of micro-grid energy management system service programs
CN106294236B (en) * 2016-08-25 2018-12-04 广东迪奥技术有限公司 A kind of communication means based on RS485, device and communication system
CN106648997A (en) * 2016-12-23 2017-05-10 北京航天测控技术有限公司 Master-salve switching method based on non-real-time operating system
CN110018925B (en) * 2018-01-10 2023-08-29 厦门雅迅网络股份有限公司 System security redundancy method and computer readable storage medium
CN108390781A (en) * 2018-02-12 2018-08-10 王磊 A kind of method and system of the automatic Hot Spare of host
CN110213065B (en) * 2018-02-28 2022-11-25 杭州宏杉科技股份有限公司 Method and device for switching paths
CN108415797A (en) * 2018-03-05 2018-08-17 山东超越数控电子股份有限公司 A method of avoid server failure switching according to library loss of data
CN108510726A (en) * 2018-04-02 2018-09-07 国网上海市电力公司 A kind of distribution terminal signal turns GPRS signal components
CN108650115B (en) * 2018-04-16 2021-08-24 宁波三星医疗电气股份有限公司 Fault processing method for multi-channel cascade topological structure of centralized meter reading system
CN108809995B (en) * 2018-06-16 2021-03-19 武汉商启网络信息有限公司 Management control system for preventing cloud host password from being decoded
CN109698775A (en) * 2018-11-21 2019-04-30 中国航空工业集团公司洛阳电光设备研究所 A kind of dual-machine redundancy backup system based on real-time status detection
CN109560993A (en) * 2018-12-20 2019-04-02 航天信息股份有限公司 The method of communication link abnormality detection, device, electronic equipment and network
CN109799797B (en) * 2019-01-10 2021-12-07 国网陕西省电力公司 Method for hot standby of double machines of plant station electric energy acquisition terminal
CN110417584A (en) * 2019-07-10 2019-11-05 南京南瑞继保电气有限公司 A kind of two-shipper main/standby switching method based on multi-link election mechanism
CN110347536A (en) * 2019-08-15 2019-10-18 深圳市万连通讯技术有限公司 A kind of 485 bus system of double hosts and working host fault redundance guard method
CN110750480B (en) * 2019-10-18 2021-06-29 苏州浪潮智能科技有限公司 Dual-computer hot standby system
CN111007815B (en) * 2019-11-28 2021-04-30 中国电子科技集团公司第二十八研究所 Centralized control host supporting dual-computer hot standby
CN111277596A (en) * 2020-01-20 2020-06-12 广东电网有限责任公司电力调度控制中心 Power grid regulation and control safety zone data transmission system, method and equipment
CN113992696A (en) * 2020-07-10 2022-01-28 中国电信股份有限公司 Memcache cache system, synchronization method thereof and computer readable storage medium
CN112367214B (en) * 2020-10-12 2022-06-14 成都精灵云科技有限公司 Method for rapidly detecting and switching main node based on etcd
CN112230625B (en) * 2020-10-30 2022-04-01 北京汽车研究总院有限公司 Vehicle control method of intelligent driving controller, storage medium and computer equipment
CN112653734B (en) * 2020-12-11 2023-09-19 邦彦技术股份有限公司 Real-time master-slave control and data synchronization system and method for server cluster
CN114115091B (en) * 2021-01-12 2024-05-17 无锡信捷电气股份有限公司 PLC data redundancy method based on time synchronization and finite data element interaction
CN112954008B (en) * 2021-01-26 2022-11-04 网宿科技股份有限公司 Distributed task processing method and device, electronic equipment and storage medium
CN113542028A (en) * 2021-07-17 2021-10-22 辽宁工业大学 Dual-computer hot standby method for receiving data of Internet of things
CN114003551A (en) * 2021-11-01 2022-02-01 山东芯慧微电子科技有限公司 FPGA hot standby controller for master-slave dual-computer hot standby
CN114935779B (en) * 2022-06-14 2022-11-29 天津君秒安减灾科技有限公司 Master-slave switching system for automatic connection between earthquake rescue field devices
CN115390490B (en) * 2022-08-23 2024-04-26 南京芯传汇电子科技有限公司 Remote control terminal redundancy management method, device, equipment and storage medium
CN115407640B (en) * 2022-11-01 2023-04-25 山东博硕自动化技术有限公司 Multi-control multi-machine automatic control system and control method thereof

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6359871B1 (en) * 1994-05-27 2002-03-19 Curtin University Of Technology Cellular communications network
CN1775606A (en) * 2005-12-19 2006-05-24 北京交通大学 Wireless locomotive signal dual-engine warm standby control method
CN101572724A (en) * 2009-03-05 2009-11-04 国电南瑞科技股份有限公司 Software version management system
CN102281563A (en) * 2010-06-11 2011-12-14 海能达通信股份有限公司 Communication system, switching method applied to communication system, and network management server

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101651361A (en) * 2008-08-15 2010-02-17 上海致达智利达系统控制有限责任公司 Integrated automation system of substation
CN201662715U (en) * 2010-03-29 2010-12-01 河南电力试验研究院 Power generation energy consumption data acquisition device
CN102073284B (en) * 2010-12-21 2012-10-10 北京航空航天大学 Dual-computer redundant embedded control system suitable for nuclear industrial robot
CN102930392A (en) * 2012-10-25 2013-02-13 沈阳化工大学 System for running information of transformer substation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6359871B1 (en) * 1994-05-27 2002-03-19 Curtin University Of Technology Cellular communications network
CN1775606A (en) * 2005-12-19 2006-05-24 北京交通大学 Wireless locomotive signal dual-engine warm standby control method
CN101572724A (en) * 2009-03-05 2009-11-04 国电南瑞科技股份有限公司 Software version management system
CN102281563A (en) * 2010-06-11 2011-12-14 海能达通信股份有限公司 Communication system, switching method applied to communication system, and network management server

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062184A (en) * 2018-08-10 2018-12-21 中国船舶重工集团公司第七〇九研究所 Two-shipper emergency and rescue equipment, failure switching method and rescue system
CN109062184B (en) * 2018-08-10 2021-05-14 中国船舶重工集团公司第七一九研究所 Double-machine emergency rescue equipment, fault switching method and rescue system

Also Published As

Publication number Publication date
CN103199972A (en) 2013-07-10

Similar Documents

Publication Publication Date Title
CN103199972B (en) The two-node cluster hot backup changing method realized based on SOA, RS485 bus and hot backup system
CN103226483B (en) The dual-machine hot backup system and method thereof that realize is stored based on SOA, cloud
CN202798798U (en) High availability system based on cloud computing technology
CN103207841B (en) Based on data read-write method and the device of key-value pair buffer memory
CN103346903A (en) Dual-machine backup method and device
CN105554074A (en) NAS resource monitoring system and monitoring method based on RPC communication
CN102761528A (en) System and method for data management
CN102467508A (en) Method for providing database service and database system
CN103019889A (en) Distributed file system and failure processing method thereof
CN105306605B (en) A kind of double host server systems
CN105471622A (en) High-availability method and system for main/standby control node switching based on Galera
CN103425645A (en) Monitoring system and monitoring method for single point of failure of database cluster
CN103441863A (en) Double-server hot standby system in blank pipe automatic system and control method thereof
CN103036719A (en) Cross-regional service disaster method and device based on main cluster servers
CN102437933A (en) Fault tolerance system and method of server
CN106919473A (en) A kind of data disaster recovery and backup systems and method for processing business
CN106850255A (en) A kind of implementation method of multi-computer back-up
CN110348826A (en) Strange land disaster recovery method, system, equipment and readable storage medium storing program for executing mostly living
CN113067782A (en) High-reliability electric energy acquisition and transmission system based on redundancy system
CN102932118B (en) The method and system of the active and standby ruling of a kind of two-shipper
CN107357800A (en) A kind of database High Availabitity zero loses solution method
CN102571311B (en) Master-slave switching communication system and master-slave switching communication method
CN101686261A (en) RAC-based redundant server system
CN204406385U (en) The management devices of computer system
CN111541599B (en) Cluster software system and method based on data bus

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160420

CF01 Termination of patent right due to non-payment of annual fee