WO2012028013A1 - Implementing method for main/standby configuration of board cards, and board card - Google Patents

Implementing method for main/standby configuration of board cards, and board card Download PDF

Info

Publication number
WO2012028013A1
WO2012028013A1 PCT/CN2011/075989 CN2011075989W WO2012028013A1 WO 2012028013 A1 WO2012028013 A1 WO 2012028013A1 CN 2011075989 W CN2011075989 W CN 2011075989W WO 2012028013 A1 WO2012028013 A1 WO 2012028013A1
Authority
WO
WIPO (PCT)
Prior art keywords
board
node
standby
candidate
competition
Prior art date
Application number
PCT/CN2011/075989
Other languages
French (fr)
Chinese (zh)
Inventor
黄文伟
周海山
杨骐
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2012028013A1 publication Critical patent/WO2012028013A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B1/00Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
    • H04B1/74Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission for increasing reliability, e.g. using redundant or spare channels or apparatus
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements

Definitions

  • the present invention relates to the field of communications, and in particular, to a method and a board for implementing an active/standby configuration of a card. Background technique
  • the routing capacity has been greatly improved, which is mainly reflected in the increase in the number of slot slots, the improvement of the exchange capacity of each slot, and the increase in the density of the card ports.
  • routers are increasing, and the functional division of each type of router is becoming more and more obvious.
  • core layer routers need to have fast forwarding capabilities, high security, high stability, and sufficient capacity.
  • router technology is constantly improving, the development of Internet traffic will always exceed the speed of device capacity updates.
  • For a single router if you need to expand it, you need to consider the development maturity of the optical device, power supply, heat dissipation performance, and load-bearing capacity of the equipment room.
  • the development of single-box routers Technology has gradually approached the limit, and the emergence of cluster routers has solved the problem of router expansion.
  • the cluster technology enables the routers in the cluster system to work together well, thus breaking the limitation of single-chassis routers in terms of switching capacity, power consumption, and heat dissipation.
  • the single-box router system smoothly expands to a larger capacity routing switching system.
  • cluster routers have advantages in capacity, the complexity of cluster routers is relatively high, and the application of cluster router systems has high requirements for uninterrupted online upgrade performance, disaster tolerance performance, automatic fault monitoring performance, and alarm functions. Therefore, it will increase the difficulty of implementing the cluster router system.
  • the cluster router system includes the master node and the standby node.
  • the master control node is a centralized control point in the cluster router system, and is mainly used to implement version loading, user operation and maintenance, centralized calculation, and decision making of the entire cluster router system.
  • the selection and monitoring of the master node directly affects the reliability and availability of the entire system. Therefore, considering the importance of the master node, the master node must be designed as a 1+1 master.
  • the master node selection mode of the single-frame router system can be used, that is, the main control board on a certain chassis is fixed as the master node, if This choice,
  • the software flow of the single-box system can be directly used, so the implementation process is relatively simple, but the entire cluster router system completely depends on the chassis where the selected master node is located, and the performance of the selected master node cannot be guaranteed. Both can meet the performance requirements of the master node, which will affect the normal operation of the entire cluster router system, and the reliability is poor.
  • the present invention provides a method for implementing the master/slave configuration of the board and the board, which can reasonably select the optimal performance.
  • the master node
  • a method for implementing a board active/standby configuration for determining a master node and a standby node from a plurality of boards located in a plurality of chassis.
  • the implementation method of the card master/slave configuration according to the present invention includes: for each of the plurality of boards, each of the candidate boards sends a competition message to the other candidate board, where The competition message sent by each candidate card carries the hardware information of the candidate card; during the competition, each of the candidate cards receives the competition message from the other candidate card; Each of the candidate boards compares the hardware information with the hardware information in the received competition message, and determines an optional board that wins according to the competition principle, wherein the competition principle is used for Determining, as a result of the comparison result of the hardware information, an optional board as a master node; determining a unique master node according to the contention of the plurality of candidate boards, and the master node is in another candidate board An optional board is identified as a standby node.
  • the determining, by the plurality of candidate boards, the unique master node includes: determining, among the plurality of candidate boards, an optional board that has never failed to be determined as the The master node.
  • the implementation method further includes: if the candidate card fails to compete in the competition, the candidate card stops sending the competition message to the other candidate cards.
  • the method may further include: setting a competition flag to the candidate card in advance, and setting the competition flag to true, the setting indicating that the candidate card is allowed to become a master by transmitting the competition message. Node; for each candidate board, if the candidate board determines that it has failed in the competition with any of the other optional boards, then the content flag of the candidate board is set to False, this setting is used to indicate that the candidate card is prohibited from becoming a master node by sending a competition message.
  • determining, among the plurality of candidate boards, only one of the candidate boards that have never failed to be determined as the master node includes: broadcasting or multicasting to all other candidate boards on one of the candidate boards If the number of times the contention message reaches the predetermined number of thresholds, if the candidate card does not continue to receive the competition message from the other candidate card, the candidate card is determined as the master node.
  • the hardware information includes the physical location information of the card and the primary standby status information of the board in the chassis.
  • the primary standby status information is used to indicate that the board is the main control board in the chassis. Card or standby control board.
  • the above competition principle includes: preferentially determining an optional board as a main control board in the respective chassis as the winning candidate card; if the two selected boards are in the respective chassis The main control board determines the candidate board with a smaller or larger physical position as the winning candidate board in the two candidate boards for comparison.
  • Determining, by the master node, one of the other candidate boards as the standby control node includes: the master node notifying the other candidate boards to report the respective hardware information, and according to the reported hardware information An optional board that is located in a different chassis with the master node is determined as a standby node.
  • the determining, according to the reported hardware information, an option board that is located in a different chassis of the master node as the standby node further includes: if there is a different chassis in the chassis and in the respective chassis As a plurality of candidate boards of the main control board, randomly selecting one of the plurality of optional boards is determined as the standby node, or the physical position of the plurality of candidate boards is minimized Or the largest candidate card is determined as the standby node.
  • the method further includes: the master node sending a service allocation request to the main control card of the multiple chassis, and receiving the Hardware information and load information of each main control board returned by the main control board of the multiple chassis; the main control node is based on the hardware information and load information of the main control board of the multiple chassis Determining, by the main control board of the plurality of chassis, the main service node and the standby service node, the hardware information and load information reported by the main control board of the multiple chassis from the plurality of Determining the primary service node and the standby service node in the main control board of the chassis includes: the primary control node first determines the primary control card with the smallest load according to the load information as the primary service node, and the candidate board One of the cards in the card except the master node, the standby node, and the main service node is determined as a standby service node.
  • Determining, in the candidate board, one of the main control board except the main control node, the standby control node, and the main service node as the standby service node includes: A main control board located in different chassis is determined as the standby service node.
  • the method further includes: the master control section The point is sent to the standby control node by the keepalive message in a predetermined period; if the master control node meets the switching condition, the master control node and the standby control node are switched, wherein the switching condition includes: The standby control node does not receive the keep-alive message from the master control node for a predetermined period of time, and the standby control node learns the master control node by using other boards in the same chassis as the master control node. The working status is abnormal.
  • the optional board may include part or all of the main control board and the standby control board of the multiple chassis.
  • a card is provided.
  • the board includes: a sending module, configured to send a competition to the other board, wherein the contention carries the hardware information of the board; the receiving module is configured to send the module to the other The board receives the contention message from the other board in the process of sending the contention message; the first determining module is configured to send the hardware information of the board and the competition information received by the receiving module from other boards.
  • the hardware information in the text is compared, and the winning board is determined according to the competition principle, wherein the competition principle is used to determine the board as the master node according to the comparison result of the hardware information; the control module uses When the result of the determination is that the board of the first determining module fails to compete, the sending module stops controlling to send the contention message; and the second determining module is used to be the master node of the board where the board is located. Next, one of the other candidate boards is determined as the standby node.
  • the hardware information includes the physical location information of the board and the primary and backup status information of the board in the chassis.
  • the primary standby status information is used to indicate that the board is the main control board in the chassis. Card or standby control board.
  • the board may further include: a notification module, configured to notify other boards to report respective hardware information after the board is determined to be the master node; and the second determining module is specifically configured to report according to the other board
  • the hardware information is determined as a standby node with an optional board in which the master node is located in a different chassis.
  • the present invention determines the master node by making a plurality of candidate boards capable of being used as the master node to select the master node, so that the determination range of the master node is not limited to a fixed chassis, and The performance of the finalized control node can be ensured, and the problem that the system reliability caused by the master node can be determined in a certain fixed machine frame and the determination process is unreasonable can be avoided.
  • FIG. 1 is a schematic flowchart diagram of a method for implementing a card master/slave configuration according to an embodiment of the present invention
  • FIG. 2 is a schematic diagram of processing for determining a master node in an implementation method of a board master/slave configuration according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram of processing for determining a primary service node and a standby service node in an implementation method of a board master/slave configuration according to an embodiment of the present invention
  • FIG. 4 is a schematic diagram showing the distribution of nodes in the active/standby relationship determined by the implementation method of the active/standby configuration of the board according to the embodiment of the present invention
  • FIG. 5 is a schematic diagram of processing performed by a master control node and a standby control node according to an implementation method of a board master/slave configuration according to an embodiment of the present invention
  • FIG. 6 is a schematic structural view of a card according to an embodiment of the present invention.
  • the present invention proposes that, in the current cluster router system, the selection mechanism of the master node is unreasonable and the performance of the selected master node is not guaranteed, the present invention proposes that multiple candidate boards are used in multiple chassis frames. According to the result of the competition, the master node is determined, and the determined range of the master node is not limited to a fixed chassis, and the performance of the final master node can be guaranteed.
  • a method for implementing a card master/slave configuration is provided, which is used to determine a master node and a standby node from a plurality of boards located in multiple chassis.
  • FIG. 1 is a schematic flowchart of a method for implementing a master/slave configuration of a card according to an embodiment of the invention.
  • an implementation method of a board active/standby configuration according to an embodiment of the present invention includes:
  • Step 101 For multiple candidate cards in multiple boards (in this embodiment, the performance of the optional board should be able to meet the performance requirements of the master node), and each candidate board is provided to other devices.
  • the selection card sends a competition message, where the competition message sent by each candidate board carries the hardware information of the candidate board;
  • Step 103 In the process of competition, each candidate board receives a competition message from another candidate board; Step 105, each candidate board sends its hardware information and the received competition message The hardware information is compared, and the candidate board that wins the competition is determined according to the competition principle, wherein the competition principle is used to determine the candidate board as the master node according to the comparison result of the hardware information (wherein each time When comparing, because of the two comparisons For the board, the winning board is not necessarily the final board that is finalized as the master node. Therefore, for a comparison, the purpose of the competition principle is actually to determine the participation in comparing two candidate boards. More suitable as an optional board for the master node);
  • Step 107 Determine a unique master node according to the competition condition of the multiple candidate boards, and the master node determines one of the other candidate boards as the standby node.
  • the master control node is determined by making a plurality of candidate boards capable of being used as the master control node to select the master control node, so that the determination range of the master control node is not limited to a fixed chassis.
  • the performance of the finalized master node can be ensured, and the problem that the system reliability caused by the master node can be determined in a certain fixed machine frame and the determination process is unreasonable is avoided.
  • the optional board should include all the normal running main control boards and the standby control boards of the multiple chassis, or only part of them, but it should be ensured that the optional boards have other boards in the local frame. The function or ability to perform the control.
  • the foregoing hardware information includes physical location information of the card (for example, the chassis identifier where the card is located, the location identifier of the card in the chassis, the address information of the card), and the primary backup of the card in the chassis. Status information, where the primary and backup status information is used to indicate that the board is the main control board or the standby control board in the chassis.
  • the competition sent by the board may be lost, and based on the competition method used in this embodiment, if a board judges that it wins in the competition, the board should continue to transmit. For example, you can send a contest message at intervals of 1 second (other time lengths can be used as needed).
  • the above-mentioned competition principle may include: preferentially determining an optional board as a main control board in the respective chassis as the winning candidate card; if the two boards being compared are the main players in the respective chassis The control board selects one of the two boards to be compared to determine the winning candidate board, or selects the smaller or larger physical board as the winning board. .
  • each mainframe is pre-configured with a main control board and a standby control board.
  • the hardware of the card is configured to have a function of managing by hardware signals), so that the main control board can control the boards in the frame by hardware signals (for example, performing reset operations, etc.), therefore, it should be considered
  • the optional board for communication with hardware signals is more suitable as the master control node; since the main control board needs to control other boards through hardware signals, the requirements for the main control board are higher, and the standby control board in the chassis It can have the function of controlling other boards in the chassis through hardware signals and logic signals, or can only have the function of controlling by using logic signals (in progress) During control, if the board can only use logic signals, the board can send logic signals to boards that can be managed by hardware signals, and then the boards send corresponding hardware signals to other boards) In addition, the performance of the standby control board in a chassis is not higher than that
  • the card ⁇ of the chassis 1 receives the content of the card B from the chassis 2, if the card A is the main control card in the chassis 1, the card B is the chassis 2 On the standby control board, board A compares its own hardware information with the hardware information of board B carried in the competition message. Since board A is the main control board, board A will judge it. It itself won the competition and continued to send the competition message.
  • the board A and the board B are the main control boards in the respective chassis.
  • the board A can determine whether it wins according to a predetermined principle, for example, the address can be small ( The larger card is determined as the winning party.
  • the card B belongs to the chassis 2 and the address is 2XXX, so the address of the card A is small.
  • board A judges that it wins, it will continue to send the competition message.
  • the principle of determining the party with the smaller address as the winner is only a specific example. In the practical application, other comparison methods can be used, as long as one unique among a plurality of candidate boards can be determined.
  • the optional board is OK, and when judging, all boards should use this principle to judge, avoiding multiple master nodes at the same time.
  • the board can always send the competing message at a predetermined interval, and once an optional board receives any other optional board. After competing for the message and judging that the card itself has failed, the card will immediately stop sending the competition message to the other board. In this way, after the competitive message has been sent multiple times on the candidate board that has not yet failed, there will be a unique candidate card that fails to compete.
  • a threshold of the number of transmissions and a time interval for transmitting the contention packets may be set, and each of the candidate boards may be in the period after the board is started in the chassis and after entering the steady state.
  • the other board sends a competition message (once the board fails to compete, the board immediately stops sending), and the number of times that an optional board broadcasts or multicasts the competition message to all other optional boards reaches.
  • the board may be considered as the only candidate board that has not failed. At this point, the candidate board can be determined as the master node.
  • the master node determines one of the other candidate boards as the standby node, the master node can notify other candidate boards to report the respective hardware information, and the hardware information according to the report will be located with the master node.
  • An optional board of a different chassis is determined as a standby node.
  • the cluster router system limits the selection range of the master node and the standby node to In the same chassis, hardware signals can be used to monitor the master node and the standby node, which reduces the complexity of monitoring. However, if the chassis is abnormal, it will affect the normal operation of the entire cluster system.
  • the master node may notify other candidate boards to report respective hardware information, and according to the reported hardware information, an optional board that is located in a different chassis from the master node.
  • the device is determined as a standby node, and if there are multiple candidate boards that are in different chassis from the master node and are used as the main control board in the respective chassis, one of the multiple candidate boards is randomly selected.
  • the board is determined as the standby node, or the candidate board having the smallest or largest physical position in the plurality of candidate boards is determined as the standby node, and similarly, one board is selected from the plurality of candidate boards.
  • other selection methods can be used as long as one card can be uniquely determined from a plurality of boards.
  • the system may be used to set a competition flag that is qualified as an optional card, wherein, in advance, if the contention flag is set to true, the permission of the candidate card is allowed to pass.
  • the sending of the competition message becomes the master node. If the flag is set to false, it means that the candidate card is prohibited from being the master node by sending the competition message.
  • the competition flag may be preset before the board sends the competition message; in addition, the competition flag may also be in the process of competition. According to the failure of the card competition, there will only be a single candidate board that wins the competition.
  • the board can be determined as the master node according to the setting of the competition flag. For example, for each candidate board, if the candidate board determines that it has failed in the competition with any of the other optional boards, then the content flag of the candidate board is set to False, this setting is used to indicate that the candidate card is forbidden to become the master node by sending the competition message, so that after multiple times of the message transmission and competition, there may be only one competition.
  • the flag is an optional board, and the board can be directly determined as the master node. In addition, the competition flag may not be set. By default, the main control board and the standby control board in each chassis participate in the competition.
  • the operator or system can pre-specify the optional board that needs to be set to true. Moreover, it is also possible to make a determination based on historical data, for example, preferentially setting the competition flag of the candidate board that was once the master node to be true.
  • the master node can send a service allocation request to the main control board of the multiple chassis, and receive the main control boards returned by the main control board of multiple chassis.
  • the hardware information and the load information in this way, the master node can determine the active service node and the standby service from the main control boards of the multiple chassis according to the hardware information and load information reported by the main control board of the multiple chassis. node.
  • the main control node is based on hardware information and load information reported by the main control board of the multiple chassis from multiple chassis.
  • the main service node and the standby service node are determined in the main control board, and the specific manner is as follows: The main control node firstly determines the main control board with the smallest load according to the load information as the main service node, and the main service is based on the hardware information.
  • the main control board whose nodes are located in different chassis is determined as the standby service node.
  • the master node sends a keep-alive message to the standby node at a predetermined period; if the master node meets the switching condition, the master node and the standby node are switched, where The switching condition includes: the standby control node does not receive the keep-alive message from the autonomous control node within a predetermined time period, and the hardware working state of the master control node is abnormal (the standby control node can obtain the master through the partner node of the master control node) If the standby node learns that the hardware working state of the master node is normal through the partner node of the master node, the partner node of the master node may be notified by a logic signal, and the master node is indicated by the partner node. Reset. In this way, when the master node fails, the standby node is switched to the master node, and a main control board that is not the local chassis is selected as the standby control node, thus implementing 1+1
  • the board may be configured to stop sending the contention message after the number of times the board fails to reach a certain threshold, and the threshold value should be less than the reference when determining the master node. The threshold for the number of times the message is sent.
  • the method for determining the standby control node, determining the primary service node, determining the standby service node, and monitoring the master control node may refer to the description of the previous example.
  • the difference between this example and the example 1 is that
  • the board When determining the master node, if a board fails in a competition, the board records the candidate board that wins in this competition, and continues to send the competition message, which is sent on the board. When the number of times of the message reaches the predetermined number of times, the other candidate boards that win out of the board are counted in all the optional boards. According to the statistics of each board, all the optional boards can be ranked. At this time, one of the first few candidate boards can be selected to determine the master node, and the selection process can be a random selection.
  • the setting of the competition flag can be further optimized. For example, after obtaining the ranking of all the optional boards, the ranking can be recorded. When the next time the main control board is required, the ranking can be prioritized. The competition flag of the previous multiple boards is set to true.
  • step 21 needs to be performed first, that is, when the system starts, the competition process of entering the master node is performed.
  • step 22 When the main control board is started, the competition flag is read from the configuration information, and the main control board can perform the competition of the main control node only if the competition flag is true.
  • the competition flag can be configured by the operator at startup, which gives the operator an opportunity to specify the location of the master node in the cluster router system; and the competition flag can be set or modified during operation.
  • the main control board When the main control board synchronizes all the data of the master node and can take over the function of the master node at any time, the content flag of the main control board can be set to true. In this way, through the judgment of the competition sign, it is guaranteed that the main control board of the main control node can complete the main control function.
  • step 23 is performed; if yes, step 24 is performed; step 23, the main control board will abandon the competition master. The control node waits for the main control node to perform service allocation on the board, and ends the current process;
  • Step 24 The board that satisfies the competitive condition periodically sends the master control node to compete for the 4th message, and the hardware information of the board is included in the 4th article, including the physical location, the hardware active/standby state, and the like. Information is the condition for judging when the master node competes.
  • Step 25 Each main control board determines whether it has received a competition message from another main control board, and the hardware information carried in the received message is obtained for the main control board that has not failed. Conducting a competitive judgment, if the board wins, it can continue to send the competition message, and for the main control board that has failed in the competition, it will not continue to send the competition ⁇ ⁇ ⁇ , and may also be wrong The other main control boards sent the contest 4 essays for processing.
  • step 26 If the board receives the competition of other boards, go to step 26, otherwise go to step 29;
  • Step 26 According to the competition message sent by the other board, the priority is judged as the main control board in the respective chassis (that is, the board with the hardware signal as the main board) wins, if both sides are the main control board Card or standby control board, continue to determine which board has a smaller physical address, if the board wins, proceed to step 28; if the board fails, go to step 23;
  • Step 28 continue to send the competition message, and return to step 25 to continue to determine whether to receive the competition from other boards.
  • Step 29 If the board fails to compete, and the content of the other board is not received, the board is determined as the master node, and the master node collects other main control boards.
  • Hardware information (for example, may be a physical address), and perform step 30;
  • Step 30 Select a main control board whose physical address is not the local frame as the standby control node according to the reported physical address.
  • step 29 if a board continuously transmits 60 times of the master node's contention and no longer receives the competing message of the other board, it indicates that the board is currently running normally.
  • the control board competes with the master control node, and the board can enter the processing flow of the master control node.
  • the master node will continuously send the master node announcements, so that the master board that is started later does not need to perform master control when receiving the master node notification message.
  • the node competes and waits directly for the master node to allocate traffic on the board. At the same time, the master node competes.
  • the ⁇ ⁇ ⁇ is also used as a kind of keep-alive ⁇ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
  • the master node can sort the physical information collected in 5 seconds (or 10 seconds, etc.) according to the priority, according to the non-frame and hardware signals.
  • the priority principle of the primary and physical locations is selected to select the standby node. If the information reported by other main control boards is not received within 5 seconds, it means that the other main control boards in the current system are not running normally.
  • the main control node is in 1 minute (may be 2 minutes, half minutes, etc.) After that, the process of physical information collection is resumed until the qualified standby node is generated.
  • the system enters the control node 1+1 backup state.
  • the standby control node does not carry the control service, but all the data configurations of the master node are synchronized on the standby control node. When the master control node fails, the standby control node can immediately take over.
  • the service distribution process is entered.
  • the master node sends all the cards to the board when it selects the service node (which can be the board that participates in the competition, as shown in Figure 3, which can be board C and board D). After requesting, board C and board D will return the hardware information and load information related to the board to the master node. After the master node collects the information, the board with the lowest load is determined as the active service node. When there are multiple main control boards with the same load and the smallest, the priority of the hardware and the physical location is small. The specific judgment method is similar to the previous method of selecting the standby node.
  • the board with the lower load is preferentially selected as the standby service node in the other chassis except the chassis where the active service node is located. If the board cannot be uniquely determined according to the above conditions, , you can select the alternate service node by referring to the physical location.
  • the master node needs to set a 1-minute timer (which can also be a timer of other duration). After 1 minute, it resends the service node allocation request for information collection.
  • a master standby node configuration such as that shown in Fig. 4 can be obtained.
  • the master node and the standby service node are located in the chassis X, and the standby node is located in the chassis ,
  • the chassis ⁇ includes The main control board of the subrack (without service commitment), the main service node is located in the subrack, and the main control board of the subrack is also included in the subrack (no service commitment).
  • the standby control node and the primary/secondary service node need to periodically send a keep-alive message to the primary control node, and the primary control node can also periodically send a keep-alive message to the standby control node, which can be classified into the following two types.
  • the master node If the master node does not receive a keep-alive message for a certain board for 10 seconds, it considers that the board has failed. At this time, the data of the faulty board can be deleted on the master node. If the backup node and the standby service node are faulty, the master node needs to perform the process of selecting the standby node again. If the active service node fails, the master node first notifies the standby service node to initiate the master and backup. The switching process allows the standby service node to take over the service, and then starts the standby service node selection process;
  • the standby control node periodically receives the keep-alive message sent by the master control node. If the keep-alive message of the master control node is not received within 5 seconds, the master control node may be in a fault state.
  • the master node is the only control point of the system. If two master nodes appear in the system at the same time, the processing flow of the system is confusing. Therefore, the standby node needs to pass the logic flow (since the master node and the standby node are located differently The chassis, so the master node and the standby node can communicate with each other through logic signals. After monitoring the abnormality of the master node, hardware signals are also needed to ensure that the master node does not work.
  • the standby control node will query the partner board of the master node (for example, the other board in the same chassis as the master node queries the status of the master node, the partner board of the master node, and the master node. There is an active/standby signal line, so the partner board can obtain the running status of the master node. If the partner board checks through the hardware signal line that the master node has been running abnormally, the standby controller can be notified to perform the switching immediately; After the hardware signal line checks that the main control node is still running normally, the running status of the main control node is notified to the standby control node, and the standby control node rechecks the status of the main control node, if the main control node is still not received within 1 minute.
  • the partner board of the master node for example, the other board in the same chassis as the master node queries the status of the master node, the partner board of the master node, and the master node.
  • the partner board There is an active/standby signal line, so the
  • the keep-alive message indicates that although the hardware signal of the master node is normal, but the software flow is already in a fault state, the standby control node notifies the partner board of the master node to reset the master node through the hardware signal, and the partner board after the reset is completed.
  • the reset result is notified to the standby control node; the standby control node receives the master control node of the master node partner board response After the result of abnormal, initiates switching the control system take over the processing flow, flow into the master node processing becomes a master node and select the standby control node.
  • the advantages of the multi-chassis multi-control board of the cluster router system can be fully utilized, and the determination of the main control node, the standby control node, the active service node, and the standby service node is implemented between the multiple chassis, and the main control node is allowed. It runs on any box and can prevent the boards with the active/standby relationship from being located in the same chassis. This improves the reliability and throughput of the system while ensuring the performance of the master node and the active service node. It is also possible to implement monitoring functions of the master node and the standby node by means of hardware signals and logic signals.
  • a board is also provided. As shown in FIG. 6, the board according to an embodiment of the present invention includes:
  • the sending module 61 is configured to send a competition message to the other board, where the competition message carries the hardware information of the board;
  • the receiving module 62 is configured to receive the contention from the other board in the process of sending the contention message to the other board by the sending module;
  • the first determining module 63 is connected to the sending module 61 and the receiving module 62, and is configured to compare the hardware information of the board and the hardware information in the competing messages received by the receiving module from other boards, and according to the competition The principle determines the winning board, wherein the competition principle is used to determine a board that is more suitable as the master node according to the comparison result of the hardware information;
  • the control module 64 is connected to the sending module 61, and is configured to control the sending module to stop sending the competition if the result of the determination is that the board of the first determining module 61 fails.
  • the second determining module 65 is connected to the sending module 61 and the receiving module 62, and is configured to determine one of the other optional boards as the standby node when the board is the master node.
  • the hardware information includes the physical location information of the board and the primary and backup status information of the board in the chassis.
  • the primary and backup status information is used to indicate that the board is the main control board or the controller in the chassis. Board.
  • the board according to the embodiment of the present invention may further include: a notification module (not shown), configured to notify other boards to report respective hardware information after the board is determined to be the master node;
  • the second determining module 65 is specifically configured to determine, as the standby control node, an optional board that is located in a different chassis of the master node according to the hardware information reported by the other board.
  • the sending module 61 can also send the content in a broadcast or multicast manner, and can implement the sending of the keep-alive message, and the receiving module 62 can also receive the keep-alive messages of other boards and the reported load information and hardware. information.
  • the transmission of the service node allocation request may be performed by the sending module 62 or by the notification module. If necessary, the notification module and the sending module 62 may be set in one.
  • the selection of the primary service node and the selection of the standby service node may be performed by the first determining module 63 and the second determining module 65, respectively, or may be performed by one of them. If necessary, the first determining module 63 and The second determining module 65 is set in one.
  • control module can also be configured to notify the sending module to stop transmitting the competing message if it is determined that the number of failures of the card competition has reached a predetermined value.
  • the advantages of the multi-chassis multi-control board of the cluster router system can be fully utilized, and the main control node, the standby control node, the main service node and the main service node are realized between multiple chassis.
  • the determination of the standby service node allows the master node to run in any one of the frames, and can prevent the board in which the active/standby relationship is located in the same chassis, and effectively improve the performance of the master node and the active service node.
  • System reliability and throughput In addition, the monitoring functions of the master node and the standby node can be realized by means of hardware signals and logic signals.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Hardware Redundancy (AREA)

Abstract

The present invention discloses an implementing method for main/standby configuration of board cards, and a board card, wherein the method includes: for multiple standby board cards in multiple board cards, in the process of competition, each standby board card sends a competition message to other standby board cards and receives competition messages from other standby board cards, wherein the competition message sent by each standby board card carries hardware information of the standby board card; each standby board card compares its hardware information with the hardware information in the received competition messages and determines the competition winner standby board card according to competition principles; an only main control node is determined according to the competition conditions of the multiple standby board cards and the main control node determines one standby board card from other standby board cards as a standby control node. The above solution solves the problem of unreasonable choosing mechanism of the main control node and no guarantee of the chosen main control node performance in current cluster router system.

Description

板卡主备配置的实现方法和板卡 技术领域 本发明涉及通信领域, 尤其涉及一种板卡主备配置的实现方法和板卡。 背景技术  The present invention relates to the field of communications, and in particular, to a method and a board for implementing an active/standby configuration of a card. Background technique
由于路由器技术的快速发展, 使得路由容量有了大幅度的提升, 具体主要体现在板 卡槽位数量的增多、 每个槽位交换能力的提升、 以及板卡端口密度的增加。  Due to the rapid development of the router technology, the routing capacity has been greatly improved, which is mainly reflected in the increase in the number of slot slots, the improvement of the exchange capacity of each slot, and the increase in the density of the card ports.
目前, 路由器的种类正在增加, 每种路由器的功能划分也越来越明显。 在多种路由 器中, 例如, 核心层路由器需要具备快速转发的能力、 高安全性、 高稳定性以及足够大 的容量。 虽然路由器技术正在不断完善, 但是互联网流量的发展总会明显超过设备容量 更新的速度。 对于单个路由器而言, 如果需要对其进行扩展, 需要考虑光器件的发展成 熟度、 电源功率、 散热性能、 机房承重等多方面因素, 目前, 由于路由器扩容存在的上 述限制, 单框路由器的开发技术已经逐步接近于极限, 而集群路由器的出现则很好地解 决了路由器的扩容问题。  At present, the types of routers are increasing, and the functional division of each type of router is becoming more and more obvious. In a variety of routers, for example, core layer routers need to have fast forwarding capabilities, high security, high stability, and sufficient capacity. Although router technology is constantly improving, the development of Internet traffic will always exceed the speed of device capacity updates. For a single router, if you need to expand it, you need to consider the development maturity of the optical device, power supply, heat dissipation performance, and load-bearing capacity of the equipment room. Currently, due to the above limitations of router expansion, the development of single-box routers Technology has gradually approached the limit, and the emergence of cluster routers has solved the problem of router expansion.
集群技术通过集中化、 一体化的控制管理, 使集群系统中各台路由器单机之间能够 很好地协同工作, 从而突破了单框路由器在交换容量、 功耗、 散热等方面的限制, 能够 将单框路由器系统平滑地扩展为更大容量的路由交换系统。  Through centralized and integrated control and management, the cluster technology enables the routers in the cluster system to work together well, thus breaking the limitation of single-chassis routers in terms of switching capacity, power consumption, and heat dissipation. The single-box router system smoothly expands to a larger capacity routing switching system.
虽然集群路由器在容量上存在优势, 但是集群路由器的复杂度比较高, 并且, 集群 路由器系统的应用对不间断在线升级性能、 容灾性能、 故障自动监测性能和报警功能等 具有较高的要求, 因此会增加集群路由器系统的实现难度。  Although cluster routers have advantages in capacity, the complexity of cluster routers is relatively high, and the application of cluster router systems has high requirements for uninterrupted online upgrade performance, disaster tolerance performance, automatic fault monitoring performance, and alarm functions. Therefore, it will increase the difficulty of implementing the cluster router system.
在集群路由器系统中, 会对系统中的多个路由器进行不同的分工, 通常, 集群路由 器系统中会包含主控节点和备控节点。 其中, 主控节点是集群路由器系统中的集中控制 点,主要用于实现整个集群路由器系统的版本加载、用户的操作维护、 系统的集中计算、 和决策等功能。 对于集群路由器系统来说, 主控节点的选择以及监控会直接影响整个系 统的可靠性和可用性, 因此, 考虑到主控节点的重要性, 必须将主控节点设计为 1+1主 备方式。  In a cluster router system, multiple routers in the system are divided. Generally, the cluster router system includes the master node and the standby node. The master control node is a centralized control point in the cluster router system, and is mainly used to implement version loading, user operation and maintenance, centralized calculation, and decision making of the entire cluster router system. For a cluster router system, the selection and monitoring of the master node directly affects the reliability and availability of the entire system. Therefore, considering the importance of the master node, the master node must be designed as a 1+1 master.
通常, 在集群路由器系统中选择主控节点时, 可以釆用与单框路由器系统的主控节 点选择方式, 即, 固定将某个机框上的主控板选择为主控节点,如果釆用这种选择方式, 则可以直接使用单框系统的软件流程, 因此实现过程较为筒单, 但是会导致整个集群路 由器系统完全依赖于该所选主控节点所在的机框, 并且不能保证所选择的主控节点的性 能均能够满足作为主控节点的性能要求, 会影响整个集群路由器系统的正常运行, 可靠 性较差。 Generally, when the master node is selected in the cluster router system, the master node selection mode of the single-frame router system can be used, that is, the main control board on a certain chassis is fixed as the master node, if This choice, The software flow of the single-box system can be directly used, so the implementation process is relatively simple, but the entire cluster router system completely depends on the chassis where the selected master node is located, and the performance of the selected master node cannot be guaranteed. Both can meet the performance requirements of the master node, which will affect the normal operation of the entire cluster router system, and the reliability is poor.
针对目前的集群路由器系统中主控节点选择机制不合理、所选主控节点性能没有保 证的问题, 目前尚未提出有效的解决方案。 发明内容  In view of the problem that the selection mechanism of the master node in the current cluster router system is unreasonable and the performance of the selected master node is not guaranteed, an effective solution has not been proposed yet. Summary of the invention
针对目前的集群路由器系统中主控节点选择机制不合理、所选主控节点性能没有保 证的问题, 本发明提出一种板卡主备配置的实现方法和板卡, 能够合理地选择性能最优 的主控节点。  Aiming at the problem that the selection mechanism of the master node in the current cluster router system is unreasonable and the performance of the selected master node is not guaranteed, the present invention provides a method for implementing the master/slave configuration of the board and the board, which can reasonably select the optimal performance. The master node.
为解决上述技术问题, 本发明的技术方案是这样实现的:  In order to solve the above technical problem, the technical solution of the present invention is implemented as follows:
根据本发明的一个方面, 提供了一种板卡主备配置的实现方法, 用于从位于多个机 框中的多个板卡中确定主控节点和备控节点。  According to an aspect of the present invention, a method for implementing a board active/standby configuration is provided for determining a master node and a standby node from a plurality of boards located in a plurality of chassis.
根据本发明的板卡主备配置的实现方法包括: 对于所述多个板卡中的多个备选板 卡, 每个备选板卡向其他备选板卡发送竟争报文, 其中, 每个备选板卡发送的竟争报文 中携带该备选板卡的硬件信息; 在竟争过程中, 所述每个备选板卡接收来自其他备选板 卡的竟争报文; 所述每个备选板卡将其硬件信息与接收的竟争报文中的硬件信息进行比 较, 根据竟争原则确定出竟争胜出的备选板卡, 其中, 所述竟争原则用于根据硬件信息 的比较结果确定作为主控节点的备选板卡;根据所述多个备选板卡的竟争情况确定唯一 的主控节点, 且所述主控节点将其他备选板卡中的一个备选板卡确定为备控节点。  The implementation method of the card master/slave configuration according to the present invention includes: for each of the plurality of boards, each of the candidate boards sends a competition message to the other candidate board, where The competition message sent by each candidate card carries the hardware information of the candidate card; during the competition, each of the candidate cards receives the competition message from the other candidate card; Each of the candidate boards compares the hardware information with the hardware information in the received competition message, and determines an optional board that wins according to the competition principle, wherein the competition principle is used for Determining, as a result of the comparison result of the hardware information, an optional board as a master node; determining a unique master node according to the contention of the plurality of candidate boards, and the master node is in another candidate board An optional board is identified as a standby node.
其中, 所述根据多个备选板卡的竟争情况确定唯一的主控节点包括: 将所述多个备 选板卡中唯——个从未竟争失败的备选板卡确定为所述主控节点。  The determining, by the plurality of candidate boards, the unique master node includes: determining, among the plurality of candidate boards, an optional board that has never failed to be determined as the The master node.
并且, 所述实现方法还包括: 在竟争时, 若备选板卡竟争失败, 则该备选板卡停止 向其他备选板卡发送竟争报文。  Moreover, the implementation method further includes: if the candidate card fails to compete in the competition, the candidate card stops sending the competition message to the other candidate cards.
该方法还可以包括: 预先对所述备选板卡设置竟争标志, 并将竟争标志置位为真, 该置位表示允许该备选板卡通过发送竟争报文竟争成为主控节点; 对于每个备选板卡, 如果该备选板卡判断其在与其他任一备选板卡的竟争中失败的情况下, 则将该备选板卡 的竟争标志置位为假,该置位用于表示禁止该备选板卡通过发送竟争报文竟争成为主控 节点。 并且, 所述将多个备选板卡中唯——个从未竟争失败的备选板卡确定为主控节点包 括: 在一个备选板卡向其他所有备选板卡广播或组播竟争报文的次数达到预定次数阈值 的情况下, 如果该备选板卡未继续收到来自其他备选板卡的竟争报文, 则将该备选板卡 确定为所述主控节点。 The method may further include: setting a competition flag to the candidate card in advance, and setting the competition flag to true, the setting indicating that the candidate card is allowed to become a master by transmitting the competition message. Node; for each candidate board, if the candidate board determines that it has failed in the competition with any of the other optional boards, then the content flag of the candidate board is set to False, this setting is used to indicate that the candidate card is prohibited from becoming a master node by sending a competition message. Moreover, determining, among the plurality of candidate boards, only one of the candidate boards that have never failed to be determined as the master node includes: broadcasting or multicasting to all other candidate boards on one of the candidate boards If the number of times the contention message reaches the predetermined number of thresholds, if the candidate card does not continue to receive the competition message from the other candidate card, the candidate card is determined as the master node. .
优选地, 上述硬件信息包括板卡的物理位置信息、 板卡在所在机框中的主备用状态 信息,其中,所述主备用状态信息用于表示该板卡在所在机框中为主控板卡或备控板卡。  Preferably, the hardware information includes the physical location information of the card and the primary standby status information of the board in the chassis. The primary standby status information is used to indicate that the board is the main control board in the chassis. Card or standby control board.
并且, 上述竟争原则包括: 优先将在各自机框中作为主控板卡的备选板卡确定为胜 出的备选板卡; 如果比较的两个备选板卡均为各自机框中的主控板卡, 则在进行比较的 两个备选板卡中将物理位置较小或较大的备选板卡确定为胜出的备选板卡。  Moreover, the above competition principle includes: preferentially determining an optional board as a main control board in the respective chassis as the winning candidate card; if the two selected boards are in the respective chassis The main control board determines the candidate board with a smaller or larger physical position as the winning candidate board in the two candidate boards for comparison.
所述主控节点将其他备选板卡中的一个备选板卡确定为备控节点包括: 所述主控节 点通知其他备选板卡上报各自的硬件信息, 并根据上报的所述硬件信息将与所述主控节 点位于不同机框的一个备选板卡确定为备控节点。  Determining, by the master node, one of the other candidate boards as the standby control node includes: the master node notifying the other candidate boards to report the respective hardware information, and according to the reported hardware information An optional board that is located in a different chassis with the master node is determined as a standby node.
优选地, 所述根据上报的硬件信息将与主控节点位于不同机框的一个备选板卡确定 为备控节点进一步包括: 如果存在与所述主控节点位于不同机框且在各自机框中作为主 控板卡的多个备选板卡, 则从该多个备选板卡中随机选择一个板卡确定为所述备控节 点、 或者将该多个备选板卡中物理位置最小或最大的备选板卡确定为所述备控节点。  Preferably, the determining, according to the reported hardware information, an option board that is located in a different chassis of the master node as the standby node further includes: if there is a different chassis in the chassis and in the respective chassis As a plurality of candidate boards of the main control board, randomly selecting one of the plurality of optional boards is determined as the standby node, or the physical position of the plurality of candidate boards is minimized Or the largest candidate card is determined as the standby node.
此外, 在确定了所述主控节点和所述备控节点之后, 所述方法还包括: 所述主控节 点向所述多个机框的主控板卡发送业务分配请求, 并接收所述多个机框的主控板卡返回 的各个主控板的硬件信息和负载信息; 所述主控节点根据所述多个机框的主控板卡上 ·ί艮 的硬件信息和负载信息从所述多个机框的主控板卡中确定主用业务节点和备用业务节 所述主控节点根据所述多个机框的主控板卡上报的硬件信息和负载信息从所述多 个机框的主控板卡中确定主用业务节点和备用业务节点包括: 所述主控节点根据负载信 息优先将负载最小的主控板卡确定为主用业务节点, 并将所述备选板卡中除所述主控节 点、 所述备控节点、 以及所述主用业务节点以外的主控板卡中的一个板卡确定为备用业 务节点。  In addition, after the master node and the standby node are determined, the method further includes: the master node sending a service allocation request to the main control card of the multiple chassis, and receiving the Hardware information and load information of each main control board returned by the main control board of the multiple chassis; the main control node is based on the hardware information and load information of the main control board of the multiple chassis Determining, by the main control board of the plurality of chassis, the main service node and the standby service node, the hardware information and load information reported by the main control board of the multiple chassis from the plurality of Determining the primary service node and the standby service node in the main control board of the chassis includes: the primary control node first determines the primary control card with the smallest load according to the load information as the primary service node, and the candidate board One of the cards in the card except the master node, the standby node, and the main service node is determined as a standby service node.
其中, 所述将备选板卡中除主控节点、 备控节点、 以及主用业务节点以外的主控板 卡中的一个板卡确定为备用业务节点包括: 将与所述主用业务节点位于不同机框的一个 主控板卡确定为所述备用业务节点。  Determining, in the candidate board, one of the main control board except the main control node, the standby control node, and the main service node as the standby service node includes: A main control board located in different chassis is determined as the standby service node.
此外, 在确定了所述主控节点和所述备控节点之后, 所述方法还包括: 所述主控节 点以预定周期向所述备控节点发送保活消息; 如果所述主控节点满足倒换条件 , 则对所 述主控节点和所述备控节点进行倒换, 其中, 所述倒换条件包括: 所述备控节点在预定 时间段内未接收到来自所述主控节点的保活消息、且所述备控节点通过与所述主控节点 位于同一机框的其他板卡获知所述主控节点的工作状态出现异常。 In addition, after determining the master node and the standby node, the method further includes: the master control section The point is sent to the standby control node by the keepalive message in a predetermined period; if the master control node meets the switching condition, the master control node and the standby control node are switched, wherein the switching condition includes: The standby control node does not receive the keep-alive message from the master control node for a predetermined period of time, and the standby control node learns the master control node by using other boards in the same chassis as the master control node. The working status is abnormal.
可选地, 所述备选板卡可以包括所述多个机框的主控板卡和备控板卡中的部分或全 部。  Optionally, the optional board may include part or all of the main control board and the standby control board of the multiple chassis.
根据本发明的另一方面, 提供了一种板卡。  According to another aspect of the present invention, a card is provided.
该板卡包括: 发送模块, 用于向其他板卡发送竟争 ·ί艮文, 其中, 竟争 ·ί艮文中携带该 板卡的硬件信息; 接收模块, 用于在所述发送模块向其他板卡发送竟争报文的过程中接 收来自其他板卡的竟争报文; 第一确定模块, 用于将所在板卡的硬件信息与所述接收模 块接收的来自其他板卡的竟争报文中的硬件信息进行比较, 并根据竟争原则确定出竟争 胜出的板卡, 其中, 所述竟争原则用于根据硬件信息的比较结果确定作为主控节点的板 卡; 控制模块, 用于在确定结果为所述第一确定模块所在板卡竟争失败的情况下, 控制 所述发送模块停止发送竟争报文;第二确定模块,用于在所在板卡为主控节点的情况下, 将其他备选板卡中的一个备选板卡确定为所述备控节点。  The board includes: a sending module, configured to send a competition to the other board, wherein the contention carries the hardware information of the board; the receiving module is configured to send the module to the other The board receives the contention message from the other board in the process of sending the contention message; the first determining module is configured to send the hardware information of the board and the competition information received by the receiving module from other boards. The hardware information in the text is compared, and the winning board is determined according to the competition principle, wherein the competition principle is used to determine the board as the master node according to the comparison result of the hardware information; the control module uses When the result of the determination is that the board of the first determining module fails to compete, the sending module stops controlling to send the contention message; and the second determining module is used to be the master node of the board where the board is located. Next, one of the other candidate boards is determined as the standby node.
其中, 所述硬件信息包括板卡的物理位置信息、 板卡在所在机框中的主备用状态信 息, 其中, 所述主备用状态信息用于表示该板卡在所在机框中为主控板卡或备控板卡。  The hardware information includes the physical location information of the board and the primary and backup status information of the board in the chassis. The primary standby status information is used to indicate that the board is the main control board in the chassis. Card or standby control board.
该板卡还可以包括: 通知模块, 用于在所在板卡被确定为主控节点后通知其他板卡 上报各自的硬件信息; 并且, 所述第二确定模块具体用于根据其他板卡上报的所述硬件 信息将与所述主控节点位于不同机框的一个备选板卡确定为备控节点。  The board may further include: a notification module, configured to notify other boards to report respective hardware information after the board is determined to be the master node; and the second determining module is specifically configured to report according to the other board The hardware information is determined as a standby node with an optional board in which the master node is located in a different chassis.
本发明通过使多个能够作为主控节点的备选板卡进行竟争选择主控节点的方式确 定主控节点, 能够使主控节点的确定范围不局限于某个固定的机框内, 并且能够使保证 最终确定的主控节点的性能,避免了相关技术中只能够在某个固定机框中确定主控节点 导致的系统可靠性差以及确定过程不合理的问题。  The present invention determines the master node by making a plurality of candidate boards capable of being used as the master node to select the master node, so that the determination range of the master node is not limited to a fixed chassis, and The performance of the finalized control node can be ensured, and the problem that the system reliability caused by the master node can be determined in a certain fixed machine frame and the determination process is unreasonable can be avoided.
进一步地, 借助于本发明的上述技术方案, 能够充分利用集群路由器系统多框多主 控板的优势, 在多个机框之间实现主控节点、 备控节点、 主用业务节点和备用业务节点 的确定, 允许主控节点运行在任何一个框上, 并且能够避免存在主备关系的板卡位于同 一机框内,在保证主控节点和主用业务节点的性能的前提下有效提高系统的可靠性和吞 吐量; 此外, 还能够借助硬件信号和逻辑信号实现主控节点和备控节点的监控功能。 附图说明 图 1是根据本发明实施例的板卡主备配置的实现方法的流程示意图; Further, with the above technical solution of the present invention, the advantages of the multi-chassis multi-control board of the cluster router system can be fully utilized, and the main control node, the standby control node, the active service node, and the standby service are implemented between multiple chassis. The determination of the node allows the master node to run on any of the frames, and can avoid the board in which the active/standby relationship is located in the same chassis, and effectively improve the system under the premise of ensuring the performance of the master node and the active service node. Reliability and throughput; In addition, the monitoring functions of the master node and the standby node can be realized by means of hardware signals and logic signals. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a schematic flowchart diagram of a method for implementing a card master/slave configuration according to an embodiment of the present invention;
图 2是根据本发明实施例的板卡主备配置的实现方法中确定主控节点的处理示意 图;  2 is a schematic diagram of processing for determining a master node in an implementation method of a board master/slave configuration according to an embodiment of the present invention;
图 3是根据本发明实施例的板卡主备配置的实现方法中确定主用业务节点和备用业 务节点的处理示意图;  3 is a schematic diagram of processing for determining a primary service node and a standby service node in an implementation method of a board master/slave configuration according to an embodiment of the present invention;
图 4根据本发明实施例的板卡主备配置的实现方法确定的存在主备关系的节点在机 框中的分布示意图;  FIG. 4 is a schematic diagram showing the distribution of nodes in the active/standby relationship determined by the implementation method of the active/standby configuration of the board according to the embodiment of the present invention;
图 5是根据本发明实施例的板卡主备配置的实现方法进行主控节点和备控节点监控 的处理示意图;  FIG. 5 is a schematic diagram of processing performed by a master control node and a standby control node according to an implementation method of a board master/slave configuration according to an embodiment of the present invention; FIG.
图 6是根据本发明实施例的板卡的结构示意图。 具体实施方式 针对目前的集群路由器系统中主控节点选择机制不合理、所选主控节点性能没有保 证的问题, 本发明提出, 在多个机框范围内, 由多个备选板卡进行竟争, 根据竟争结果 确定主控节点, 能够使主控节点的确定范围不局限于某个固定的机框内, 并且能够保证 最终确定的主控节点的性能。  6 is a schematic structural view of a card according to an embodiment of the present invention. The present invention proposes that, in the current cluster router system, the selection mechanism of the master node is unreasonable and the performance of the selected master node is not guaranteed, the present invention proposes that multiple candidate boards are used in multiple chassis frames. According to the result of the competition, the master node is determined, and the determined range of the master node is not limited to a fixed chassis, and the performance of the final master node can be guaranteed.
下面将结合附图, 详细描述本发明的实施例。  Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
根据本发明的实施例, 提供了一种板卡主备配置的实现方法, 该方法用于从位于多 个机框中的多个板卡中确定主控节点和备控节点。  According to an embodiment of the present invention, a method for implementing a card master/slave configuration is provided, which is used to determine a master node and a standby node from a plurality of boards located in multiple chassis.
图 1是根据本发明实施例的板卡主备配置的实现方法的流程示意图。 如图 1所示, 根据本发明实施例的板卡主备配置的实现方法包括:  FIG. 1 is a schematic flowchart of a method for implementing a master/slave configuration of a card according to an embodiment of the invention. As shown in FIG. 1 , an implementation method of a board active/standby configuration according to an embodiment of the present invention includes:
步骤 101 , 对于多个板卡中的多个备选板卡(在本实施例中, 备选板卡的性能应当 能够满足作为主控节点的性能要求), 每个备选板卡向其他备选板卡发送竟争报文, 其 中, 每个备选板卡发送的竟争报文中携带该备选板卡的硬件信息;  Step 101: For multiple candidate cards in multiple boards (in this embodiment, the performance of the optional board should be able to meet the performance requirements of the master node), and each candidate board is provided to other devices. The selection card sends a competition message, where the competition message sent by each candidate board carries the hardware information of the candidate board;
步骤 103 , 在竟争过程中, 每个备选板卡接收来自其他备选板卡的竟争报文; 步骤 105 , 每个备选板卡将其硬件信息与接收的竟争报文中的硬件信息进行比较, 根据竟争原则确定出竟争胜出的备选板卡, 其中, 竟争原则用于根据硬件信息的比较结 果确定作为主控节点的备选板卡(其中, 在每次进行比较时, 由于参与比较的为两个备 选板卡, 胜出的备选板卡不一定是最终确定为主控节点的备选板卡, 因此, 针对一次比 较,该竟争原则的目的实际上是确定参与比较两个备选板卡中更适合作为主控节点的备 选板卡); Step 103: In the process of competition, each candidate board receives a competition message from another candidate board; Step 105, each candidate board sends its hardware information and the received competition message The hardware information is compared, and the candidate board that wins the competition is determined according to the competition principle, wherein the competition principle is used to determine the candidate board as the master node according to the comparison result of the hardware information (wherein each time When comparing, because of the two comparisons For the board, the winning board is not necessarily the final board that is finalized as the master node. Therefore, for a comparison, the purpose of the competition principle is actually to determine the participation in comparing two candidate boards. More suitable as an optional board for the master node);
步骤 107, 根据多个备选板卡的竟争情况确定唯一的主控节点, 且主控节点将其他 备选板卡中的一个备选板卡确定为备控节点。  Step 107: Determine a unique master node according to the competition condition of the multiple candidate boards, and the master node determines one of the other candidate boards as the standby node.
借助于上述处理,通过使多个能够作为主控节点的备选板卡进行竟争选择主控节点 的方式确定主控节点, 能够使主控节点的确定范围不局限于某个固定的机框内, 并且能 够保证最终确定的主控节点的性能,避免了相关技术中只能够在某个固定机框中确定主 控节点导致的系统可靠性差以及确定过程不合理的问题。  By means of the above processing, the master control node is determined by making a plurality of candidate boards capable of being used as the master control node to select the master control node, so that the determination range of the master control node is not limited to a fixed chassis. The performance of the finalized master node can be ensured, and the problem that the system reliability caused by the master node can be determined in a certain fixed machine frame and the determination process is unreasonable is avoided.
实例 1  Example 1
优选地, 备选板卡应当包括多个机框的全部正常运行的主控板卡和备控板卡, 也可 以只是其中的一部分,但是应当保证备选板卡具有对本机框内其他板卡进行控制的功能 或能力。  Preferably, the optional board should include all the normal running main control boards and the standby control boards of the multiple chassis, or only part of them, but it should be ensured that the optional boards have other boards in the local frame. The function or ability to perform the control.
并且, 上述硬件信息包括板卡的物理位置信息(例如, 板卡所在的机框标识、 板卡 在机框中的位置标识、 板卡的地址信息)、 板卡在所在机框中的主备用状态信息, 其中, 主备用状态信息用于表示该板卡在所在机框中为主控板卡或备控板卡。  Moreover, the foregoing hardware information includes physical location information of the card (for example, the chassis identifier where the card is located, the location identifier of the card in the chassis, the address information of the card), and the primary backup of the card in the chassis. Status information, where the primary and backup status information is used to indicate that the board is the main control board or the standby control board in the chassis.
由于板卡发送的竟争 ·ί艮文有可能会丢失, 并且, 基于本实施例所釆用的竟争方式, 如果一个板卡判断其本身在竟争中胜出, 该板卡应当继续发送竟争 ·ί艮文, 例如, 可以以 1秒(根据需要, 也可以釆用其他时间长度 ) 为间隔发送竟争报文。  Since the competition sent by the board may be lost, and based on the competition method used in this embodiment, if a board judges that it wins in the competition, the board should continue to transmit. For example, you can send a contest message at intervals of 1 second (other time lengths can be used as needed).
并且, 上述竟争原则可以包括: 优先将在各自机框中作为主控板卡的备选板卡确定 为胜出的备选板卡; 如果比较的两个板卡均为各自机框中的主控板卡, 则从进行比较的 两个板卡中随机选择一个板卡确定为胜出的备选板卡、或者将物理位置较小或较大的备 选板卡确定为胜出的备选板卡。  Moreover, the above-mentioned competition principle may include: preferentially determining an optional board as a main control board in the respective chassis as the winning candidate card; if the two boards being compared are the main players in the respective chassis The control board selects one of the two boards to be compared to determine the winning candidate board, or selects the smaller or larger physical board as the winning board. .
由于通常每个机框中均预先配置有一个主控板卡和一个备控板卡, 其中, 出于可靠 性和管理效率等因素的考虑, 需要对主控板卡增设一些功能(包括对板卡的硬件进行配 置, 使之具备通过硬件信号进行管理的功能), 使主控板卡能够通过硬件信号对本机框 内的板卡进行控制 (例如, 进行复位等操作), 因此, 应当认为釆用硬件信号进行通信 的备选板卡更适合作为主控节点; 由于主控板卡需要通过硬件信号控制其他板, 因此对 主控板卡的要求较高, 而机框内的备控板卡则既可以具有通过硬件信号和逻辑信号对机 框内的其他板卡进行控制的功能, 也可以仅具备釆用逻辑信号进行控制的功能(在进行 控制时, 如果该板卡仅能够釆用逻辑信号, 则该板卡可以将逻辑信号发送给能够通过硬 件信号进行管理的板卡, 再由这些板卡发送相应的硬件信号给其他板卡), 并且, 通常 情况下, 一个机框内的备控板卡的性能不会高于主控板卡, 因此, 选择主控板卡作为主 控节点能够有效保证主控节点的性能。 Generally, each mainframe is pre-configured with a main control board and a standby control board. Among them, for the reasons of reliability and management efficiency, it is necessary to add some functions to the main control board (including the board). The hardware of the card is configured to have a function of managing by hardware signals), so that the main control board can control the boards in the frame by hardware signals (for example, performing reset operations, etc.), therefore, it should be considered The optional board for communication with hardware signals is more suitable as the master control node; since the main control board needs to control other boards through hardware signals, the requirements for the main control board are higher, and the standby control board in the chassis It can have the function of controlling other boards in the chassis through hardware signals and logic signals, or can only have the function of controlling by using logic signals (in progress) During control, if the board can only use logic signals, the board can send logic signals to boards that can be managed by hardware signals, and then the boards send corresponding hardware signals to other boards) In addition, the performance of the standby control board in a chassis is not higher than that of the main control board. Therefore, selecting the main control board as the master node can effectively guarantee the performance of the master node.
例如, 假设机框 1的板卡 Α收到了来自机框 2的板卡 B的竟争 4艮文, 如果板卡 A 是机框 1中的主控板卡, 板卡 B是机框 2的备控板卡, 板卡 A会将其本身的硬件信息 与竟争报文中携带的板卡 B的硬件信息进行比较, 由于板卡 A是主控板卡, 因此, 板 卡 A会判断其本身竟争胜出, 并继续发送竟争报文。  For example, suppose the card Α of the chassis 1 receives the content of the card B from the chassis 2, if the card A is the main control card in the chassis 1, the card B is the chassis 2 On the standby control board, board A compares its own hardware information with the hardware information of board B carried in the competition message. Since board A is the main control board, board A will judge it. It itself won the competition and continued to send the competition message.
假设在上述实例中, 板卡 A和板卡 B均为各自机框内的主控板卡, 此时, 板卡 A 可以根据预定的原则确定其本身是否胜出, 例如, 可以将地址较小 (较大)的板卡确定 为胜出的一方, 此时, 由于板卡 A属于机框 1 , 地址为 1XXX, 板卡 B属于机框 2, 地 址为 2XXX, 所以板卡 A的地址较小, 此时板卡 A判断其本身胜出, 会继续发送竟争 报文。其中,这里所列举的将地址较小的一方确定为胜出的原则仅仅是一个具体的实例, 在实际应用当中可以釆用其他的比较方式, 只要能够在多个备选板卡中确定出一个唯一 的备选板卡即可, 并且在判断时, 所有的板卡均应当利用该原则进行判断, 避免同时出 现多个主控节点。  It is assumed that in the above example, the board A and the board B are the main control boards in the respective chassis. At this time, the board A can determine whether it wins according to a predetermined principle, for example, the address can be small ( The larger card is determined as the winning party. At this time, since the card A belongs to the chassis 1 and the address is 1XXX, the card B belongs to the chassis 2 and the address is 2XXX, so the address of the card A is small. When board A judges that it wins, it will continue to send the competition message. Among them, the principle of determining the party with the smaller address as the winner is only a specific example. In the practical application, other comparison methods can be used, as long as one unique among a plurality of candidate boards can be determined. The optional board is OK, and when judging, all boards should use this principle to judge, avoiding multiple master nodes at the same time.
对于一个备选板卡, 如果该板卡未竟争失败过, 则该板卡可以一直以预定间隔发送 竟争报文, 而一旦一个备选板卡在接收到来自任一其他备选板卡的竟争报文, 并判断该 板卡本身竟争失败, 则该板卡会立即停止向其他板卡发送竟争报文。 这样, 在尚未竟争 失败的备选板卡多次发送竟争报文后, 会出现唯一的一个未竟争失败的备选板卡。  For an optional board, if the board fails to pass, the board can always send the competing message at a predetermined interval, and once an optional board receives any other optional board. After competing for the message and judging that the card itself has failed, the card will immediately stop sending the competition message to the other board. In this way, after the competitive message has been sent multiple times on the candidate board that has not yet failed, there will be a unique candidate card that fails to compete.
在实际应用中, 可以设置一个发送次数的阈值以及一个发送竟争报文的时间间隔, 保证机框内的板卡启动后到进入稳定状态之后的时间段中每个备选板卡均可以向其他 板卡发送竟争报文(一旦该板卡竟争失败, 则该板卡立即停止发送), 在一个备选板卡 向其他所有备选板卡广播或组播竟争报文的次数达到预定次数阈值的情况下,如果该备 选板卡未继续收到来自其他备选板卡的竟争 ·ί艮文, 则可以认为该板卡为唯一的未曾竟争 失败的备选板卡, 此时可以将该备选板卡确定为主控节点。  In an actual application, a threshold of the number of transmissions and a time interval for transmitting the contention packets may be set, and each of the candidate boards may be in the period after the board is started in the chassis and after entering the steady state. The other board sends a competition message (once the board fails to compete, the board immediately stops sending), and the number of times that an optional board broadcasts or multicasts the competition message to all other optional boards reaches In the case of a predetermined number of thresholds, if the candidate card does not continue to receive the contention from other candidate boards, the board may be considered as the only candidate board that has not failed. At this point, the candidate board can be determined as the master node.
在主控节点将其他备选板卡中的一个板卡确定为备控节点时, 主控节点可以通知其 他备选板卡上报各自的硬件信息, 并根据上报的硬件信息将与主控节点位于不同机框的 一个备选板卡确定为备控节点。  When the master node determines one of the other candidate boards as the standby node, the master node can notify other candidate boards to report the respective hardware information, and the hardware information according to the report will be located with the master node. An optional board of a different chassis is determined as a standby node.
此外, 在相关技术中, 集群路由器系统会将主控节点和备控节点的选择范围限制在 同一机框中, 这样就可以釆用硬件信号来实现主控节点和备控节点的监控, 降低监控的 复杂度, 但是, 如果该机框出现异常, 会影响整个集群系统的正常运行。 In addition, in the related art, the cluster router system limits the selection range of the master node and the standby node to In the same chassis, hardware signals can be used to monitor the master node and the standby node, which reduces the complexity of monitoring. However, if the chassis is abnormal, it will affect the normal operation of the entire cluster system.
为了避免该问题, 在本发明的实施例中, 主控节点可以通知其他备选板卡上报各自 的硬件信息, 并根据上报的硬件信息将与主控节点位于不同机框的一个备选板卡确定为 备控节点, 其中, 如果存在与主控节点位于不同机框且在各自机框中作为主控板卡的多 个备选板卡, 则从该多个备选板卡中随机选择一个板卡确定为备控节点、 或者将该多个 备选板卡中物理位置最小或最大的备选板卡确定为备控节点, 类似地, 在从多个备选板 卡中选择一个板卡作为备控节点时, 还可以釆用其他的选择方式, 只要能够从多个板卡 中唯一地确定一个板卡即可。  In order to avoid the problem, in the embodiment of the present invention, the master node may notify other candidate boards to report respective hardware information, and according to the reported hardware information, an optional board that is located in a different chassis from the master node. The device is determined as a standby node, and if there are multiple candidate boards that are in different chassis from the master node and are used as the main control board in the respective chassis, one of the multiple candidate boards is randomly selected. The board is determined as the standby node, or the candidate board having the smallest or largest physical position in the plurality of candidate boards is determined as the standby node, and similarly, one board is selected from the plurality of candidate boards. As a standby node, other selection methods can be used as long as one card can be uniquely determined from a plurality of boards.
此外, 在本实施例中, 可以由系统对有资格作为备选板卡的竟争标志置位, 其中, 可以预先规定, 如果竟争标志置位为真, 则表示允许该备选板卡通过发送竟争报文竟争 成为主控节点, 如果置位为假, 则表示禁止该备选板卡通过发送竟争报文竟争成为主控 节点。  In addition, in this embodiment, the system may be used to set a competition flag that is qualified as an optional card, wherein, in advance, if the contention flag is set to true, the permission of the candidate card is allowed to pass. The sending of the competition message becomes the master node. If the flag is set to false, it means that the candidate card is prohibited from being the master node by sending the competition message.
具体地, 在实际应用中, 为了使操作员能够指定主控节点选择的范围, 竟争标志可 以在板卡发送竟争报文之前预先设置; 此外, 竟争标志也可以在竟争的过程中根据板卡 竟争失败的情况进行设置, 最终只会存在唯一的一个竟争胜出的备选板卡, 此时可以根 据竟争标志的置位情况确定该板卡为主控节点。 例如, 对于每个备选板卡, 如果该备选 板卡判断其在与其他任一备选板卡的竟争中失败的情况下, 则将该备选板卡的竟争标志 置位为假,该置位用于表示禁止该备选板卡通过发送竟争报文竟争成为主控节点,这样, 在经过多次竟争报文发送和竟争后, 可能仅存在唯一一个竟争标志为真的备选板卡, 此 时可以直接将该板卡确定为主控节点。 另外, 也可以不设置竟争标志, 默认各个机框内 的主控板卡和备控板卡均参与竟争。  Specifically, in practical applications, in order to enable the operator to specify the range selected by the master node, the competition flag may be preset before the board sends the competition message; in addition, the competition flag may also be in the process of competition. According to the failure of the card competition, there will only be a single candidate board that wins the competition. At this time, the board can be determined as the master node according to the setting of the competition flag. For example, for each candidate board, if the candidate board determines that it has failed in the competition with any of the other optional boards, then the content flag of the candidate board is set to False, this setting is used to indicate that the candidate card is forbidden to become the master node by sending the competition message, so that after multiple times of the message transmission and competition, there may be only one competition. The flag is an optional board, and the board can be directly determined as the master node. In addition, the competition flag may not be set. By default, the main control board and the standby control board in each chassis participate in the competition.
如果需要在开始竟争之前就确定需要将竟争标志置位为真的备选板卡, 则可以由操 作员或系统预先指定需要置位为真的备选板卡。 并且, 还可以根据历史数据进行确定, 例如, 优先将曾经作为主控节点的备选板卡的竟争标志设置为真。  If it is necessary to determine the need to set the competition flag as a true board before starting the competition, the operator or system can pre-specify the optional board that needs to be set to true. Moreover, it is also possible to make a determination based on historical data, for example, preferentially setting the competition flag of the candidate board that was once the master node to be true.
另外, 在确定了主控节点和备控节点之后, 主控节点可以向多个机框的主控板卡发 送业务分配请求, 并接收多个机框的主控板卡返回的各个主控板的硬件信息和负载信 息; 这样, 主控节点就能够根据多个机框的主控板卡上报的硬件信息和负载信息从多个 机框的主控板卡中确定主用业务节点和备用业务节点。  In addition, after the master node and the standby node are determined, the master node can send a service allocation request to the main control board of the multiple chassis, and receive the main control boards returned by the main control board of multiple chassis. The hardware information and the load information; in this way, the master node can determine the active service node and the standby service from the main control boards of the multiple chassis according to the hardware information and load information reported by the main control board of the multiple chassis. node.
其中, 主控节点根据多个机框的主控板卡上报的硬件信息和负载信息从多个机框的 主控板卡中确定主用业务节点和备用业务节点, 具体方式如下: 主控节点根据负载信息 优先将负载最小的主控板卡确定为主用业务节点, 并且根据硬件信息将与主用业务节点 位于不同机框的主控板卡确定为备用业务节点。 The main control node is based on hardware information and load information reported by the main control board of the multiple chassis from multiple chassis. The main service node and the standby service node are determined in the main control board, and the specific manner is as follows: The main control node firstly determines the main control board with the smallest load according to the load information as the main service node, and the main service is based on the hardware information. The main control board whose nodes are located in different chassis is determined as the standby service node.
此外, 在确定了主控节点和备控节点之后 , 主控节点以预定周期向备控节点发送保 活消息; 如果主控节点满足倒换条件, 则对主控节点和备控节点进行倒换, 其中, 倒换 条件包括: 备控节点在预定时间段内未接收到来自主控节点的保活消息、 且主控节点的 硬件工作状态出现异常(备控节点可以通过主控节点的伙伴节点获取该主控节点的硬件 工作状态); 而如果备控节点通过主控节点的伙伴节点获知主控节点的硬件工作状态正 常,则可以通过逻辑信号通知主控节点的伙伴节点,由伙伴节点指示主控节点进行复位。 这样, 当主控节点发生故障的时候, 备控节点会倒换成主控节点, 并选取一个非本机框 的主控板卡作为备控节点, 从而实现了 1+1保护倒换。  In addition, after the master node and the standby node are determined, the master node sends a keep-alive message to the standby node at a predetermined period; if the master node meets the switching condition, the master node and the standby node are switched, where The switching condition includes: the standby control node does not receive the keep-alive message from the autonomous control node within a predetermined time period, and the hardware working state of the master control node is abnormal (the standby control node can obtain the master through the partner node of the master control node) If the standby node learns that the hardware working state of the master node is normal through the partner node of the master node, the partner node of the master node may be notified by a logic signal, and the master node is indicated by the partner node. Reset. In this way, when the master node fails, the standby node is switched to the master node, and a main control board that is not the local chassis is selected as the standby control node, thus implementing 1+1 protection switching.
此外, 在实际应用中, 也可以将板卡配置为在该板卡竟争失败次数达到一定门限值 后会停止发送竟争报文,该门限值应当少于确定主控节点时参照的竟争报文发送次数的 阈值。  In addition, in practical applications, the board may be configured to stop sending the contention message after the number of times the board fails to reach a certain threshold, and the threshold value should be less than the reference when determining the master node. The threshold for the number of times the message is sent.
实例 2  Example 2
在本实例中, 备控节点的确定、 主用业务节点的确定、 备用业务节点的确定、 对主 控节点等进行监控的方式可以参照上一实例的描述, 本实例与实例 1的区别在于, 在确 定主控节点时, 如果一个板卡在一次竟争中失败, 该板卡会记录在本次竟争中胜出的备 选板卡, 并继续发送竟争报文, 在该板卡发送竟争报文的次数达到预定次数时, 统计出 所有备选板卡中竟争胜出该板卡的其他备选板卡, 根据每个板卡的统计, 就能够对所有 的备选板卡进行排名, 此时, 可以在排名前几位的备选板卡中选择一个板卡确定为主控 节点, 选择的过程可以是随机选择。  In this example, the method for determining the standby control node, determining the primary service node, determining the standby service node, and monitoring the master control node may refer to the description of the previous example. The difference between this example and the example 1 is that When determining the master node, if a board fails in a competition, the board records the candidate board that wins in this competition, and continues to send the competition message, which is sent on the board. When the number of times of the message reaches the predetermined number of times, the other candidate boards that win out of the board are counted in all the optional boards. According to the statistics of each board, all the optional boards can be ranked. At this time, one of the first few candidate boards can be selected to determine the master node, and the selection process can be a random selection.
在本实例的基础上, 可以进一步优化竟争标志的设置, 例如, 在得到所有备选板卡 的排名后, 可以记录该排名, 在下一次需要竟争主控板卡时, 可以优先将排名靠前的多 个板卡的竟争标志置位为真。  On the basis of this example, the setting of the competition flag can be further optimized. For example, after obtaining the ranking of all the optional boards, the ranking can be recorded. When the next time the main control board is required, the ranking can be prioritized. The competition flag of the previous multiple boards is set to true.
下面将结合图 2并参照上述实例 1来描述 # ^据本实施例的方法。  The method according to the present embodiment will be described below with reference to Fig. 2 and with reference to the above-described example 1.
如图 2所示, 具体包括以下处理过程:  As shown in Figure 2, the following processes are specifically included:
由于主控节点是集群路由器的集中的控制点, 主控节点需要负责整个系统的版本加 载和业务配置工作, 因此首先需要执行步骤 21 , 即, 在系统启动时, 进入主控节点的竟 争流程。 步骤 22,主控板卡启动时从配置信息中读取竟争标志,只有竟争标志为真的情况下 该主控板卡才能进行主控节点的竟争。 竟争标志可以在启动时由操作员来配置, 这样在 集群路由器系统中就给操作员提供了一个指定主控节点位置的机会; 并且, 竟争标志在 运行过程中也可以被设置或修改, 当主控板上同步了主控节点的所有数据并且可以随时 接管主控节点的功能时, 可以将该主控板的竟争标志设置为真。 这样通过竟争标志的判 断, 保证竟争到主控节点的主控板卡都能完成主控功能。 在竟争开始前, 如果判断某个 主控板卡不满足竟争主控节点的条件, 则执行步骤 23; 如果满足, 则执行步骤 24; 步骤 23 , 本主控板卡将放弃竟争主控节点, 等待主控节点对本板卡进行业务分配, 结束当前流程; Because the master node is the centralized control point of the cluster router, the master node needs to be responsible for the version loading and service configuration of the entire system. Therefore, step 21 needs to be performed first, that is, when the system starts, the competition process of entering the master node is performed. . Step 22: When the main control board is started, the competition flag is read from the configuration information, and the main control board can perform the competition of the main control node only if the competition flag is true. The competition flag can be configured by the operator at startup, which gives the operator an opportunity to specify the location of the master node in the cluster router system; and the competition flag can be set or modified during operation. When the main control board synchronizes all the data of the master node and can take over the function of the master node at any time, the content flag of the main control board can be set to true. In this way, through the judgment of the competition sign, it is guaranteed that the main control board of the main control node can complete the main control function. Before the competition begins, if it is determined that a certain main control board does not satisfy the condition of the competition master node, step 23 is performed; if yes, step 24 is performed; step 23, the main control board will abandon the competition master. The control node waits for the main control node to perform service allocation on the board, and ends the current process;
步骤 24, 满足竟争条件的板卡每隔 1秒定时发送主控节点竟争 4艮文,在 4艮文中带有 本板卡的硬件信息, 包括物理位置、 硬件主备状态等, 这些硬件信息是在主控节点竟争 时候的判断条件。  Step 24: The board that satisfies the competitive condition periodically sends the master control node to compete for the 4th message, and the hardware information of the board is included in the 4th article, including the physical location, the hardware active/standby state, and the like. Information is the condition for judging when the master node competes.
步骤 25 ,每个主控板卡判断其本身是否接收到来自其他主控板卡的竟争报文,对于 竟争未经失败的主控板卡, 会根据接收的报文中携带的硬件信息进行竟争判断, 如果本 板卡竟争胜出, 则可以继续发送竟争报文, 而对于竟争曾经失败的主控板卡, 则不会继 续发送竟争 ·ί艮文, 并且也可以不对其他主控板卡发送的竟争 4艮文进行处理。  Step 25: Each main control board determines whether it has received a competition message from another main control board, and the hardware information carried in the received message is obtained for the main control board that has not failed. Conducting a competitive judgment, if the board wins, it can continue to send the competition message, and for the main control board that has failed in the competition, it will not continue to send the competition · 艮 艮, and may also be wrong The other main control boards sent the contest 4 essays for processing.
如果该板卡收到其他板卡的竟争 ·ί艮文, 则执行步骤 26, 否则执行步骤 29;  If the board receives the competition of other boards, go to step 26, otherwise go to step 29;
步骤 26 ,根据其他板卡发送的竟争报文进行判断,优先判断作为各自机框内的主控 板卡(即, 硬件信号为主用的板卡)胜出, 如果比较双方都是主控板卡或备控板卡, 则 继续判断哪个板卡的物理地址较小, 如果本板卡胜出, 则执行步骤 28; 如果本板卡竟争 失败, 则执行步骤 23;  Step 26: According to the competition message sent by the other board, the priority is judged as the main control board in the respective chassis (that is, the board with the hardware signal as the main board) wins, if both sides are the main control board Card or standby control board, continue to determine which board has a smaller physical address, if the board wins, proceed to step 28; if the board fails, go to step 23;
步骤 28 , 继续发送竟争报文, 并返回步骤 25 , 继续判断是否接收到来自其他板卡 的竟争 ·ί艮文;  Step 28, continue to send the competition message, and return to step 25 to continue to determine whether to receive the competition from other boards.
步骤 29, 如果本板卡从未竟争失败, 且收不到来自其他板卡的竟争 ·ί艮文, 则将本板 卡确定为主控节点, 该主控节点收集其他主控板上报的硬件信息(例如, 可以是物理地 址), 并执行步骤 30;  Step 29: If the board fails to compete, and the content of the other board is not received, the board is determined as the master node, and the master node collects other main control boards. Hardware information (for example, may be a physical address), and perform step 30;
步骤 30, 根据上报的物理地址选择一个物理地址非本机框的主控板作为备控节点。 在步骤 29中,如果一个板卡连续发送 60次主控节点竟争 4艮文而不再收到其它板卡 的竟争报文, 则表示该板卡在当前已经正常运行起来的多个主控板之间竟争到主控节 点, 且本板卡可以进入主控节点的处理流程。 此外, 在上述过程中, 主控节点会不停地发送主控节点通告 ·ί艮文, 这样对于后启动 的主控板在收到主控节点通告 4艮文的时候就不需要进行主控节点竟争了, 直接等待主控 节点在本板卡上分配业务。 同时主控节点竟争 ·ί艮文也是作为一种保活 ·ί艮文, 备控节点可 以通过该 ·ί艮文来监控主控节点的运行情况。 Step 30: Select a main control board whose physical address is not the local frame as the standby control node according to the reported physical address. In step 29, if a board continuously transmits 60 times of the master node's contention and no longer receives the competing message of the other board, it indicates that the board is currently running normally. The control board competes with the master control node, and the board can enter the processing flow of the master control node. In addition, in the above process, the master node will continuously send the master node announcements, so that the master board that is started later does not need to perform master control when receiving the master node notification message. The node competes and waits directly for the master node to allocate traffic on the board. At the same time, the master node competes. The 艮 艮 也是 is also used as a kind of keep-alive 艮 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
主控节点通知其它板卡上报各自的物理信息时,主控节点可以将 5秒(也可以是 10 秒等时间段) 内收集的物理信息按优先级进行排序, 按照非本框、 硬件信号为主用、 物 理位置小的优先级原则选取备控节点。 如果 5秒钟内没有收到其它主控板上报的信息, 则表示当前系统中其它主控板还没有正常运行起来, 则主控节点在 1分钟(也可以是 2 分钟、 半分钟等时间段)后重新进行物理信息收集的流程, 直到符合条件的备控节点产 生。 当主控节点和备控节点选取结束之后, 系统就进入控制节点 1+1备份状态。 备控节 点上不承载控制业务, 但是备控节点上同步了主控节点的所有数据配置, 当主控节点发 生故障的时候备控节点就可以立即接管过来。  When the master node notifies other boards to report their physical information, the master node can sort the physical information collected in 5 seconds (or 10 seconds, etc.) according to the priority, according to the non-frame and hardware signals. The priority principle of the primary and physical locations is selected to select the standby node. If the information reported by other main control boards is not received within 5 seconds, it means that the other main control boards in the current system are not running normally. The main control node is in 1 minute (may be 2 minutes, half minutes, etc.) After that, the process of physical information collection is resumed until the qualified standby node is generated. After the master node and the standby node are selected, the system enters the control node 1+1 backup state. The standby control node does not carry the control service, but all the data configurations of the master node are synchronized on the standby control node. When the master control node fails, the standby control node can immediately take over.
系统的主控节点和备控节点选取之后 , 就进入业务分配流程。  After the main control node and the standby control node of the system are selected, the service distribution process is entered.
如图 3所示, 主控节点在选择业务节点的时候给所有板卡(可以是上述参与竟争的 板卡, 如图 3所示, 可以是板卡 C和板卡 D )发送业务节点分配请求, 板卡 C和板卡 D 收到该请求后, 会将本板卡相关的硬件信息和负载信息返回给主控节点。 主控节点收集 到信息后, 将负载最低的板卡确定为主用业务节点, 当存在多个负载相同且最小的主控 板卡时, 则可以根据硬件主用和物理位置小的优先原则进行选取, 具体的判断方式与之 前选择备控节点的方式类似。 在主用业务节点选取之后, 应当在主用业务节点所在机框 之外的其他机框内, 优先选择负载较低的板卡确定为备用业务节点, 如果根据上述条件 无法唯一地确定一个板卡, 则可以参照物理位置选择备用业务节点。  As shown in Figure 3, the master node sends all the cards to the board when it selects the service node (which can be the board that participates in the competition, as shown in Figure 3, which can be board C and board D). After requesting, board C and board D will return the hardware information and load information related to the board to the master node. After the master node collects the information, the board with the lowest load is determined as the active service node. When there are multiple main control boards with the same load and the smallest, the priority of the hardware and the physical location is small. The specific judgment method is similar to the previous method of selecting the standby node. After the primary service node is selected, the board with the lower load is preferentially selected as the standby service node in the other chassis except the chassis where the active service node is located. If the board cannot be uniquely determined according to the above conditions, , you can select the alternate service node by referring to the physical location.
此外, 在上述过程中, 如果主控节点发送业务节点分配请求之后的 5秒(也可以是 10秒或其他时间段)内没有收集到其他板卡上 ·ί艮的信息,则表示当前其它板卡还没有正 常运行起来, 主控节点需要设置一个 1分钟定时器(也可以是其他时长的定时器), 在 1 分钟后重新发送业务节点分配请求进行信息收集。  In addition, in the above process, if the information of the other board is not collected within 5 seconds (or may be 10 seconds or other time period) after the master node sends the service node allocation request, the current other board is indicated. The card is not working properly. The master node needs to set a 1-minute timer (which can also be a timer of other duration). After 1 minute, it resends the service node allocation request for information collection.
才艮据上述处理, 能够得到诸如图 4所示的主备控节点配置。 如图 4所示, 在与控制 面交换网连接的三个机框中,主控节点和备用业务节点位于机框 X中,备控节点则位于 机框 Υ中, 并且机框 Υ中还包括机框 Υ的主控板卡(无业务承担 ), 主用业务节点则位 于机框 Ζ中, 并且机框 Ζ中还包括机框 Ζ的主控板卡(无业务承担)。  According to the above processing, a master standby node configuration such as that shown in Fig. 4 can be obtained. As shown in FIG. 4, in the three chassis connected to the control plane switching network, the master node and the standby service node are located in the chassis X, and the standby node is located in the chassis ,, and the chassis 还 includes The main control board of the subrack (without service commitment), the main service node is located in the subrack, and the main control board of the subrack is also included in the subrack (no service commitment).
这样就能够保证 1+1备份的节点不位于同一个机框中,使得整个系统中不会存在单 一的失效点, 能够有效提高系统的可靠性。 This ensures that the 1+1 backup nodes are not in the same chassis, so there will be no single in the entire system. The failure point of one can effectively improve the reliability of the system.
如图 5所示,备控节点和主 /备用业务节点需要定时给主控节点发送保活消息,同时, 主控节点同样可以定时给备控节点发送保活消息, 具体可以分为以下两种情况:  As shown in FIG. 5, the standby control node and the primary/secondary service node need to periodically send a keep-alive message to the primary control node, and the primary control node can also periodically send a keep-alive message to the standby control node, which can be classified into the following two types. Happening:
(情况一)如果主控节点连续 10秒时间内没有收到某个板卡的保活消息, 则认为 该板卡已经发生故障, 此时, 可以在主控节点上删除该故障板卡的数据配置信息, 如果 是备控节点和备用业务节点发生故障, 则主控节点需要重新进行备用节点的选取流程; 如果是主用业务节点发生故障, 则主控节点会首先通知备用业务节点发起主备倒换流 程, 让备用业务节点将业务接管过来, 然后再开始进行备用业务节点选取流程;  (Case 1) If the master node does not receive a keep-alive message for a certain board for 10 seconds, it considers that the board has failed. At this time, the data of the faulty board can be deleted on the master node. If the backup node and the standby service node are faulty, the master node needs to perform the process of selecting the standby node again. If the active service node fails, the master node first notifies the standby service node to initiate the master and backup. The switching process allows the standby service node to take over the service, and then starts the standby service node selection process;
(情况二)备控节点定时接收主控节点发送的保活消息, 如果连续 5秒时间内没有 收到主控节点的保活消息, 则认为主控节点可能处于故障状态。 主控节点是系统的唯一 控制点, 如果系统中同时出现两个主控节点, 则系统的处理流程就混乱了, 因此备控节 点需要通过逻辑流程 (由于主控节点与备控节点位于不同的机框, 所以主控节点与备控 节点之间可以通过逻辑信号进行通信)监控到主控节点异常后, 还需要通过硬件信号来 确保主控节点确实已经无法工作。 此时, 备控节点会再向主控节点的伙伴板卡(例如, 与主控节点位于同一机框的其他板卡查询主控节点的状态, 主控节点的伙伴板卡和主控 节点之间存在主备信号线, 因此伙伴板卡可以确切地获取主控节点的运行状态。 如果伙 伴板卡通过硬件信号线检查到主控节点已经运行异常, 则通知备控节点可以立即进行倒 换; 如果通过硬件信号线检查到主控节点仍然正常运行, 则把主控节点的运行状态通知 给备控节点, 备控节点重新检查主控节点的状态, 如果在 1分钟内仍然没有接收到主控 节点的保活消息, 则表示主控节点虽然硬件信号正常, 但是软件流程已经处在故障状态 了, 备控节点通知主控节点的伙伴板卡通过硬件信号复位主控节点, 复位结束后伙伴板 卡会将复位结果通知给备控节点;备控节点收到主控节点伙伴板卡应答的主控节点运行 异常的结果后, 就发起倒换, 将系统的控制处理流程接管过来, 进入主控节点处理的流 程, 成为主控节点并选择备控节点。  (Case 2) The standby control node periodically receives the keep-alive message sent by the master control node. If the keep-alive message of the master control node is not received within 5 seconds, the master control node may be in a fault state. The master node is the only control point of the system. If two master nodes appear in the system at the same time, the processing flow of the system is confusing. Therefore, the standby node needs to pass the logic flow (since the master node and the standby node are located differently The chassis, so the master node and the standby node can communicate with each other through logic signals. After monitoring the abnormality of the master node, hardware signals are also needed to ensure that the master node does not work. At this time, the standby control node will query the partner board of the master node (for example, the other board in the same chassis as the master node queries the status of the master node, the partner board of the master node, and the master node. There is an active/standby signal line, so the partner board can obtain the running status of the master node. If the partner board checks through the hardware signal line that the master node has been running abnormally, the standby controller can be notified to perform the switching immediately; After the hardware signal line checks that the main control node is still running normally, the running status of the main control node is notified to the standby control node, and the standby control node rechecks the status of the main control node, if the main control node is still not received within 1 minute. The keep-alive message indicates that although the hardware signal of the master node is normal, but the software flow is already in a fault state, the standby control node notifies the partner board of the master node to reset the master node through the hardware signal, and the partner board after the reset is completed. The reset result is notified to the standby control node; the standby control node receives the master control node of the master node partner board response After the result of abnormal, initiates switching the control system take over the processing flow, flow into the master node processing becomes a master node and select the standby control node.
通过上述处理, 能够充分利用集群路由器系统多框多主控板的优势, 在多个机框之 间实现主控节点、 备控节点、 主用业务节点和备用业务节点的确定, 允许主控节点运行 在任何一个框上, 并且能够避免存在主备关系的板卡位于同一机框内, 在保证主控节点 和主用业务节点的性能的前提下有效提高系统的可靠性和吞吐量; 此外, 还能够借助硬 件信号和逻辑信号实现主控节点和备控节点的监控功能。  Through the above processing, the advantages of the multi-chassis multi-control board of the cluster router system can be fully utilized, and the determination of the main control node, the standby control node, the active service node, and the standby service node is implemented between the multiple chassis, and the main control node is allowed. It runs on any box and can prevent the boards with the active/standby relationship from being located in the same chassis. This improves the reliability and throughput of the system while ensuring the performance of the master node and the active service node. It is also possible to implement monitoring functions of the master node and the standby node by means of hardware signals and logic signals.
根据本发明的实施例, 还提供了一种板卡。 如图 6所示, 根据本发明实施例的板卡包括: According to an embodiment of the invention, a board is also provided. As shown in FIG. 6, the board according to an embodiment of the present invention includes:
发送模块 61 , 用于向其他板卡发送竟争报文, 其中, 竟争报文中携带该板卡的硬件 信息;  The sending module 61 is configured to send a competition message to the other board, where the competition message carries the hardware information of the board;
接收模块 62,用于在发送模块向其他板卡发送竟争报文的过程中接收来自其他板卡 的竟争 4艮文;  The receiving module 62 is configured to receive the contention from the other board in the process of sending the contention message to the other board by the sending module;
第一确定模块 63 , 连接至发送模块 61和接收模块 62 , 用于将所在板卡的硬件信息 与接收模块接收的来自其他板卡的竟争报文中的硬件信息进行比较, 并根据竟争原则确 定出竟争胜出的板卡, 其中, 竟争原则用于根据硬件信息的比较结果确定更适合作为主 控节点的板卡;  The first determining module 63 is connected to the sending module 61 and the receiving module 62, and is configured to compare the hardware information of the board and the hardware information in the competing messages received by the receiving module from other boards, and according to the competition The principle determines the winning board, wherein the competition principle is used to determine a board that is more suitable as the master node according to the comparison result of the hardware information;
控制模块 64, 连接至发送模块 61 , 用于在确定结果为第一确定模块 61所在板卡竟 争失败的情况下, 控制发送模块停止发送竟争 ·ί艮文;  The control module 64 is connected to the sending module 61, and is configured to control the sending module to stop sending the competition if the result of the determination is that the board of the first determining module 61 fails.
第二确定模块 65 , 连接至发送模块 61和接收模块 62, 用于在所在板卡为主控节点 的情况下, 将其他备选板卡中的一个备选板卡确定为备控节点.  The second determining module 65 is connected to the sending module 61 and the receiving module 62, and is configured to determine one of the other optional boards as the standby node when the board is the master node.
其中, 硬件信息包括板卡的物理位置信息、 板卡在所在机框中的主备用状态信息, 其中, 主备用状态信息用于表示该板卡在所在机框中为主控板卡或备控板卡。  The hardware information includes the physical location information of the board and the primary and backup status information of the board in the chassis. The primary and backup status information is used to indicate that the board is the main control board or the controller in the chassis. Board.
并且, 根据本发明实施例的板卡还可以包括: 通知模块(未示出), 用于在所在板 卡被确定为主控节点后通知其他板卡上报各自的硬件信息;  Moreover, the board according to the embodiment of the present invention may further include: a notification module (not shown), configured to notify other boards to report respective hardware information after the board is determined to be the master node;
并且, 第二确定模块 65具体用于根据其他板卡上报的硬件信息将与主控节点位于 不同机框的一个备选板卡确定为备控节点。  Moreover, the second determining module 65 is specifically configured to determine, as the standby control node, an optional board that is located in a different chassis of the master node according to the hardware information reported by the other board.
此外, 发送模块 61 同样可以以广播或组播的方式发送竟争 4艮文, 并且能够实现保 活消息的发送, 接收模块 62 同样能够接收其他板卡的保活消息以及上报的负载信息和 硬件信息。 业务节点分配请求的发送则既可以由发送模块 62来执行, 也可以由通知模 块执行, 必要时, 可以将通知模块与发送模块 62合一设置。  In addition, the sending module 61 can also send the content in a broadcast or multicast manner, and can implement the sending of the keep-alive message, and the receiving module 62 can also receive the keep-alive messages of other boards and the reported load information and hardware. information. The transmission of the service node allocation request may be performed by the sending module 62 or by the notification module. If necessary, the notification module and the sending module 62 may be set in one.
并且, 主用业务节点的选择和备用业务节点的选择可以由第一确定模块 63和第二 确定模块 65分别执行, 也可以由其中的一者执行, 必要时, 可以将第一确定模块 63和 第二确定模块 65合一设置。  Moreover, the selection of the primary service node and the selection of the standby service node may be performed by the first determining module 63 and the second determining module 65, respectively, or may be performed by one of them. If necessary, the first determining module 63 and The second determining module 65 is set in one.
并且,控制模块同样可以被配置为在判断本板卡竟争失败次数达到一预定值的情况 下通知发送模块停止发送竟争报文。  Moreover, the control module can also be configured to notify the sending module to stop transmitting the competing message if it is determined that the number of failures of the card competition has reached a predetermined value.
图 6所示的板卡在进行竟争报文发送、 竟争胜出的判断、 备控节点选择、 主用业务 节点的选择、 备用业务节点的选择、 以及主控节点的监控的处理过程已经在方法实施例 中结合实例 1和实例 2进行了描述, 这里将不再重复其具体过程。 The processing of the board shown in FIG. 6 in the process of transmitting the competitive message, the judgment of the winning, the selection of the standby control node, the selection of the active service node, the selection of the standby service node, and the monitoring of the master node are already in progress. Method embodiment It is described in connection with Example 1 and Example 2, and the specific process will not be repeated here.
综上所述, 借助于本发明的上述技术方案, 能够充分利用集群路由器系统多框多主 控板的优势, 在多个机框之间实现主控节点、 备控节点、 主用业务节点和备用业务节点 的确定, 允许主控节点运行在任何一个框上, 并且能够避免存在主备关系的板卡位于同 一机框内,在保证主控节点和主用业务节点的性能的前提下有效提高系统的可靠性和吞 吐量; 此外, 还能够借助硬件信号和逻辑信号实现主控节点和备控节点的监控功能。  In summary, with the above technical solution of the present invention, the advantages of the multi-chassis multi-control board of the cluster router system can be fully utilized, and the main control node, the standby control node, the main service node and the main service node are realized between multiple chassis. The determination of the standby service node allows the master node to run in any one of the frames, and can prevent the board in which the active/standby relationship is located in the same chassis, and effectively improve the performance of the master node and the active service node. System reliability and throughput; In addition, the monitoring functions of the master node and the standby node can be realized by means of hardware signals and logic signals.
以上所述仅为本发明的较佳实施例而已, 并不用以限制本发明, 凡在本发明的精神 和原则之内, 所作的任何修改、等同替换、 改进等, 均应包含在本发明的保护范围之内。  The above description is only the preferred embodiment of the present invention, and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc., which are included in the spirit and scope of the present invention, should be included in the present invention. Within the scope of protection.

Claims

权利要求书 Claim
1. 一种板卡主备配置的实现方法,用于从位于多个机框中的多个板卡中确定主控节 点和备控节点, 其特征在于, 所述方法包括:  A method for implementing a master/slave configuration of a card, configured to determine a master node and a standby node from a plurality of boards located in a plurality of chassis, wherein the method includes:
对于所述多个板卡中的多个备选板卡, 每个备选板卡向其他备选板卡发送竟争报 文, 其中, 每个备选板卡发送的竟争报文中携带该备选板卡的硬件信息;  For each of the plurality of boards, each of the candidate boards sends a competition message to the other candidate board, wherein each of the candidate boards sends the contention message Hardware information of the optional board;
在竟争过程中, 所述每个备选板卡接收来自其他备选板卡的竟争报文;  During the competition, each of the candidate boards receives a competition message from another candidate board;
所述每个备选板卡将其硬件信息与接收的竟争报文中的硬件信息进行比较,根据竟 争原则确定出竟争胜出的备选板卡, 其中, 所述竟争原则用于根据硬件信息的比较结果 确定作为主控节点的备选板卡;  Each of the candidate boards compares the hardware information with the hardware information in the received competition message, and determines an optional board that wins according to the competition principle, wherein the competition principle is used for Determining an optional board as a master node according to the comparison result of the hardware information;
根据所述多个备选板卡的竟争情况确定唯一的主控节点,且所述主控节点将其他备 选板卡中的一个备选板卡确定为备控节点。  A unique master node is determined according to the contention of the plurality of candidate boards, and the master node determines one of the other candidate boards as the standby node.
2. 根据权利要求 1所述的实现方法,其特征在于,所述根据多个备选板卡的竟争情 况确定唯一的主控节点包括:  The implementation method according to claim 1, wherein the determining the unique master node according to the competition condition of the multiple candidate boards comprises:
将所述多个备选板卡中唯——个从未竟争失败的备选板卡确定为所述主控节点。  Only one of the plurality of candidate boards that has never failed to be determined is the master node.
3. 根据权利要求 1所述的实现方法, 其特征在于, 还包括: The implementation method of claim 1, further comprising:
若备选板卡竟争失败, 则该备选板卡停止向其他备选板卡发送竟争报文。  If the candidate card fails, the candidate card stops sending the competition message to the other candidate cards.
4. 根据权利要求 1所述的实现方法, 其特征在于, 还包括:  The implementation method of claim 1, further comprising:
预先对所述备选板卡设置竟争标志, 并将竟争标志置位为真, 该置位表示允许该备 选板卡通过发送竟争报文竟争成为主控节点;  Setting a competition flag for the candidate card in advance, and setting the competition flag to true, the setting indicates that the standby card is allowed to become a master node by sending a competition message;
对于每个备选板卡,如果该备选板卡判断其在与其他任一备选板卡的竟争中失败的 情况下, 则将该备选板卡的竟争标志置位为假, 该置位用于表示禁止该备选板卡通过发 送竟争报文竟争成为主控节点。  For each candidate board, if the candidate board determines that it has failed in the competition with any of the other optional boards, then the content flag of the candidate board is set to false. The setting is used to indicate that the candidate card is prohibited from being a master node by sending a competition message.
5. 根据权利要求 3所述的实现方法,其特征在于,所述将多个备选板卡中唯一一个 从未竟争失败的备选板卡确定为主控节点包括:  The implementation method according to claim 3, wherein the determining, by the only one of the plurality of candidate boards, the candidate card that has never failed to be determined as the master node comprises:
在一个备选板卡向其他所有备选板卡广播或组播竟争报文的次数达到预定次数阈 值的情况下, 如果该备选板卡未继续收到来自其他备选板卡的竟争报文, 则将该备选板 卡确定为所述主控节点。  In the case where an optional board broadcasts or multicasts a number of competing messages to all other candidate cards for a predetermined number of thresholds, if the candidate card does not continue to receive competition from other candidate boards For the message, the candidate card is determined as the master node.
6. 根据权利要求 1所述的实现方法,其特征在于,所述硬件信息包括板卡的物理位 置信息、 板卡在所在机框中的主备用状态信息, 其中, 所述主备用状态信息用于表示该 板卡在所在机框中为主控板卡或备控板卡。 The implementation method according to claim 1, wherein the hardware information includes physical location information of the card, and primary standby state information of the board in the chassis, where the primary standby status information is used. Indicates that The board is the main control board or the standby control board in the chassis.
7. 根据权利要求 6所述的实现方法, 其特征在于, 所述竟争原则包括:  The implementation method according to claim 6, wherein the competition principle comprises:
优先将在各自机框中作为主控板卡的备选板卡确定为胜出的备选板卡;  It is preferred to determine the candidate board as the main control board in the respective chassis as the winning candidate board;
如果比较的两个备选板卡均为各自机框中的主控板卡, 则在进行比较的两个备选板 卡中将物理位置较小或较大的备选板卡确定为胜出的备选板卡。  If the two candidate boards being compared are the main control boards in the respective chassis, the candidate boards with smaller or larger physical positions are determined to be winning in the two candidate boards for comparison. Optional board.
8. 根据权利要求 6所述的实现方法,其特征在于,所述主控节点将其他备选板卡中 的一个备选板卡确定为备控节点包括:  The implementation method according to claim 6, wherein the determining, by the master node, one of the other candidate cards as the standby node comprises:
所述主控节点通知其他备选板卡上报各自的硬件信息, 并根据上报的所述硬件信息 将与所述主控节点位于不同机框的一个备选板卡确定为备控节点。  The master node notifies the other candidate boards to report the respective hardware information, and determines an optional board that is located in a different chassis of the master node as the standby node according to the reported hardware information.
9. 根据权利要求 8所述的实现方法,其特征在于,所述根据上报的硬件信息将与主 控节点位于不同机框的一个备选板卡确定为备控节点进一步包括:  The method according to claim 8, wherein the determining, according to the reported hardware information, an option board that is located in a different chassis of the master node as the standby node further includes:
如果存在与所述主控节点位于不同机框且在各自机框中作为主控板卡的多个备选 板卡, 则从该多个备选板卡中随机选择一个板卡确定为所述备控节点、 或者将该多个备 选板卡中物理位置最小或最大的备选板卡确定为所述备控节点。  If there are a plurality of candidate boards that are located in different chassis and are in the respective chassis as the main control board, randomly selecting one of the plurality of candidate boards is determined as The standby node, or an optional board that minimizes or maximizes the physical location of the plurality of candidate boards, is determined as the standby node.
10. 根据权利要求 1所述的实现方法, 其特征在于, 在确定了所述主控节点和所述 备控节点之后, 所述方法还包括:  The method according to claim 1, wherein after determining the master node and the standby node, the method further includes:
所述主控节点向所述多个机框的主控板卡发送业务分配请求, 并接收所述多个机框 的主控板卡返回的各个主控板的硬件信息和负载信息;  And the master control node sends a service allocation request to the main control board of the multiple chassis, and receives hardware information and load information of each main control board returned by the main control board of the multiple chassis;
所述主控节点根据所述多个机框的主控板卡上报的硬件信息和负载信息从所述多 个机框的主控板卡中确定主用业务节点和备用业务节点。  The master control node determines the active service node and the standby service node from the main control board of the multiple chassis according to the hardware information and the load information reported by the main control board of the multiple chassis.
11. 根据权利要求 10所述的实现方法,其特征在于, 所述主控节点根据所述多个机 框的主控板卡上报的硬件信息和负载信息从所述多个机框的主控板卡中确定主用业务 节点和备用业务节点包括:  The implementation method according to claim 10, wherein the master control node controls the plurality of chassis according to hardware information and load information reported by the main control board of the multiple chassis The primary service node and the standby service node are determined in the board, including:
所述主控节点根据负载信息优先将负载最小的主控板卡确定为主用业务节点, 并将 所述备选板卡中除所述主控节点、 所述备控节点、 以及所述主用业务节点以外的主控板 卡中的一个板卡确定为备用业务节点。  Determining, by the main control node, the main control board with the smallest load according to the load information as the main service node, and excluding the main control node, the standby control node, and the main It is determined as a standby service node by using one of the main control boards other than the service node.
12. 根据权利要求 11所述的实现方法,其特征在于,所述将备选板卡中除主控节点、 备控节点、 以及主用业务节点以外的主控板卡中的一个板卡确定为备用业务节点包括: 将与所述主用业务节点位于不同机框的一个主控板卡确定为所述备用业务节点。  The implementation method according to claim 11, wherein the board of the main control board except the main control node, the standby control node, and the main service node is determined in the candidate card. The standby service node includes: determining, as the standby service node, a main control board that is located in a different chassis from the active service node.
13. 根据权利要求 1所述的实现方法, 其特征在于, 在确定了所述主控节点和所述 备控节点之后, 所述方法还包括: 13. The implementation method according to claim 1, wherein in the determining the master node and the After the node is controlled, the method further includes:
所述主控节点以预定周期向所述备控节点发送保活消息;  The master node sends a keep-alive message to the standby node at a predetermined period;
如果所述主控节点满足倒换条件 , 则对所述主控节点和所述备控节点进行倒换, 其 中, 所述倒换条件包括: 所述备控节点在预定时间段内未接收到来自所述主控节点的保 活消息、且所述备控节点通过与所述主控节点位于同一机框的其他板卡获知所述主控节 点的工作状态出现异常。  If the master node meets the switching condition, the master node and the standby node are switched, where the switching condition includes: the standby node does not receive the A keep-alive message of the master node, and the standby node learns that the working state of the master node is abnormal through other boards in the same chassis as the master node.
14. 根据权利要求 1至 13中任一项所述的实现方法,其特征在于,所述备选板卡包 括所述多个机框的主控板卡和备控板卡中的部分或全部。  The implementation method according to any one of claims 1 to 13, wherein the optional board comprises part or all of the main control board and the standby control board of the plurality of chassis .
15. 一种板卡, 其特征在于, 包括:  15. A board, comprising:
发送模块, 用于向其他板卡发送竟争报文, 其中, 竟争报文中携带该板卡的硬件信 息;  a sending module, configured to send a competition message to another board, where the competition message carries hardware information of the board;
接收模块, 用于在所述发送模块向其他板卡发送竟争报文的过程中接收来自其他板 卡的竟争 ·ί艮文;  a receiving module, configured to receive a competition from another board in the process of sending the contention message to the other board by the sending module.
第一确定模块, 用于将所在板卡的硬件信息与所述接收模块接收的来自其他板卡的 竟争报文中的硬件信息进行比较, 并根据竟争原则确定出竟争胜出的板卡, 其中, 所述 竟争原则用于根据硬件信息的比较结果确定作为主控节点的板卡;  a first determining module, configured to compare hardware information of the board in which the hardware information of the board is received with the hardware information received by the receiving module from the other board, and determine the winning board according to the competition principle The competition principle is used to determine a board as a master node according to a comparison result of hardware information;
控制模块, 用于在确定结果为所述第一确定模块所在板卡竟争失败的情况下, 控制 所述发送模块停止发送竟争报文;  a control module, configured to control the sending module to stop sending a competition message if the result of the determination is that the board of the first determining module fails to compete;
第二确定模块, 用于在所在板卡为主控节点的情况下, 将其他备选板卡中的一个备 选板卡确定为所述备控节点。  The second determining module is configured to determine one of the other candidate boards as the standby node when the board is the master node.
16. 根据权利要求 15所述的板卡,其特征在于,所述硬件信息包括板卡的物理位置 信息、 板卡在所在机框中的主备用状态信息, 其中, 所述主备用状态信息用于表示该板 卡在所在机框中为主控板卡或备控板卡。  The board according to claim 15, wherein the hardware information includes physical location information of the board, and primary standby status information of the board in the chassis, wherein the primary standby status information is used. Indicates that the board is the main control board or the standby control board in the same chassis.
17. 根据权利要求 15或 16所述的板卡, 其特征在于, 还包括:  17. The card of claim 15 or 16, further comprising:
通知模块, 用于在所在板卡被确定为主控节点后通知其他板卡上报各自的硬件信 息;  The notification module is configured to notify other boards to report respective hardware information after the board is determined to be the master node;
并且, 所述第二确定模块具体用于根据其他板卡上报的所述硬件信息将与所述主控 节点位于不同机框的一个备选板卡确定为备控节点。  And the second determining module is specifically configured to determine, as the standby control node, an optional board that is located in a different chassis of the master node according to the hardware information reported by the other board.
PCT/CN2011/075989 2010-09-01 2011-06-20 Implementing method for main/standby configuration of board cards, and board card WO2012028013A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010270476.9 2010-09-01
CN2010102704769A CN101938417A (en) 2010-09-01 2010-09-01 Method for realizing configuration of main and auxiliary board cards as well as board cards

Publications (1)

Publication Number Publication Date
WO2012028013A1 true WO2012028013A1 (en) 2012-03-08

Family

ID=43391559

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/075989 WO2012028013A1 (en) 2010-09-01 2011-06-20 Implementing method for main/standby configuration of board cards, and board card

Country Status (2)

Country Link
CN (1) CN101938417A (en)
WO (1) WO2012028013A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108874441A (en) * 2018-06-20 2018-11-23 上海思源弘瑞自动化有限公司 A kind of board configuration method, device, server and storage medium
CN109150423A (en) * 2017-06-27 2019-01-04 中兴通讯股份有限公司 Dual master control equipment starts method, apparatus and dual master control equipment
WO2020103645A1 (en) * 2018-11-23 2020-05-28 中兴通讯股份有限公司 Single board main and standby control method, apparatus, device and readable storage medium
CN113641623A (en) * 2021-06-30 2021-11-12 曙光网络科技有限公司 Information interaction method and device, frame type network equipment and computer readable storage medium

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101938417A (en) * 2010-09-01 2011-01-05 中兴通讯股份有限公司 Method for realizing configuration of main and auxiliary board cards as well as board cards
CN102098218B (en) * 2011-01-21 2013-07-17 汉柏科技有限公司 System keep-alive method based on Ethernet multicast
CN103532843B (en) * 2011-03-17 2016-12-07 华为技术有限公司 Method of work and device for virtual network unit
CN102291303B (en) * 2011-08-12 2014-07-23 大唐移动通信设备有限公司 Single board and method for determining primary and standby states for the same
CN102368208A (en) * 2011-09-23 2012-03-07 广东威创视讯科技股份有限公司 Master and slave node distribution method and device for splicing unit
CN103684855A (en) * 2013-11-29 2014-03-26 中国电子科技集团公司第三十研究所 Intelligent control method for cluster routers
CN104202205B (en) * 2014-09-26 2018-01-02 烽火通信科技股份有限公司 The method and device of service protection is realized in a kind of board
CN106302198A (en) * 2015-05-25 2017-01-04 中兴通讯股份有限公司 The collocation method of cluster routers cpu resource and cluster routers
CN106850715A (en) * 2015-12-04 2017-06-13 大唐移动通信设备有限公司 A kind of primary main frame dynamic selection method of Intrusion Detection based on host state and priority
CN107196779B (en) * 2016-03-15 2020-11-03 中国电信股份有限公司 Method, node and system for realizing network self-healing
CN106773862A (en) * 2016-12-30 2017-05-31 深圳市英威腾电气股份有限公司 A kind of rectifier combining system and its control method
CN108632151B (en) * 2017-03-24 2022-03-11 中兴通讯股份有限公司 Cluster router board card access method and device and cluster router
CN107547281B (en) * 2017-09-18 2020-07-24 通鼎互联信息股份有限公司 Main/standby competition method, device and application equipment
CN107659413B (en) * 2017-09-18 2021-06-08 北京百卓网络技术有限公司 Small-sized communication equipment
CN109471779A (en) * 2018-11-21 2019-03-15 上海闻泰信息技术有限公司 Board state monitoring method, micro-control unit, server and storage medium
CN109828945B (en) * 2019-02-20 2021-01-26 杭州迪普科技股份有限公司 Service message processing method and system
CN110177032B (en) * 2019-07-08 2021-05-18 北京经纬恒润科技股份有限公司 Message routing quality monitoring method and gateway controller
CN112416445B (en) * 2020-11-19 2023-07-04 北京天融信网络安全技术有限公司 Device and method for determining master and slave of board card
CN113114516B (en) * 2021-05-20 2023-04-07 中国联合网络通信集团有限公司 Router management method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1321004A (en) * 2000-04-25 2001-11-07 华为技术有限公司 Method and equipment for swapping active with standby switches
KR20030051475A (en) * 2003-05-19 2003-06-25 이길년 A automobile cleaner
CN1536840A (en) * 2003-04-11 2004-10-13 中兴通讯股份有限公司 Method for implementing cooperation between wireless local area network equipments
CN1797253A (en) * 2004-12-29 2006-07-05 上海贝尔阿尔卡特股份有限公司 Method and device for determining states of main/standby of devices in system
CN101605051A (en) * 2009-07-16 2009-12-16 杭州华三通信技术有限公司 A kind of main and standby rearranging method and device of realizing business on the control board
CN101938417A (en) * 2010-09-01 2011-01-05 中兴通讯股份有限公司 Method for realizing configuration of main and auxiliary board cards as well as board cards

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1321004A (en) * 2000-04-25 2001-11-07 华为技术有限公司 Method and equipment for swapping active with standby switches
CN1536840A (en) * 2003-04-11 2004-10-13 中兴通讯股份有限公司 Method for implementing cooperation between wireless local area network equipments
KR20030051475A (en) * 2003-05-19 2003-06-25 이길년 A automobile cleaner
CN1797253A (en) * 2004-12-29 2006-07-05 上海贝尔阿尔卡特股份有限公司 Method and device for determining states of main/standby of devices in system
CN101605051A (en) * 2009-07-16 2009-12-16 杭州华三通信技术有限公司 A kind of main and standby rearranging method and device of realizing business on the control board
CN101938417A (en) * 2010-09-01 2011-01-05 中兴通讯股份有限公司 Method for realizing configuration of main and auxiliary board cards as well as board cards

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109150423A (en) * 2017-06-27 2019-01-04 中兴通讯股份有限公司 Dual master control equipment starts method, apparatus and dual master control equipment
CN109150423B (en) * 2017-06-27 2022-07-22 中兴通讯股份有限公司 Dual-master control equipment starting method and device and dual-master control equipment
CN108874441A (en) * 2018-06-20 2018-11-23 上海思源弘瑞自动化有限公司 A kind of board configuration method, device, server and storage medium
CN108874441B (en) * 2018-06-20 2022-08-09 上海思源弘瑞自动化有限公司 Board card configuration method, device, server and storage medium
WO2020103645A1 (en) * 2018-11-23 2020-05-28 中兴通讯股份有限公司 Single board main and standby control method, apparatus, device and readable storage medium
CN113641623A (en) * 2021-06-30 2021-11-12 曙光网络科技有限公司 Information interaction method and device, frame type network equipment and computer readable storage medium
CN113641623B (en) * 2021-06-30 2024-02-20 曙光网络科技有限公司 Information interaction method and device, frame type network equipment and computer readable storage medium

Also Published As

Publication number Publication date
CN101938417A (en) 2011-01-05

Similar Documents

Publication Publication Date Title
WO2012028013A1 (en) Implementing method for main/standby configuration of board cards, and board card
US7483383B2 (en) Stack manager protocol with automatic set up mechanism
EP1655906B1 (en) Stack switch manager protocol with temporary suspension of supervision
US7505403B2 (en) Stack manager protocol with automatic set up mechanism
EP0140712B1 (en) Data transmission system and method
US20050163061A1 (en) Zero configuration peer discovery in a grid computing environment
JPH05276175A (en) Data communication method and communication system
US20140153924A1 (en) Method and Apparatus for Connectivity Control in a Data Center Network
JP2007060184A (en) Network repeating installation and its control method
US20090168706A1 (en) Method for Channel Assignment in Multi-Radio Wireless Mesh Networks and Corresponding Network Node
CN109040184B (en) Host node election method and server
US11483383B2 (en) Data reporting method and system
CN114866365B (en) Arbitration machine election method, device, intelligent equipment and computer readable storage medium
JP2011239382A (en) Ring manager selection method and node in ring network
US20230208661A1 (en) Intermediary device for daisy chain and tree configuration in hybrid data/power connection
JP2001103062A (en) Method for notifying detection of fault
KR101075462B1 (en) Method to elect master nodes from nodes of a subnet
JP2008228180A (en) Radio device
EP2071764B1 (en) A method, device and communication system thereof of electing local master
JP2009224978A (en) Radio communication apparatus, radio communication system, and network restructuring method of radio communication apparatus
KR100832543B1 (en) High availability cluster system having hierarchical multiple backup structure and method performing high availability using the same
CN113765795A (en) Networking method, device, system and storage medium
KR20200101117A (en) Network system capable of detecting freezing status of node and method for detecting freezing status of node
JPH10126429A (en) Network
CN115643237B (en) Data processing system for conference

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11821039

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11821039

Country of ref document: EP

Kind code of ref document: A1