CN110914806B - System and method for synchronous distributed multi-node code execution - Google Patents

System and method for synchronous distributed multi-node code execution Download PDF

Info

Publication number
CN110914806B
CN110914806B CN201780090777.3A CN201780090777A CN110914806B CN 110914806 B CN110914806 B CN 110914806B CN 201780090777 A CN201780090777 A CN 201780090777A CN 110914806 B CN110914806 B CN 110914806B
Authority
CN
China
Prior art keywords
monitored
progress
code
synchronization
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201780090777.3A
Other languages
Chinese (zh)
Other versions
CN110914806A (en
Inventor
亚历山大·克拉夫索夫
吴祖光
什洛莫·蓬格拉茨
拉米·茨卡里埃
锡安·高尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN110914806A publication Critical patent/CN110914806A/en
Application granted granted Critical
Publication of CN110914806B publication Critical patent/CN110914806B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/522Barrier synchronisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Abstract

A synchronized distributed system comprising at least one hardware processor configured to: instructing a plurality of monitored hardware processors to begin executing at least one monitored code, the at least one monitored code comprising a sequence of computational steps; receiving a progress indication associated with a time period from one of the plurality of monitored hardware processors; updating a code progress indication set according to the progress indication; calculating a synchronization delay for one of the plurality of monitored hardware processors using the set of code progress indications; sending the synchronization delay to the one of the plurality of monitored hardware processors to allow the one of the plurality of monitored hardware processors to stop executing between each two consecutive computation steps of the sequence of computation steps to reach a time interval that coincides with the synchronization delay.

Description

System and method for synchronous distributed multi-node code execution
Background
The present invention, in some embodiments thereof, relates to a system for synchronizing distributed code execution on multiple hardware processors, and more particularly, but not exclusively, to a system for synchronizing distributed simulated code execution on multiple hardware processors.
When running distributed code on multiple hardware processors, synchronization may need to be performed between some of the multiple hardware processors running the distributed code. When the distributed code is code for a simulation system, the distributed code typically includes a plurality of simulation engines; each simulation engine is a software object that executes code that simulates a portion of the system. For example, software simulation is typically implemented on a large-scale digital communications network containing a plurality of connected network nodes using a plurality of simulation engines executed by a plurality of hardware processors, wherein each of the plurality of hardware processors executes code that simulates some of the plurality of connected network nodes. In order to properly simulate the behavior of the large-scale digital communication network, it is necessary to synchronize the progress of the multiple simulation engines executed by the multiple hardware processors. For example, when a first analog network node of the analog large scale digital communication network sends a message to a second analog network node of the analog large scale digital communication network, it may be necessary to stop an executing hardware processor from executing code to simulate the second analog network node until code of the first analog network node to be simulated that is executed by another hardware processor sends a message to the second analog network entity using the analog large scale digital communication network.
Disclosure of Invention
It is an object of the present invention to provide a system and method for synchronizing distributed code execution on multiple hardware processors.
The above and other objects are achieved by the features of the independent claims. Further implementations are apparent from the dependent claims, the detailed description and the drawings.
According to a first aspect of the invention, a synchronized distributed system comprises at least one hardware processor for: instructing a plurality of monitored hardware processors to begin executing at least one monitored code, the at least one monitored code comprising a sequence of computational steps; receiving a progress indication associated with a time period from one of the plurality of monitored hardware processors; updating a code progress indication set according to the progress indication; calculating a synchronization delay for one of the plurality of monitored hardware processors using the set of code progress indications; sending the synchronization delay to the one of the plurality of monitored hardware processors to allow the one of the plurality of monitored hardware processors to stop executing between each two consecutive computation steps of the sequence of computation steps to reach a time interval that coincides with the synchronization delay. By having one of the plurality of monitored hardware processors stop executing independently for other ones of the plurality of monitored hardware processors, synchronization may be achieved between the plurality of monitored hardware processors without introducing delays caused by waiting to receive progress indications from all of the plurality of monitoring hardware processors. Further, calculating the dedicated synchronization delays of the monitored hardware processors, an optimal synchronization delay may be selected for each of the plurality of monitored hardware processors, thereby minimizing delays in executing the at least one monitored code introduced when using other schemes.
According to a second aspect of the invention, a method for synchronizing a distributed system comprises, at a synchronization controller: instructing a plurality of monitored hardware processors to begin executing at least one monitored code, the at least one monitored code comprising a sequence of computational steps; receiving a progress indication associated with a time period from one of the plurality of monitored hardware processors; updating a code progress indication set according to the progress indication; calculating a synchronization delay for one of the plurality of monitored hardware processors using the set of code progress indications; sending the synchronization delay to the one of the plurality of monitored hardware processors.
According to a third aspect of the invention, a method for adjusting execution speed of a monitored hardware processor comprises: on the monitored hardware processor: executing a calculation step; incrementing a step counter; sending a value of the step counter associated with a time period to a synchronization controller; receiving a synchronization delay value from the synchronization controller; calculating a time interval consistent with the synchronization delay value; executing a further sequence of calculation steps and stopping execution in said time interval between every two consecutive calculation steps of said further sequence of calculation steps. The execution of the monitored hardware processor is stopped at a time interval corresponding to the synchronization delay received from the synchronization controller between the execution of every two successive calculation steps, enabling the execution progress on the monitored hardware processor to be continuously adjusted.
With reference to the first, second, and third aspects, in a first possible implementation manner of the first, second, and third aspects of the present invention, the synchronization delay is calculated by: identifying a minimum progress in the set of code progress indications; calculating a progress gap x by subtracting the minimum progress from the progress indication; calculating a synchronization interval y by subtracting said x from a maximum synchronization interval Δ T of said time period; calculating a modification term by subtracting the reciprocal of Δ T from the reciprocal of y; calculating a synchronization delay by adding the modification term to the x. Calculating a synchronization delay associated with the slowest monitored hardware processor's progress using a maximum synchronization interval, ensuring that all of the plurality of monitored hardware processors progress at a rate sufficiently similar to the slowest monitored hardware processor's progress. This may reduce problems that arise when the at least one monitored code is executed due to loss of synchronization.
With reference to the first, second, and third aspects or the first implementation manner of the first, second, and third aspects, in a second possible implementation manner of the first, second, and third aspects of the present invention, the plurality of monitored hardware processors send periodic progress indications to the at least one hardware processor at fixed time intervals. The continuous monitoring and calculating of the synchronization delay may be responsive to a peak change in execution speed of one or more of the plurality of monitored hardware processors.
With reference to the first, second, and third aspects or the first and second possible implementations of the first, second, and third aspects, in a third possible implementation of the first, second, and third aspects of the present invention, the fixed time interval has a value between 1 millisecond and 5 milliseconds.
With reference to the first, second, and third aspects or the first, second, or third implementation manners of the first, second, and third aspects, in a fourth possible implementation manner of the first, second, and third aspects of the present invention, the at least one hardware processor and the plurality of monitored hardware processors are connected to a data communication network. When sending and receiving messages using a data communication network, delays are typically introduced into the synchronization time due to the round trip delay of the messages on the data communication network. The present invention avoids these delays. In a system comprising a data communications network, the present invention introduces less delay than other common schemes.
With reference to the first, second, and third aspects or the fourth possible implementation manners of the first, second, and third aspects, in a fifth possible implementation manner of the first, second, and third aspects of the present invention, the data communication network is a Dynamic Circuit Network (DCN).
With reference to the first, second, and third aspects or the fourth possible implementation manners of the first, second, and third aspects, in a sixth possible implementation manner of the first, second, and third aspects of the present invention, the data communication network is a switched data communication network.
With reference to the first, second, and third aspects, or the first, second, third, fourth, fifth, or sixth implementation manners of the first, second, and third aspects, in a seventh possible implementation manner of the first, second, and third aspects of the present invention, the progress indication is a step quantity of the calculation step sequence. The measurement calculation step is very useful in monitored code (e.g. simulation).
With reference to the first, second and third aspects, or the first, second, third, fourth, fifth or sixth implementation manners of the first, second and third aspects, in an eighth possible implementation manner of the first, second and third aspects of the present invention, the progress indication is an amount of time. In some monitored codes, it is useful to measure the time progress.
With reference to the first, second, and third aspects, or the first, second, third, fourth, fifth, sixth, seventh, or eighth implementation manners of the first, second, and third aspects, in a ninth possible implementation manner of the first, second, and third aspects of the present invention, the at least one monitored code is a simulator. A simulator is a distributed code that requires synchronization.
With reference to the first, second, and third aspects, or the first, second, third, fourth, fifth, sixth, seventh, eighth, or ninth implementation manners of the first, second, and third aspects, in a tenth possible implementation manner of the first, second, and third aspects of the present invention, at least one of the plurality of monitored hardware processors is configured to execute a plurality of monitored codes. In some systems, such as a system for performing simulation in which some of the at least one monitored code is a simulation engine, one monitored hardware processor may execute more than one simulation engine.
With reference to the first aspect, the second aspect, and the third aspect, in an eleventh possible implementation manner of the first aspect, the second aspect, and the third aspect of the present invention, the calculating a time interval includes: calculating a denominator by multiplying the number of hardware processor cycles per second by the number of hardware processor cycles within the time period; dividing the synchronization delay value by the denominator. The computing step is a unit common to more than one of the plurality of monitored hardware processors. However, the number of hardware processor cycles per second for each of the plurality of monitored hardware processors may be different. By calculating the time delay with the monitored hardware processor, the synchronization controller can perform calculations using a unit common to more than one monitored hardware processor without sharing multiple hardware processors or speeds with the synchronization controller.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the present invention, with the exemplary methods and/or materials being described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not necessarily limiting.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Methods and materials similar or equivalent to those described herein can be used in the practice or testing of embodiments of the present invention, with the exemplary methods and/or materials being described below. In case of conflict, the patent specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not necessarily limiting.
Drawings
Some embodiments of the invention are described herein, by way of example only, with reference to the accompanying drawings. With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of the embodiments of the present invention. Thus, it will be apparent to one skilled in the art from the description of the figures how embodiments of the invention may be practiced.
In the drawings:
FIG. 1 is a schematic diagram of an exemplary system according to some embodiments of the invention;
FIG. 2 is a sequence diagram of an alternative operational flow according to some embodiments of the present invention;
FIG. 3 is a sequence diagram of a second alternative operational flow, involving calculating a synchronization delay, according to some embodiments of the invention;
FIG. 4 is a sequence diagram of a third alternative operational flow, involving delayed execution of a portion of distributed code, according to some embodiments of the present invention;
fig. 5 is a sequence diagram of a fourth alternative operational flow, involving calculating time intervals, according to some embodiments of the invention.
Detailed Description
The present invention, in some embodiments thereof, relates to a system for synchronizing distributed code execution on multiple hardware processors, and more particularly, but not exclusively, to a system for synchronizing distributed simulated code execution on multiple hardware processors.
The following description focuses on an embodiment wherein large scale simulation software is implemented using the present invention. However, the invention is not limited to such embodiments. Examples of other uses include parallel scientific computing and distributed rendering in computer graphics.
In a typical distributed computing system that includes multiple hardware processors, each of the multiple hardware processors executes a portion of the distributed code. Each portion of the distributed code includes a sequence of atomic computation steps, each atomic computation step including a set of instructions that execute uninterrupted within step boundaries. Analog clock ticks are examples of step boundaries. Another example of a step boundary is an amount of simulation time, for example, 1 microsecond. A third example of a step boundary is the execution of a software function. In such a distributed computing system, it may be desirable to synchronize the code execution progress of the multiple hardware processors in an atomic computing step within a predefined accuracy range. For example, the difference between the amount of atomic computation steps performed by one of the plurality of hardware processors and the amount of other atomic computation steps performed by another of the plurality of hardware processors cannot exceed a threshold difference.
One approach for synchronizing the plurality of hardware processors uses a Bulk Synchronization Parallel (BSP) model (BSP model), where a synchronization controller is the first hardware processor executing code for controlling the simulation and is connected to a plurality of other hardware processors. In the BSP model, the synchronization controller instructs the plurality of other hardware processors to perform an atomic computation step and receives a completion indication from each of the plurality of other hardware processors. The synchronization controller waits for completion indications from all of the plurality of other hardware processors before instructing the plurality of hardware processors to perform another atomic computation step. In this method, the total execution time is gated by the slowest of the plurality of other hardware processors, all of which are delayed until a completion indication is received from the slowest of the plurality of other hardware processors. Only then will the synchronization controller instruct the next atomic calculation step to be performed. Furthermore, when the plurality of other hardware processors are connected to the synchronization controller using a data communication network, sending a message from the synchronization controller to a second hardware processor may cause a delay, as may sending a completion indication from the second hardware processor to the synchronization controller. These delays can extend the total execution time of the distributed code. The long latency of a hardware processor may be adversely affected when the distributed code simulates real world time. For example, when a hardware processor waits for more than a few seconds, the hardware processor's network connection may be interrupted by a timeout.
In another approach to synchronizing multiple hardware processors used by the Massachusetts Institute of Technology's graph (MIT graph) approach of the Massachusetts Institute of Technology, each of the multiple hardware processors is randomly paired with another one of the multiple hardware processors to perform synchronization for each quantum of atomic computation steps. This scheme introduces significant latency cycles, where at least one of the plurality of hardware processors is stalled. Furthermore, with this approach, it is not easy to have all of the plurality of hardware processors implement a converged state, where each of the plurality of hardware processors has performed a certain amount of atomic computation steps at a particular time.
The present invention, in some embodiments thereof, uses a feedback control approach to balance the execution speed of multiple hardware processors. In these embodiments, a plurality of monitored hardware processors executing at least one monitored code comprising a sequence of atomic computing steps are connected to a first hardware processor executing a synchronization controller over a data communications network. Each of the plurality of monitored hardware processors periodically samples its progress and sends its progress report to the synchronization controller. The synchronization controller calculates a synchronization delay for each of the plurality of monitored hardware processors to stop execution between every two consecutive atomic calculation steps of the at least one monitored code executed by the monitored hardware processor. This scheme allows for slowing down the execution speed of some of the plurality of monitored hardware processors without completely stopping execution of the at least one monitored code, thereby allowing the plurality of monitored hardware processors to execute respective portions of the at least one monitored code at substantially the same speed (within a predetermined difference limit). Further, in such embodiments, the synchronization controller communicates with each of the plurality of monitored hardware processors independently for another of the plurality of monitored hardware processors, and the monitored hardware processors of the plurality of monitored hardware processors need not stall communication to the synchronization controller with another of the plurality of monitored hardware processors. This may eliminate the delay introduced by synchronous communication with the plurality of monitored hardware processors on the data communication network. Therefore, the time required to complete the execution of the at least one monitored code is shorter than the time required to complete the execution of the at least one monitored code using other schemes (e.g., BSP model or MIT graphics card). Further, in such embodiments, there are fewer interruptions in executing the at least one monitored code (e.g., fewer network connections lost due to timeouts).
In some embodiments of the invention, the synchronization delay is calculated relative to the speed of the slowest hardware processor at the time of synchronization delay calculation. In these embodiments, the continuous periodic sampling of progress and the calculation of synchronization delays allow for dynamic adjustment of the execution speed of each of the monitored hardware processors, keeping all of the plurality of monitored hardware processors executing at approximately the same speed relative to each other during execution of the distributed code. Furthermore, as the speed of the slowest hardware processor increases, it may increase approximately the same. The dynamically adjusting execution speed may be agilely responsive to peak changes in execution speed of one or more of the plurality of monitored hardware processors.
Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not necessarily limited in its application to the details of construction and the arrangement of the components and/or methods set forth in the following description and/or illustrated in the drawings and/or examples. The invention is capable of other embodiments or of being practiced or carried out in various ways.
The present invention may be a system, method and/or computer program product. The computer program product may include one (or more) computer-readable storage media having computer-readable program instructions for causing a processor to perform various aspects of the present invention.
The computer readable storage medium may be a tangible device that can store and store instructions for use by an instruction execution device. For example, the computer-readable storage medium may be, but is not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing.
The computer-readable program instructions described herein may be downloaded to the respective computing/processing device from a computer-readable storage medium, or downloaded to an external computer or external storage device over a network. The network is the internet, a local area network, a wide area network and/or a wireless network, etc.
The computer readable program instructions may execute entirely on the user's computer or partly on the user's computer, or as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, using the Internet from an Internet service provider). In some embodiments, an electronic circuit comprising a programmable logic circuit, a Field Programmable Gate Array (FPGA) or a Programmable Logic Array (PLA), etc., may execute computer readable program instructions using state information of the computer readable program instructions to personalize the electronic circuit to perform aspects of the present invention.
Aspects of the present invention are described herein in connection with flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Referring now to FIG. 1, shown therein is a schematic diagram of an exemplary system 100 in accordance with some embodiments of the present invention. A synchronization (central) controller 102 is connected to the digital communication network 101. In these embodiments, the synchronization controller is at least one hardware processor executing code for controlling the plurality of monitored hardware processors 103, 104, and 105. The monitored hardware processor executes at least one monitored code. Optionally, the at least one monitored code is part of a distributed code. In these embodiments, the plurality of monitored hardware processors are connected to the digital communications network. In some embodiments, the digital communication network is a Dynamic Circuit Network (DCN). In embodiments where the digital communication network is a DCN, the DCN may use Inter-domain Controller (IDC) protocol. In other embodiments, the digital communication network is a switched data communication network, such as an Internet Protocol (IP) based switching network.
Referring now to fig. 2, shown therein is a sequence diagram of an alternative operational flow 200 in accordance with some embodiments of the present invention. In these embodiments, the synchronization controller instructs 201 the plurality of monitored hardware processors to begin executing at least one monitored code. The at least one monitored code comprises a sequence of computational steps. The computing steps of the sequence of computing steps are atomic, i.e., synchronization between the plurality of monitored hardware processors is done on the boundaries of the computing steps, wherein a hardware processor performs the entire computing step before stopping or reporting progress. The monitored code may be distributed code, wherein each of the plurality of monitored hardware processors executes at least a portion of the distributed code. In some embodiments, the monitored code is code for simulation, such as system simulation. Optionally, the monitored code simulates a large-scale data communication network. Optionally, the monitored code simulates an electronic device. In some embodiments, the monitored code performs distributed computations.
At 202, the synchronization controller receives a progress indication from one of at least one monitored code from a monitored hardware processor of the plurality of monitored hardware processors. The progress indication is associated with a certain time period. The progress indication may be a quantity of steps of a sequence of computational steps performed within a time period associated with the progress indication. Optionally, the time period is a time period of actual time, and the progress indicator is an amount of time of simulated time. In 203, the synchronization controller may update a set of code progress indications including a last code progress indication for each of the at least one monitored code received from a hardware processor executing the at least one monitored code. Optionally, the synchronization controller adds the progress indication to the set of code progress indications. Optionally, the synchronization controller may replace a previous progress indicator in the set of code progress indicators with the progress indicator. The synchronization controller uses the set of code progress indications to calculate, at 204, a synchronization delay for the monitored hardware processor. In 205, the synchronization controller sends the synchronization delay to the monitored hardware processor, so that the monitored hardware processor stops executing between every two consecutive calculation steps of the sequence of calculation steps to reach a time interval coinciding with the synchronization delay. By introducing a delay between two successive calculation steps, the overall progress of the monitored hardware processor will be slowed down, thereby synchronizing the pace of the hardware processor with the pace of the other monitored hardware processors of the plurality of monitored hardware processors.
In some embodiments, each of the plurality of monitored hardware processors sends a progress indication to the synchronization controller at regular intervals. The fixed time interval may have a value of a few milliseconds, for example between 1 and 5 milliseconds.
In some embodiments, the synchronization delay of the monitored hardware processor as calculated in 204 is calculated relative to the speed of the slowest monitored hardware processor in the synchronization delay calculation using the set of code progress indications. Referring now to FIG. 3, shown therein is a sequence diagram of a second alternative operational flow 300 involving calculating a synchronization delay for a monitored hardware processor, according to some embodiments of the present invention. In these embodiments, the synchronization controller identifies 301 a minimum progress in the set of progress indications. At 302, the synchronization controller calculates a progress interval x in these embodiments by subtracting the minimum progress from the progress indication of the monitored hardware processor. In 303 the synchronization controller calculates the synchronization interval y in these embodiments by subtracting x from a predefined maximum synchronization interval Δ T. In these embodiments, Δ T represents a maximum synchronization interval of the time period associated with the progress indication of the monitored hardware processor. Using a maximum synchronization interval of the time period, a difference between the progress of the first monitored hardware processor and the progress of the second monitored hardware processor may be maintained within a predefined limit. In these embodiments, the means for the monitored hardware processor progress indication is the same as the means for Δ T. For example, these units may be several steps in the sequence of calculation steps. In some embodiments, a time period associated with the progress indication of the monitored hardware processor is less than half of the maximum synchronization interval.
In some embodiments, the synchronization controller calculates a modification term in 304 by subtracting the reciprocal of Δ T from the reciprocal of y, and in 305, the synchronization controller calculates a synchronization delay by adding the modification term to the progress interval x.
Thus, the synchronization delay G of the hardware processor is calculatednThis can be described by the following formula:
Figure BDA0002271821560000071
in these embodiments, the synchronization delay associated with the progress interval has a non-linear relationship. For a relatively slow monitored hardware processor, the synchronization delay is brought closer to the processing interval as the progress indication approaches the minimum progress. However, as the processing interval of the monitored hardware processor approaches the maximum processing interval, the synchronization delay gradually increases to infinity. This may prevent loss of synchronization within a time window having a length equal to the maximum processing interval by completely preventing the monitored hardware processor from executing at a speed much faster than the slowest monitored hardware processor.
In some embodiments of the invention, the monitored hardware processor adjusts its execution speed according to the synchronization delay received from the synchronization controller. Referring now to fig. 4, shown therein is a sequence diagram of a third alternative operational flow 400, involving delayed execution of a portion of distributed code, in accordance with some embodiments of the present invention. In these embodiments, the monitored hardware processor executes at least one monitored code, the at least one monitored code comprising a sequence of atomic computation steps, the monitored hardware processor performs 401 the computation steps of the sequence of atomic computation steps and then increments 402 a step counter. Optionally, the monitored hardware processor increments an analog time counter. In 403, the monitored hardware processor sends the value of the step counter to the synchronization controller. Optionally, the monitored hardware processor sends the value of an analog time counter to the synchronization controller. In some embodiments, the value of the step counter is associated with a time period. In some embodiments, the monitored hardware processor receives 404 a synchronization delay value from the synchronization controller. In some embodiments, the synchronization delay value is a delay between atomic computation steps in the sequence of atomic computation steps. In these embodiments, the monitored hardware processor calculates 405 a time interval that coincides with the synchronization delay value. Next, in such an embodiment, the monitored hardware processor performs 406 other ones of the sequence of computing steps and stops performing for a time interval computed between every two consecutive computing steps of the other sequence of computing steps. By stopping execution between every two successive computation steps, the execution progress of the monitored code on the monitored hardware processor will be slowed down to synchronize with the speed specified by the synchronization controller. In some embodiments, the synchronization controller specifies a speed that is synchronized to the speed of the slowest monitored hardware processor.
In an embodiment comprising a plurality of monitored hardware processors executing at least one monitored code, a processor speed of one of the plurality of monitored hardware processors may be different from another processor speed of a second monitored hardware processor of the plurality of monitored hardware processors. Optionally, the processor speed is a number of instructions per second. Optionally, the processor speed is a number of hardware processor cycles per second. For a time period associated with a progress indication sent by the monitored hardware processor to the synchronization controller, there is one hardware processor cycle within the time period. The time interval for stopping execution received from the synchronization controller in accordance with the synchronization delay value may be calculated using the number of hardware processor cycles within the time period and the processor speed. Referring now to fig. 5, shown therein is a sequence diagram of a fourth alternative operational flow 500, involving calculating a time interval, in accordance with some embodiments of the present invention. In these embodiments, the monitored hardware processor calculates 501 the denominator by multiplying the number of hardware processor cycles per second by the number of hardware processor cycles in the time period. Next, in such embodiments, the monitored hardware processor divides 502 the synchronization delay value by the denominator of the calculation to calculate the time interval between execution of successive atomic calculation steps.
In some embodiments of the invention, the synchronization controller configures a time interval of a monitored hardware processor of the plurality of monitored hardware processors to send a progress indication to the synchronization controller. In these embodiments, the time period associated with the indication of progress is the time interval.
Furthermore, the present invention, in some embodiments thereof, provides a centralized schedule reference through the use of a synchronization controller and allows for centralized control of the progress of execution of the at least one monitored code. In these embodiments, the synchronization controller may speed up or slow down the entire execution of the at least one monitored code by a plurality of synchronization delay values sent to the plurality of monitored hardware processors.
The description of the various embodiments of the present invention is intended to be illustrative, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen to best explain the principles of the embodiments, the practical application, or technical advances, or to enable others skilled in the art to understand the embodiments disclosed herein, as compared to techniques available in the market.
It is expected that during the maturity of this patent application many relevant monitored codes will be developed, the scope of the term monitored code is intended to include all these new technologies a priori.
The term "about" as used herein means ± 10%.
The terms "including" and "having" mean "including but not limited to". This term includes the terms "consisting of … …" and "consisting essentially of … …".
The phrase "consisting essentially of …" means that the composition or method may include additional ingredients and/or steps, provided that the additional ingredients and/or steps do not materially alter the basic and novel characteristics of the claimed composition or method.
As used herein, the singular forms "a", "an" and "the" include plural references unless the context clearly dictates otherwise. For example, the term "a complex" or "at least one complex" may include a plurality of complexes, including mixtures thereof.
The word "exemplary" is used herein to mean "serving as an example, instance, or illustration. Any "exemplary" embodiment is not necessarily to be construed as preferred or advantageous over other embodiments and/or to exclude the presence of other combinations of features of embodiments.
The word "optionally" is used herein to mean "provided in some embodiments and not provided in other embodiments". Any particular embodiment of the invention may incorporate a plurality of "optional" features, unless these features contradict each other.
Throughout this application, various embodiments of the present invention may be presented in a range format. It is to be understood that the description of the range format is merely for convenience and brevity and should not be construed as a fixed limitation on the scope of the present invention. Accordingly, the description of a range should be considered to have specifically disclosed all the possible sub-ranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed sub-ranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6, etc., as well as individual numbers within the range, such as 1, 2, 3, 4, 5, and 6.
When a range of numbers is indicated herein, the expression includes any number (fractional or integer) recited within the indicated range. The phrases "in the first indicated number and the second indicated number range" and "from the first indicated number to the second indicated number range" and used interchangeably herein are meant to include the first and second indicated numbers and all fractions and integers in between.
It is appreciated that certain features of the invention, which are, for brevity, described in the context of a single embodiment, may also be provided in combination in a single embodiment. Conversely, various features of the invention which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable subcombination or as any suitable alternative embodiment of the invention. Certain features described in the context of various embodiments are not considered essential features of those embodiments unless the embodiments are not otherwise invalid.
All publications, patents and patent specifications mentioned in this specification are herein incorporated in the specification by reference, and likewise, each individual publication, patent or patent specification is specifically and individually incorporated herein by reference. In addition, citation or identification of any reference in this application shall not be construed as an admission that such reference is available as prior art to the present invention. To the extent section headings are used, they should not be construed as necessarily limiting.

Claims (13)

1. A synchronous distributed system, comprising a synchronous controller and at least one hardware processor, wherein the synchronous controller is connected to each hardware processor, and wherein:
the synchronization controller is configured to:
instructing a plurality of monitored hardware processors to begin executing at least one monitored code, the at least one monitored code comprising a sequence of computational steps;
receiving a progress indication associated with a time period from one of the plurality of monitored hardware processors; the progress indication is a quantity of steps of a sequence of computational steps performed within a time period associated with the progress indication;
updating a code progress indication set according to the progress indication; the set of code progress indications comprises: a last code progress indication for each of the at least one monitored code received from a hardware processor executing the at least one monitored code;
calculating a synchronization delay for one of the plurality of monitored hardware processors using the set of code progress indications;
sending the synchronization delay to the one of the plurality of monitored hardware processors to allow the one of the plurality of monitored hardware processors to stop executing between each two consecutive computation steps of the sequence of computation steps to reach a time interval that coincides with the synchronization delay;
wherein the synchronization controller is configured to calculate the synchronization delay by:
identifying a minimum progress in the set of code progress indications;
calculating a progress gap x by subtracting the minimum progress from the progress indication;
calculating a synchronization interval y by subtracting said x from a maximum synchronization interval Δ T of said time period;
calculating a modification term by subtracting the reciprocal of Δ T from the reciprocal of y;
calculating a synchronization delay by adding the modification term to the x.
2. The synchronous distributed system according to claim 1, wherein said plurality of monitored hardware processors are configured to send periodic progress indications to said synchronization controller at regular time intervals.
3. The synchronous distributed system according to claim 2, wherein the fixed time interval has a value between 1 millisecond and 5 milliseconds.
4. The synchronous distributed system according to any one of claims 1 to 3, wherein said synchronization controller and said plurality of monitored hardware processors are connected to a data communication network.
5. The synchronous distributed system according to claim 4, wherein said data communication network is a Dynamic Circuit Network (DCN).
6. The synchronous distributed system of claim 4, wherein said data communications network is a switched data communications network.
7. The synchronous distributed system according to any of claims 1 to 3, wherein said progress indications are step quantities of said sequence of calculation steps.
8. The synchronous distributed system according to any of claims 1 to 3, wherein said progress indication is an amount of time.
9. The synchronous distributed system according to any of claims 1 to 3, wherein said at least one monitored code is a simulator.
10. The synchronous distributed system according to any one of claims 1 to 3, wherein at least one of said plurality of monitored hardware processors is configured to execute a plurality of monitored codes.
11. A method for synchronizing a distributed system, comprising:
on the synchronization controller:
instructing a plurality of monitored hardware processors to begin executing at least one monitored code, the at least one monitored code comprising a sequence of computational steps;
receiving a progress indication associated with a time period from one of the plurality of monitored hardware processors; the progress indication is a quantity of steps of a sequence of computational steps performed within a time period associated with the progress indication;
updating a code progress indication set according to the progress indication; the set of code progress indications comprises: a last code progress indication for each of the at least one monitored code received from a hardware processor executing the at least one monitored code;
calculating a synchronization delay for one of the plurality of monitored hardware processors using the set of code progress indications;
sending the synchronization delay to the one of the plurality of monitored hardware processors;
wherein the synchronization delay is calculated at the synchronization controller by:
identifying a minimum progress in the set of code progress indications;
calculating a progress gap x by subtracting the minimum progress from the progress indication;
calculating a synchronization interval y by subtracting said x from a maximum synchronization interval Δ T of said time period;
calculating a modification term by subtracting the reciprocal of Δ T from the reciprocal of y;
calculating a synchronization delay by adding the modification term to the x.
12. A method for adjusting execution speed of a monitored hardware processor based on the system of claim 1, comprising:
at the monitored hardware processor:
executing a calculation step;
incrementing a step counter;
sending a value of the step counter associated with a time period to a synchronization controller;
receiving a synchronization delay value from the synchronization controller; wherein the synchronization delay value is calculated on the synchronization controller by: identifying a minimum progress in the set of code progress indications; calculating a progress gap x by subtracting the minimum progress from the progress indication; calculating a synchronization interval y by subtracting said x from a maximum synchronization interval Δ T of said time period; calculating a modification term by subtracting the reciprocal of Δ T from the reciprocal of y; calculating a synchronization delay by adding the modification term to the x;
calculating a time interval consistent with the synchronization delay value;
executing a further sequence of calculation steps and stopping execution in said time interval between every two consecutive calculation steps of said further sequence of calculation steps.
13. The method of claim 12, wherein calculating a time interval comprises:
calculating a denominator by multiplying the number of hardware processor cycles per second by the number of hardware processor cycles within the time period;
dividing the synchronization delay value by the denominator.
CN201780090777.3A 2017-05-18 2017-05-18 System and method for synchronous distributed multi-node code execution Active CN110914806B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2017/061914 WO2018210419A1 (en) 2017-05-18 2017-05-18 System and method of synchronizing distributed multi-node code execution

Publications (2)

Publication Number Publication Date
CN110914806A CN110914806A (en) 2020-03-24
CN110914806B true CN110914806B (en) 2022-06-14

Family

ID=58709990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201780090777.3A Active CN110914806B (en) 2017-05-18 2017-05-18 System and method for synchronous distributed multi-node code execution

Country Status (2)

Country Link
CN (1) CN110914806B (en)
WO (1) WO2018210419A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112732493B (en) * 2021-03-30 2021-06-18 恒生电子股份有限公司 Method and device for newly adding node, node of distributed system and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101588494A (en) * 2009-06-30 2009-11-25 华为技术有限公司 Method for processing media stream, communication system, and relative devices
CN102316575A (en) * 2006-01-18 2012-01-11 华为技术有限公司 Synchronized method and system thereof in communication system
CN103237191A (en) * 2013-04-16 2013-08-07 成都飞视美视频技术有限公司 Method for synchronously pushing audios and videos in video conference
CN104798069A (en) * 2012-09-18 2015-07-22 诺基亚技术有限公司 Methods, apparatuses and computer program products for providing a protocol to resolve synchronization conflicts when synchronizing between multiple devices

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5568048B2 (en) * 2011-04-04 2014-08-06 株式会社日立製作所 Parallel computer system and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102316575A (en) * 2006-01-18 2012-01-11 华为技术有限公司 Synchronized method and system thereof in communication system
CN101588494A (en) * 2009-06-30 2009-11-25 华为技术有限公司 Method for processing media stream, communication system, and relative devices
CN104798069A (en) * 2012-09-18 2015-07-22 诺基亚技术有限公司 Methods, apparatuses and computer program products for providing a protocol to resolve synchronization conflicts when synchronizing between multiple devices
CN103237191A (en) * 2013-04-16 2013-08-07 成都飞视美视频技术有限公司 Method for synchronously pushing audios and videos in video conference

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A Very Fast Simulator for Exploring the Many-Core Future;Olivier Certner,Zheng Li,Arun Raman et;《IEEE》;20110908;第443-454页 *

Also Published As

Publication number Publication date
WO2018210419A1 (en) 2018-11-22
CN110914806A (en) 2020-03-24

Similar Documents

Publication Publication Date Title
US10908941B2 (en) Timestamping data received by monitoring system in NFV
US10884795B2 (en) Dynamic accelerator scheduling and grouping for deep learning jobs in a computing cluster
KR102013617B1 (en) Method and apparatus for synchronizing time
CN111052155A (en) Distributed random gradient descent method for asynchronous gradient averaging
Poirier et al. Accurate offline synchronization of distributed traces using kernel-level events
US11973670B2 (en) Area efficient traffic generator
CN109690483B (en) Techniques for determining and mitigating latency in a virtual environment
CN103093059A (en) Real-time and efficient distributed semi-physical simulation system construction method
CN112838904B (en) TSN network delay jitter measuring device and method
CN110914806B (en) System and method for synchronous distributed multi-node code execution
CN106453127A (en) Token processing method and device
CN114866178A (en) Step length-based time synchronization method for distributed simulation system
JP5933147B2 (en) COMMUNICATION DEVICE, COMMUNICATION METHOD, AND PROGRAM
Lamps et al. Conjoining emulation and network simulators on linux multiprocessors
Khalid et al. Deadline aware virtual machine scheduler for grid and cloud computing
Srinivasan et al. An analysis of the delayed gradients problem in asynchronous sgd
JP2008187235A (en) Network system and slave synchronization method
CN109478052B (en) Method and device for determining execution time of application program
US9037891B2 (en) Multi-processor synchronization using time base counters
US9887928B2 (en) System and method for identifying performance characteristics in asynchronous networks
SE520281C2 (en) Method and devices for simulating communication systems
US20210243278A1 (en) Communication system and moving method
Ziganurova et al. Synchronization aspects of the optimistic parallel discrete event simulation algorithms
Lee et al. Phasing of periodic tasks distributed over real-time fieldbus
Popescu et al. Reproducing Network Experiments in a Time-controlled Emulation Environment.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant