US9244701B2 - Method for system scenario based design of dynamic embedded systems - Google Patents
Method for system scenario based design of dynamic embedded systems Download PDFInfo
- Publication number
- US9244701B2 US9244701B2 US13/940,247 US201313940247A US9244701B2 US 9244701 B2 US9244701 B2 US 9244701B2 US 201313940247 A US201313940247 A US 201313940247A US 9244701 B2 US9244701 B2 US 9244701B2
- Authority
- US
- United States
- Prior art keywords
- scenario
- cost
- based design
- system scenario
- values
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
- 238000000034 method Methods 0.000 title claims abstract description 69
- 238000013461 design Methods 0.000 title claims abstract description 50
- 238000000638 solvent extraction Methods 0.000 claims abstract description 10
- 238000009826 distribution Methods 0.000 claims abstract description 9
- 230000002123 temporal effect Effects 0.000 claims abstract description 7
- 230000006399 behavior Effects 0.000 claims description 25
- 230000006870 function Effects 0.000 claims description 17
- 238000001514 detection method Methods 0.000 claims description 9
- 230000001419 dependent effect Effects 0.000 claims description 5
- 238000009966 trimming Methods 0.000 claims description 2
- 238000004422 calculation algorithm Methods 0.000 description 22
- 238000013459 approach Methods 0.000 description 18
- 230000000875 corresponding effect Effects 0.000 description 14
- 238000005265 energy consumption Methods 0.000 description 10
- 230000008901 benefit Effects 0.000 description 9
- 238000013507 mapping Methods 0.000 description 9
- 238000005457 optimization Methods 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000002829 reductive effect Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000012886 linear function Methods 0.000 description 4
- 238000000537 electroencephalography Methods 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 230000009897 systematic effect Effects 0.000 description 3
- 206010010904 Convulsion Diseases 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 238000004880 explosion Methods 0.000 description 2
- 238000009825 accumulation Methods 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000012938 design process Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 206010015037 epilepsy Diseases 0.000 description 1
- 208000028329 epileptic seizure Diseases 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000000670 limiting effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 238000009828 non-uniform distribution Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000003864 performance function Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000011514 reflex Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44505—Configuring for program initiating, e.g. using registry, configuration files
-
- G06F17/5045—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/30—Circuit design
Definitions
- the present disclosure is related to the field of generic and systematic design-time/run-time methods capable of handling the dynamic nature of embedded systems.
- Real-time embedded systems have become much more complex due to the introduction of a lot of new functionality in one application, and due to running multiple applications concurrently. This increases the dynamic nature of today's applications and systems and tightens the requirements for their constraints in terms of deadlines and energy consumption. State-of-the-art design methodologies try to cope with these issues by identifying several most used cases and dealing with them separately, reducing the newly introduced complexity.
- Embedded systems usually comprise processors that execute domain-specific applications. These systems are software intensive, having much of their functionality implemented in software, which is running on one or multiple processors, leaving only the high performance functions implemented in hardware. Typical examples include TV sets, cellular phones, wireless access points, MP3 players and printers. Most of these systems are running multimedia and/or telecom applications and support multiple standards. Thus, these applications are full of dynamism, i.e., their execution costs (e.g., number of processor cycles, memory usage, energy) are environment dependent (e.g., input data, processor temperature).
- execution costs e.g., number of processor cycles, memory usage, energy
- environment dependent e.g., input data, processor temperature
- Scenario-based design in general has been used for some time in both hardware and software design of embedded systems. In both of these cases, scenarios concretely describe, in an early phase of the development process, the use of a future system. These scenarios are called use-case scenarios. They focus on the application functional and timing behaviours and on the interaction with the users and environment, and not on the resources required by a system to meet its constraints. These scenarios are used as an input for design approaches centred round the application context.
- system scenarios a different and complementary type of scenarios, called system scenarios.
- System scenario-based design methodologies have recently been successfully applied to reduce the costs of dynamic embedded systems. They provide a systematic way of constructing workload-adaptive embedded systems and have already been proposed for multimedia and wireless domains.
- the system is separately optimized for a set of system scenarios with different costs, e.g., alternative mapping and scheduling of tasks on multiprocessor systems.
- certain parameters are monitored, changes to the current scenario situation are detected, and mapping and scheduling are changed accordingly.
- control variable-based system scenario approaches based on bottom-up clustering have been studied in-depth in the literature.
- TCM Task Concurrency Management
- An application is divided into thread frames, each consisting of a number of thread nodes, as will be further detailed below.
- each thread node is profiled to find its execution time and power consumption for all possible input data and on all possible processors on the platform.
- profiling is meant the simulation of hardware based emulation of the system behaviour to obtain the system responses for a representative set of input stimuli.
- the resulting numbers are used to find all thread frame schedules with an optimal trade-off between execution time and energy consumption.
- a schedule candidate is optimal if it, e.g., has the lowest energy consumption for a given execution time.
- each thread frame has a set of optimal schedules along a curve in a two-dimensional execution time-energy consumption solution space.
- system scenarios can be used to find a thread frame Pareto-curve for each of the individual scenarios.
- Each system scenario corresponds then to a different (non-overlapping) cluster of run-time situations that are similar enough in their Pareto-curve positions (close enough in the N-dimensional trade-off space, see also further).
- input data is monitored to keep track of the currently active scenario.
- the scenarios are derived from the combination of the application behaviour and the application mapping on the system platform. These scenarios are used to reduce the system cost by exploiting information about what can happen at run-time to make better design decisions at design-time, and to exploit the time-varying behaviour at run-time. While use-case scenarios classify the application's behaviour based on the different ways the system can be used in its over-all context, system scenarios classify the behaviour based on the multi-dimensional cost trade-off during the implementation trajectory. By optimizing the system per scenario and by ensuring that the actual system scenario is predictable at run-time, a system setting can be derived per scenario to optimally exploit the system scenario knowledge.
- FIG. 1 depicts a design trajectory using use-case and system scenarios. It starts from a product idea, for which the product's functionality is manually defined as use-case scenarios 1, 2, and 3. These scenarios characterize the system from a user perspective and are used as an input to the design of an embedded system that includes both software and hardware components. In order to optimize the design of the system, the detection and usage of system scenarios augments this trajectory (the cost perspective box of FIG. 1 ). The run-time behaviour of the system is classified into several system scenarios (A and B in FIG. 1 ), with similar cost trade-offs within a scenario. For each individual scenario, more specific and aggressive design decisions can be made.
- the sets of use-case scenarios and system scenarios are not necessarily disjoint, and it is possible that one or more use-case scenarios correspond to one system scenario. But still, they are usually not overlapping, and it is likely that a use-case scenario is split into several system scenarios, or even that several system scenarios intersect several use-case scenarios.
- the system scenario-based design methodology is a powerful tool that can also be used for fine grain optimizations at the task abstraction level and for simultaneous optimization of multiple system costs.
- the ability of handling multiple and non-linear system costs differentiates system-based design methodologies from the dynamic run-time managers intended for Dynamic Voltage and Frequency Scaling (DVFS) type platforms.
- DVFS methodologies concentrate on optimization of a single cost (the energy consumption of the system), that scales monotonically with frequency and voltage. They perform direct selection of the system reconfiguration from the current workload situation. This, however, cannot be generalized for costs that depend on the parameters in a non-uniform way. That makes the decision in one run-time step too complex.
- Scenario-based design methodologies solve this problem by a two-stage approach decided at run-time: they first identify what scenario the working situation belongs to, and then choose the best system reconfiguration for that scenario. Since the relationship between the parameters and the costs will, in practice, be very complex, the scenario identification is, however, performed at design-time.
- RTS run-time situation
- An RTS is a piece of system execution with an associated cost that is treated as a unit. The cost usually consists of one or several primary costs, like quality and resource usage (e.g., number of processor cycles, memory size).
- the system execution on a given system platform is a sequence of RTSs. One complete run of the application on the target platform represents the sequence of RTSs. The current RTS is known only at the moment it occurs.
- RTS parameters can be predicted in advance in which RTS the system will run next for a non-zero future time window. If the information about all possible RTSs in which a system may run is known at design-time, and the RTSs are considered in different steps of the embedded system design, a better optimized (e.g., faster or more energy efficient) system can be built because specific and aggressive design decisions can be made for each RTS. These intermediate per-RTS optimizations lead to a smaller, cheaper and more energy efficient system that can deliver the required quality. In general, any combination of N cost dimensions may be targeted. However, the number of cost dimensions and all possible values of the considered RTS parameters may lead to an exponential number of RTSs.
- the RTSs are classified and clustered from an N-dimensional cost perspective into system scenarios, such that the cost trade-off combinations within a scenario are always fairly similar (i.e their Euclidean distance in the N-dimensional cost space is relatively small), the RTS parameter values allow an accurate prediction, and a system setting can be defined that allows to exploit the scenario knowledge and optimizations.
- a scenario identification technique therefore lies at the heart of any system scenario-based design methodology. It determines how the different observed RTSs should be divided into groups with similar costs, i.e. The system scenarios, and how these system scenarios should be represented to make their run-time prediction as simple as possible.
- parameters that decide the scenario boundaries have been limited to control variables, or variables with a limited number of distinct values.
- the relevant RTS parameters are selected and the RTSs are clustered into scenarios. This clustering is based on the cost trade-offs of the RTSs, or an estimate thereof.
- the identification step should take as much as possible into account the overhead costs introduced in the system by the following steps of the methodology. As this is not easy to achieve, an alternative solution is to refine the scenario identification (i.e., to further cluster RTSs) during these steps.
- a task-level scenario identification split can be performed in two steps.
- the variables in the application code are analyzed, either statically, or through profiling of the application with a representative data set.
- the variables having most impact on the run-time cost of the system are determined.
- These variables are called RTS parameters, denoted by ⁇ 1 , ⁇ 2 , . . . , ⁇ k , and are used to characterize system scenarios and design the scenario prediction mechanism.
- RTS parameters denoted by ⁇ 1 , ⁇ 2 , . . . , ⁇ k
- the number N of RTS signatures will hence be very large. Depending on the number of RTS parameters and how many different values each of them can take, there is a small or large number of different RTS signatures.
- the RTS signatures are divided into groups with similar costs—the system scenarios. This can be done by a bottom-up clustering of RTS signatures with a resulting multi-valued decision diagram (MDD) that is used as a predictor for the upcoming system scenario (see step of scenario prediction).
- MDD multi-valued decision diagram
- RTS parameters can be evaluated either statically or through profiling.
- the frequencies of occurrence of different RTS parameter values are not taken into consideration. Therefore, system scenarios may be produced that almost never occur.
- This technique can be extended with profiling information, and then forms a system scenario set that exploits run-time statistics. This approach typically leads to only a limited amount of parameters being labelled as important enough to incorporate in the identification step, which is crucial to limit the complexity.
- a scenario has to be selected from the scenario set based on the actual parameter values.
- This selection process is referred to as scenario prediction.
- the parameter values may not be known before the RTS starts, so they may have to be estimated.
- Prediction is not a trivial task; both the number of parameters and the number of scenarios may be considerable, so a simple lookup in a list of scenarios may not be feasible.
- the prediction incurs a certain run-time overhead, which depends on the chosen scenario set. Therefore, the scenario set may be refined based on the prediction overhead. In this step two decisions are made at design-time, namely selection of the run-time detection algorithm and scenario set refinement.
- the exploitation step is essentially based on some optimization that is applied when no scenario approach is applied.
- a scenario approach can simply be put on top of this optimization by applying the optimization to each scenario of the scenario set separately. Using the additional scenario information enables better optimization.
- Switching is the act of changing the system from one set of knob positions (see below) to another. This implies some overhead (e.g., time and energy), which may be large (e.g., when migrating a task from one processor to another). Therefore, even when a certain scenario (different from the current one) is predicted, it is not always a good idea to switch to it, because the overhead may be larger than the gain.
- the switching step selects at design-time an algorithm that is used at run-time to decide whether to switch or not. It also introduces into the application the way how to implement the switch, and refines the scenario set by taking into account switching overhead.
- the previous steps of the methodology make different choices (e.g., scenario set, prediction algorithm) at design-time that depend very much on the values that the RTS parameters typically have at run-time; it makes no sense to support a certain scenario if in practice it (almost) never occurs.
- profiling augmented with static analysis can be used.
- the ability to predict the actual run-time environment, including the input data is obviously limited. Therefore, also support is foreseen for infrequent calibration at run-time, which complements all the methodology steps previously described.
- the disclosure relates to a method for system scenario based design for an embedded platform whereon a dynamic application is implemented, whereby said dynamic application has to meet at least one guaranteed constraint and whereby the application has temporal correlations in the behaviour of internal data variables used in the dynamic application.
- the internal data variables represent parameters used for executing a portion of the application.
- the present disclosure proposes determining cost regions that are derived from data variables, from which corresponding system scenarios can be obtained. More particularly, correlations over time assumed present between the internal data variables used in the application (algorithm) are exploited.
- An N-dimensional cost function is determined for the implementation of the application for a set of combinations of the internal data variables.
- the cost space is partitioned into a number of bounded regions. Each bounded region clusters combinations of internal variable values that have a similar cost and also have a certain, sufficiently large frequency of occurrence. In contrast, a separate bounded region is provided that collects rarely occurring combinations of internal data variable values.
- the proposed solution yields a gain in cost due to the fact that instead of the overall worst-case cost solution for the system, a worst-case cost solution per a bounded region is now available.
- the method comprises a step of subdividing the at least two bounded regions into one or more system scenarios and clustering within a system scenario the tuples of internal data variable values that have similar cost and sufficiently large frequency of occurrence.
- a separate scenario is provided that corresponds to the above-mentioned separate bounded region provided for rarely occurring combinations. This is called the back-up scenario.
- the method comprises a step of performing a backward reasoning on an observed set of internal data variable values for determining the bounded region the observed set belongs to. This subdivision in the variable space enables run-time detection of the scenarios.
- each of the system scenarios is represented by taking the combination of values of internal data variables with the highest cost impact in each system scenario.
- This representation i.e. this set of internal data variable values, is then used in the implementation of the application for performing e.g. calculations on the platform. Due to the frequently occurring RTSes that are clustered in scenarios each corresponding to a bounded region with a lower than “overall worst-case” maximum cost, the average cost goes down. Different costs can for example be due to the use of an alternative mapping or scheduling of tasks in one scenario as compared to another.
- the original guaranteed constraint imposed to the embedded system is met for each bounded region separately by the worst-case cost (i.e. the N-dimensional cost tuple in the cost space) that can occur in said bounded region. In that way it is assured that the application continues to meet this constraint, also when various scenarios for the application are defined.
- the worst-case cost i.e. the N-dimensional cost tuple in the cost space
- the N-dimensional cost function takes into account at least two costs of the group ⁇ area, memory size, dynamic energy, leakage energy ⁇ .
- the cost related to system scenario detection at run-time and/or the cost related to a switch of system scenario at run-time is taken into account.
- temporal correlations in one or more input signals of the dynamic application are exploited.
- Said exploiting then advantageously comprises predicting in which range of a plurality of input signal value ranges future values of said input signal belong to.
- the predicted range is next preferably taken into account when identifying a scenario.
- the frequency of occurrence of input signal values is taken into account. This further allows keeping the behaviour within certain limits such that the risk of not meeting the imposed constraint is reduced or even avoided.
- said exploiting comprises predicting in which range of a plurality of input signal value ranges future values of the input signal belong to, said input signal value ranges being indicated by threshold levels.
- a step of trimming is then performed to fine-tune the threshold level values until a desired threshold level distribution is reached.
- data curves of input signals waveforms are buffered and exploited in the prediction.
- the step of partitioning comprises performing a balancing function for determining the ranges of the bounded regions.
- the method comprises a step of ordering the combinations of values of the internal data variables according to the cost function value they yield.
- the N-dimensional cost space is advantageously partitioned in polyhedrons.
- the method discloses in one embodiment wherein a top-down partitioning for arbitrary large domain is applied.
- FIG. 1 illustrates a scenario-based design flow for embedded systems.
- FIG. 2 illustrates a two-dimensional cost space divided in regions.
- FIG. 3 illustrates the concept of system scenario identification.
- FIG. 4 represents an example of clustering of signal values in three levels.
- FIG. 5 illustrates the original input of a digital filter.
- FIG. 6 illustrates clustered input of the digital filter.
- FIG. 7 illustrates prediction curves coding
- the goal of a scenario based method is, given an embedded system, to exploit at design-time the possible RTSs, without getting into an explosion of detail. If the environment, the inputs, and the hardware architecture status would always be the same, then it would be possible to optimally tune the system to that particular situation. However, since a lot of parameters are changing all the time, the system must be designed for the worst case situation. Still, it is possible to tune the system at run-time based on the actual RTS. If this has to happen entirely at run-time, the overhead is most likely too large. So, an optimal configuration of the system is selected up front, at design-time. However, if a different configuration would be stored for every possible RTS, a huge database is required. Therefore, the RTSs similar from the resource usage perspective are clustered together into a single scenario, for which a tuned configuration is stored for the worst case of all RTSs included in it.
- system knobs Many system parameters exist that can be tuned at run-time while the system operates, in order to optimize the application behaviour on the platform which it is mapped on. These parameters are called system knobs.
- a huge variety of system knobs is available. Anything that can be changed about the system during operation and that affects system cost (directly or indirectly) can be considered a system knob. The changes do not have to occur at the hardware level; they can occur at the software level as well.
- a particular position or tuning of a system knob is called a knob position. If the knob positions are fully fixed at design-time, then the system always has the same fixed, worst case cost. By configuring knobs while the system is operating, the system cost can be affected. However, tuning knob positions at run-time introduces overhead, which should be taken into account when the system cost is computed.
- knob position is chosen, depending on the actual RTS.
- the appropriate knob position should be set.
- the knob position is not changed during the RTS execution. Therefore, it is necessary to determine which RTS is about to start. This prediction is based on RTS parameters, which have to be observable and which are assumed to remain sufficiently constant during the RTS execution. These parameters together with their values in a given RTS form the RTS snapshot. Taking the example of a H.264 decoder, the RTS corresponds to the decoding of a frame, and the RTS parameter is the frame breakup into the macroblock types.
- the number of distinguishable RTSs from a system is exponential in the number of observable parameters. Therefore, to avoid the complexity of handling all of them at run-time, several RTSs are clustered into a single system scenario.
- a trade-off is present between optimisation quality and run-time overhead of the scenario exploitation.
- the RTS parameters are used to detect the current scenario rather than the current RTS.
- the same knob position is used for all the RTSs in a scenario, so they all have the same cost value: the worst case of all the RTSs in the scenario. Therefore, as motivated also above, it is best to cluster RTSs which have nearby cost values. Since at run-time any RTS may be encountered, it is necessary to design not one scenario but rather a scenario set.
- a scenario set is a partitioning of all possible RTSs, i.e., each RTS must belong to exactly one scenario.
- Recent biomedical applications for outpatient care have a dynamic nature and are at the same time subject to strict cost constraints. They continuously monitor a patient's signals for an anomaly and perform specific tasks when the anomaly is detected. They may use complex signal processing on multiple channels and are required to be powered by a battery for months or even years.
- One such example is an epileptic seizure predictor, which tracks electroencephalography (EEG) signals from up to 32 channels and may warn patients of upcoming seizures hours in advance. A part of this predictor performs calculations for each channel once every 10 seconds. Due to different EEG input data, the energy consumption of one calculation can vary widely from 6 mJ to 13 mJ. The peak energy consumption for this application occurs only once in the 6 hours long EEG recording. A system designed based on this worst case energy consumption will consume 829 J/channel while processing the recording.
- EEG electroencephalography
- An ideal workload-adaptive system is able to configure the system optimally in each run so that it consumes the minimum amount of energy possible.
- a heavily optimized thread-level workload-adaptive design can never be built in practice as the costs of reconfiguring such system, storing the different configurations and predicting them would be excessive.
- a system scenario-based design methodology uses the same concept of adaptively reconfiguring the system, but allows only a limited set of possible configurations.
- a given system scenario has a fixed system cost corresponding to its system configuration. It contains the group of runs for which this configuration is better than any of the other configurations in the limited number of scenarios. For most runs the system will then require a small energy overhead compared with running on the optimal configuration, but far less than the system based on the worst case energy consumption. There will be an added energy consumption related to the scenario detection and reconfiguration, but this can be kept low if the guidelines for scenario based design are followed.
- Internal data variables in the algorithm have partly predictable future evolution in time, which can be derived from these correlations.
- These internal data variables also include the current set of internal state variables. For instance in a FIR filter of order n, the n variables in the n-tap delay line represent the state variables from which all other variables can be derived at a given time instance. These time evolutions are labelled as sequences of RTSs.
- This input data and internal state variable behaviour is in general very non-uniform though and it can only be represented by a heavily non-linear function F system of the input data and the “previous” or “old” internal state variables. Hence, one can write for the internal data variable:
- both sys_scen 1 and sys_scen 2 have an upper bound on the number of iterations which is now significantly better (lower) than the worst-case bound of the original application code.
- the remaining more rarely occurring cost combinations in the overall application profiling are then grouped together into a single so-called back-up system scenario which on average will be active very seldom.
- this back-up scenario also has to take the worst-case cost combination into account as it has to serve a highly non-uniform set of RTSs that cannot be restricted to better than worst-case behaviour given that we want to meet a guaranteed constraint.
- Each of these system scenarios can now be represented as a bounded region and, in most practical cases, it is observed that an approximation of this boundary by polyhedral regions or sets of polyhedrons provides a very useful formal representation.
- the worst-case behaviour per system scenario in terms of cost axes can now typically be significantly restricted compared to the overall worst-case behaviour of the entire system, because of these polyhedral bounds that span only part of the potential cost behaviour.
- the only exception is the rarely occurring back-up scenario which still has to be represented by the worst-case behaviour.
- This reduced behaviour of the regular system scenario set ⁇ sys_scen ⁇ can then also be mapped more efficiently by separately tuning the mappings to the characteristics of each specific system scenario.
- the clustering into system scenarios determines that gain to a large extent. In principle, at design-time and based on profiling one can then find the system scenario clustering which maximizes this average cost gain but a trade-off is present then with the detection cost for the active system scenario and also with the switching overhead of having to move to the “best” scenario whenever the input and state have sufficiently evolved. These three cost contributions should hence drive the best system scenario identification method (see below).
- FIG. 3 illustrates the theoretical concepts of the proposed scenario identification technique, given k RTS parameters and N profiled RTS signatures as in Equation 1. If one dimension is assigned to each RTS parameter, the resulting k-dimensional space defines all theoretically possible values for the RTS parameters in the application. Such space is called an RTS parameter space. When static max and min constraints on the values are added, the space reduces to one or several k-dimensional domains.
- the scenario identification task can be viewed as a distribution of points into S different groups, representing system scenarios, according to which the overall configuration cost is minimized.
- An RTS point i is assigned to scenario j whenever its cost c(i) falls into that scenario's cost range ⁇ C(j) min ,C(j) max ⁇ .
- the scenario cost ranges are determined by a balancing function that ensures that all scenarios have a near-equal probability to occur at run-time. In this way, rare system scenarios are avoided since their storage cost will exceed the gains of adding them. This probability is measured by the number of points, including the repeating ones, that each scenario contains and call it scenario size.
- the projection of scenarios onto the RTS parameter space will produce M ⁇ S regions that characterize the system scenarios in terms of RTS parameter values.
- Each region can be described as a polyhedron, and the run-time scenario prediction can be done by checking which polyhedron contains the RTS parameter values of the next RTS. Since it is known which scenario the region belongs to, one can foresee that the next running cost will be no more than the cost of that scenario.
- Checking if a point lies inside a polyhedron is the classical data point location problem from the computational geometry domain, and the advantage of using it for prediction instead of MDD is that it operates on/stores only the vertices of polyhedral regions, not the whole RTS parameter space.
- This top-down partitioning approach based on geometrical techniques can handle arbitrary large domains, provided that the number of distinct geometric regions stays reasonably low. Otherwise prediction overhead will grow.
- the number of regions depends on the number of system scenarios and the underlying structure of the system—the relationship between the cost locality of RTS points and the value locality of their RTS parameters.
- the desired number of system scenarios is best defined by the user according to the characteristics of the application domain. Typically this is limited to a few tens because beyond that the potential gains in better following the system dynamics are counterbalanced with the additional cost complexity of detecting and exploiting the (too) large set of possible system scenarios.
- a pre-processing step may be performed, where profiled RTS signatures are sorted by their costs starting from the worst case.
- a worst case system scenario is created.
- the system scenario is filled in with signatures having the next costs in the sorted sequence.
- a new system scenario is created.
- Each completed system scenario is checked for overlap with previously calculated higher cost system scenarios.
- An overlap means that the scenario regions in the RTS parameter space are not disjoint, and equals the intersection of the regions. The intersections make prediction of scenarios ambiguous and have to be eliminated.
- the complexity of the algorithm depends on the number of RTS signatures, the number of scenarios and the complexity of the underlying geometric algorithms in the labelled functions.
- the algorithm is modified, such that it calculates the distance between the points on the hull and removes those that are closer than L/v max , where L is the perimeter of the hull, and v max , is a user defined constraint of the maximum number of vertices in the prediction polyhedra.
- convex scenario projections are produced in the RTS parameter space.
- concave scenario projections are preferable: a) the inherent correlation between the RTS parameter values and the corresponding costs has a concave shape; and b) the system scenarios overlap in the RTS parameter space and complete migration of the signatures to the higher cost system scenarios results in considerable reduction of run-time gain.
- the overlaps may be produced by variables affecting the costs but not selected as RTS parameters. They can also be caused by non-deterministic properties of the underlying platform resulting in different costs for the same RTS parameters.
- a concave scenario projection generally reduces the overlap and improves the run-time gain.
- large overhead may incur since algorithms processing concave polyhedra are much more complex.
- a possible solution is to split the concave projection into a set of convex polyhedron at design-time and apply convex hull algorithms. The separate polyhedra still require additional storage and processing time, which should be kept low. To achieve that, a restriction must be made on the number of reflex angles in the concave projection, and also a careful consideration of the cost trade-offs must be made.
- the dynamic range of the signal is split into three different ranges separated by two thresholds.
- FIG. 4 the signal waveform at the digital filter input is shown (dashed line).
- the value clustering curve (solid line) shows the result of levelling the signal values after using the two thresholds UP and DOWN as shown in the same figure.
- the general rule to use these thresholds is shown in the value clustering function below:
- clustered_value ⁇ ( value ) ⁇ 1 value > THRESHOLD - 1 value ⁇ - THRESHOLD 0 otherwise
- First absolute data is used to define the positive threshold.
- the negative threshold is just the negative value of the positive.
- a threshold is selected that evenly divides the dynamic range. Then the threshold is trimmed to end up with a distribution of the signal values across the three sub-ranges as uniform as possible.
- the creation of one or more large sub-ranges, the sizes of which might be comparable with the dynamic range should be avoided. That is because the usefulness of the prediction is compromised.
- the main approach of the prediction process is detailed below for the example of a digital filter and Fast Fourier Transform (FFT) of which the inputs need to be estimated.
- FFT Fast Fourier Transform
- the threshold used for the digital filter is 0.3 and for the FFT 0.8. This selection is based on the dynamic range and the data histogram. A lot of samples of the digital filter input are obtained.
- the pattern one tries to exploit is not easily distinguishable. However, it becomes visible only after clustering the signal values to the three ranges mentioned above. In FIGS. 5 and 6 the original waveform is shown before and after the clustering. In FIG. 6 the pattern is indeed more obvious.
- One option is to predict the next value based solely on the previous and current signal values.
- a noteworthy property of the waveform is that there are always two samples residing in the same range. So, if the previous sample belongs to a different range than the current one the next sample will be in the same range as the current one. Hence, if the current value is different from the previous one then it should be repeated and the next value is the same as the current and if it is different, the next value is predicted as the opposite of the current. Data may be used as it is or may be flipped (whereby only the absolute signal value is used and only the positive threshold) before being processed.
- Another option for the prediction algorithm is one whereby an additional buffer is used that stores curves so that the predictability can be increased.
- an additional buffer is used that stores curves so that the predictability can be increased.
- a single curve can be described by four samples which are stored in a buffer.
- the curves are coded based on the ranges of four consecutive samples.
- FIG. 7 the used coding is shown. In the figure each curve is described by three points. Due to the previously described property of the value repetition, the middle point is repeated and the repetition is omitted in the figure.
- To identify a curve the samples at the edges of each curve are needed (three in total). The predictions based on the curve are based only on the previous curve. It is safe to assume that when using a larger curve buffer, the exploitation of the pattern appearing at a larger scale is feasible. On the other hand, the complexity of the algorithm will increase slightly.
- ⁇ (x) Three samples in a row are forming a function ⁇ (x) (a different function each time).
- ⁇ (0) we can calculate ⁇ ′(x) using the differences between the ⁇ (0), ⁇ (1), ⁇ (2). This can be done easily by using a 2 samples buffer for the ⁇ (0) and ⁇ (1) and have the ⁇ (2) as the current value.
- the ⁇ (0) and the ⁇ (1) can be stored in another buffer too.
- the differences of the derivative values form the second derivative which can be used to have a better approximation of the Taylor series.
- a computer program may be stored/distributed on a suitable non-transitory medium, such as an optical storage medium or a solid-state medium supplied together with or as part of other hardware, but may also be distributed in other forms, such as via the Internet or other wired or wireless telecommunication systems. Any reference signs in the claims should not be construed as limiting the scope.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Geometry (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Supply And Distribution Of Alternating Current (AREA)
Abstract
Description
r(i)=ξ1(i),ξ2(i), . . . ,ξk(i);c(i), (1)
containing parameter values ξ1(i), ξ2(i), . . . , ξk(i) and the corresponding task costs c(i), i.e., each run instance of each task has its own RTS signature. The number N of RTS signatures will hence be very large. Depending on the number of RTS parameters and how many different values each of them can take, there is a small or large number of different RTS signatures. This is important for the complexity of the second step of the scenario identification. In this second step, the RTS signatures are divided into groups with similar costs—the system scenarios. This can be done by a bottom-up clustering of RTS signatures with a resulting multi-valued decision diagram (MDD) that is used as a predictor for the upcoming system scenario (see step of scenario prediction).
-
- determining a distribution over time of an N-dimensional cost function, with N being an integer and N>=1, corresponding to the implementation on the platform for a set of combinations of the internal data variables; and
- partitioning the N-dimensional cost space in at least two bounded regions, each bounded region containing cost combinations corresponding to combinations of values of the internal data variables of the set that have similar cost and frequency of occurrence, whereby one bounded region is provided for rarely occurring cost combinations.
-
- internal_data_var=Fsystem(old_state_var, input_data)
Hence, any implementation/realisation of this behaviour on some given platform also has such a correlation-based partial predictability, involving another non-linear function Fplatf mapping with the internal data variable behaviour as input this time. One can write: - platform_node_behav=Fplatf(internal_data_var)
For instance, the values stored in the registers of the processor data path are important parts of the platform node behaviour. Finally, also the N-dimensional cost associated with this implementation has a non-uniform distribution over time with some cost combinations in the N-dimensional space occurring more often than others. In particular, one can write: - {cost}=Fcost(platform_node_behav)
where {cost} is an N-dimensional set. For instance, important costs include area, memory size, dynamic energy, leakage energy and so on. Which cost combinations (or regions of cost combinations) occur more frequently can also be analysed from this non-linear function behaviour.
- internal_data_var=Fsystem(old_state_var, input_data)
-
- {internal_state_var_regions}=Fsys
— scen— regions(internal_state_var).
Per system scenario a bounded variable region is then obtained, which preferably is again approximated based on (sets of) polyhedra. SeeFIG. 2 for an illustration of how a 2-dimensional internal data variable domain is partitioned into three regions corresponding to three system scenarios. For instance, the iteration range of a while loop or the size of an array from which data are read can be such data variables on which we want to determine (detect) the active system scenario. Whenever one now has an input data and internal state variable condition falling within these polyhedra at run-time, one knows that it will give rise to the corresponding system scenario. That is indeed true also for the input data when one has a proper prediction procedure available.
- {internal_state_var_regions}=Fsys
C(j)max =r((j−1)·N/S+1) (2)
C(j)min =r((j)·N/S) (3)
The cost of scenario Cj is defined as the maximum cost of RTS signatures that it includes: Cj=C(j)max.
Claims (17)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP12176304.9 | 2012-07-13 | ||
EP12176304.9A EP2685395A1 (en) | 2012-07-13 | 2012-07-13 | Method for System Scenario Based Design of Dynamic Embedded Systems |
EP12176304 | 2012-07-13 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20140019739A1 US20140019739A1 (en) | 2014-01-16 |
US9244701B2 true US9244701B2 (en) | 2016-01-26 |
Family
ID=46798993
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/940,247 Expired - Fee Related US9244701B2 (en) | 2012-07-13 | 2013-07-11 | Method for system scenario based design of dynamic embedded systems |
Country Status (2)
Country | Link |
---|---|
US (1) | US9244701B2 (en) |
EP (1) | EP2685395A1 (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP3144820A1 (en) | 2015-09-18 | 2017-03-22 | Stichting IMEC Nederland | Inter-cluster data communication network for a dynamic shared communication platform |
US10318703B2 (en) * | 2016-01-19 | 2019-06-11 | Ford Motor Company | Maximally standard automatic completion using a multi-valued decision diagram |
CN113595059A (en) * | 2021-05-25 | 2021-11-02 | 国网天津市电力公司电力科学研究院 | Generation method of active frequency response control typical scene of wide area power grid |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6396492B1 (en) * | 1999-08-06 | 2002-05-28 | Mitsubishi Electric Research Laboratories, Inc | Detail-directed hierarchical distance fields |
-
2012
- 2012-07-13 EP EP12176304.9A patent/EP2685395A1/en not_active Withdrawn
-
2013
- 2013-07-11 US US13/940,247 patent/US9244701B2/en not_active Expired - Fee Related
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6396492B1 (en) * | 1999-08-06 | 2002-05-28 | Mitsubishi Electric Research Laboratories, Inc | Detail-directed hierarchical distance fields |
Non-Patent Citations (12)
Title |
---|
32nd WIC Symposium on Information Theory in the Benelux-First Joint WIC/IEEE SP Symposium on Information Theory and Signal Processing in the Benelux, May 10-11, 2011, Brussels, Belgium, 7 pages. |
Bebelis, Vagelis, "System Scenarios Using Data Correlation in FFT Input", WIC Symposium May 10, 2011, 1 page. |
Caldwell et al., "Relaxed Partitioning Balance Constraints in Top-Down Placement", 1998. * |
European Search Report, European Patent Application No. 12176304.9, dated Nov. 19, 2012. |
Frantzen, L. et al., "A Symbolic Framework for Model-Based Testing", FATES/RV 2006, LNCS, vol. 4262, Aug. 15-16, 2011, pp. 40-54. |
Gheorghita et al., "System Scenario based Design of Dynamic Embedded Systems", 2007. * |
Gheorghita et al., System Scenario based Design of Dynamic Embedded Systems, 2007. * |
Gheorghita, S.V. et al., "System Scenario Design of Dynamic Embedded Systems", ACM Trans. on Design Automation of Electronic Systems, vol. 14, No. 1, Jan. 2008, pp. 1-44. |
Van Stralen, Peter et al., "A Trace-Based Scenario Database for High-Level Simuliation of Multimedia MP-SoCs", Embedded Computer Systems (SAMOS), 2010 International Conference, Jul. 19-22, 2010, pp. 11-19. |
Van Stralen, Peter et al., "Fast Scenario-Based Design Space Exploration Using Feature Selection", ARCS Workshops (ARCS), IEEE, Feb. 28, 2012, pp. 1-7. |
Van Stralen, Peter et al., "Scenario-Based Design Space Exploration of MPSoCs", Computer Design (ICCD), IEEE International Conference, Oct. 3, 2010, pp. 305-312. |
Zimmermann, Jochen et al., "Analysis of Multi-Domain Scenarios for Optimized Dynamic Power Management Strategies", Design, Automation & Test in Europe Conference & Exhibition, IEEE, Mar. 12, 2012, pp. 862-865. |
Also Published As
Publication number | Publication date |
---|---|
US20140019739A1 (en) | 2014-01-16 |
EP2685395A1 (en) | 2014-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xu et al. | Deepwear: Adaptive local offloading for on-wearable deep learning | |
Herodotou et al. | A survey on automatic parameter tuning for big data processing systems | |
Song et al. | Towards pervasive and user satisfactory cnn across gpu microarchitectures | |
Hamerly et al. | Simpoint 3.0: Faster and more flexible program phase analysis | |
Choi et al. | Lazy batching: An sla-aware batching system for cloud machine learning inference | |
US9239740B2 (en) | Program partitioning across client and cloud | |
Liao et al. | Machine learning-based prefetch optimization for data center applications | |
EP3776375A1 (en) | Learning optimizer for shared cloud | |
US9436512B2 (en) | Energy efficient job scheduling in heterogeneous chip multiprocessors based on dynamic program behavior using prim model | |
Masadeh et al. | Machine-learning-based self-tunable design of approximate computing | |
Thiagarajan et al. | Bootstrapping parameter space exploration for fast tuning | |
US9244701B2 (en) | Method for system scenario based design of dynamic embedded systems | |
Qazi et al. | Workload prediction of virtual machines for harnessing data center resources | |
Quan et al. | Scenario-based run-time adaptive MPSoC systems | |
WO2022031561A1 (en) | Memory usage prediction for machine learning and deep learning models | |
Chen et al. | JOSS: Joint Exploration of CPU-Memory DVFS and Task Scheduling for Energy Efficiency | |
Daghero et al. | Dynamic Decision Tree Ensembles for Energy-Efficient Inference on IoT Edge Nodes | |
Adegbija et al. | Phase distance mapping: a phase-based cache tuning methodology for embedded systems | |
Kramer et al. | Realizing a proactive, self-optimizing system behavior within adaptive, heterogeneous many-core architectures | |
Adegbija et al. | Dynamic phase-based tuning for embedded systems using phase distance mapping | |
Wei et al. | A survey on quality-assurance approximate stream processing and applications | |
CN115269543A (en) | Data sampling method | |
US8949249B2 (en) | Techniques to find percentiles in a distributed computing environment | |
Quan et al. | Towards self-adaptive mpsoc systems with adaptivity throttling | |
Yen et al. | Keep in Balance: Runtime-reconfigurable Intermittent Deep Inference |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: KATHOLIEKE UNIVERSITEIT LEUVEN, K.U. LEUVEN R&D, B Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CATTHOOR, FRANCKY;VAN THILLO, WIM;RAGHAVAN, PRAVEEN;AND OTHERS;SIGNING DATES FROM 20130723 TO 20131210;REEL/FRAME:032806/0737 Owner name: STICHTING IMEC NEDERLAND, NETHERLANDS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CATTHOOR, FRANCKY;VAN THILLO, WIM;RAGHAVAN, PRAVEEN;AND OTHERS;SIGNING DATES FROM 20130723 TO 20131210;REEL/FRAME:032806/0737 Owner name: IMEC, BELGIUM Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:CATTHOOR, FRANCKY;VAN THILLO, WIM;RAGHAVAN, PRAVEEN;AND OTHERS;SIGNING DATES FROM 20130723 TO 20131210;REEL/FRAME:032806/0737 |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20240126 |