EP2176754A2 - System und verfahren zur optimierung einer datenanalyse - Google Patents
System und verfahren zur optimierung einer datenanalyseInfo
- Publication number
- EP2176754A2 EP2176754A2 EP08835071A EP08835071A EP2176754A2 EP 2176754 A2 EP2176754 A2 EP 2176754A2 EP 08835071 A EP08835071 A EP 08835071A EP 08835071 A EP08835071 A EP 08835071A EP 2176754 A2 EP2176754 A2 EP 2176754A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- nodes
- node
- tasks
- data
- graphing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 238000007405 data analysis Methods 0.000 title abstract description 5
- 238000012545 processing Methods 0.000 claims abstract description 67
- 230000015654 memory Effects 0.000 claims abstract description 61
- 238000004458 analytical method Methods 0.000 claims abstract description 23
- 230000003595 spectral effect Effects 0.000 claims abstract description 6
- 230000000903 blocking effect Effects 0.000 claims description 61
- 238000005192 partition Methods 0.000 claims description 17
- 239000000975 dye Substances 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 7
- 238000012512 characterization method Methods 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 4
- 238000000638 solvent extraction Methods 0.000 claims description 4
- 238000013479 data entry Methods 0.000 claims description 2
- 238000001514 detection method Methods 0.000 claims description 2
- 238000000684 flow cytometry Methods 0.000 abstract description 4
- 230000003044 adaptive effect Effects 0.000 abstract description 2
- 239000007850 fluorescent dye Substances 0.000 abstract description 2
- 230000001360 synchronised effect Effects 0.000 abstract description 2
- 230000001419 dependent effect Effects 0.000 description 11
- 238000004364 calculation method Methods 0.000 description 9
- 230000008901 benefit Effects 0.000 description 5
- 230000009977 dual effect Effects 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 4
- 238000003491 array Methods 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 241000699670 Mus sp. Species 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000002085 persistent effect Effects 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G06F9/5066—Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
- G06F9/526—Mutual exclusion algorithms
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N21/00—Investigating or analysing materials by the use of optical means, i.e. using sub-millimetre waves, infrared, visible or ultraviolet light
- G01N21/62—Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light
- G01N21/63—Systems in which the material investigated is excited whereby it emits light or causes a change in wavelength of the incident light optically excited
- G01N21/64—Fluorescence; Phosphorescence
- G01N21/6428—Measuring fluorescence of fluorescent products of reactions or of fluorochrome labelled reactive substances, e.g. measuring quenching effects, using measuring "optrodes"
- G01N2021/6439—Measuring fluorescence of fluorescent products of reactions or of fluorochrome labelled reactive substances, e.g. measuring quenching effects, using measuring "optrodes" with indicators, stains, dyes, tags, labels, marks
- G01N2021/6441—Measuring fluorescence of fluorescent products of reactions or of fluorochrome labelled reactive substances, e.g. measuring quenching effects, using measuring "optrodes" with indicators, stains, dyes, tags, labels, marks with two or more labels
Definitions
- the present invention relates to systems and methods and for optimizing computing resources, and more particularly, to systems and methods for optimizing processing of tasks within a parallel computing architecture.
- vector or parallel computer processor architectures have been used for many years to allow tasks to be executed concurrently, thereby increasing overall computational speed.
- Early vector processors have evolved to massively parallel systems such as the IBM Blue Gene/L, which in one configuration has 65,536 computational nodes distributed among 64 cabinets delivering a theoretical peak performance of 360 terra- FLOPS.
- integrated circuit microprocessors have been developed that include a plurality of processor cores, making parallel computing possible even on modest physical scales. Each of the processor cores can perform operations interdependently from another processor core and can perform operations in parallel with another processor core.
- Intel Pentium processors have offered dual core configurations, and quad core implementations are soon planned for market introduction.
- processors available in the system may become underutilized as one or more processors sit idle while others perform significantly more work.
- Such efficient use of resources is needed particularly in an environment that requires analysis of large amounts of data originating from a variety of sources.
- an adaptive semi- synchronous parallel processing systems and methods which may be adapted to various data analysis applications such as flow cytometry systems.
- By identifying the relationship and memory dependencies between tasks that are necessary to complete an analysis it is possible to significantly reduce the analysis processing time by selectively executing tasks after careful assignment of tasks to one or more processor queues, where the queue assignment is based on an optimal execution strategy.
- Further strategies are disclosed to address optimal processing once a task undergoes computation by a computational element in a multiprocessor system. Also disclosed is a technique to perform fluorescence compensation to correct spectral overlap between different detectors in a flow cytometry system due to emission characteristics of various fluorescent dyes.
- the analysis can be decomposed into a set of operations (such as tasks) that need to be performed on the data to achieve the desired results.
- operations such as tasks
- the operations are predicated on the results of other operations, there is a dependency between the operations; some operations cannot be processed until other operations have completed processing.
- a directed graph results, with the operations represented by vertices (nodes), and the dependencies represented by the directed edges of the graph.
- algorithms can be applied to establish the a preferred order of processing that should be applied to the operations to reduce processing times and/or obtain optimal utilization of resources such as processor cores in a multiprocessor system.
- example attributes may include blocking status (e.g. "blocked” or “unblocked"), node generation (e.g. an ordinal number such as “1,” “2,” “3,” and so on), vertex count (e.g. an ordinal number), processing states (e.g. "clean” or “dirty”) or any other suitable information.
- blocking status e.g. "blocked” or “unblocked”
- node generation e.g. an ordinal number such as “1,” “2,” “3,” and so on
- vertex count e.g. an ordinal number
- processing states e.g. "clean” or "dirty”
- the graph may also be partitioned so that if only one operation needs to be performed, the corresponding nodes (or operations) from which it depends may be updated so that the relevant operation may produce meaningful and up-to-date results without having to process every operation or task defined within the graph.
- a blocked or blocking vertex is a vertex that cannot be computed until all its input data is available.
- An example would be an auto region in a display that adjusts its position to be an upper percentile of a histogram. This operation normally cannot execute until all the histogram is built with all input data.
- the difference between blocking and non blocking vertices should be distinguished and the nodes appropriately partitioned. Any vertex that depends on a blocking vertex is within the blocking set.
- the blocking attribute then propagates down the graph's edges designating vertices that depend upon the blocked vertex to also be in a blocked state.
- the partitioned vertices are sorted into generations.
- the first generation includes is vertices that have no dependencies; the second generation includes vertices that depend upon the first generation, and so on.
- the vertex generation is determined to be the highest generation of its parent vertices plus one.
- calculation of the generations is performed by using a shortest paths algorithm.
- each vertex may include the designation of a threading model under which it may execute.
- the generations are partitioned into the desired set of threading models.
- the threading models may include "Parallel," "Strip- mine.” and "Do Not Thread” models.
- Parallel threading treats the set of vertices as a queue of tasks; a thread pool consumes the tasks within the queue. Once a thread has completed a task, it fetches the next available task from the queue.
- Strip-mine threading splits data into a series of data sections. Threads from a thread pool consume a data section and perform all tasks within the set of strip-mined tasks on that data section.
- the thread fetches the next available data section.
- One of the advantages of this method is that where multiple tasks within a generation are using the same data, the data is more likely to be immediately available in a processor's cache memory. For tasks that produce output data with a 1 to 1 correspondence with an event, it also eliminates or substantially reduces the need to use a lock to protect data access.
- Do Not Thread threading results in an approach where tasks are sequentially executed by the thread that executes the scheduling function.
- execution of the graph is undertaken by processing non-blocking nodes, followed by processing of blocking nodes.
- the node attributes may be appropriately updated such as identifying them as "clean" or "processed.”
- An embodiment provides for a method for processing data that comprises: defining a plurality of tasks requiring execution; identifying one or more dependencies for each respective task of the plurality of tasks, the one or more dependencies comprising at least one of a resource dependency and a memory dependency; determining dependencies between for each respective task of the plurality of tasks wherein the dependencies include directional dependencies; creating a directed graph; and analyzing the directed graph to assign execution of tasks to a plurality of processors to maximize computational efficiency.
- the directed graph may includes a plurality of nodes, the nodes respectively corresponding to the plurality of tasks requiring execution; and a plurality of directed edges that respectively correspond to the directional dependencies, each of said directed edges originating from a node of the plurality of nodes and terminating on another node of the plurality of nodes.
- One or more threading models may be attributed to each node of the plurality of nodes, and attributing a threading model to each node of the plurality of graphing may further include: assigning a parallel threading model to each said node where said node corresponds to a task that is independent, may operate in parallel, and has a non-blocked status; assigning a strip-mine threading model to each node in a collection of nodes that correspond to tasks that operate on common data; and assigning a non-threading model is assigned to nodes corresponding to tasks that must be executed sequentially.
- each corresponding task may be iteratively computed for each data range associated with the task.
- Attributing a strip-mine threading model comprises: partitioning a data set into a plurality of subsets, the size of each subset of the plurality of subsets chosen to be containable within a cache memory of predetermined size associated with a processor of the plurality of processors; allocating processing of a subset of the plurality of subsets to the processor; loading the subset of the plurality of subsets into the cache memory associated with the processor; executing a task utilizing the subset of the plurality of subsets; and indicating, by the processor, that another subset of the plurality of subsets may be allocated.
- a branch prediction model may be assigned to a node of the plurality of nodes, the branch prediction model determined by an entropy characterization of data operated on by a function corresponding to the node.
- Each node of the plurality of nodes may be assigned to one or more processing queues and tasks corresponding to each node assigned to the one or more processing queues are executed, and alternatively or in combination, each node of the plurality of nodes may be assigned to one or more processing queues based on the an identified memory dependency identified for that node.
- the consideration of memory dependency for making the assignment may include a desired cache level allocation.
- Executing tasks assigned to the one or more processing queues may comprise: for each node of the plurality of nodes, determining whether the status of each node is blocked or unblocked; and executing tasks corresponding to all non-blocked nodes of the plurality of nodes then executing tasks corresponding to all blocked nodes of the plurality of nodes. Execution of tasks corresponding to all blocked nodes of the plurality of graphing nodes may be repeated once data has become available.
- the following steps are performed until all data is processed: setting status of all nodes of the plurality of nodes to processed or unprocessed status; when a task corresponding to a node of the plurality of nodes has been completely computed, setting status of the node to processed status; when data corresponding to a node of the plurality of nodes has been changed, setting status of the node to unprocessed status; and executing tasks corresponding to all unprocessed nodes of the plurality of nodes in accordance with respectively assigned threading models.
- a method for optimizing utilization of processor resources comprising: determining a plurality of processing tasks required to perform a requested computation and obtaining from the plurality of processing tasks a set of task dependencies; determining, from the plurality of processing tasks and task dependencies, a blocking state for each of the plurality of processing tasks and setting the blocking state to blocked or non-blocked; instantiating a plurality of graphing nodes respectively representing the plurality of processing tasks wherein said plurality of nodes are initialized to a dirty state; defining a set of directed edges between pairs of individual nodes from the plurality of graphing nodes; counting references for each node of the plurality of graphing nodes; adding blocking attributes to one or more nodes of the plurality of graphing nodes; partitioning the plurality of graphing nodes into a blocked set and an non-blocked set; within each of the blocked set partition and the non-blocked set partition, assigning a generation number to each node of the plurality of graphing nodes; attributing a thread
- the following steps may be performed until all data is processed: setting status of all nodes of the plurality of graphing nodes to clean or dirty status; when a task corresponding to a node of the plurality of graphing nodes has been completely computed, setting status of the node to clean status; when a data corresponding to a node of the plurality of graphing nodes has been changed, setting status of the node to dirty status; and executing all dirty nodes of the plurality of graphing nodes in accordance with respectively assigned threading models.
- Directed edges may be defined between nodes in the graph, for instance, between pairs of nodes. Defining a set of directed edges between pairs of individual nodes from the plurality of graphing nodes may further comprise: adding a node reference to a node from the pair of individual nodes when the node has at least one of the set of directed edges directed to it; and not adding a node reference to a node from the pair of individual nodes when the node has at least one of the set of directed edges originating from the node.
- counting references for each node of the plurality of graphing nodes further comprises removing a node of the plurality of graphing nodes if the node has zero counted references.
- assigning a generation number to each node of the plurality of graphing nodes may further comprise: assigning a node a generation number of 1 if the node has no directed edges terminating at the node; assigning a node a generation number of n if the node has a single directed edge terminating at the node that originated from a generation n-1 node; and if the node has multiple directed edges terminating at the node, assigning the node a generation number equal to 1 plus the maximum number of the generation number of the nodes from which the multiple directed edges originate.
- adding blocking attributes to one or more nodes of the plurality of graphing nodes may comprise: for each node of the plurality of graphing nodes, adding a blocking attribute to the node if: the node corresponds to a task with a blocked blocking state; or the node has a directed edge terminating at the node and the originating node of the directed edge has a blocked blocking state.
- One or more threading models may be attributed to a node in any desired manner.
- attributing a threading model to each node of the plurality of graphing nodes may comprise: assigning a parallel threading model to each said node where said node corresponds to a task that is independent, may operate in parallel, and has a non-blocked status; assigning a strip-mine threading model to each node in a collection of nodes that correspond to tasks that operate on common data; and assigning a non-threading model is assigned to nodes corresponding to tasks that must be executed sequentially.
- a branch prediction model may be assigned to a node, the branch prediction model determined by an entropy characterization of data operated on by a function corresponding to the node.
- Threading models that include a strip-mine threading model may iteratively compute each corresponding task for each data range associated with the task.
- Tasks may be executed by a blocked/blocking or non-blocked/non-blocking status indicator.
- executing tasks assigned to the one or more processing queues further comprises executing tasks corresponding to all non-blocked nodes of the plurality of graphing nodes then executing tasks corresponding to all blocked nodes of the plurality of graphing nodes.
- executing tasks corresponding to all non-blocked nodes of the plurality of graphing nodes further comprises repeating execution of tasks corresponding to all non-blocked nodes of the plurality of graphing nodes.
- executing tasks corresponding to all blocked nodes of the plurality of graphing nodes further comprises repeating execution of tasks corresponding to all blocked nodes of the plurality of graphing nodes once data has become available. Processing status may also be considered in accordance with various embodiments.
- the status of the node may be set to dirty status until all asynchronous data has been loaded.
- executing all dirty nodes of the plurality of graphing nodes in accordance with respectively assigned threading models may further comprise: re-executing a task corresponding to a node of the plurality of graphing nodes, wherein the re-execution comprises: locating one or more predecessor nodes of the node; determining whether the one or more predecessor nodes are set to dirty status; re-executing each task corresponding to the one or more predecessor nodes that are set to dirty status; and executing the task corresponding to the node of the plurality of graphing nodes.
- assigning each node of the plurality of graphing nodes to one or more processing queues is accomplished by a scheduler. Also assigning each node of the plurality of graphing nodes to one or more processing queues may further comprise grouping tasks within queues by predecessor tasks or by data sources. Assigning each node of the plurality of graphing nodes to one or more processing queues may also comprise a weighted combination of: grouping tasks within queues within queues by predecessor tasks; and grouping tasks within queues within queues by data sources.
- a system for optimizing data computed by multiple processors.
- the system comprises: a plurality of processors wherein: the processors are respectively coupled to a plurality of cache memories; and the processors are coupled to a common data bus; a user interface comprising a user data entry interface and a display interface; a memory for storing data and one or more instructions for execution by the plurality of processors to implement one or more functions of the data processing system to: define a plurality of tasks requiring execution; identify one or more dependencies for each respective task of the plurality of tasks, the one or more dependencies comprising at least one of a resource dependency and a memory dependency; determine dependencies between for each respective task of the plurality of tasks wherein the dependencies include directional dependencies; create a directed graph, wherein the directed graph includes: a plurality of nodes, the nodes respectively corresponding to the plurality of tasks requiring execution; and a plurality of directed edges that respectively correspond to the directional dependencies, each of said directed edges originating from a no
- user interfaces may include commonly used visual, tactile, and aural input and output interface elements such as keyboards, touch screens, mice or other cursor manipulation devices, displays such as LCD panels or CRT displays, printers, speakers, and microphones.
- Instructions may be executed in any desired manner.
- the instructions to be executed by the plurality of processors further include instructions to attribute a threading model to each node of the plurality of nodes.
- the instructions to be executed by the plurality of processors may further include instructions to: assign one or more threading models, where a parallel threading model to each said node where said node corresponds to a task that is independent, may operate in parallel, and has a non-blocked status; assign a strip- mine threading model to each node in a collection of nodes that correspond to tasks that operate on common data; and assign a non-threading model is assigned to nodes corresponding to tasks that must be executed sequentially.
- a branch prediction model may be assigned to a node of the plurality of nodes, the branch prediction model determined by an entropy characterization of data operated on by a function corresponding to the node. Further, each node of the plurality of nodes may be assigned to one or more processing queues and tasks corresponding to each node assigned to the one or more processing queues and executed. Each node of the plurality of nodes may be is assigned to one or more processing queues based on an identified memory dependency identified for that node, and the consideration of memory dependency may include a desired cache level allocation. Node execution may also be determined by node processing status.
- systems and methods include for each node of the plurality of nodes, determining whether the status of each node is blocked or un-blocked; and executing tasks corresponding to all non- blocked nodes of the plurality of nodes then executing tasks corresponding to all blocked nodes of the plurality of nodes.
- the system may further comprise repeating execution of tasks corresponding to all blocked nodes of the plurality of graphing nodes once data has become available.
- An embodiment of the system of the present invention includes iterative techniques to provide continued or additional computation.
- the instructions to be executed by the plurality of processors may further include instructions to perform the following steps until all data is processed: setting status of all nodes of the plurality of nodes to processed or unprocessed status; when a task corresponding to a node of the plurality of nodes has been completely computed, setting status of the node to processed status; when data corresponding to a node of the plurality of nodes has been changed, setting status of the node to unprocessed status; and executing tasks corresponding to all unprocessed nodes of the plurality of nodes in accordance with respectively assigned threading models.
- An embodiment also provides a fluorescence compensation system that comprises: a plurality of detectors coupled to a communication bus; a plurality of processors, each respectively coupled to a cache memory and the communication bus; a memory coupled to the communication bus, the memory containing instructions to be executed by the one or more processors to: load a data file, the data file comprising dye response data from the plurality of detectors for one or more dyes; analyze spectral response from each detector for the plurality of detectors; identify which of said plurality of detectors detected a dye from the one or more dyes within a predetermined detection threshold; and calculate a spectral overlap for each dye of the one or more dyes to compute a fluorescence compensation matrix.
- the instructions may be executed by the one or more processors to store the compensation matrix in the memory for analysis with the dye response data.
- FIG. 1 is an illustration of an exemplary embodiment of a multiprocessor system of the present invention
- FIG. 2 is an illustration of an exemplary embodiment of a multiprocessor system of the present invention showing task allocation by processor
- FIG. 2A is a representation of a directed graph utilized by embodiments of the present invention.
- FIG. 2B is a representation of a directed graph utilized by embodiments of the present invention where nodes have been marked with a "dirty" status;
- FIG. 2C is a representation of a directed graph utilized by embodiments of the present invention where nodes have been marked with a generation number indicia
- FIG. 2D is a representation of a directed graph utilized by embodiments of the present invention where nodes have been identified as not needing processing because of independence from predecessor nodes;
- FIG. 2E is a representation of a directed graph utilized by embodiments of the present invention where nodes have been partitioned into blocking and non-blocking nodes;
- FIG. 3 is an illustration of an exemplary embodiment of a flowchart depicting a method of the present invention
- FIG. 4 is a continued illustration of an exemplary embodiment of a flowchart depicting a method of the present invention.
- FIG. 5 is a continued illustration of an exemplary embodiment of a flowchart depicting a method of the present invention.
- the system 100 includes a pool 10 of processors 1Oa(I), 10a(2) through 10a(n), where n may comprise any number of processors.
- the processors 10a(n) from the processor pool 10 are respectively coupled to a common bus 20, which relays data to and from other system components 30 such as displays, keyboards, mice, peripherals, sensors, detectors, storage devices, and the like; main memory 40 such as volatile dynamic random access memory (DRAM); and persistent memory 50 such as a hard drive, FLASH memory, CDRWs DVD+/- RWs and the like.
- main memory 40 such as volatile dynamic random access memory (DRAM)
- persistent memory 50 such as a hard drive, FLASH memory, CDRWs DVD+/- RWs and the like.
- system 100 may comprise a unitary or federated computer system, a personal computer system, a networked system of computing components, or distributed computers interconnected through a network such as the Internet via protocols such as TCP/IP.
- the bus 20 may comprise a local computer bus, a plurality of local computer busses, a network connection bus and protocol such as Ethernet, or any combination thereof.
- Each processor 10a further comprises a processor element and a cache element for storing frequently-accessed information.
- caches are utilized in computer systems to increase performance by decreasing the access time required to write or retrieve data from main memory.
- Caches generally comprise memory with decreased access time and/or latency, and memory locations within the cache store copies of the data from the most frequently used main memory locations. As long as most memory accesses are to cached memory locations, the average latency of memory accesses will be closer to the cache latency than to the latency of main memory.
- each processor 10a comprises a core of a microprocessor such as a Pentium dual core or quad core processor, an AMD multi-core processor, a core of a CELL processor, or the like.
- each processor core includes dedicated cache memory to increase performance by decreasing data storage and retrieval latency as opposed to more lengthy times required to access main memory.
- the plurality of caches operate independently of each other, but in alternate embodiments, each of the caches in the processor pool operate coherently, that is, for example, data values that are duplicated among caches are maintained at the same value based on independent computation from each of the processors 10a.
- a multilevel cache scheme may also be employed to better utilize the resources of the system 100.
- users utilize the system 100 to create plots in order to explore their data and produce results. These plots are a combination of one or more parameters within the data. Traditionally users will create a plot and select the parameters that the plot is graphing. For creating a large number of plots this is a tedious process and the user does not see the data until the plots are created.
- all combinations of parameters of the data are automatically produced as plot thumbnails and the user creates plots by selecting the thumbnails in the user interface (e.g. by clicking). Thus, the user can select and create plots by looking at the data, rather than having to create the plot then look at the data.
- the thumbnails update based upon the users' focus within the application; if the user had selected an analysis region then the thumbnails may update based on that analysis region; if the user had selected a gate then the thumbnails may update based on that gate.
- the user can select subsets of plot thumbnails to add in a single operation.
- the user may wish to filter the parameters that are displayed as thumbnails and a user interface may allow selection of this, or a heuristic may filter the parameters (e.g. parameters not used in the current gate), or the user may select a profile for a particular type of input data based on matching keywords and values within the input data, or associated metadata.
- the thumbnails may only initially display a partial set of the data, in many cases where the data is distributed randomly this may be sufficient to allow the user to begin working whilst the whole set of data is calculated in the background. Once the whole set of data is calculated, the thumbnails update.
- the partial subset may be user selectable or the application may choose; options may be a fixed number of events from the start of the data, a random sampling of the data, or a semi random sampling based on the user's selected context within the user interface (e.g. if the user was looking at a gated subset of data, then using elements from that subset plus elements from the whole data set may give a realistic representation of the whole data set).
- users perform an analysis using the tools within the application to transform the data, to find subsets of data, and to produce statistical and graphical output.
- the application maps the user designed analysis into a mathematical construct called a directed graph (See, e.g., FIG. 2A).
- a directed graph represents a set whose elements are called vertices or nodes, and includes a set of ordered pairs of directed edges connecting the vertices or nodes.
- operations or data represent the vertices or nodes in the graph construct (see FIG. 2A, 230, 232, 234,238, 238, 239, 240, 242, 244, 250, 252, 254), and the dependencies between are tasks represented by directed edges (see FIG. 2A, arrows such as 260).
- the directional dependency relationships reflect the temporal relationships of tasks; that is, some tasks cannot be completed before a predecessor task is complete (e.g. a density plot task requires a dual parameter histogram task to be complete).
- the analysis is decomposed into a set of operations that can be performed on that data. Some of the operations become predicated on the results of other operations and so there is a dependency between the operations; that is, some operations cannot be computed until the other operations have completed.
- graph algorithms can then be used to identify the order in which the tasks should be processed.
- embodiments of the present invention may utilize the directed graph to determine a minimum number of nodes that need to be processed to achieve a particular result, thereby minimizing the computation time and expense.
- a filtered view of the task graph is produced using a subset of the tasks; that is, the tasks that are 'dirty' and their dependents.
- a task is 'dirty' if it has not been calculated yet or its state or data has changed so that it requires recalculation.
- a dependent task of a dirty task will also need recalculation as it is using products of the dirty task.
- a visitor algorithm is applied to the graph to mark a dirty node and all its children as dirty.
- the graph may be filtered so that only dirty nodes remain, and only the dirty nodes need be recalculated to obtain the desired result.
- Task 2 (232) has become "dirty” because, for instance, its data became out of date. Then all of its dependent "children" tasks (Tasks 2A-2D) will need recalculation.
- FIG. 2B as only the tasks within the dashed area 270 require recalculation when Task 2 (232) becomes "dirty," computational effort and cost is saved as the remaining nodes do not require re-computation.
- Each task is assigned to a generation, with each generation ordinally numbered. Referring to FIG. 2C for example, Tasks 1 and 2 (230, 232, respectively) are assigned to generation 1 (see small circled number within each task). Subsequent tasks are assigned to generations (Tasks 3 and 4 to generation 2, Tasks 2A, 5, and 6 to generation 3, Task 2B to generation 4, Task 2C to generation 5, and Tasks 2D, 5A, and 6A to generation 6). Each task within a generation does not depend on any other task within the same generation, and therefore these tasks may be executed in parallel by a multitude of processing units, further improving execution speed and efficiency.
- An additional refinement includes the ability to calculate a partial set of the task graph that includes the predecessors of a specified task. This provides functionality such as the ability to interactively manipulate a subset of the user analysis in response to user interaction and just perform the calculations needed for that action rather than calculating all products that may be affected by the action. Another example includes the situation where a user is interested in a limited portion of the analysis such as the statistical results, where just the involved tasks are calculated rather than all the other products within the analysis. For an example, by analyzing the predecessors for a given node (see FIG. 2D, Task 2D (239)), we can calculate the minimum set of nodes required to obtain the desired result. The nodes within the dashed box 280 do not need calculating, as they do not affect Task 2D (239). This approach may be combined with the dirty node functionality mentioned above to further reduce the set of nodes.
- Some tasks are dependent on the number of elements within the input data set (generally they perform an operation on each element within the set) while other tasks are independent of the number of elements within the input data set (e.g. they take constant time, for example, producing a density plot bitmap from a dual histogram plot is dependent on the dimensions of the density plot and not the input data set).
- Some of these tasks therefore require all elements within the input data set to be present in order to execute, whereas some tasks can execute on a subset of the input data, and then when the rest of the input data is available, execute on the rest of the input data. But other tasks that require the whole input data to be available cannot execute, or if they are executed, will need to re-execute when the whole set is available, which adds to overall execution time.
- a blocking task is a task that requires the entire data set to be present to execute.
- a non- blocking task can be performed on incremental sets of data. Examples of blocking tasks include calculating statistics from histograms, producing density plot bitmaps from dual parameter histograms, and analysis regions that adjust based on the profile of the input data. Examples of non-blocking tasks include performing fluorescence compensation, calculating computed parameters, and building histograms. Since a non-blocking task may depend on a blocking task (e.g.
- FIG. 2E illustrates an exemplary partition of a directed graph into blocking nodes 295 and non-blocking nodes 290. The partition is established so that any descendants of blocking nodes are in the blocking partition even if the descendant is non-blocking.
- the disk may provide data at a slower rate than it is possible to perform all the tasks on it, so the application performs non-blocking calculations on blocks of data as they arrive from the disk, and once all data has arrived from the disk, the blocking calculations can execute. If the application finds itself still waiting for data from the disk even after non-blocking calculations have executed, it may elect to perform the blocking calculations so as to provide feedback to the user, even though it would have to perform those blocking calculations again once the whole input data set is available. Since it was waiting on the external event of data arriving from the disk anyway, this may not affect the total elapsed time from starting disk loading to the final results being available to the user.
- tasks are grouped as to a threading preference.
- threading preference there are three threading preferences available: DoNotThread, Parallel, and Stripmine. This enables further optimization of performance for a number of reasons.
- Microprocessors are much faster than the memory that they are connected to, and therefore accessing memory is slow and involves the processor waiting on memory to deliver data or instructions to it.
- Microprocessors compensate for this by having a set of local very fast memory cache. When data is requested, it is fetched into this cache, and the subsequent access to memory that is in already in the cache is much faster. The amount of cache is small compared to the main memory size however, and infrequently used data is evicted from the cache to main memory.
- whole data sets 220 can be partitioned 210 into subsets that fit within a cache, and each processor 10a in the pool of processors 10 takes an individual partition and executes the tasks on the data in the partition. Once a processor has completed a partition, it is allocated the next available partition that has not been executed.
- this task-cache allocation process may be referred to as "stripmining.” As some partitions may execute faster than others (due to properties of the input data, such as a temporal element to the user's experiment), this helps to balance out the processing load across processors 10a as they are less affected by any variation in the execution time of partitions of the input data set.
- stripmining is that where the output data of a task has a 1 to 1 correspondence with the input data but the output data is shared between tasks (e.g. the task is writing a single bit per element of the input data into a word of memory per element), there is no need for locking within the task, and the task executes in a lock free manner. It may be advantageous to sort the stripmining tasks within a build generation by their links to input data; this has the advantage on a processor with multiple levels of cache that consecutive tasks have a greater chance of finding their data within nearer (and thus, faster) levels of cache within the processor.
- Parallel tasks are tasks where it is better to assign tasks to a pool, and each processor within the system is assigned a task from the pool. The processor executes the task, and upon completion is assigned another task from the pool until the pool is empty.
- These tasks may be examples such as building a two dimensional histogram, where the access pattern of writes and reads to the histogram is relatively random because it is dependent on random unpredictable input data, and this histogram is large relative to the cache size and therefore it is advantageous to keep the histogram in the cache rather than the input data.
- Some tasks may not be safe to execute in parallel, and therefore require execution in sequence.
- Embodiments refer to these tasks as DoNotThread tasks, and they are executed sequentially by the system.
- data within the application is stored within a Structure of Arrays format.
- the Array of Structures format lays out data as the parameters for each event being contiguous, repeating for each new event. If there were 6 parameters per event, then the first 6 locations of memory would represent the 6 parameters of the first event, the next 6 locations would represent the 6 parameters of the second event, and so on.
- the Structure of Arrays format groups by parameter rather than event, so that the first 6 locations in memory would be the value of the 1st parameter for the first 6 events.
- SoA Structure of Arrays
- Modern microprocessors may contain logic to predict which way a branch in program execution will be taken, and perform speculative work beyond the branch ahead of time. When the logic mispredicts the branch there is a significant penalty as the processor has to throw away the speculative work and restore its internal state. When it predicts correctly, performance is improved however.
- the operations within a system depend on input data that has a large degree of randomness. Where the input data is random the branch prediction does not perform well. For some algorithms it may be possible to implement them as variations that use a branch or do not use a branch. If the amount of randomness within the input data can be measured, it is possible to select a branched implementation for less random data, and a branchless implementation for random data. Depending on the relative execution time of the implementations, the level of randomness can be chosen as to where to switch between the implementations.
- the measurement of the level of randomness may be derived from a predecessor task' s results. For example, a count of events within a gate relative to the total event count will give an indication of the randomness of the gated data for gating. Alternatively it may in some cases be actually faster to measure the randomness of the data and then use either implementation of the algorithm.
- the randomness over the whole data set may vary, e.g. where a user has added a reagent mid experiment that then affects the subsequent data, and hence it's randomness. For example, 50% of the data may fall within a gate, but because the gate was on a kinetics experiment, the 50% occurs at the first 50% of the data set. This would ordinarily imply selection of a branchedless implementation, which would actually reduce performance.
- different implementations can be applied to the algorithm to different subsets, and this would solve the performance issue, and for sufficiently large subsets the cost of the sub-measurement can be amortized.
- analysis tools e.g., gates, regions, plots
- the same set of tools can be used to analyze results from entire sets of data.
- a protocol for a users' experiment may produce a series of statistics (for example CD4 count, CD8 count). But if the user has a whole series of data files analyzed using the same protocol, then these statistics form a multi-parameter dataset themselves, which can in turn be analyzed using the same toolset. Instead of each 'event' represents a cell or other particle, each event represents a whole sample that the user has analyzed. Thus, a researcher could use the same tools he uses for analyzing a single file to analyze a whole study, and look for populations of samples within the study.
- Embodiments of the present invention may be better understood when considering the flow chart 300 shown in FIGS. 3-5.
- the tasks are analyzed for dependencies and are analyzed for blocking/non- blocking status 305.
- Nodes may then be instantiated to represent the defined tasks whereupon an indicia associated with node data structures are initially set to "dirty" 315.
- Edges may then be added to represent the dependencies between nodes 320, for example, references are added when a node has an edge directed to it, and a reference is not added when a node has an edge originating from it.
- a reference count is performed 325. If after assigning all nodes and edges, a node has no references, then nodes may be optionally deleted 330, 335. Blocking attributes are added to blocking nodes and all nodes in all generations dependent therefrom 340, and the set of nodes is partitioned by blocking/nonblocking status 345. Within each partition, data structures associated with the nodes are assigned 350 a generation number. For example, generation 1 is assigned to nodes with no dependencies, and generation 2 is assigned to nodes dependent upon generation 1. Nodes that depend from a generation "n-1," where n is any arbitrary generation number, are assigned generation number n. If a node has multiple dependencies, then generation of the node is set to be the maximum of the node's parent node generations plus 1.
- Threading models are assigned 355 to each node, including an optional branch prediction 355.
- a threading model may be assigned as (a) a parallel model, when nodes are independent and non-blocking; (b) a Strip-mine model, when nodes operate on common data; (c) a Do Not Thread model for tasks that must be executed sequentially; and (d) combinations of those factors that may be desirable for conditions such as load balancing.
- Nodes may be assigned to one or more processing queues by any method, such as via a scheduler 360. If task is designated with a Strip-mine threading model, each data range is iteratively computed for each task. In one embodiment, assignment to processing queues may be performed by (a) grouping tasks within queues by predecessor tasks, (b) grouping tasks within queues by data sources, or a weighted combination of (a) and (b).
- Non- blocking nodes may be executed first, then blocking nodes according to threading model 365.
- Non-blocking node execution may occur multiple times before blocking node computation, if desired. Examples include data that has partially been loaded. Blocking node execution may be repeated for interim calculations or after all data becomes available.
- Iteration may then be undertaken to perform a full, partial, or incremental analysis.
- all nodes are visited 370 to mark their processing status as "clean” or "dirty” (i.e. processed, or unprocessed). Once all nodes are computed, then they are marked as clean, meaning their computation is completed. If data is being loaded asynchronously, nodes are assigned dirty status until the last pass is complete. If a node's data has changed since previous calculation, the node and the node's dependent nodes are assigned dirty status.
- All dirty nodes are executed 375 in accordance with previous thread model/generation partitions.
- To re-execute 380 a task corresponding to a node somewhere in the graph, its predecessors are determined 383, and it is determined 385 whether the node and/or its ancestor nodes are dirty 385. If a node is dirty, each relevant predecessor task is re-executed 385, and then the task corresponding to the specified node is executed 387. This process continues 390 until data is entirely processed (for instance, no more screen updates remain). It is to be understood that the foregoing description is exemplary and explanatory only and is not restrictive of the invention, as disclosed or claimed.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
- Investigating, Analyzing Materials By Fluorescence Or Luminescence (AREA)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US94638107P | 2007-06-26 | 2007-06-26 | |
PCT/IB2008/003706 WO2009044296A2 (en) | 2007-06-26 | 2008-06-26 | System and method for optimizing data analysis |
Publications (2)
Publication Number | Publication Date |
---|---|
EP2176754A2 true EP2176754A2 (de) | 2010-04-21 |
EP2176754B1 EP2176754B1 (de) | 2019-10-16 |
Family
ID=40162380
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP08835071.5A Active EP2176754B1 (de) | 2007-06-26 | 2008-06-26 | System und verfahren zur optimierung einer datenanalyse |
Country Status (3)
Country | Link |
---|---|
US (1) | US8166479B2 (de) |
EP (1) | EP2176754B1 (de) |
WO (1) | WO2009044296A2 (de) |
Families Citing this family (56)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090073981A1 (en) * | 2007-09-18 | 2009-03-19 | Sensory Networks, Inc. | Methods and Apparatus for Network Packet Filtering |
US8086455B2 (en) * | 2008-01-09 | 2011-12-27 | Microsoft Corporation | Model development authoring, generation and execution based on data and processor dependencies |
US8631094B1 (en) * | 2008-08-08 | 2014-01-14 | Google Inc. | Distributed parallel determination of single and multiple source shortest paths in large directed graphs |
WO2010033383A1 (en) * | 2008-09-16 | 2010-03-25 | Beckman Coulter, Inc. | Interactive tree plot for flow cytometry data |
US20100070502A1 (en) * | 2008-09-16 | 2010-03-18 | Beckman Coulter, Inc. | Collision Free Hash Table for Classifying Data |
US8510281B2 (en) * | 2008-12-18 | 2013-08-13 | Sap Ag | Ultimate locking mechanism |
US8069446B2 (en) * | 2009-04-03 | 2011-11-29 | Microsoft Corporation | Parallel programming and execution systems and techniques |
FR2959637B1 (fr) * | 2010-04-30 | 2012-04-27 | Thales Sa | Procede et dispositif de configuration d'un reseau de capteurs sans fils deposes. |
US20120131559A1 (en) * | 2010-11-22 | 2012-05-24 | Microsoft Corporation | Automatic Program Partition For Targeted Replay |
US9996394B2 (en) * | 2012-03-01 | 2018-06-12 | Microsoft Technology Licensing, Llc | Scheduling accelerator tasks on accelerators using graphs |
US9235527B2 (en) * | 2012-07-20 | 2016-01-12 | The Johns Hopkins University | Multiple-cache parallel reduction and applications |
DE102012222215A1 (de) * | 2012-12-04 | 2014-06-05 | Robert Bosch Gmbh | Verfahren zum Betreiben einer echtzeitkritischen Anwendung auf einem Steuergerät |
US9256460B2 (en) | 2013-03-15 | 2016-02-09 | International Business Machines Corporation | Selective checkpointing of links in a data flow based on a set of predefined criteria |
US9401835B2 (en) | 2013-03-15 | 2016-07-26 | International Business Machines Corporation | Data integration on retargetable engines in a networked environment |
US9323619B2 (en) | 2013-03-15 | 2016-04-26 | International Business Machines Corporation | Deploying parallel data integration applications to distributed computing environments |
US9424079B2 (en) | 2013-06-27 | 2016-08-23 | Microsoft Technology Licensing, Llc | Iteration support in a heterogeneous dataflow engine |
US9852230B2 (en) | 2013-06-29 | 2017-12-26 | Google Llc | Asynchronous message passing for large graph clustering |
US9596295B2 (en) | 2013-06-29 | 2017-03-14 | Google Inc. | Computing connected components in large graphs |
US9477511B2 (en) * | 2013-08-14 | 2016-10-25 | International Business Machines Corporation | Task-based modeling for parallel data integration |
US9686142B2 (en) * | 2013-09-30 | 2017-06-20 | International Business Machines Corporation | Node-pair process scope definition and scope selection computation |
JP6197659B2 (ja) * | 2014-01-20 | 2017-09-20 | 富士ゼロックス株式会社 | 検出制御装置、プログラム及び検出システム |
US9330199B2 (en) * | 2014-07-21 | 2016-05-03 | Facebook, Inc. | Striping of directed graphs and nodes with improved functionality |
US11093878B2 (en) * | 2015-07-01 | 2021-08-17 | Oracle International Corporation | System and method for providing temporal dependencies between tasks |
FR3045870B1 (fr) * | 2015-12-21 | 2018-08-31 | Valeo Equipements Electriques Moteur | Procede hors ligne d'allocation d'un logiciel embarque temps reel sur une architecture multicontroleur multicoeur, et son utilisation pour des applications embarquees dans un vehicule automobile |
US10650046B2 (en) | 2016-02-05 | 2020-05-12 | Sas Institute Inc. | Many task computing with distributed file system |
US10338968B2 (en) | 2016-02-05 | 2019-07-02 | Sas Institute Inc. | Distributed neuromorphic processing performance accountability |
US10380185B2 (en) * | 2016-02-05 | 2019-08-13 | Sas Institute Inc. | Generation of job flow objects in federated areas from data structure |
US10650045B2 (en) | 2016-02-05 | 2020-05-12 | Sas Institute Inc. | Staged training of neural networks for improved time series prediction performance |
US10795935B2 (en) | 2016-02-05 | 2020-10-06 | Sas Institute Inc. | Automated generation of job flow definitions |
US10331495B2 (en) | 2016-02-05 | 2019-06-25 | Sas Institute Inc. | Generation of directed acyclic graphs from task routines |
US10642896B2 (en) | 2016-02-05 | 2020-05-05 | Sas Institute Inc. | Handling of data sets during execution of task routines of multiple languages |
US10089761B2 (en) * | 2016-04-29 | 2018-10-02 | Hewlett Packard Enterprise Development Lp | Graph processing using a shared memory |
US10275287B2 (en) * | 2016-06-07 | 2019-04-30 | Oracle International Corporation | Concurrent distributed graph processing system with self-balance |
WO2017213537A1 (en) * | 2016-06-10 | 2017-12-14 | Huawei Technologies Co., Ltd. | Parallel optimization of homogeneous systems |
US10552450B2 (en) | 2016-08-05 | 2020-02-04 | International Business Machines Corporation | Distributed graph databases that facilitate streaming data insertion and low latency graph queries |
US10394891B2 (en) * | 2016-08-05 | 2019-08-27 | International Business Machines Corporation | Distributed graph databases that facilitate streaming data insertion and queries by efficient throughput edge addition |
US10380188B2 (en) | 2016-08-05 | 2019-08-13 | International Business Machines Corporation | Distributed graph databases that facilitate streaming data insertion and queries by reducing number of messages required to add a new edge by employing asynchronous communication |
US10445507B2 (en) | 2016-09-23 | 2019-10-15 | International Business Machines Corporation | Automated security testing for a mobile application or a backend server |
USD898059S1 (en) | 2017-02-06 | 2020-10-06 | Sas Institute Inc. | Display screen or portion thereof with graphical user interface |
USD898060S1 (en) | 2017-06-05 | 2020-10-06 | Sas Institute Inc. | Display screen or portion thereof with graphical user interface |
US11151031B2 (en) | 2017-06-29 | 2021-10-19 | Microsoft Technology Licensing, Llc | Optimized record placement in defragmenting graph database |
JP7080033B2 (ja) * | 2017-11-07 | 2022-06-03 | 株式会社日立製作所 | タスク管理システム、タスク管理方法、及びタスク管理プログラム |
US11030248B2 (en) * | 2018-04-18 | 2021-06-08 | Palantir Technologies Inc. | Resource dependency system and graphical user interface |
US11010436B1 (en) | 2018-04-20 | 2021-05-18 | Facebook, Inc. | Engaging users by personalized composing-content recommendation |
US11307880B2 (en) | 2018-04-20 | 2022-04-19 | Meta Platforms, Inc. | Assisting users with personalized and contextual communication content |
US11676220B2 (en) | 2018-04-20 | 2023-06-13 | Meta Platforms, Inc. | Processing multimodal user input for assistant systems |
US11886473B2 (en) | 2018-04-20 | 2024-01-30 | Meta Platforms, Inc. | Intent identification for agent matching by assistant systems |
US11715042B1 (en) | 2018-04-20 | 2023-08-01 | Meta Platforms Technologies, Llc | Interpretability of deep reinforcement learning models in assistant systems |
US10310907B1 (en) * | 2018-05-11 | 2019-06-04 | Xactly Corporation | Computer system providing numeric calculations with less resource usage |
US10956413B2 (en) * | 2018-10-31 | 2021-03-23 | Salesforce.Com, Inc. | Action set translation |
US11526746B2 (en) | 2018-11-20 | 2022-12-13 | Bank Of America Corporation | System and method for incremental learning through state-based real-time adaptations in neural networks |
US10977058B2 (en) | 2019-06-20 | 2021-04-13 | Sap Se | Generation of bots based on observed behavior |
US11321798B2 (en) | 2019-08-08 | 2022-05-03 | Nvidia Corporation | Dynamic allocation of system on chip resources for efficient signal processing |
US11003645B1 (en) | 2019-10-04 | 2021-05-11 | Palantir Technologies Inc. | Column lineage for resource dependency system and graphical user interface |
US11500654B2 (en) * | 2019-12-04 | 2022-11-15 | International Business Machines Corporation | Selecting a set of fast computable functions to assess core properties of entities |
CN114968516A (zh) * | 2022-05-16 | 2022-08-30 | 脸萌有限公司 | 调度方法、装置、设备及存储介质 |
Family Cites Families (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3662401A (en) * | 1970-09-23 | 1972-05-09 | Collins Radio Co | Method of program execution |
GB9112754D0 (en) | 1991-06-13 | 1991-07-31 | Int Computers Ltd | Data processing apparatus |
CA2076293A1 (en) | 1991-10-11 | 1993-04-12 | Prathima Agrawal | Multiprocessor computer for solving sets of equations |
US5675517A (en) | 1995-04-25 | 1997-10-07 | Systemix | Fluorescence spectral overlap compensation for high speed flow cytometry systems |
US6044394A (en) | 1997-07-21 | 2000-03-28 | International Business Machines Corporation | Managing independently executing computer tasks that are interrelated by dataflow |
JPH11175355A (ja) | 1997-12-15 | 1999-07-02 | Sony Corp | 情報処理装置及び方法、オペレーティングシステム並びにコンピュータ読み取り可能な媒体 |
JPH11175357A (ja) | 1997-12-17 | 1999-07-02 | Chokosoku Network Computer Gijutsu Kenkyusho:Kk | タスク管理方法 |
US6189141B1 (en) | 1998-05-04 | 2001-02-13 | Hewlett-Packard Company | Control path evaluating trace designator with dynamically adjustable thresholds for activation of tracing for high (hot) activity and low (cold) activity of flow control |
US6553355B1 (en) * | 1998-05-29 | 2003-04-22 | Indranet Technologies Limited | Autopoietic network system endowed with distributed artificial intelligence for the supply of high volume high-speed multimedia telesthesia telemetry, telekinesis, telepresence, telemanagement, telecommunications, and data processing services |
US6499023B1 (en) * | 1999-02-19 | 2002-12-24 | Lucent Technologies Inc. | Data item evaluation based on the combination of multiple factors |
US7024316B1 (en) | 1999-10-21 | 2006-04-04 | Dakocytomation Colorado, Inc. | Transiently dynamic flow cytometer analysis system |
US6748518B1 (en) | 2000-06-06 | 2004-06-08 | International Business Machines Corporation | Multi-level multiprocessor speculation mechanism |
US7089557B2 (en) | 2001-04-10 | 2006-08-08 | Rusty Shawn Lee | Data processing system and method for high-efficiency multitasking |
US20030078703A1 (en) | 2001-10-19 | 2003-04-24 | Surromed, Inc. | Cytometry analysis system and method using database-driven network of cytometers |
US6954722B2 (en) | 2002-10-18 | 2005-10-11 | Leland Stanford Junior University | Methods and systems for data analysis |
US20060015291A1 (en) | 2002-10-18 | 2006-01-19 | Leland Stanford Junior University | Methods and systems for data analysis |
US8381037B2 (en) | 2003-10-09 | 2013-02-19 | International Business Machines Corporation | Method and system for autonomic execution path selection in an application |
US7321983B2 (en) | 2003-12-09 | 2008-01-22 | Traverse Systems Llc | Event sensing and meta-routing process automation |
US7743376B2 (en) | 2004-09-13 | 2010-06-22 | Broadcom Corporation | Method and apparatus for managing tasks in a multiprocessor system |
US20060168571A1 (en) | 2005-01-27 | 2006-07-27 | International Business Machines Corporation | System and method for optimized task scheduling in a heterogeneous data processing system |
US20060294499A1 (en) * | 2005-06-24 | 2006-12-28 | John Shim | A method of event driven computation using data coherency of graph |
JP3938387B2 (ja) | 2005-08-10 | 2007-06-27 | インターナショナル・ビジネス・マシーンズ・コーポレーション | コンパイラ、制御方法、およびコンパイラ・プログラム |
US20070143759A1 (en) | 2005-12-15 | 2007-06-21 | Aysel Ozgur | Scheduling and partitioning tasks via architecture-aware feedback information |
-
2008
- 2008-06-26 WO PCT/IB2008/003706 patent/WO2009044296A2/en active Application Filing
- 2008-06-26 US US12/147,312 patent/US8166479B2/en active Active
- 2008-06-26 EP EP08835071.5A patent/EP2176754B1/de active Active
Non-Patent Citations (1)
Title |
---|
See references of WO2009044296A2 * |
Also Published As
Publication number | Publication date |
---|---|
US20090007127A1 (en) | 2009-01-01 |
WO2009044296A3 (en) | 2009-12-10 |
US8166479B2 (en) | 2012-04-24 |
WO2009044296A2 (en) | 2009-04-09 |
EP2176754B1 (de) | 2019-10-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8166479B2 (en) | Optimizing data analysis through directional dependencies of a graph including plurality of nodes and attributing threading models and setting status to each of the nodes | |
CN110399222B (zh) | Gpu集群深度学习任务并行化方法、装置及电子设备 | |
Blelloch et al. | Provably good multicore cache performance for divide-and-conquer algorithms | |
US9063982B2 (en) | Dynamically associating different query execution strategies with selective portions of a database table | |
Gautam et al. | A survey on job scheduling algorithms in big data processing | |
Breß et al. | Efficient co-processor utilization in database query processing | |
US11966785B2 (en) | Hardware resource configuration for processing system | |
US8788986B2 (en) | System and method for capacity planning for systems with multithreaded multicore multiprocessor resources | |
US11475342B2 (en) | Systems, methods, and apparatuses for solving stochastic problems using probability distribution samples | |
TWI827792B (zh) | 多路徑神經網路、資源配置的方法及多路徑神經網路分析器 | |
CN107908536B (zh) | Cpu-gpu异构环境中对gpu应用的性能评估方法及系统 | |
CN116501505A (zh) | 负载任务的数据流生成方法、装置、设备及介质 | |
CN113452546A (zh) | 深度学习训练通信的动态服务质量管理 | |
CN110415162B (zh) | 大数据中面向异构融合处理器的自适应图划分方法 | |
US7689958B1 (en) | Partitioning for a massively parallel simulation system | |
TWI782845B (zh) | 通用型圖形處理器核心函式之組態設定預測系統及方法 | |
CN118502964B (zh) | 托卡马克新经典环向粘滞力矩cuda模拟实现方法 | |
Gautama et al. | Low-cost static performance prediction of parallel stochastic task compositions | |
US11874836B2 (en) | Configuring graph query parallelism for high system throughput | |
WO2024212617A1 (zh) | 一种基于图划分的虚拟机关联调度方法 | |
Gong et al. | Intermediate Data Placement Strategy for Different Data Skew Levels Based on Random Sampling in Spark | |
Neytcheva et al. | Multidimensional performance and scalability analysis for diverse applications based on system monitoring data | |
Gallet | Efficient Euclidean Distance Calculations and Distance Similarity Searches: An Examination of Heterogeneous CPU, GPU, and Tensor Core Architectures | |
Marszalek | The Analysis of Energy Performance in Use Parallel Merge Sort Algorithms | |
US20120303337A1 (en) | Systems and methods for improving the execution of computational algorithms |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20100205 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL BA MK RS |
|
17Q | First examination report despatched |
Effective date: 20100715 |
|
DAX | Request for extension of the european patent (deleted) | ||
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
INTG | Intention to grant announced |
Effective date: 20190507 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602008061437 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 1191958 Country of ref document: AT Kind code of ref document: T Effective date: 20191115 |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MP Effective date: 20191016 |
|
REG | Reference to a national code |
Ref country code: LT Ref legal event code: MG4D |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 1191958 Country of ref document: AT Kind code of ref document: T Effective date: 20191016 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200116 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200217 Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200116 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200117 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200224 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602008061437 Country of ref document: DE |
|
PG2D | Information on lapse in contracting state deleted |
Ref country code: IS |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200216 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 |
|
26N | No opposition filed |
Effective date: 20200717 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200626 |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20200630 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200630 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200630 Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200626 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200630 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20200630 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: MT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20191016 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20240612 Year of fee payment: 17 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20240628 Year of fee payment: 17 |