WO2012051757A1 - Method and tool suite device for identifying modular structure in complex network - Google Patents

Method and tool suite device for identifying modular structure in complex network Download PDF

Info

Publication number
WO2012051757A1
WO2012051757A1 PCT/CN2010/077949 CN2010077949W WO2012051757A1 WO 2012051757 A1 WO2012051757 A1 WO 2012051757A1 CN 2010077949 W CN2010077949 W CN 2010077949W WO 2012051757 A1 WO2012051757 A1 WO 2012051757A1
Authority
WO
WIPO (PCT)
Prior art keywords
parallel processing
task
data
block
processing units
Prior art date
Application number
PCT/CN2010/077949
Other languages
French (fr)
Inventor
Rui Wang
Xingyuan Chen
Huachang Li
Original Assignee
Beijing Prosperous Biopharm Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Prosperous Biopharm Co., Ltd. filed Critical Beijing Prosperous Biopharm Co., Ltd.
Priority to CN201080051364.2A priority Critical patent/CN102667710B/en
Priority to PCT/CN2010/077949 priority patent/WO2012051757A1/en
Publication of WO2012051757A1 publication Critical patent/WO2012051757A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising

Definitions

  • the present invention relates generally to modular structure identification, and in particular, to a method and a tool suite device for identifying modular structure in a complex network, and a computing system.
  • a complex network is a network with non-trivial topological features.
  • the study of complex networks is inspired by empirical study of real networks such as networks in biotechnology (cell networks, protein-protein interaction networks, neuro-networks), Internet/WWW (World Wide Web) networks, social networks, etc.
  • One of the most pervasive features in such real networks is the existence of modular structure, or clustering, i.e., if a graph is used to represent a complex network, organization of vertices in the clusters, with many edges within the same cluster and relatively few edges connecting vertices from different clusters.
  • Identifying modular structure in a complex network is of great importance for understanding real problems the graph represents, e.g., tracking online viruses, community behaviors analysis in, for example, social network services, detecting important gene functions, etc.
  • CPUs Central Processing Units
  • Execution time of the traditional methods such as hierarchical clustering, partitioning clustering and spectral clustering, divisive methods, e.g., Girvan and Newman algorithm, and modular-based greedy algorithm, takes hours to complete the computation due to the massive scale of the complex networks. For instance, in the field of social networks, Facebook announced 400 million users in February, 2010.
  • a corresponding graph has millions of edges and vertices.
  • the existing real complex networks have the characteristics of massive amount of data, high computational complexity in both time and storage space. Detecting modular structures in the complex networks using CPUs have exposed to disadvantages of long execution time, low user interaction and energy inefficiency.
  • supercomputer workstations or high performance computing clusters though capable of completing the computation in a short time, are expensive, developer-unfriendly, and raise an entrance barrier of common research and business entities.
  • the present invention provides a tool suite device, a method and a system for identifying a modular structure in a complex network that are capable of completing the computation in a short time while saving in cost.
  • a tool suite device for identifying a modular structure in a complex network using a computing system with a CPU and a parallel processing device, the tool suite device comprising a data reading means on the CPU for reading task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network; a block storage means on the CPU for storing a predefined set of sub-blocks each of which indicates a particular process; a determining means on the CPU for determining a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from the predefined set of sub-blocks stored in the block storage means according to the task data; a first interface on the CPU for receiving the task block transferred from the determining means; a dispatcher means on the CPU for dividing the task data into a plurality of data subsets with respect to the plurality of parallel processing units; a second interface on the CPU for receiving
  • a method of identifying a modular structure in a complex network using a computing system with a CPU and a parallel processing device comprising reading, by a data reading means on the CPU, task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network; determining, by a determining means on the CPU, a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from a predefined set of sub-blocks each of which indicates a particular process, according to the task data, and transferring the task block to a first interface on the CPU; transferring the task block, by the first interface, to a first frontend on the parallel processing device; passing, by the first frontend, the task block to an assembler means on the parallel processing device; generating, by the assembler means, the subtask process readable by the plurality of parallel processing units from the task block
  • a system for identifying modular structure in a complex network comprising a CPU and a parallel processing device.
  • the CPU includes a data reading means for reading task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network; a block storage means for storing a predefined set of sub-blocks each of which indicates a particular process; a determining means for determining a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from the predefined set of sub-blocks stored in the block storage means according to the task data; a first interface for receiving the task block transferred from the determining means; a dispatcher means for dividing the task data into a plurality of data subsets with respect to the plurality of parallel processing units; and a second interface for receiving the task data transferred from the dispatcher means.
  • the parallel processing device includes: a first frontend on the parallel processing device connected to the first interface for receiving the task block transferred from the first interface; an assembler means on the parallel processing device for receiving the task block passed from the first frontend, generating the subtask process readable by the plurality of parallel processing units from the task block and assigning the subtask process to the plurality of parallel processing units; a second frontend on the parallel processing device connected to the second interface for receiving the plurality of data subsets from the second interface and passing the plurality of data subsets to the plurality of parallel processing units, respectively; the plurality of parallel processing units for performing in parallel, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results; and a classification means for processing the parallel results to obtain the modular structure in the complex network.
  • FIG.l is a block diagram illustrating a computing system in accordance with an embodiment of the invention.
  • FIG.2 is a block diagram illustrating a computing system in accordance with another embodiment of the invention.
  • FIG.3 illustrates a structure example of the complex network data.
  • FIG.4 shows an embodiment of the block storage means storing the predefined set of sub-blocks.
  • FIG.5 is a flow chart of a BFS routine that is applicable in the graph search sub-block.
  • FIG.6 is a flow chart of the routine indicated by the centrality measure sub-block.
  • FIG.7 is a flow chart of a routine performed by the classification means.
  • FIG.8 shows a flow chart of method of identifying modular structure in a complex network in accordance with another embodiment of the invention.
  • FIG.l is a block diagram illustrating a computing system 1000 in accordance with an embodiment of the invention.
  • the computing system 1000 includes a CPU 200 and a GPU 300 as an example of a parallel processing device.
  • the numbers of the CPUs and GPUs contained in the computing system 1000 are not limited to only one and may be altered as necessary.
  • GPU 300 is shown in FIG.l as the parallel processing device, the specific product of the parallel processing device may be changed in different cases.
  • the parallel processing device may be a plurality of processing units distributed on a network that can perform parallel processing and communicate data or information with each other, such as a LAN (Local Area Network) or a WAN (Wide Area Network).
  • LAN Local Area Network
  • WAN Wide Area Network
  • GPU 300 may use the general-purpose graphics processing unit (GPGPU) platform Computing Unified Device Architecture (CUDA) developed by nVidia.
  • GPGPU general-purpose graphics processing unit
  • CUDA Unified Device Architecture
  • other commercially available GPUs can be applied under the spirit and scope of the present invention.
  • the CPU 200 includes a data reading means 210, a dispatcher means 220, a block storage means 230, a determining means 240, a first interface 250 and a second interface 255.
  • the GPU 300 comprises a plurality of parallel processing units 310-1, 310-2 ..., 310-N, where N is an integer.
  • the plurality of parallel processing units 310-1, 310-2 310-N will be collectively referred to as the parallel processing units 310 hereinafter as appropriate.
  • the GPU 300 further comprises a first frontend 320, a second frontend 325, an assembler means 330, and a classification means 340.
  • the classification means 340 is shown as included in the GPU 300, it can be located alternatively.
  • the classification means 340 may be located in the CPU 200, and in such case, the present invention is also applicable.
  • the data reading means 210 reads task data which includes nodes in a complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network.
  • the modular structure to be determined by the computing system 1000 may be comprised of a plurality of communities each of which contains the nodes that have closer relationships.
  • a protein-protein interaction network the interactions between proteins are important for almost all biological functions.
  • signals from the exterior of a cell are mediated to the inside of that cell by protein-protein interactions of the signaling molecules.
  • This process called signal transduction, plays a fundamental role in many biological processes and in many diseases e.g. cancers.
  • the nodes in the task data can be various proteins, respectively.
  • the relationships among the nodes indicate interactions among the proteins.
  • the task parameters describe the task for determining the modular structure of the proteins, and each community in the modular structure contains the proteins that tend to interact among them.
  • the nodes can represent different members in the social network, and the edges with corresponding values represent specific relationships among the nodes.
  • the relationships, denoted by edges with corresponding values may represent spatial distances between work locations of two arbitrary staff members. In such case, the staff members having closer distances can be deemed as in a department. Therefore, the modular structure obtained finally contains the staff members in the same department.
  • the task parameter may be used to control the modular structure obtained finally.
  • the task parameter may specify a condition that should be meted by the final modular structure, such as a maximum size of the modules in the modular structure, a minimum number of the modules in the modular structure.
  • the block storage means 230 stores a predefined set of sub-blocks each of which indicates a particular process.
  • the predefined sub-blocks will be described in detail later.
  • the determining means 240 determines a task block from the predefined set of sub-blocks stored in the block storage means 230 according to the task data. Then the determining means 240 transfers the task data to the first interface 250.
  • the task data is used to assign subtask process to be performed in a plurality of parallel processing units 310, respectively.
  • the determination of the task block may be customized to specific tasks and will be exemplified below.
  • the dispatcher means 220 divides the task data into a plurality of data subsets with respect to the plurality of parallel processing units 310 on the parallel processing device (GPU) 300.
  • the dispatcher means 220 may divide the task data by checking the GPU configurations and then setting size of the data subsets.
  • the first frontend 320 on the GPU 300 is connected to the first interface 250 on the CPU 200.
  • the first frontend 320 receives the task block transferred from the first interface 250, and passes the task block to the assembler means 330.
  • the assembler means 330 receives the task block passed from the first frontend 320 and generates the subtask process readable by the plurality of parallel processing units 310 from the task block. For example, the assembler means 330 translates the task block into the subtask process in a form of GPU readable machine code for the units 310. Then, the assembler means 330 assigns the subtask process to the plurality of parallel processing units 310, respectively.
  • the second frontend 325 on the GPU 300 is connected to the second interface 255 on the CPU 200 and receives the plurality of data subsets from the second interface 255. Then the second frontend 325 passes the plurality of data subsets to the plurality of parallel processing units 310, respectively.
  • the plurality of parallel processing units 310 performs in parallel, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results.
  • the classification means 340 processes the parallel results to obtain the modular structure in the complex network.
  • interfaces 250 and 255 are shown in FIG.l separately, they can be combined. That is, the interfaces 250 and 255 may be embodied by one component. This holds true for the first frontend 320 and the second frontend 325, and they may be embodied by one component.
  • FIG.1 the components in FIG.1 are shown as distributed on CPU and parallel processing device. However, these components may be integrated into one entity, i.e. a tool suite device according to the present invention.
  • the present invention proposes a parallel computing system based on parallel processing device of low cost, such as GPU, to identify the modular structures in complex networks, reduces the execution time of the computation and the cost significantly, and provides a complex network research platform for commercial and academic entities.
  • FIG.2 is a block diagram illustrating a computing system 1100 in accordance with another embodiment of the invention. Similar reference numbers are used to indicate the same or similar parts as those in FIG.l and therefore the detailed descriptions thereof will be omitted for the purpose of clarity.
  • the computing system 1100 in FIG.2 further comprises a visualization means 260, a network data means 270 and a data storage means 280.
  • the visualization means 260 receives the modular structure obtained by the classification means 340 directly (if the classification means 340 is located in the CPU 200) or via other intermediary components (such as the frontends and the interfaces), and displays on a monitor the modular structure. Therefore, a user-friendly way of interpreting the data is provided.
  • the network representation means 270 can extract a complex network data representing the nodes and the edges with corresponding values.
  • the network representation means 270 can convert a real problem in a specific network, such as a biological or social network, into the complex network data.
  • FIG.3 illustrates a structure example of the complex network data.
  • an entry of the complex network data may comprise an index of a node (Node 1) and an index of a node adjacent to the Node 1 (Node 2), and a value representing the edge between the Node 1 and the adjacent Node 2.
  • FIG.3 shows only one entry of the complex network data for the purpose of explanation. However, there is no limitation on the number of the entries in the complex network data.
  • the number of the entries in the complex network data may be from 1 to C(M,2).
  • the edges have directions, e.g. the edge from Node 1 to Node 2 is different from that from Node 2 to Node 1 and therefore has different value, the number of the entries in the complex network data may be from 1 to P(M,2).
  • the data storage means 280 stores the complex network data as a part of the task data, to be read by the data reading means 210. It should be noted that although the data storage means 280 is shown located in CPU 200 in FIG.2, it may be located alternatively.
  • the data storage means 280 may be a non-volatile or volatile memory independent of the CPU 200 and the GPU 300, such as a ROM (Read-Only Memory), a RAM (Random Access Memory), a hard disk, an optical disk, a flash memory, or the like.
  • G(V, E, C) be a complex network with vertices v ⁇ V, and edges e ⁇ E with corresponding cost c ⁇ C.
  • one node may include gene/protein names, and edges with values may refer to specific interactions therebetween, such as catalysis or binding.
  • the task data may be determined in advance or generated by the network data means 270.
  • the network data means 270 may parse protein-protein interaction networks into a numerical format G(V, E, C).
  • Betweenness is a centrality measure of an edge e within a graph. Edges that occur on many shortest paths between other edges have higher betweenness than those that do not.
  • the data storage means 280 stores the task data for the complex network G(V, E, C), e.g. in the form of adjacency matrix for further processing.
  • FIG.4 shows an embodiment of the block storage means 230 storing the predefined set of sub-blocks.
  • some processing sub-blocks each of which indicates a particular processing are illustrated, such as a graph search sub-block 231, a shortest path sub-block 233, and a centrality measure sub-block 235.
  • a graph search sub-block 231 a shortest path sub-block 233
  • a centrality measure sub-block 235 a centrality measure sub-block 235.
  • other kinds of processing sub-block(s) can be incorporated into the block storage means 230, alternatively or additionally.
  • a task block may be determined by the determining means 240 by selecting/combining one or more sub-blocks from the sub-blocks stored in the block storage means 230.
  • the task block may comprise a combination of the graph search sub-block 231 and the centrality measure sub-block 235, or a combination of the shortest path sub-block 233 and the centrality measure sub-block 235.
  • the graph search sub-block 231 indicates a routine to cause the parallel processing units 310 to perform Breadth First Search (BFS).
  • inputs to the graph search sub-block 231 may be a preprocessed input matrix and source node.
  • Output of the graph search sub-block 231 is a breadth first tree from a source node s in the network.
  • FIG.5 is a flow chart of a BFS routine 500 that is applicable in the graph search sub-block 231.
  • data is input to the graph search sub-block 231, such as one of the plurality of the data subsets.
  • each node is mapped to a thread on one CUDA streaming multiprocessor (parallel processing unit).
  • the graph search sub-block 231 specifies a source node s and an initial frontier set F. Let an array F denote the frontier of the search, an array X denote the visited nodes. During each iteration, every frontier node explores its neighbor nodes and adds them to the frontier node set F.
  • the graph search sub-block 231 find connected nodes of F in the next level and add them to the visited array X. In the meantime, upon the completion of searching neighbor nodes, the current node adds itself to the visited node set X.
  • the frontier node set F is updated. For example, if the current path is shorter than the existing path, then the length of the path is updated and one iteration is completed.
  • the graph search sub-block 231 may further or alternatively indicate the DFS (Deep First Search) which is well-known in the art and can be applied in the present invention similarly to the BFS.
  • the centrality measure sub-block 235 has a routine to cause the parallel processing units to calculate a breadth first tree for all of the nodes.
  • FIG.6 is a flow chart of the routine 600 indicated by the centrality measure sub-block 235.
  • the centrality measure sub-block 235 may call the graph search sub-block 231 to obtain the breadth first trees for the nodes in the network.
  • the routine goes to S630 where a parallel reduction is performed on the betweenness of each edge to obtain correlation coefficients between the nodes.
  • the correlation coefficient of an edge is obtained using the formula w/b, where w denotes the weight of the edge from the complex network weight, and b denotes the betweenness of the edge.
  • the routines 500 and 600 are of parallel processes, and can be performed on the parallel processing units 310, respectively, to obtain parallel results therefrom, such as the correlation coefficients described above.
  • the classification means 340 uses the global correlation coefficients input from the centrality measure sub-block 235 to obtain the modular structures of the network.
  • FIG.7 is a flow chart of a routine 700 performed by the classification means 340.
  • the correlation coefficients are input from all of the plurality of parallel processing units 310, as the parallel results.
  • the classification means 340 identifies the edge with the largest correlation coefficient.
  • the edge with the largest correlation coefficient is deleted. Then at S740, the network with such edge deleted is determined whether it satisfies the condition specified by the task parameter. For example, it is determined in S740 whether all the communities remaining in the network has sizes that are smaller than the maximum size.
  • routine 700 goes to S705 where the centrality measure sub-block 235 is called again to obtain correlation coefficients for the network deleting the edge with the largest correlation coefficient. [0060] If the condition is satisfied at S740, the routine 700 terminates at S750 where a modular structure with modules satisfying the condition specified by the task parameter is obtained.
  • the shortest path sub-block 233 may have a routine for finding shortest paths between every pair of nodes in the graph (network). For example, routines for APSP (All pairs shortest path), SPSP (Single Pair Shortest Path), SSSP (Single Source Shortest Path), SDSP (Single Destination Shortest Path) or the like can be applied in the shortest path sub-block 233. These shortest path methods are all well-known in the art, therefore omitted herein.
  • the centrality measure sub-block 235 has a routine to cause the parallel processing units to build a hierarchical clustering tree.
  • the classification means 340 cut the hierarchical clustering tree using task parameters as the constraints to obtain the modular structure.
  • Hierarchical clustering procedure produces a series of partitions of the data, Pn,
  • the first Pn consists of n single object 'clusters', and the last PI consists of a single group containing all n cases.
  • the method joins together the two clusters which are closest together (most similar). (At the first stage, of course, this amounts to joining together the two objects that are closest together, since at the initial stage each cluster has one object.) .
  • Differences between methods arise because of the different ways of defining distance (or similarity) between clusters. In this embodiment, we calculate the distance using matrix multiplication-based all pair shortest path algorithm.
  • the invention develops a GPU-based parallel computing system to identify the modular structures in complex networks, reduces the execution time of computation and the cost significantly, and provides a complex network research platform for commercial and academic entities.
  • FIG.8 shows a flow chart of method 2000 of identifying modular structure in a complex network using a computing system with a CPU and a parallel processing device (e.g. GPU) in accordance with an embodiment of the present invention.
  • the method 2000 may be performed by the respective components in the computing system 1000 in FIG.l.
  • task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameters for a task to be performed on the complex network, may be read by the data reading means 210 on the CPU 200.
  • the determining means 240 on the CPU 200 determines a task block from a predefined set of sub-blocks each of which indicates a particular process, according to the task data, and transferring the task block to the first interface 250 on the CPU 200.
  • the task block is used for assigning subtask process to be performed in the plurality of parallel processing units 310-1, 310-2 310-N on the parallel processing device 300, respectively, where N is an integer.
  • the task block is transferred by the first interface 250 to a first frontend 320 on the parallel processing device 300. Then at 2240, passing, by the first frontend, the task block to an assembler means 330 on the parallel processing device 300.
  • the assembler means 330 generates the subtask process readable by the plurality of parallel processing units 310 from the task block and assigns the subtask process to the plurality of parallel processing units 310.
  • the dispatcher means 220 may divide the task data into a plurality of data subsets with respect to the plurality of parallel processing units 310.
  • a second interface 255 transfers on the CPU 200, the plurality of data subsets to a second frontend 325 on the parallel processing device. Then at S2340, the second frontend 325 passes the plurality of data subsets to the plurality of parallel processing units, respectively.
  • the plurality of parallel processing units 310-1, 310-2 310-N perform in parallel, by the plurality of parallel processing units, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results.
  • the classification means 340 (located in the CPU 200 or GPU 300) processes the parallel results to obtain the modular structure in the complex network.
  • the method of identifying modular structure in a complex network according to the present invention may incorporates one or more aspects described above with reference to FIGs.1-7.
  • the method may further comprise extracting a complex network data representing the nodes and the edges with values and storing the complex network data as a part of the task data.
  • the task block may comprise a combination of a graph search sub-block and a centrality measure sub-block, or a combination of a shortest path sub-block and a centrality measure sub-block.
  • the routines described in FIGs.5-7 can also be applied in the method 2000.
  • steps in method 2000 as shown in FIG.8 do not have to be performed in the order as shown.
  • steps S2200-S2260 may be performed after or at the same time as S2300-S2340.
  • a computer system with an associated computer-readable medium containing instructions for controlling the computer system can be utilized to implement the exemplary embodiments that are disclosed herein.
  • the computer system may include at least one computer such as a microprocessor, digital signal processor, and associated peripheral electronic circuitry.

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Theoretical Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Human Resources & Organizations (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Tourism & Hospitality (AREA)
  • Quality & Reliability (AREA)
  • Operations Research (AREA)
  • Multi Processors (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed is a tool suite device for identifying a modular structure in a complex network using a computing system with a CPU and a parallel processing device. The tool suite device comprises a data reading means for reading task data; a block storage means for storing a predefined set of sub-blocks each of which indicates a particular process; a determining means for determining a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from the predefined set of sub-blocks stored in the block storage means; a first interface for receiving the task block transferred from the determining means; a dispatcher means for dividing the task data into data subsets; a second interface for receiving the task data from the dispatcher means; a first frontend for receiving the task block from the first interface; an assembler means for receiving the task block passed from the first frontend, generating the subtask process readable by the parallel processing units from the task block and assigning the subtask process to the parallel processing units; a second frontend for receiving the data subsets from the second interface and passing data subsets to the parallel processing units; the parallel processing units for performing in parallel, the subtask process on the data subsets, respectively, to obtain parallel results; and a classification means for processing the parallel results to obtain the modular structure in the complex network.

Description

METHOD AND TOOL SUITE DEVICE FOR IDENTIFYING
MODULAR STRUCTURE IN COMPLEX NETWORK
FIELD OF THE INVENTION [0001] The present invention relates generally to modular structure identification, and in particular, to a method and a tool suite device for identifying modular structure in a complex network, and a computing system.
BACKGROUND OF THE INVENTION
[0002] A complex network is a network with non-trivial topological features. The study of complex networks is inspired by empirical study of real networks such as networks in biotechnology (cell networks, protein-protein interaction networks, neuro-networks), Internet/WWW (World Wide Web) networks, social networks, etc. One of the most pervasive features in such real networks is the existence of modular structure, or clustering, i.e., if a graph is used to represent a complex network, organization of vertices in the clusters, with many edges within the same cluster and relatively few edges connecting vertices from different clusters. Identifying modular structure in a complex network is of great importance for understanding real problems the graph represents, e.g., tracking online viruses, community behaviors analysis in, for example, social network services, detecting important gene functions, etc. [0003] There have been some methods in the related art of identifying the modular structure in a complex network which are based on sequential computing devices, namely, CPUs (Central Processing Units). Execution time of the traditional methods, such as hierarchical clustering, partitioning clustering and spectral clustering, divisive methods, e.g., Girvan and Newman algorithm, and modular-based greedy algorithm, takes hours to complete the computation due to the massive scale of the complex networks. For instance, in the field of social networks, Facebook announced 400 million users in February, 2010. Hence, a corresponding graph has millions of edges and vertices. As a result, the existing real complex networks have the characteristics of massive amount of data, high computational complexity in both time and storage space. Detecting modular structures in the complex networks using CPUs have exposed to disadvantages of long execution time, low user interaction and energy inefficiency. On the flip side, supercomputer workstations or high performance computing clusters, though capable of completing the computation in a short time, are expensive, developer-unfriendly, and raise an entrance barrier of common research and business entities.
[0004] Thus, there is a need for a method and a tool suite device for identifying modular structure in a complex network, and a computing system using a computing system with a CPU and a parallel processing device such as GPU (Graphic Processing Unit), or processing units distributed on a network, which are capable of completing the computation in a short time while saving in cost.
SUMMARY OF THE INVENTION
[0005] The present invention provides a tool suite device, a method and a system for identifying a modular structure in a complex network that are capable of completing the computation in a short time while saving in cost.
[0006] According to one aspect of the present invention, a tool suite device for identifying a modular structure in a complex network using a computing system with a CPU and a parallel processing device, the tool suite device comprising a data reading means on the CPU for reading task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network; a block storage means on the CPU for storing a predefined set of sub-blocks each of which indicates a particular process; a determining means on the CPU for determining a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from the predefined set of sub-blocks stored in the block storage means according to the task data; a first interface on the CPU for receiving the task block transferred from the determining means; a dispatcher means on the CPU for dividing the task data into a plurality of data subsets with respect to the plurality of parallel processing units; a second interface on the CPU for receiving the task data transferred from the dispatcher means; a first frontend on the parallel processing device connected to the first interface for receiving the task block transferred from the first interface; an assembler means on the parallel processing device for receiving the task block passed from the first frontend, generating the subtask process readable by the plurality of parallel processing units from the task block and assigning the subtask process to the plurality of parallel processing units; a second frontend on the parallel processing device connected to the second interface for receiving the plurality of data subsets from the second interface and passing the plurality of data subsets to the plurality of parallel processing units, respectively; the plurality of parallel processing units for performing in parallel, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results; and a classification means for processing the parallel results to obtain the modular structure in the complex network.
[0007] According to another aspect of the present invention, a method of identifying a modular structure in a complex network using a computing system with a CPU and a parallel processing device, the method comprising reading, by a data reading means on the CPU, task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network; determining, by a determining means on the CPU, a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from a predefined set of sub-blocks each of which indicates a particular process, according to the task data, and transferring the task block to a first interface on the CPU; transferring the task block, by the first interface, to a first frontend on the parallel processing device; passing, by the first frontend, the task block to an assembler means on the parallel processing device; generating, by the assembler means, the subtask process readable by the plurality of parallel processing units from the task block and assigning the subtask process to the plurality of parallel processing units; dividing, by a dispatcher means on the CPU, the task data into a plurality of data subsets with respect to the plurality of parallel processing units; transferring, by a second interface on the CPU, the plurality of data subsets to a second frontend on the parallel processing device; passing, by the second frontend, the plurality of data subsets to the plurality of parallel processing units, respectively; performing in parallel, by the plurality of parallel processing units, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results; and processing, by a classification means, the parallel results to obtain the modular structure in the complex network.
[0008] According to another aspect of the present invention, A system for identifying modular structure in a complex network comprising a CPU and a parallel processing device. The CPU includes a data reading means for reading task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network; a block storage means for storing a predefined set of sub-blocks each of which indicates a particular process; a determining means for determining a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from the predefined set of sub-blocks stored in the block storage means according to the task data; a first interface for receiving the task block transferred from the determining means; a dispatcher means for dividing the task data into a plurality of data subsets with respect to the plurality of parallel processing units; and a second interface for receiving the task data transferred from the dispatcher means. The parallel processing device includes: a first frontend on the parallel processing device connected to the first interface for receiving the task block transferred from the first interface; an assembler means on the parallel processing device for receiving the task block passed from the first frontend, generating the subtask process readable by the plurality of parallel processing units from the task block and assigning the subtask process to the plurality of parallel processing units; a second frontend on the parallel processing device connected to the second interface for receiving the plurality of data subsets from the second interface and passing the plurality of data subsets to the plurality of parallel processing units, respectively; the plurality of parallel processing units for performing in parallel, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results; and a classification means for processing the parallel results to obtain the modular structure in the complex network.
BRIEF DESCRIPTION OF THE DRAWINGS
[0009] The foregoing summary of the invention, as well as the following detailed description of exemplary embodiments of the invention, is better understood when read in conjunction with the accompanying drawings, which are included by way of example, and not by way of limitation with regard to the claimed invention.
[0010] FIG.l is a block diagram illustrating a computing system in accordance with an embodiment of the invention.
[0011] FIG.2 is a block diagram illustrating a computing system in accordance with another embodiment of the invention.
[0012] FIG.3 illustrates a structure example of the complex network data. [0013] FIG.4 shows an embodiment of the block storage means storing the predefined set of sub-blocks.
[0014] FIG.5 is a flow chart of a BFS routine that is applicable in the graph search sub-block.
[0015] FIG.6 is a flow chart of the routine indicated by the centrality measure sub-block.
[0016] FIG.7 is a flow chart of a routine performed by the classification means.
[0017] FIG.8 shows a flow chart of method of identifying modular structure in a complex network in accordance with another embodiment of the invention.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
[0018] Reference will now be made in detail to embodiments, examples of which are in illustrated in the accompanying drawings. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding the present invention.
[0019] FIG.l is a block diagram illustrating a computing system 1000 in accordance with an embodiment of the invention. The computing system 1000 includes a CPU 200 and a GPU 300 as an example of a parallel processing device. It should be noted that although only one CPU 200 and one GPU 300 are illustrated in FIG.l, the numbers of the CPUs and GPUs contained in the computing system 1000 are not limited to only one and may be altered as necessary. Also it is to be understood that although GPU 300 is shown in FIG.l as the parallel processing device, the specific product of the parallel processing device may be changed in different cases. For example, the parallel processing device may be a plurality of processing units distributed on a network that can perform parallel processing and communicate data or information with each other, such as a LAN (Local Area Network) or a WAN (Wide Area Network).
[0020] Examples of GPU 300 may use the general-purpose graphics processing unit (GPGPU) platform Computing Unified Device Architecture (CUDA) developed by nVidia. However, other commercially available GPUs can be applied under the spirit and scope of the present invention.
[0021] Referring to FIG.l, the CPU 200 includes a data reading means 210, a dispatcher means 220, a block storage means 230, a determining means 240, a first interface 250 and a second interface 255. The GPU 300 comprises a plurality of parallel processing units 310-1, 310-2 ..., 310-N, where N is an integer. For clarity of description, the plurality of parallel processing units 310-1, 310-2 310-N will be collectively referred to as the parallel processing units 310 hereinafter as appropriate.
[0022] The GPU 300 further comprises a first frontend 320, a second frontend 325, an assembler means 330, and a classification means 340. However, it should be noted that although the classification means 340 is shown as included in the GPU 300, it can be located alternatively. For example, the classification means 340 may be located in the CPU 200, and in such case, the present invention is also applicable.
[0023] In particular, the data reading means 210 reads task data which includes nodes in a complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network.
[0024] The modular structure to be determined by the computing system 1000 may be comprised of a plurality of communities each of which contains the nodes that have closer relationships. For example, in a protein-protein interaction network, the interactions between proteins are important for almost all biological functions. For example, signals from the exterior of a cell are mediated to the inside of that cell by protein-protein interactions of the signaling molecules. This process, called signal transduction, plays a fundamental role in many biological processes and in many diseases e.g. cancers. It is common practice to visualize protein interactions in a network representation where characterizing the embedded modular structure is of great importance. In such case, the nodes in the task data can be various proteins, respectively. The relationships among the nodes indicate interactions among the proteins. The task parameters describe the task for determining the modular structure of the proteins, and each community in the modular structure contains the proteins that tend to interact among them.
[0025] As another example, in a social network, the nodes can represent different members in the social network, and the edges with corresponding values represent specific relationships among the nodes. For instance, in the social network of a company, and the nodes refer to respective staff members in the company. The relationships, denoted by edges with corresponding values may represent spatial distances between work locations of two arbitrary staff members. In such case, the staff members having closer distances can be deemed as in a department. Therefore, the modular structure obtained finally contains the staff members in the same department. [0026] The task parameter may be used to control the modular structure obtained finally. For example, the task parameter may specify a condition that should be meted by the final modular structure, such as a maximum size of the modules in the modular structure, a minimum number of the modules in the modular structure.
[0027] The block storage means 230 stores a predefined set of sub-blocks each of which indicates a particular process. The predefined sub-blocks will be described in detail later.
[0028] The determining means 240 determines a task block from the predefined set of sub-blocks stored in the block storage means 230 according to the task data. Then the determining means 240 transfers the task data to the first interface 250. The task data is used to assign subtask process to be performed in a plurality of parallel processing units 310, respectively. The determination of the task block may be customized to specific tasks and will be exemplified below.
[0029] The dispatcher means 220 divides the task data into a plurality of data subsets with respect to the plurality of parallel processing units 310 on the parallel processing device (GPU) 300. The dispatcher means 220 may divide the task data by checking the GPU configurations and then setting size of the data subsets.
[0030] As shown in FIG.l, the first frontend 320 on the GPU 300 is connected to the first interface 250 on the CPU 200. The first frontend 320 receives the task block transferred from the first interface 250, and passes the task block to the assembler means 330.
[0031] The assembler means 330 receives the task block passed from the first frontend 320 and generates the subtask process readable by the plurality of parallel processing units 310 from the task block. For example, the assembler means 330 translates the task block into the subtask process in a form of GPU readable machine code for the units 310. Then, the assembler means 330 assigns the subtask process to the plurality of parallel processing units 310, respectively.
[0032] The second frontend 325 on the GPU 300 is connected to the second interface 255 on the CPU 200 and receives the plurality of data subsets from the second interface 255. Then the second frontend 325 passes the plurality of data subsets to the plurality of parallel processing units 310, respectively.
[0033] The plurality of parallel processing units 310 performs in parallel, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results. The classification means 340 processes the parallel results to obtain the modular structure in the complex network.
[0034] It should be noted that although two interfaces 250 and 255 are shown in FIG.l separately, they can be combined. That is, the interfaces 250 and 255 may be embodied by one component. This holds true for the first frontend 320 and the second frontend 325, and they may be embodied by one component.
[0035] On the other hand, the components in FIG.1 are shown as distributed on CPU and parallel processing device. However, these components may be integrated into one entity, i.e. a tool suite device according to the present invention.
[0036] Therefore the present invention proposes a parallel computing system based on parallel processing device of low cost, such as GPU, to identify the modular structures in complex networks, reduces the execution time of the computation and the cost significantly, and provides a complex network research platform for commercial and academic entities.
[0037] FIG.2 is a block diagram illustrating a computing system 1100 in accordance with another embodiment of the invention. Similar reference numbers are used to indicate the same or similar parts as those in FIG.l and therefore the detailed descriptions thereof will be omitted for the purpose of clarity.
[0038] In addition to those components of the computing system 1000 as shown in FIG.l, the computing system 1100 in FIG.2 further comprises a visualization means 260, a network data means 270 and a data storage means 280.
[0039] Particularly, in the case of FIG.2, the visualization means 260 receives the modular structure obtained by the classification means 340 directly (if the classification means 340 is located in the CPU 200) or via other intermediary components (such as the frontends and the interfaces), and displays on a monitor the modular structure. Therefore, a user-friendly way of interpreting the data is provided.
[0040] The network representation means 270 can extract a complex network data representing the nodes and the edges with corresponding values. For example, the network representation means 270 can convert a real problem in a specific network, such as a biological or social network, into the complex network data. FIG.3 illustrates a structure example of the complex network data. As shown in FIG.3, an entry of the complex network data may comprise an index of a node (Node 1) and an index of a node adjacent to the Node 1 (Node 2), and a value representing the edge between the Node 1 and the adjacent Node 2. Noted that FIG.3 shows only one entry of the complex network data for the purpose of explanation. However, there is no limitation on the number of the entries in the complex network data. For example, if there are M nodes in the complex network and the edges among the nodes have no direction (i.e. edge from Node 1 to Node 2 is equivalent to that from Node 2 to Node 1), then the number of the entries in the complex network data may be from 1 to C(M,2). In the case that the edges have directions, e.g. the edge from Node 1 to Node 2 is different from that from Node 2 to Node 1 and therefore has different value, the number of the entries in the complex network data may be from 1 to P(M,2).
[0041] Returning to FIG.2, the data storage means 280 stores the complex network data as a part of the task data, to be read by the data reading means 210. It should be noted that although the data storage means 280 is shown located in CPU 200 in FIG.2, it may be located alternatively. For example, the data storage means 280 may be a non-volatile or volatile memory independent of the CPU 200 and the GPU 300, such as a ROM (Read-Only Memory), a RAM (Random Access Memory), a hard disk, an optical disk, a flash memory, or the like.
[0042] Let G(V, E, C) be a complex network with vertices v^ V, and edges e ^ E with corresponding cost c ^ C. In case of real complex networks such as biological networks, depending on the particular network representation, one node may include gene/protein names, and edges with values may refer to specific interactions therebetween, such as catalysis or binding. For example, the task data may be determined in advance or generated by the network data means 270. The network data means 270 may parse protein-protein interaction networks into a numerical format G(V, E, C).
[0043] Define a betweenness of an edge e as number of all pair shortest paths that pass through the edge e. Betweenness is a centrality measure of an edge e within a graph. Edges that occur on many shortest paths between other edges have higher betweenness than those that do not.
[0044] The data storage means 280 stores the task data for the complex network G(V, E, C), e.g. in the form of adjacency matrix for further processing.
[0045] FIG.4 shows an embodiment of the block storage means 230 storing the predefined set of sub-blocks. In the block storage means 230, some processing sub-blocks each of which indicates a particular processing are illustrated, such as a graph search sub-block 231, a shortest path sub-block 233, and a centrality measure sub-block 235. However, other kinds of processing sub-block(s) can be incorporated into the block storage means 230, alternatively or additionally.
[0046] As described above, a task block may be determined by the determining means 240 by selecting/combining one or more sub-blocks from the sub-blocks stored in the block storage means 230. For example, the task block may comprise a combination of the graph search sub-block 231 and the centrality measure sub-block 235, or a combination of the shortest path sub-block 233 and the centrality measure sub-block 235.
[0047] The graph search sub-block 231 indicates a routine to cause the parallel processing units 310 to perform Breadth First Search (BFS). In such case, inputs to the graph search sub-block 231 may be a preprocessed input matrix and source node. Output of the graph search sub-block 231 is a breadth first tree from a source node s in the network. For example, FIG.5 is a flow chart of a BFS routine 500 that is applicable in the graph search sub-block 231.
[0048] At S520, data is input to the graph search sub-block 231, such as one of the plurality of the data subsets.
[0049] At S530, each node is mapped to a thread on one CUDA streaming multiprocessor (parallel processing unit). At S540, based on the topological features, the graph search sub-block 231 specifies a source node s and an initial frontier set F. Let an array F denote the frontier of the search, an array X denote the visited nodes. During each iteration, every frontier node explores its neighbor nodes and adds them to the frontier node set F. At S550, the graph search sub-block 231 find connected nodes of F in the next level and add them to the visited array X. In the meantime, upon the completion of searching neighbor nodes, the current node adds itself to the visited node set X. At S560, the frontier node set F is updated. For example, if the current path is shorter than the existing path, then the length of the path is updated and one iteration is completed.
[0050] Then at S570, it is determined whether all the nodes have been discovered. If there are nodes to be discovered, the routine 500 returns to S550 and starts another iteration. If the frontier set F is empty and there is no node to be discovered, the routine of the graph search sub-block 231 terminates at S580, where a breadth first tree is generated. The breadth first tree contains information on betweenness of the breadth first tree.
[0051] The graph search sub-block 231 may further or alternatively indicate the DFS (Deep First Search) which is well-known in the art and can be applied in the present invention similarly to the BFS. [0052] When the task block comprises a combination of the graph search sub-block 231 and the centrality measure sub-block 235 serially in this order, the centrality measure sub-block 235 has a routine to cause the parallel processing units to calculate a breadth first tree for all of the nodes. FIG.6 is a flow chart of the routine 600 indicated by the centrality measure sub-block 235.
[0053] At S610 of the routine 600, the centrality measure sub-block 235 may call the graph search sub-block 231 to obtain the breadth first trees for the nodes in the network. At S620, it is determined whether all the breadth first trees for all of the nodes as source are found. If it is determined that there is still a breadth first tree to be found for a certain node, the routine 400 returns to S610 to obtain the tree for the certain node as the source.
[0054] If it is determined as S620 that all the breadth first trees are found, the routine goes to S630 where a parallel reduction is performed on the betweenness of each edge to obtain correlation coefficients between the nodes. The correlation coefficient of an edge is obtained using the formula w/b, where w denotes the weight of the edge from the complex network weight, and b denotes the betweenness of the edge.
[0055] The routines 500 and 600 are of parallel processes, and can be performed on the parallel processing units 310, respectively, to obtain parallel results therefrom, such as the correlation coefficients described above.
[0056] In such case, the classification means 340 uses the global correlation coefficients input from the centrality measure sub-block 235 to obtain the modular structures of the network. FIG.7 is a flow chart of a routine 700 performed by the classification means 340.
[0057] At S710, the correlation coefficients are input from all of the plurality of parallel processing units 310, as the parallel results. At S720, the classification means 340 identifies the edge with the largest correlation coefficient.
[0058] At S730, the edge with the largest correlation coefficient is deleted. Then At S740, the network with such edge deleted is determined whether it satisfies the condition specified by the task parameter. For example, it is determined in S740 whether all the communities remaining in the network has sizes that are smaller than the maximum size.
[0059] If the condition is not satisfied at S740, the routine 700 goes to S705 where the centrality measure sub-block 235 is called again to obtain correlation coefficients for the network deleting the edge with the largest correlation coefficient. [0060] If the condition is satisfied at S740, the routine 700 terminates at S750 where a modular structure with modules satisfying the condition specified by the task parameter is obtained.
[0061] On the other hand, the shortest path sub-block 233 may have a routine for finding shortest paths between every pair of nodes in the graph (network). For example, routines for APSP (All pairs shortest path), SPSP (Single Pair Shortest Path), SSSP (Single Source Shortest Path), SDSP (Single Destination Shortest Path) or the like can be applied in the shortest path sub-block 233. These shortest path methods are all well-known in the art, therefore omitted herein.
[0062] When the task block comprises a combination of the shortest path sub-block 233 and the centrality measure sub-block 235 serially in this order, the centrality measure sub-block 235 has a routine to cause the parallel processing units to build a hierarchical clustering tree.
[0063] In such case, the classification means 340 cut the hierarchical clustering tree using task parameters as the constraints to obtain the modular structure.
[0064] Thus, a modular structure with modules satisfying the condition specified in the task parameter is obtained.
[0065] In traditional hierarchical clustering algorithms, a typical procedure is as follows.
Hierarchical clustering procedure produces a series of partitions of the data, Pn,
Pn-1, , PI. The first Pn consists of n single object 'clusters', and the last PI consists of a single group containing all n cases. At each particular stage the method joins together the two clusters which are closest together (most similar). (At the first stage, of course, this amounts to joining together the two objects that are closest together, since at the initial stage each cluster has one object.) . Differences between methods arise because of the different ways of defining distance (or similarity) between clusters. In this embodiment, we calculate the distance using matrix multiplication-based all pair shortest path algorithm.
[0066] Our implementation modifies the matrix multiplication routine given by Volkov and Demmel by replacing the multiplication and addition operations with addition and minimum operations. Shared memory is used as a user managed cache to improve performance. Volkov and Demmel bring sections of matrices R, C and Di into shared memory in blocks: R is brought in 64x4 sized blocks, C in 16 xl6 sized blocks and Di in 64x16 sized blocks. These values are selected to maximize throughput of the CUD A device. During execution, each thread computes 64x16 values of Di. Algorithm 1 describes the modified matrix multiplication kernel. Please see http://www.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-49.html for full details on the matrix multiplication kernel.
[0067] The invention develops a GPU-based parallel computing system to identify the modular structures in complex networks, reduces the execution time of computation and the cost significantly, and provides a complex network research platform for commercial and academic entities.
[0068] FIG.8 shows a flow chart of method 2000 of identifying modular structure in a complex network using a computing system with a CPU and a parallel processing device (e.g. GPU) in accordance with an embodiment of the present invention. The method 2000 may be performed by the respective components in the computing system 1000 in FIG.l.
[0069] In particular, at S2100, task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameters for a task to be performed on the complex network, may be read by the data reading means 210 on the CPU 200.
[0070] At S2200, the determining means 240 on the CPU 200 determines a task block from a predefined set of sub-blocks each of which indicates a particular process, according to the task data, and transferring the task block to the first interface 250 on the CPU 200. The task block is used for assigning subtask process to be performed in the plurality of parallel processing units 310-1, 310-2 310-N on the parallel processing device 300, respectively, where N is an integer.
[0071] At S2220, the task block is transferred by the first interface 250 to a first frontend 320 on the parallel processing device 300. Then at 2240, passing, by the first frontend, the task block to an assembler means 330 on the parallel processing device 300.
[0072] At S2260, the assembler means 330 generates the subtask process readable by the plurality of parallel processing units 310 from the task block and assigns the subtask process to the plurality of parallel processing units 310.
[0073] On the other hand, at S2300, the dispatcher means 220 may divide the task data into a plurality of data subsets with respect to the plurality of parallel processing units 310.
[0074] At S2320, a second interface 255 transfers on the CPU 200, the plurality of data subsets to a second frontend 325 on the parallel processing device. Then at S2340, the second frontend 325 passes the plurality of data subsets to the plurality of parallel processing units, respectively.
[0075] Having received the task process and the data subsets, at S2400, the plurality of parallel processing units 310-1, 310-2 310-N perform in parallel, by the plurality of parallel processing units, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results.
[0076] At S2500, the classification means 340 (located in the CPU 200 or GPU 300) processes the parallel results to obtain the modular structure in the complex network.
[0077] The method of identifying modular structure in a complex network according to the present invention may incorporates one or more aspects described above with reference to FIGs.1-7.
[0078] For example, the method may further comprise extracting a complex network data representing the nodes and the edges with values and storing the complex network data as a part of the task data.
[0079] Also, the task block may comprise a combination of a graph search sub-block and a centrality measure sub-block, or a combination of a shortest path sub-block and a centrality measure sub-block. The routines described in FIGs.5-7 can also be applied in the method 2000.
[0080] It should be noted that the steps in method 2000 as shown in FIG.8 do not have to be performed in the order as shown. For example, steps S2200-S2260 may be performed after or at the same time as S2300-S2340.
[0081] As can be appreciated by one skilled in the art, a computer system with an associated computer-readable medium containing instructions for controlling the computer system can be utilized to implement the exemplary embodiments that are disclosed herein. The computer system may include at least one computer such as a microprocessor, digital signal processor, and associated peripheral electronic circuitry.
[0082] Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims

CLAIMS What is claimed is:
1. A method of identifying a modular structure in a complex network using a computing system with a CPU and a parallel processing device, the method comprising
reading, by a data reading means on the CPU, task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network;
determining, by a determining means on the CPU, a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from a predefined set of sub-blocks each of which indicates a particular process, according to the task data, and transferring the task block to a first interface on the CPU; transferring the task block, by the first interface, to a first frontend on the parallel processing device;
passing, by the first frontend, the task block to an assembler means on the parallel processing device;
generating, by the assembler means, the subtask process readable by the plurality of parallel processing units from the task block and assigning the subtask process to the plurality of parallel processing units;
dividing, by a dispatcher means on the CPU, the task data into a plurality of data subsets with respect to the plurality of parallel processing units;
transferring, by a second interface on the CPU, the plurality of data subsets to a second frontend on the parallel processing device;
passing, by the second frontend, the plurality of data subsets to the plurality of parallel processing units, respectively;
performing in parallel, by the plurality of parallel processing units, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results; and processing, by a classification means, the parallel results to obtain the modular structure in the complex network.
2. The method of claim 1, wherein the parallel processing device is a Graphic Processing Unit, or is distributed on a local area network or a wide area network.
3. The method of claim 1, wherein the task block comprises a combination of a graph search sub-block and a centrality measure sub-block, or a combination of a shortest path sub-block and a centrality measure sub-block.
4. The method of claim 1, wherein the nodes represent different genes or proteins, and the values represent specific interactions among the nodes.
5. The method of claim 1, wherein the nodes represent different members in a group, and the values represent specific relationships among the nodes.
6. The method of claim 1, further comprising:
extracting, by a network data means on the CPU, a complex network data representing the nodes and the edges with values; and
storing, by a data storage means on the CPU, the complex network data as a part of the task data.
7. The method of claim 1, wherein the modular structure is comprised of a plurality of communities each of which contains the nodes that have closer relationships.
8. A tool suite device for identifying a modular structure in a complex network using a computing system with a CPU and a parallel processing device, the tool suite device comprising
a data reading means on the CPU for reading task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network;
a block storage means on the CPU for storing a predefined set of sub-blocks each of which indicates a particular process; a determining means on the CPU for determining a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from the predefined set of sub-blocks stored in the block storage means according to the task data;
a first interface on the CPU for receiving the task block transferred from the determining means;
a dispatcher means on the CPU for dividing the task data into a plurality of data subsets with respect to the plurality of parallel processing units;
a second interface on the CPU for receiving the task data transferred from the dispatcher means;
a first frontend on the parallel processing device connected to the first interface for receiving the task block transferred from the first interface;
an assembler means on the parallel processing device for receiving the task block passed from the first frontend, generating the subtask process readable by the plurality of parallel processing units from the task block and assigning the subtask process to the plurality of parallel processing units;
a second frontend on the parallel processing device connected to the second interface for receiving the plurality of data subsets from the second interface and passing the plurality of data subsets to the plurality of parallel processing units, respectively;
the plurality of parallel processing units for performing in parallel, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results; and a classification means for processing the parallel results to obtain the modular structure in the complex network.
9. The tool suite device of claim 8, wherein the parallel processing device is a
Graphic Processing Unit, or is distributed on a local area network or a wide area network.
10. The tool suite device of claim 8, wherein the task block comprises a combination of a graph search sub-block and a centrality measure sub-block, or a combination of a shortest path sub-block and a centrality measure sub-block.
11. The tool suite device of claim 8, wherein the nodes represent different genes or proteins, and the edges with values represent specific interactions among the nodes.
12. The tool suite device of claim 8, wherein the nodes represent different members in a group, and the edges with values represent specific relationships among the nodes.
13. The tool suite device of claim 8, further comprising:
a network data unit for extracting a graph representation for the nodes and the edges with values; and
a data storage unit for storing the graph representation as the task data.
14. The tool suite device of claim 8, wherein the modular structure is comprised of a plurality of communities each of which contains the nodes that have closer relationships.
15. A system for identifying modular structure in a complex network comprising a and a parallel processing device, wherein
the CPU includes
a data reading means for reading task data which includes nodes in the complex network, edges with values indicating relationships among the nodes, and task parameter for a task to be performed on the complex network;
a block storage means for storing a predefined set of sub-blocks each of which indicates a particular process;
a determining means for determining a task block for assigning subtask process to be performed in a plurality of parallel processing units on the parallel processing device, respectively, from the predefined set of sub-blocks stored in the block storage means according to the task data;
a first interface for receiving the task block transferred from the determining means;
a dispatcher means for dividing the task data into a plurality of data subsets with respect to the plurality of parallel processing units; and a second interface for receiving the task data transferred from the dispatcher means, and
the parallel processing device includes:
a first frontend on the parallel processing device connected to the first interface for receiving the task block transferred from the first interface;
an assembler means on the parallel processing device for receiving the task block passed from the first frontend, generating the subtask process readable by the plurality of parallel processing units from the task block and assigning the subtask process to the plurality of parallel processing units;
a second frontend on the parallel processing device connected to the second interface for receiving the plurality of data subsets from the second interface and passing the plurality of data subsets to the plurality of parallel processing units, respectively; the plurality of parallel processing units for performing in parallel, the subtask process assigned by the assembler means on the data subsets, respectively, to obtain parallel results; and
a classification means for processing the parallel results to obtain the modular structure in the complex network.
PCT/CN2010/077949 2010-10-21 2010-10-21 Method and tool suite device for identifying modular structure in complex network WO2012051757A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201080051364.2A CN102667710B (en) 2010-10-21 2010-10-21 Method and tool suite device for identifying modular structure in complex network
PCT/CN2010/077949 WO2012051757A1 (en) 2010-10-21 2010-10-21 Method and tool suite device for identifying modular structure in complex network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2010/077949 WO2012051757A1 (en) 2010-10-21 2010-10-21 Method and tool suite device for identifying modular structure in complex network

Publications (1)

Publication Number Publication Date
WO2012051757A1 true WO2012051757A1 (en) 2012-04-26

Family

ID=45974623

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/077949 WO2012051757A1 (en) 2010-10-21 2010-10-21 Method and tool suite device for identifying modular structure in complex network

Country Status (2)

Country Link
CN (1) CN102667710B (en)
WO (1) WO2012051757A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014020122A1 (en) * 2012-08-01 2014-02-06 Netwave System for processing data for connecting to a platform of an internet site
CN104737159A (en) * 2012-08-01 2015-06-24 诺夫尔公司 Data-processing method for situational analysis

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10255105B2 (en) * 2017-04-11 2019-04-09 Imagination Technologies Limited Parallel computing architecture for use with a non-greedy scheduling algorithm

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030009621A1 (en) * 2001-07-06 2003-01-09 Fred Gruner First tier cache memory preventing stale data storage
WO2006044258A1 (en) * 2004-10-12 2006-04-27 International Business Machines Corporation Optimizing layout of an aplication on a massively parallel supercomputer
CN101272328A (en) * 2008-02-29 2008-09-24 吉林大学 Dispersion type community network clustering method based on intelligent proxy system
CN101278257A (en) * 2005-05-10 2008-10-01 奈特希尔公司 Method and apparatus for distributed community finding
CN101383748A (en) * 2008-10-24 2009-03-11 北京航空航天大学 Community division method in complex network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030009621A1 (en) * 2001-07-06 2003-01-09 Fred Gruner First tier cache memory preventing stale data storage
WO2006044258A1 (en) * 2004-10-12 2006-04-27 International Business Machines Corporation Optimizing layout of an aplication on a massively parallel supercomputer
CN101278257A (en) * 2005-05-10 2008-10-01 奈特希尔公司 Method and apparatus for distributed community finding
CN101272328A (en) * 2008-02-29 2008-09-24 吉林大学 Dispersion type community network clustering method based on intelligent proxy system
CN101383748A (en) * 2008-10-24 2009-03-11 北京航空航天大学 Community division method in complex network

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014020122A1 (en) * 2012-08-01 2014-02-06 Netwave System for processing data for connecting to a platform of an internet site
FR2994358A1 (en) * 2012-08-01 2014-02-07 Netwave SYSTEM FOR PROCESSING CONNECTION DATA TO A PLATFORM OF AN INTERNET SITE
CN104737520A (en) * 2012-08-01 2015-06-24 诺夫尔公司 System for processing data for connecting to a platform of an Internet site
CN104737159A (en) * 2012-08-01 2015-06-24 诺夫尔公司 Data-processing method for situational analysis
RU2654171C2 (en) * 2012-08-01 2018-05-16 Нетвэйв System for processing data for connecting to platform of internet site
CN104737159B (en) * 2012-08-01 2018-05-25 诺夫尔公司 For the data processing method of scenario analysis

Also Published As

Publication number Publication date
CN102667710A (en) 2012-09-12
CN102667710B (en) 2014-09-03

Similar Documents

Publication Publication Date Title
Bader et al. Graph partitioning and graph clustering
Ihmsen et al. A parallel SPH implementation on multi‐core CPUs
Trifunovic et al. Paradigm shift in big data supercomputing: dataflow vs. controlflow
Sethi et al. RecShard: statistical feature-based memory optimization for industry-scale neural recommendation
KR101329350B1 (en) An updating method for betweenness centrality of graph
CN111860807B (en) Fractal calculation device, fractal calculation method, integrated circuit and board card
El Zein et al. Generating optimal CUDA sparse matrix–vector product implementations for evolving GPU hardware
Messer et al. MiniApps derived from production HPC applications using multiple programing models
US9536043B2 (en) Using RNAi imaging data for gene interaction network construction
Pechlivanidou et al. MapReduce-based distributed k-shell decomposition for online social networks
WO2012051757A1 (en) Method and tool suite device for identifying modular structure in complex network
CN108694664A (en) Checking method and device, the electronic equipment of operation system
CN111274455A (en) Graph data processing method and device, electronic equipment and computer readable medium
Tao et al. A modified gravity p-Median model for optimizing facility locations
CN109416688A (en) Method and system for flexible high performance structured data processing
Kratochvíl et al. GigaSOM. jl: High-performance clustering and visualization of huge cytometry datasets
He et al. Parallel outlier detection using kd-tree based on mapreduce
US20110179095A1 (en) Device and method for storing file
Inoubli et al. A distributed and incremental algorithm for large-scale graph clustering
Astsatryan et al. Performance-efficient Recommendation and Prediction Service for Big Data frameworks focusing on Data Compression and In-memory Data Storage Indicators
CN113227965B (en) Distributed processing support device, distributed processing support method, and storage medium
CN110070383A (en) Abnormal user recognition methods and device based on big data analysis
Lin et al. Performance evaluation of cluster algorithms for Big Data analysis on cloud
KR20140082238A (en) An Optimization Framework for Cost-driven Sequential Hardware/Software Partitioning
Dib Novel hybrid evolutionary algorithm for bi-objective optimization problems

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201080051364.2

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10858545

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10858545

Country of ref document: EP

Kind code of ref document: A1