CN106598743A - Attribute reduction method for information system based on MPI parallel solving - Google Patents
Attribute reduction method for information system based on MPI parallel solving Download PDFInfo
- Publication number
- CN106598743A CN106598743A CN201611259383.XA CN201611259383A CN106598743A CN 106598743 A CN106598743 A CN 106598743A CN 201611259383 A CN201611259383 A CN 201611259383A CN 106598743 A CN106598743 A CN 106598743A
- Authority
- CN
- China
- Prior art keywords
- information system
- attribute
- node
- sub
- parallel
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an attribute reduction method for an information system based on MPI parallel solving. The attribute reduction method comprises the following steps: firstly, reading the data of the information system, pretreating numerical values, and carrying out data discretization; then, horizontally dividing the information system into p sample data subsets, assigning to n nodes through communication, calculating the equivalence class of the data subsets in parallel, and integrating the result of each node to obtain m equivalence class division information subsystems of the whole information system; then, assigning the m information subsystems to the n nodes, carrying out parallel attribute core computing until all the information subsystems are processed, then merging the results of the nodes to obtain the attribute core of the whole information system; and finally, carrying out parallel solving attribute reduction, and merging the attribute reduction results of the nodes to obtain the attribute reduction of the whole information system. A rough set attribute reduction method is combined with MPI parallel computing, so that parallel solving can be carried out by using differentiation matrix solving attribute reduction calculation, and the algorithmic efficiency is improved.
Description
Technical field
The invention belongs to data mining, rough set, parallel computation field, and in particular to one kind is based on MPI using resolution square
The method that battle array obtains parallel attribute reduction,.
Background technology
As in recent years data explosion formula increases, concurrent technique seems more and more important, and the main purpose of parallel computation is
The process time of large complicated problem or mass data is saved, the computer resource for integrating " cheap " sets up parallel computing platform gram
Take the restriction that unit calculates performance bottleneck and unit memory space.
Parallel computation refer on parallel computer or parallel computing platform a mass computing task is split as it is multiple
Subtask, is assigned to each processor, and mutually collaboration completes subtask between each processor, and so as to reach solution efficiency or complete is improved
Into the purpose of extensive task.It is with the key of parallel computation optimal solution that pending problem has concurrency.Parallel
Calculating is divided into time parallel and spatial parallelism, and time parallel actually refers to pipelining, and spatial parallelism is then multiple places
Reason device simultaneously participates in calculating, is mainly studying a question for parallel computation.Parallel computation can be divided into data parallel and task simultaneously again
OK, allow multiple processors to participate in calculating, improve efficiency and performance.
Message passing interface (Message Passing Interface, abbreviation MPI) since the nineties in 20th century always
The fact that be high-performance computing sector parallel program development standard, current major part high-performance calculation platform is provided which that MPI is parallel
Environment.MPI is currently the most important ones multiple programming instrument, and it has transplantability good, powerful, various advantages such as efficiency height,
And having the version of realizing of various different free highly effectives, almost all of parallel computer manufacturer all provides props up it
Hold, this is that other all of Parallel Programming Environments are all incomparable.
MPI was produced in 1994, although generation time is relatively later, because it absorbs the excellent of other various parallel environments
Point, while the features such as taking into account performance function transplantability, just rapid popularization becomes message transmission multiple programming in short several years
The standard of pattern.This also illustrates the vitality and superiority of MPI from one side, and MPI is exactly in fact a storehouse, has up to a hundred
These functions directly can be called by individual function call interface in C language, although calling for MPI offers is a lot, most
Only 6 for often using, only just need to can complete almost all of communication function by using this 6 functions.
The characteristic of MPI:(1) easily use, it is portable good.Almost all of parallel computer all supports MPI frameworks, appoints
What supports that the parallel computer of interprocess communication all supports the programming of MPI.(2) there is perfect asynchronous mechanism.Each
Concurrent process has oneself independent memory headroom, and ensure that is carried out between process in the case where other parallel processes of getting along well are clashed
Communication, solves the problems, such as data syn-chronization, realizes real asynchronous communication.(3) explicit data exchange.User must be by aobvious
Formula sends and receives message to realize the message between concurrent process and data exchange.(4) parallel granularity is big.Message-Passing Model
Programming need task resolution well, adapt to compute-intensive applications, be to reduce communication to consume, it is adaptable to parallel computation
The big extensive scalable parallel algorithm of granularity.
The attribute of information system is not only diversified in actual life, and dimension is high, and comprising noise, redundancy and uncorrelated category
Property, in order to solve data complexity of the calculation and accuracy problem, abates the noise etc. what calculating process and final result were caused
Affect, reduce the calculating time of rule extraction algorithm, so as to see the distribution situation of response data substitutive characteristics clearly, attribute reduction must
It is indispensable.In recent years, rough set theory becomes the effective mathematical tool for processing uncertain information.
Rough set:The theory is taught by Polish scholar Pawlak and is proposed in nineteen eighty-two, be it is a kind of can effective process it is inaccurate,
The mathematical theory of uncertain and fuzzy message.At present, rough set has been successfully applied to machine learning, data mining, intelligent data
The field such as analysis and control algolithm acquisition.The main thought of rough set theory be using known knowledge base, will inaccurately or not
It is determined that knowledge portrayed come (approximate) with the knowledge in known knowledge storehouse.Rough set can be independent of priori, according to data
Decision-making and distribution carry out Knowledge Discovery.
Attribute reduction:Attribute reduction is one of important research contents of rough set theory, is mainly should for rough set theory
With direction, the also always study hotspot of rough set theory.Attribute reduction is the process for full dataset, is in holding information
On the basis of system or the original classification capacity of information system, redundancy or incoherent attribute are deleted, generally also regarded as to data
The dimensionality reduction of collection.Common attribute reduction method has discrimination matrix reduction method, positive region reduction method and comentropy reduction method.
Discrimination matrix method has readily understood and realizes convenient advantage, obtains many scholar's favors.
Discrimination matrix:Old attribute reduction algorithms based on discrimination matrix are directed to each given information system, all accordingly
A corresponding discrimination matrix is given, each element community set different from other elements is found out, is represented in information system and is had
The knowledge of body.The advantage of this method is that the information possessed in knowledge-representation system is represented by discrimination matrix visualization,
Can be with the open-and-shut differentiation attribute found out between each object by discrimination matrix.
The content of the invention
Attribute variation, high-dimensional in the information system that the present invention exists for prior art, comprising noise, redundancy, number
According to the defect such as amount is big, it is proposed that a kind of utilization discrimination matrix tries to achieve parallel the method for attribute reduction to solve data meter based on MPI
The complexity and accuracy problem of calculation, improve calculate performance and computational efficiency information system attribute reduction is asked parallel based on MPI
Method.Technical scheme is as follows:
A kind of method for seeking information system attribute reduction parallel based on MPI, it is comprised the following steps:
Step 1), in data preprocessing phase, read the data of information system, numerical value is carried out into pretreatment, i.e. discretization
Process, according to the feature of data, can using simple directly wide interval or wait frequency interval method, based on Importance of Attributes from
Dispersion method, the discretization method based on cluster so that continuous data discretization;
Step 2), information system is equably divided horizontally in units of sample p sample data subset, and by p
To n node, each node, then will be each according to the equivalence class of conditional attribute parallel computation data subset for sample data subset allocation
The result of individual node is integrated, so as to obtain m equivalence class partition of whole information system, each equivalence class one sub-information of correspondence
System;
Step 3), again m sub- information systeies are distributed to into n node, each node is to assigned sub-information system
Parallel computation attribute nucleus, until having processed all sub-information systems, then the result of each node are merged, and draw prime information system
Attribute nucleus;
Step 4), it is last, then the attribute nucleus of prime information system be sent to into each nodal parallel obtain attribute reduction, then will
The attribute reduction result of each node merges integration, obtains the attribute reduction result of whole information system.
Further, the step 1) data preprocessing phase first reads in information system and specifically includes:Described information system
Namely decision table is that (U, A, V, f), wherein U represents the set of all objects in field question to four-tuple IS=, referred to as discusses
Domain;A=C ∪ D are community sets, and subset C and D represent respectively conditional attribute collection and decision kind set;Va is attribute
The codomain of a;f:U × A → V is an information function, gives a value of information to each attribute of an object, i.e.,X ∈ U, there is f (x, a) ∈ Va.
Further, when the data to information system carry out continuous data discretization, according to the feature of data, can be with
Using it is simple directly it is wide it is interval, etc. frequency interval method, based on Importance of Attributes, based on discretization methods such as clusters.
Further, the step 2) information system equivalence class partition be using equivalence relation to domain according to condition belong to
Property classified, the conditional attribute collection of data set, form be { conditional attribute 1, conditional attribute 2 ... ... conditional attribute p }, this etc.
Valency apoplexy due to endogenous wind contains consistent object and inconsistent object, if conditional attribute is consistent with decision attribute, for consistent object, if condition
Attribute is consistent, and decision attribute is inconsistent, then be inconsistent object.
Further, the step 3) m sub- information system is distributed to into n node, adopt principal and subordinate's mould during distribution task
Formula, selects a node to be host node, and remaining node is that, from node, host node is responsible for for task being allocated to each from node,
The task action result from node is received, task distribution adopts dynamically distributes mode, takes and be randomly assigned or order-assigned mode,
The fast node distribution task of process is more, sub-information system is distributed to each time the node of free time, until all sub-information systems
Have been processed.
Further, the step 3) parallel computation attribute nucleus are that sub- decision-making discrimination matrix is created on each node, if
Decision attribute D, i.e. sub-information system are included in sub-information system decision-making discrimination matrix for inconsistent object set, the sub-information system
Attribute nucleus beOtherwise, single attribute that decision-making is determined in sub-information system is found out, the union of each single attribute of condition is met,
As attribute nucleus of the sub-information system.
Further, it is described seek sub-information system parallel attribute reduction be in sub- decision-making discrimination matrix, will be comprising core
The value of the element of attribute is revised as empty set, so as to obtain a new matrix, resettles and extract accordingly logical expression, will be all
Logical expression of extracting carry out conjunction computing, obtain conjunctive normal form, then conjunctive normal form is converted to into disjunctive normal form form, finally
Each conjunct all core attributes being added in disjunctive normal form, obtains the result of sub-information system property yojan.
Further, it is described to create being specially for sub- decision-making discrimination matrix:To each division, each is found out in the division
The element attribute different from other elements.
Advantages of the present invention and have the beneficial effect that:
The present invention provides a kind of method for seeking information system attribute reduction parallel based on MPI.The problem to be solved includes:Letter
Attribute has variation in breath system, high-dimensional, comprising noise, redundancy, and the defect such as data volume is big, traditional attribute reduction method
Limited by calculating the time, it is impossible to fast and effeciently attribute reduction is carried out to Large Information Systems.By by MPI concurrent techniques
The method for being applied to seek attribute reduction using discrimination matrix, can solve the problem that data complexity of the calculation and accuracy problem, improve
Calculate performance and computational efficiency.The method can process the not treatable large-scale dataset of serial algorithm, and increase substantially
The time of attribute reduction is obtained, the time that conventional serial old attribute reduction algorithms occur when to Large Information Systems yojan is solved
The problems such as long, internal memory overflows and delays machine.
Description of the drawings
Fig. 1 is that present invention offer preferred embodiment seeks parallel information system attribute reduction method FB(flow block) based on MPI;
Fig. 2 master slave modes design a model;
Fig. 3 ad-hoc modes design a model.
Fig. 4 node tasks distribution diagrams;
Fig. 5 computation attribute core flow charts;
Fig. 6 seeks attribute reduction flow chart by attribute nucleus;
Fig. 7 MPI communication pattern flow charts.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, detailed
Carefully describe.Described embodiment is only a part of embodiment of the present invention.
The present invention solves the technical scheme of above-mentioned technical problem,
A kind of method that attribute reduction is obtained parallel using discrimination matrix based on MPI is proposed, is comprised the following steps:
First, in data preprocessing phase, the data of information system are read, data is carried out into sliding-model control.Secondly, exist
Data divide the stage, and the data level of information system is divided into sample data subset and the different nodes of MPI clusters are distributed to,
So as to realize parallel computation equivalence class, then collect the result of calculation of each node, obtain the equivalence class of information system, equivalence class is made
For the foundation of division information system, equivalence class one sub- information system of correspondence.Then, in the parallel computation attribute nucleus stage,
Distribute task using master slave mode, sub-information system is distributed to into different nodes and realizes the parallel computation of attribute nucleus, then pass through
Communication amalgamation result, obtains the attribute nucleus of information system.Finally, distributed according to task on last stage, attribute reduction sought parallel,
The result of each node is merged, the attribute reduction of whole information system is obtained.
Specifically, information system is first read in data preprocessing phase.(U, A, V f) are a letter to one four-tuple IS=
Breath system (is also decision table), and wherein U represents the set of all objects in field question, referred to as domain;A=C ∪ D are property sets
Close, subset C and D represent respectively conditional attribute collection and decision kind set;
Va is the codomain of attribute a;f:U × A → V is an information function, and to each attribute of an object one is given
The individual value of information, i.e.,X ∈ U, there is f (x, a) ∈ Va.
Then discretization is carried out to data.The scientific and rational transformation of continuous variable is referred to as to meet real data distribution spy
The discrete magnitude levied.
The stage is divided in data, first information system p data subset is divided horizontally into into, n node, data are distributed to
The number of subset is appropriate, and communication overhead can be increased too much, again parallel granularity can be caused little very little so that node processing time phase
Difference is too big, and total time overhead increases.
Reciprocity program can be classified as when carrying out parallel Programming based on MPI according to the mutual relation between each node to set
Meter model and principal and subordinate's programming model.When carrying out programming using reciprocity programming model, each node is mutually cooperateed with
Task is completed jointly, is not interdepended between node.During using principal and subordinate's programming model, it is divided into host node and from node.Main section
Point is responsible for distribution calculating task, coordinates from node progress and collects result of calculation.Corresponding task is received from node and calculated, assist
It is same to complete task.In actual pairing programming design, often both of which is applied in combination, improves parallel efficiency calculation.
Then different nodal parallels calculate the equivalence class of data subset.Information system IS=(U, A, V, f) in, each
Attribute setDetermine the relation that can not offer an explanation (i.e. equivalence relation) IND (P):
Relations I ND (B),A division of U is constituted, is represented with U/IND (P), and the friendship of all of equivalence relation
Collection is also a kind of equivalence relation, i.e. [x]IND(P)=∩ [x]P, any element therein
Referred to as equivalence class.
P derived divisions on U are designated as U/P.With The lower aprons collection and upper approximate set of X are referred to as, claim POSP(X)=P_X is the positive region of the P of X, claimsFor P with regard to D positive region.In information system IS, if Have f (x,
C)=f (y, C) ∧ f (x, D) ≠ f (y, D), then IS is called Inconsistent information, and x and y is referred to as inconsistent object.Otherwise claim IS
For consistent information system.
The result of calculation of different nodes is collected again, is merged same equivalence class, so as to obtain whole information system
Equivalence class partition, each equivalence class one sub- information system of correspondence, if equivalence class has m, accordingly, information system drawn
It is divided into m sub- information system.
In the parallel computation attribute nucleus stage, task is distributed first.M sub- information system is distributed to into n node, distribution is appointed
Master slave mode is adopted during business, selects a node to be host node, remaining node is that, from node, host node is responsible for dividing task
To each from node, the task action result from node is received.Task distribution adopts dynamically distributes mode, due to sub-information system
Order do not affect result of calculation, therefore can take and be randomly assigned or order-assigned mode, it is more to process fast node distribution task, often
Sub-information system is once distributed to the node of free time, until all sub-information systems have been processed.
Then each nodal parallel computation attribute core.First decision-making is built to each sub-information system in parallel by different nodes
Discrimination matrix, (U, A, V, decision-making discrimination matrix f) is defined as DM={ m to information system IS=ij, wherein mijMeet:
In information system IS, U/C={ U are made1, U2..., Um, information system can be divided horizontally into m sub- information system,
Then sub-information system representation is ISk=(Uk, A, V, f) (1≤k≤m).Sub-information system ISk=(Uk, A, V, sub- decision-making f) point
Distinguish that matrix is defined asWhereinMeet:
Then
Then the attribute nucleus of each sub-information system are calculated.Sub-information system ISk=(Uk, A, V, core attributes f)
It is defined as DCOREk(C), then meet:
(U, A, V, the attribute nucleus of decision-making discrimination matrix f) are defined as information system IS=
Communicated between final node, attribute nucleus are merged.The union operation of attribute nucleus is right i.e. according to the definition of attribute nucleus
The attribute nucleus that each node is calculated take union operation.
The attribute reduction stage is being asked parallel, and being first according to the computation attribute core stage of the task is divided, each node processing phase
The sub-information system answered, antithetical phrase information system seeks attribute reduction.
(U, A, V, f), A=C ∪ D are community sets to given information system IS=, and subset C and D represent that respectively condition belongs to
Property collection and decision kind set, ifIf have g (P, D)=g (C, D) andG (B, D) ≠ γ (C, D), then P is called IS
A yojan.
According to the definition of attribute reduction, attribute nucleus are broadcast to into each from node by collective communication mode by host node,
In sub- decision-making discrimination matrix, the value of the element comprising attribute nucleus is revised asSo as to obtain a new matrix, to decision-making point
Distinguish all values in matrix for nonempty set elementSet up the logical expression L that extracts accordinglyij,
By all of logical expression L that extractsijConjunction computing is carried out, a conjunctive normal form is obtained, i.e.,
Again conjunctive normal form is converted to into disjunctive normal form form, is obtained
Then each conjunct all properties core being added in disjunctive normal form, obtains the result of attribute reduction.
Finally communication merges attribute reduction result.Respectively result is submitted to into host node from node, by host node by it is all about
Simple result carries out conjunction computing, then turns to disjunctive normal form, and as final result obtains the attribute reduction of whole information system.
MPI communication mechanisms:MPI communications refer to that program carries out a kind of row of message and data exchange between concurrent process
For.Communication mode can be divided into two classes by the difference according to message transmission target:Point-to-point communication and collective communication.
MPI provides the point-to-point communication function of two big types.First type is referred to as obstructive type (blocking), second
Type is referred to as non-obstructive type (non blocking).Obstruction type function needs to wait actually accomplishing for assigned operation, or at least institute
The data being related to just are returned after being securely backed up by MPI systems.Non- obstructive type function call is always returned immediately, and is actually grasped
Work is then carried out by MPI systems on backstage.For point-to-point message sends, MPI provides four kinds of sending modes, mode standard, buffering
Pattern, synchronous mode and ready mode.
MPI standard communication pattern is all the counting in units of transmission/receiving data type, and receiving buffer pool size can not
Less than the data to be received, mistake can be otherwise pointed out.If the data volume for arriving is less than buffer pool size, relief area is only received
Changed by received data in the region of the actual received quantity length for starting anew.The data for receiving unknown lengths are such as needed,
Then utilize MPI_Probe.Receiving process may specify generic reception envelope i.e. MPI_ANY_SOURCE, MPI_ANY_TAG, receives and
From any flag information of any originating process.As can be seen here, send and recv operations are asymmetrical, i.e., sender must be given
Specific destination address, and recipient then can be from any source receive information.Source, purpose process can also be specified to be same process,
But to block communication due to easily causing deadlock.
Mode standard is determined it is first immediately after to return message copying to a relief area (now to disappear by MPI systems
The transmission of breath is carried out by MPI systems on backstage), also it is to wait for being returned again to after data is activation is gone out.Most of MPI systems are reserved
A certain size relief area, by message copying to relief area and then can stand when the message-length for sending is less than buffer size
Return, otherwise then just return after part or all of message is sent completely.It is non-local that mode standard sends operation, because
Its needs that complete are got in touch with recipient.It is MPI_Send that mode standard obstructive type sends function.
MPI collective communications can allow message and data only at this using the node for oneself creating as a communication subset
Transmitted in communication subset, different from point-to-point communication, collective communication is all obstruction, it is therefore desirable to all parallel in set
Process is carried out, and next operation can be just done after having performed, and can otherwise be absorbed in unlimited wait, relative to point-to-point communication, set
Communication can more play parallel efficiency calculation.
The direction of collective communication can be divided into one-to-many communication, many-one communication and many-many communication Three models.It is synchronous
Function is used for the implementation progress of Coordinator, equivalent to a synchronous point is provided with, until all processes are carried out the synchronization
Operation could be continued after point.Computing function refers to that data of the process to receiving are processed.
Collective communication is based on point-to-point communication, but it is not to its simple encapsulation, but according to collective communication
Own characteristic carries out specific aim optimization.Collective communication greatly alleviates the burden of programmer, not only makes concurrent program terseization, and
And improve the performance and efficiency of program.For example, if user wishes to transmit the message to all processes in communication domain, can be straight
Connect and call MPI_Bcast collective communication functions, only one sentence just can be completed.
Fig. 1 is FB(flow block) of the present invention.Comprise the steps:
(1) data preprocessing phase.
This stage mainly reads in information system and processes the Data Discretization of information system, and step is as follows:
From UCI experimental data platform (network address:http://archive.ics.uci.edu/ml/) downloading data collection, data
The form of collection is { conditional attribute 1, conditional attribute 2 ..., conditional attribute n, decision attribute }, and wherein conditional attribute collection is { condition
Attribute 1, conditional attribute 2 ..., conditional attribute n }, decision kind set is { decision attribute 1, decision attribute 2 ... ... decision attribute
p}。
(U, A, V, are f) information system (being also decision table) to one four-tuple IS=, and wherein U is represented in field question
The set of all objects, referred to as domain;A=C ∪ D are community sets, and subset C and D represent respectively conditional attribute collection and decision-making category
Property collection.
The data of information system are read in by host node, according to the distribution of data, continuous data discretization is converted into meeting
The discrete magnitude being actually needed.
(2) data divide the stage.
This stage is divided into task distribution, parallel computation equivalence class and merges three steps of equivalence class:
1. task distribution.
According to the definition of information system, information system is divided horizontally into into p data subset.By this p data subset point
The different node of dispensing, so as to next step calculating.Task distribute when adopt master slave mode, by host node distribution data subset give from
Node.
2. parallel computation equivalence class.
According to the definition of equivalence class, each nodal parallel calculates the equivalence class in data subset, using equity when calculating
Pattern, i.e. host node also serve as a calculate node and are calculated.
Information system IS=(U, A, V, f) in, each attribute setDetermine the relation that can not offer an explanation (i.e. etc.
Valency relation) IND (P):
Relations I ND (B),A division of U is constituted, is represented with U/IND (P), and the friendship of all of equivalence relation
Collection is also a kind of equivalence relation, i.e. [x]IND(P)=∩ [x]P, any element therein
Referred to as equivalence class.
Cite a plain example below and illustrate how to divide equivalence class.
The influenza data set example of table 1
Individuality numbering | Headache | Myalgia | Body temperature | Influenza |
e1 | It is | It is | Normally | It is no |
e2 | It is | It is | It is high | It is |
e3 | It is | It is | It is very high | It is |
e4 | It is no | It is | Normally | It is no |
e5 | It is no | It is no | It is high | It is no |
e6 | It is no | It is | It is very high | It is |
Classify according to myalgia:
U/ myalgias={ { e1, e2, e3, e4, e6, { e5}}。
Carry out common category according to headache and influenza the two attributes:
U/ headaches and influenza={ { e1, { e2, e3, { e4, e5, { e6}}。
We can be classified with different standards to domain, obtain different concept and abstract.
3. equivalence class is merged.
Using master slave mode, collected respectively from the result of calculation of node by host node, identical equivalence class is collected, obtain whole
The equivalence class partition result of individual information system, this result as division information system foundation, i.e., one equivalence class correspondence one
Sub-information system, if there is m equivalence class, that is, has m sub- information system, realizes the division of data.
(3) the attribute nucleus stage is asked parallel.
This stage is divided into task distribution, parallel computation sub-information system property core and merging three steps of attribute nucleus.
1. task distribution.
MPI Parallel programming models are combined using master slave mode and ad-hoc mode, such as Fig. 2, Fig. 3, are cooperateed between node
Task is completed jointly so that parallel efficiency calculation is improved.
Because the order of sub-information system does not affect result of calculation, thus can using order-assigned or to be randomly assigned mode,
Which node processing it is fast, it is possible to overabsorption sub-information system, each time by sub-information system distribute to free time node,
And the result of each node is in no particular order sequentially.Node tasks distribution such as Fig. 4.
Such as, wherein three sub- information systeies are first given three nodes, node by 3 node processings, 7 sub- information systeies
1, node 2, node 3.If node 2 has first been processed, also four sub- information systeies are untreated, then continue to give the distribution task of node 2,
If node 3 distributes task, by that analogy, until all sub-information systems have been processed in the free time to node 3.
2. parallel computation attribute nucleus.
Each node is to assigned sub-information system in parallel computation attribute core.For each sub-information system,
First build sub- decision-making discrimination matrix.
According to the definition of decision-making discrimination matrix, each node builds decision-making and differentiates to different sub-information system in parallel respectively
Matrix.Sub-information system ISk=(Uk, A, V, sub- decision-making discrimination matrix f) is defined asWhereinMeet:
The attribute nucleus of sub-information system are calculated again.
According to the definition of attribute nucleus, each node seeks attribute nucleus to each sub-information system in parallel respectively, and flow chart is such as
Fig. 5.(Uk, A, V, core attributes f) are defined as DCOREk (C) to sub-information system ISk=, then meet:
3. attribute nucleus are merged.
(U, A, V, the attribute nucleus of decision-making discrimination matrix f) are defined as information system IS=
According to the relation between information system and sub-information system property core, communicated between node, result is closed
And, will the attribute nucleus that calculate of each node seek union.
(4) the attribute reduction stage is asked parallel.
This stage is divided into and ask parallel the yojan of sub-information system property and merging two steps of attribute reduction.
1. sub-information system property yojan is asked parallel.
(U, A, V, f), A=C ∪ D are community sets to given information system IS=, and subset C and D represent that respectively condition belongs to
Property collection and decision kind set, ifIf have g (P, D)=g (C, D) andG (B, D) ≠ γ (C, D), then P is called IS
A yojan.
According to the definition of attribute reduction, attribute nucleus are broadcast to each from node by host node by collective communication mode,
In sub- decision-making discrimination matrix, the value of the element comprising attribute nucleus is revised asSo as to obtain a new matrix, decision-making is differentiated
All values are the element of nonempty set in matrixSet up the logical expression L that extracts accordinglyij,
By all of logical expression L that extractsijConjunction computing is carried out, a conjunctive normal form is obtained, i.e.,
Again conjunctive normal form is converted to into disjunctive normal form form, is obtained
Then each conjunct all properties core being added in disjunctive normal form, obtains the attribute of all sub-information systems
Yojan result.
2. attribute reduction, flow chart such as Fig. 6 are merged.
Respectively communicated with host node from node, merged attribute reduction result.Process communication mode with seek attribute nucleus parallel
Stage is consistent, using mode standard.So-called merging attribute reduction result, will all yojan results carry out conjunction computing, then change
For disjunctive normal form, final result, i.e., the attribute reduction of whole information system are obtained.
Node communication is using normal communications mode and collective communication pattern.Normal communications mode belongs to blocking communication, message
The send of sender calls the cooperation that the recv for needing recipient is called just can complete, flow chart such as Fig. 7.The resistance of mode standard
The message that plug communication itself is decided whether to to be sent by MPI environment enters row buffering, if MPI has buffered the data for sending,
Even if receiving terminal not yet starts reception action, sending action also can be returned immediately.Consider in performance and resource optimization, MPI rings
Border can provide a number of relief area, then need obstruction to finish after data just until there is corresponding reception operation to collect more than after
Can return to.That is in blocking communication, whether transmitting terminal completes the state for depending not only on local process, to be also subject to distal end
The state left-right of receiving process.
Collective communication pattern can allow message and data only at this using the node for oneself creating as a communication subset
Transmitted in communication subset, different from point-to-point communication, collective communication is all obstruction, it is therefore desirable to all parallel in set
Process is carried out, and next operation can be just done after having performed, and can otherwise be absorbed in unlimited wait, relative to point-to-point communication, set
Communication can more play parallel efficiency calculation.
The enforcement to the present invention of below illustrating is described further.This example enters under premised on technical solution of the present invention
Row is implemented, and gives detailed embodiment and specific operating process, but protection scope of the present invention is not limited to following realities
Example.
Information system IS of table 2
Step (1) data prediction.The data of information system are read in, data are carried out into sliding-model control, e.g., condition is belonged to
Property a carry out more than 0.5 and the differentiation less than 0.5, will be greater than 0.5 discrete and turn to 1, discrete less than 0.5 turns to 0, then each number
1,0,1,0,1,1,0 are respectively according to a property values of object.In the same manner, sliding-model control is carried out to conditional attribute c, uses 0,1 to represent, 0
0 number, i.e. negative are represented less than, 1 represents the number more than 0, i.e. positive number, then the value of each data object corresponding to attribute c is
1,1,1,1,1,0,0.Whole information system IS carries out the such as table 3 of the result after Data Discretization.
Information system IS of the discretization of table 30
Step (2) data are divided.Information system is divided into into m equivalence class.That is determining according to information system and equivalence class
Justice, by information system m sub- information system is divided into.
1. task distribution.
Information system is divided horizontally into into p data subset, and is given this p data subset allocation not using master slave mode
Same node.Assume IS0A data subset is divided into per three data objects, then has 3 data subset IS1IS2IS3,
3 nodes are distributed to, task is also distributed using ad-hoc mode, i.e. host node during calculating, so, node 1 is host node, is first distributed
IS1To node 2, IS2To node 3, IS3To node 1, the distribution of task is completed.
The data subset IS of table 3.11
The data subset IS of table 3.22
The data subset IS of table 3.33
2. parallel computation equivalence class.
According to the definition of equivalence class, each node is calculated according to the tasks in parallel of distribution and be based in data subset conditional attribute
Equivalence class, one calculate node is also served as using ad-hoc mode, i.e. host node when calculating and is calculated.
Then in each node, conditional attribute identical data object is returned together, according to the allocation result of previous step, obtained final product
To IS1', IS2', IS3', it is as follows:
The data subset IS of table 4.11’
The data subset IS of table 4.22’
The data subset IS of table 4.33’
In data subset IS1' in, by x1And x3It is divided into same equivalence class, x2For another equivalence class, due to data
Subset IS2' in, x4, x5And x6Conditional attribute value it is all different, therefore there is no equivalence class, same IS3' there is no equivalence class yet.
3. equivalence class is merged.
Using master slave mode, collected respectively from the result of calculation of node by host node, merge identical equivalence class, therefore obtained
The equivalence class partition of information system, equivalence class one sub- information system of correspondence, is achieved in the division of data.
IS1' in x2And IS2' in x4Conditional attribute is consistent, is same equivalence class, can merge, then finally obtain 5
Individual equivalence class, U1={ x1, x3, U2={ x2, x4, U3={ x5, U4={ x6, U5={ x7, 5 sub- information systeies of correspondence.Such as
Shown in lower:
Sub-information system U of table 5.11
Sub-information system U of table 5.22
Sub-information system U of table 5.33
Sub-information system U of table 5.44
Sub-information system U of table 5.55
Step (3) seeks parallel attribute nucleus.Sub-information system is distributed to into different nodes, each nodal parallel calculates sub-information
The attribute nucleus of system again merge attribute nucleus.
1. task distribution.
Sub-information system is distributed to each node by host node, master slave mode is adopted during distribution task, using right during calculating
Isotype.This mode causes parallel efficiency calculation to be improved.The order of sub-information system does not affect result of calculation, which
Node processing it is fast, it is possible to overabsorption sub-information system, each time by sub-information system distribute to free time node.3 sections
Point processes 5 sub- information systeies, by U1Distribute to node 2, U2Distribute to node 3, U3Node 1 is distributed to, because host node will enter
The distribution of row task and the collection of result, calculating speed can be reduced, if node 2 has processed at first task, therefore by U4Distribute to section
2 are put, therewith node 1 is disposed, by U5Distribute to node 1, i.e. node 1 and process U3, U5, the process U of node 21, U4, node 3 processes
U2。
2. parallel computation attribute nucleus.Build decision-making discrimination matrix parallel first, then by the definition of attribute nucleus, calculate sub-information
System property core.
The definition of the task and decision-making discrimination matrix distributed according to node, each node difference antithetical phrase information system U1,
U2, U3, U4, U5Build decision-making discrimination matrix.The decision-making discrimination matrix of the sub-information system of structure is as follows:
The sub-information system decision-making discrimination matrix DM of table 6.11
The sub-information system decision-making discrimination matrix DM of table 6.22
The sub-information system decision-making discrimination matrix DM of table 6.33
The sub-information system decision-making discrimination matrix DM of table 6.44
The sub-information system decision-making discrimination matrix DM of table 6.55
Further according to the task distribution and the definition of attribute nucleus of node, using sub- discrimination matrix, antithetical phrase information system U1, U2,
U3, U4, U5Attribute nucleus are sought parallel.In DM1In, find D ∈ DM1, i.e. sub-information system U1For inconsistent object set, therefore DM1In not
There are attribute nucleus,In the same manner, it can be deduced that, DM2In there are no attribute nucleus,For
DM3, there is no D ∈ DM3And aD ∈ DM3With dD ∈ DM3, Fructus Psoraleae information system U3There is attribute nucleus, and CORE3(C)={ a, d };
In DM4In, there is no D ∈ DM4, and have aD ∈ DM4, Fructus Psoraleae information system U4There is attribute nucleus, CORE4(C)={ a };In DM5
In, there is no D ∈ DM5, and have aD ∈ DM5, Fructus Psoraleae information system U5There is attribute nucleus, CORE5(C)={ a }.
3. attribute nucleus are merged.
According to the relation between information system and sub-information system property core, other are collected by host node using master slave mode
Node result of calculation, merges, will the attribute nucleus that calculate of each node seek union.
According to the definition of attribute nucleus, by 2. can obtaining,CORE3(C)={ a, d },
CORE4(C)={ a }, CORE5(C)={ a },
Can be obtained using above-mentioned steps, the attribute nucleus DCORE (C) of information system IS=={ a, d }, obtain with positive region method
Core attributes be consistent.
Step (4) seeks parallel attribute reduction.Sub-information system is sought parallel in task distribution of each node in step (3)
Attribute reduction, each node does not interfere with each other, and obtains result, and last host node merges each result.
1. sub-information system property yojan is asked parallel.According to the definition of attribute reduction, based on the result of previous step, obtain
Attribute reduction.
In sub- decision-making discrimination matrix, it is set to decision attribute is only included with the value not comprising decision attributeAnd will include
The value concurrent modification of the element of core attributes isIt is as follows so as to obtain a new matrix:
The amended decision-making discrimination matrix DM of table 7.11
The amended decision-making discrimination matrix DM of table 7.22
The amended decision-making discrimination matrix DM of table 7.33
The amended decision-making discrimination matrix DM of table 7.44
The amended decision-making discrimination matrix DM of table 7.55
Pair and DM then logical expression of extracting accordingly is set up to each sub-information system, i.e.,4Set up expression formula b ∨
C, by all of logical expression of extracting conjunction computing is carried out, and obtains conjunctive normal form, then conjunctive normal form is converted to into disjunctive normal form
Form, abbreviation obtains b ∨ c, each conjunct being finally added to all properties core in disjunctive normal form, i.e. obtain each sub- letter
The result of breath system property yojan, each conjunct is an attribute reduction.DM1Obtain result (a ∧ b ∧ d) ∨ (a ∧ c
∧ d), i.e. DM1Attribute reduction result be a ∧ b ∧ d or a ∧ c ∧ d, DM4Result (a ∧ b ∧ d) ∨ (a ∧ c ∧ d) is obtained, i.e.,
DM4Attribute reduction result be a ∧ b ∧ d or a ∧ c ∧ d, remaining is
2. attribute reduction is merged.Using master slave mode, host node collects the result of calculation of other nodes, and result is transported
Calculate.
When attribute reduction result is merged, all non-NULL yojan results are carried out into conjunction computing, i.e. ((a ∧ b ∧ d) ∨ (a
∧ c ∧ d)) ∧ ((a ∧ b ∧ d) ∨ (a ∧ c ∧ d)), then disjunctive normal form (a ∧ b ∧ d) ∨ (a ∧ c ∧ d) is turned to, as finally
Result, i.e., the attribute reduction of whole information system, the result finally tried to achieve be a ∧ b ∧ d or a ∧ c ∧ d.
The above embodiment is interpreted as being merely to illustrate the present invention rather than limits the scope of the invention.
After the content of the record for having read the present invention, technical staff can make various changes or modifications to the present invention, these equivalent changes
Change and modification equally falls into the scope of the claims in the present invention.
Claims (8)
1. a kind of method that information system attribute reduction is asked parallel based on MPI, it is characterised in that comprise the following steps:
Step 1), in data preprocessing phase, read the data of information system, numerical value is carried out into pretreatment, i.e. sliding-model control,
According to the feature of data so that continuous data discretization;
Step 2), information system is equably divided horizontally in units of sample p sample data subset, and by p sample
Data subset distributes to n node, and each node is according to the equivalence class of conditional attribute parallel computation data subset, then each is saved
The result of point is integrated, so as to obtain m equivalence class partition of whole information system, each one sub-information system of equivalence class correspondence
System;
Step 3), again m sub- information systeies are distributed to into n node, each node is to assigned sub-information system in parallel
Computation attribute core, until having processed all sub-information systems, then the result of each node is merged, and draws the category of prime information system
Property core;
Step 4), it is last, then the attribute nucleus of prime information system be sent to into each nodal parallel obtain attribute reduction, then by each
The attribute reduction result of node merges integration, obtains the attribute reduction result of whole information system.
2. the method that information system attribute reduction is asked parallel based on MPI according to claim 1, it is characterised in that described
Step 1) data preprocessing phase first reads in information system and specifically includes:Described information system namely decision table are a four-tuple
(U, A, V, f), wherein U represents the set of all objects in field question, referred to as domain to IS=;A=C ∪ D are community sets, son
Collection C and D represents respectively conditional attribute collection and decision kind set;Va is the codomain of attribute a;f:U × A → V is one
Individual information function, gives a value of information, i.e., to each attribute of an objectX ∈ U, there is f (x, a) ∈ Va.
3. the method that information system attribute reduction is asked parallel based on MPI according to claim 2, it is characterised in that described
When carrying out continuous data discretization to the data of information system, according to the feature of data, can using it is wide it is interval, etc. frequency it is interval
Method, based on Importance of Attributes, based on cluster in interior discretization method.
4. the method for seeking information system attribute reduction parallel based on MPI according to one of claim 1-3, its feature exists
In the step 2) equivalence class partition of information system is that domain is classified according to conditional attribute using equivalence relation, number
According to the conditional attribute collection of collection, form is { conditional attribute 1, conditional attribute 2 ... ... conditional attribute p }, containing consistent in the equivalence class
Object and inconsistent object, if conditional attribute is consistent with decision attribute, for consistent object, if conditional attribute is consistent, decision-making
Attribute is inconsistent, then be inconsistent object.
5. the method that information system attribute reduction is asked parallel based on MPI according to claim 4, it is characterised in that described
Step 3) m sub- information system is distributed to into n node, master slave mode is adopted during distribution task, select section based on a node
Point, remaining node is that, from node, host node is responsible for for task being allocated to each from node, receives from the tasks carrying of node and ties
Really, task distribution adopts dynamically distributes mode, takes and is randomly assigned or order-assigned mode, processes fast node distribution task
It is many, sub-information system is distributed to each time the node of free time, until all sub-information systems have been processed.
6. the method that information system attribute reduction is asked parallel based on MPI according to claim 5, it is characterised in that described
Step 3) parallel computation attribute nucleus are that sub- decision-making discrimination matrix is created on each node, if sub-information system decision-making discrimination matrix
In comprising decision attribute D, i.e. sub-information system be inconsistent object set, the attribute nucleus of the sub-information system areOtherwise, find out
Single attribute of decision-making is determined in sub-information system, the union of each single attribute of condition, the as attribute of the sub-information system is met
Core.
7. the method that information system attribute reduction is asked parallel based on MPI according to claim 2, it is characterised in that described
The attribute reduction for seeking sub-information system parallel is in sub- decision-making discrimination matrix, the value of the element comprising core attributes to be revised as into sky
Collection, so as to obtain a new matrix, resettles and extract accordingly logical expression, and all of logical expression of extracting is closed
Computing is taken, conjunctive normal form is obtained, then conjunctive normal form is converted to into disjunctive normal form form, finally all core attributes are added to and are extracted
Each conjunct in normal form, obtains the result of sub-information system property yojan.
8. the method that information system attribute reduction is asked parallel based on MPI according to claim 6 or 7, it is characterised in that institute
State and create being specially for sub- decision-making discrimination matrix:To each division, each element in the division is found out different from other elements
Attribute.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611259383.XA CN106598743B (en) | 2016-12-30 | 2016-12-30 | MPI-based method for parallel attribute reduction of information system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611259383.XA CN106598743B (en) | 2016-12-30 | 2016-12-30 | MPI-based method for parallel attribute reduction of information system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106598743A true CN106598743A (en) | 2017-04-26 |
CN106598743B CN106598743B (en) | 2020-06-16 |
Family
ID=58581486
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611259383.XA Active CN106598743B (en) | 2016-12-30 | 2016-12-30 | MPI-based method for parallel attribute reduction of information system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106598743B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107544945A (en) * | 2017-08-31 | 2018-01-05 | 北京语言大学 | The distribution of decision table and change precision part reduction method |
CN107590566A (en) * | 2017-09-16 | 2018-01-16 | 西安科技大学 | The air duct air outlet regulation regulation obtaining method of the optimal dust field of fully mechanized workface |
CN107958266A (en) * | 2017-11-21 | 2018-04-24 | 重庆邮电大学 | It is a kind of based on MPI and be about to connection attribute carry out discretization method |
CN108345999A (en) * | 2018-02-09 | 2018-07-31 | 重庆科技学院 | A kind of manufacture system production process information reduction method based on Dynamic Programming |
CN108599151A (en) * | 2018-04-28 | 2018-09-28 | 国网山东省电力公司电力科学研究院 | A kind of Model in Reliability Evaluation of Power Systems dynamic parallel computational methods |
CN112749012A (en) * | 2021-01-15 | 2021-05-04 | 北京智芯微电子科技有限公司 | Data processing method, device and system of terminal equipment and storage medium |
CN112988871A (en) * | 2021-03-23 | 2021-06-18 | 重庆飞唐网景科技有限公司 | Information compression transmission method for MPI data interface in big data |
CN117111585A (en) * | 2023-09-08 | 2023-11-24 | 广东工业大学 | Numerical control machine tool health state prediction method based on tolerance sub-relation rough set |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020101421A1 (en) * | 2001-01-31 | 2002-08-01 | Kim Pallister | Reducing detail in animated three-dimensional models |
CN102539132A (en) * | 2011-12-16 | 2012-07-04 | 西安交通大学 | Method used for evaluating double-shaft linkage performance of numerical control machine and based on rough set |
CN103646118A (en) * | 2013-12-27 | 2014-03-19 | 重庆绿色智能技术研究院 | Confidence dominance-based rough set analysis model and attribute reduction methods |
-
2016
- 2016-12-30 CN CN201611259383.XA patent/CN106598743B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020101421A1 (en) * | 2001-01-31 | 2002-08-01 | Kim Pallister | Reducing detail in animated three-dimensional models |
CN102539132A (en) * | 2011-12-16 | 2012-07-04 | 西安交通大学 | Method used for evaluating double-shaft linkage performance of numerical control machine and based on rough set |
CN103646118A (en) * | 2013-12-27 | 2014-03-19 | 重庆绿色智能技术研究院 | Confidence dominance-based rough set analysis model and attribute reduction methods |
Non-Patent Citations (2)
Title |
---|
朱炜等: ""基于属性约简的MPI运行时参数优化"", 《计算机与现代化》 * |
杨传健等: ""水平划分决策表的属性约简算法"", 《计算机工程与应用》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107544945A (en) * | 2017-08-31 | 2018-01-05 | 北京语言大学 | The distribution of decision table and change precision part reduction method |
CN107590566A (en) * | 2017-09-16 | 2018-01-16 | 西安科技大学 | The air duct air outlet regulation regulation obtaining method of the optimal dust field of fully mechanized workface |
CN107958266A (en) * | 2017-11-21 | 2018-04-24 | 重庆邮电大学 | It is a kind of based on MPI and be about to connection attribute carry out discretization method |
CN108345999A (en) * | 2018-02-09 | 2018-07-31 | 重庆科技学院 | A kind of manufacture system production process information reduction method based on Dynamic Programming |
CN108345999B (en) * | 2018-02-09 | 2021-09-28 | 重庆科技学院 | Manufacturing system production process information reduction method based on dynamic programming |
CN108599151A (en) * | 2018-04-28 | 2018-09-28 | 国网山东省电力公司电力科学研究院 | A kind of Model in Reliability Evaluation of Power Systems dynamic parallel computational methods |
CN112749012A (en) * | 2021-01-15 | 2021-05-04 | 北京智芯微电子科技有限公司 | Data processing method, device and system of terminal equipment and storage medium |
CN112749012B (en) * | 2021-01-15 | 2024-05-28 | 北京智芯微电子科技有限公司 | Data processing method, device and system of terminal equipment and storage medium |
CN112988871A (en) * | 2021-03-23 | 2021-06-18 | 重庆飞唐网景科技有限公司 | Information compression transmission method for MPI data interface in big data |
CN117111585A (en) * | 2023-09-08 | 2023-11-24 | 广东工业大学 | Numerical control machine tool health state prediction method based on tolerance sub-relation rough set |
CN117111585B (en) * | 2023-09-08 | 2024-02-09 | 广东工业大学 | Numerical control machine tool health state prediction method based on tolerance sub-relation rough set |
Also Published As
Publication number | Publication date |
---|---|
CN106598743B (en) | 2020-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106598743A (en) | Attribute reduction method for information system based on MPI parallel solving | |
US8381230B2 (en) | Message passing with queues and channels | |
CN104298713B (en) | A kind of picture retrieval method based on fuzzy clustering | |
WO2018166270A2 (en) | Index and direction vector combination-based multi-objective optimisation method and system | |
CN107329828A (en) | A kind of data flow programmed method and system towards CPU/GPU isomeric groups | |
CN108021449A (en) | One kind association journey implementation method, terminal device and storage medium | |
Reddy et al. | A review on density-based clustering algorithms for big data analysis | |
CN111738341A (en) | Distributed large-scale face clustering method and device | |
Abbasi et al. | Enhancing the performance of decision tree-based packet classification algorithms using CPU cluster | |
CN107958266A (en) | It is a kind of based on MPI and be about to connection attribute carry out discretization method | |
Duan et al. | Distributed in-memory vocabulary tree for real-time retrieval of big data images | |
US8543722B2 (en) | Message passing with queues and channels | |
Zhang et al. | Egraph: efficient concurrent GPU-based dynamic graph processing | |
CN108776814A (en) | A kind of Electric Power Communication Data resource parallelization clustering method | |
Volk et al. | Clustering uncertain data with possible worlds | |
CN108880871A (en) | A kind of wireless sensor network topology resource distribution method and device | |
CN104063230B (en) | The parallel reduction method of rough set based on MapReduce, apparatus and system | |
Ding et al. | Efficient probabilistic skyline query processing in mapreduce | |
Song et al. | Towards modeling large-scale data flows in a multidatacenter computing system with petri net | |
Atrushi et al. | Distributed Graph Processing in Cloud Computing: A Review of Large-Scale Graph Analytics | |
Schreiber et al. | SFC-based communication metadata encoding for adaptive mesh refinement | |
Lin et al. | A parallel Cop-Kmeans clustering algorithm based on MapReduce framework | |
CN103942235A (en) | Distributed computation system and method for large-scale data set cross comparison | |
Sakouhi et al. | Hammer lightweight graph partitioner based on graph data volumes | |
CN108875786B (en) | Optimization method of consistency problem of food data parallel computing based on Storm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |