CN103051509A - Tree-structure-based initialization method - Google Patents

Tree-structure-based initialization method Download PDF

Info

Publication number
CN103051509A
CN103051509A CN201210274685XA CN201210274685A CN103051509A CN 103051509 A CN103051509 A CN 103051509A CN 201210274685X A CN201210274685X A CN 201210274685XA CN 201210274685 A CN201210274685 A CN 201210274685A CN 103051509 A CN103051509 A CN 103051509A
Authority
CN
China
Prior art keywords
container
tree
node
monitoring process
ibe
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210274685XA
Other languages
Chinese (zh)
Other versions
CN103051509B (en
Inventor
胡凯
丁毅
赵祯龙
吴恺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan Innovation Institute of Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201210274685.XA priority Critical patent/CN103051509B/en
Publication of CN103051509A publication Critical patent/CN103051509A/en
Application granted granted Critical
Publication of CN103051509B publication Critical patent/CN103051509B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides an initialization method for a tree transmission structure. The method comprises the following steps that 1, a front-end process FE monitors the information of a monitoring process set BE in a round-robin way, collects the information of the BE by utilizing a transmission control protocol/Internet protocol (TCP/IP), and generates the topology of a communication process set CP according to a topology configuration file, and the BE transmits the information to the FE, triggers an auxiliary service process set AP to assist the BE in transmitting the information to the FE according to a system configuration file to generate inner tree topology, and the BE is attached to an inner tree terminal process set IBE, thereby establishing the tree transmission structure for transmitting performance data; 2, the BE collects the performance data of parallel programs by using an acquisition monitoring technology, and submits the performance data to the tree transmission structure in real time; 3, the CP receives and forwards the performance data; 4, the FE receives the performance data from the tree transmission structure, stores the performance data in a data storage carrier, and performs visual display to realize real-time and/or afterward performance analysis; and 5, the tree transmission structure is automatically destroyed, and an online data transmission and collection process is finished.

Description

A kind of initial method based on tree-shaped framework
Technical field
The present invention relates to a kind of initialization construction method based on tree-shaped transmission architecture, particularly a kind of initialization construction method of tree-shaped transmission architecture of the performance data that is used for the different nodes of real-time aggregation.
Background technology
Calculate extensive and long playing demand in order to satisfy science, and the gap that effectively reduces the continuous expansion of high-performance computing sector software and hardware, traditional static state and afterwards performance monitoring and analytical technology replaced by online extendible monitoring method gradually, thereby improve the performance of concurrent program and take full advantage of existing computational resource.And then traditional 1-n framework is replaced by tree-shaped transmission architecture gradually because it lacks extensibility.More typical framework is in the performance monitoring field
Figure BDA00001970563800011
Model, its prototype realization MRNet communication pool is used widely in this field.This framework is based on process, and its building process can executed in parallel, and root node generates its each child node by CONFIG.SYS, child node and then link its father's node, and obtain the configuration information of its subtree, thus final this communication tree of parallel generation.And for the performance monitoring instrument, if want to utilize this tree-shaped transmission architecture, then need to use additional modes, and make up first the intercommunication tree, then monitoring process appends to the leaf node of intercommunication tree, thus the whole tree-shaped transmission structure of initialization.Yet its precondition is exactly that application process will be understood the relevant information that it needs additional internal leaf node in advance, such as IP, and port etc.Although MRNet provides flexibly interface, and information identification, internal process are placed, assisted process triggers etc. and still to need hand-designed and realization, this patent is used for overcoming the above problems exactly, be applicable to typical group system, can be used for the convergence of base on-line performance monitoring tool thereon.
The before research for the initialization Construct question of transmission architecture mainly is divided into two classes: 1) not take the process initiation method of MRNet as the basis, comprise LaunchMON, LIBI and TDP; 2) based on the performance tool of MRNet, such as TAU, Extrae, DPCL, Open|SpeedShop, etc.
Above-mentioned LaunchMON, the common purpose of LIBI and TDP project make up finger daemon and the application process that a general framework starts instrument, but because it depends on the RM(explorer) thus complexity increased.In addition, in the disabled situation of LIBI method formerly, proposed another kind of strategy, that is exactly process instance (being responsible for all processes in this node) and the communication service of being responsible for the XM level in each node with process agent.Yet agency itself will increase extra expense and can cause more application processes to connect when the agency was lost efficacy, and mistake also can lengthen recovery time.
Transmission network initial method based on the typical performance tool of MRNet is described below.DPCL and MRNet/XT lay particular emphasis on the original network establishing method of change and adapt to its separately specific system architecture: BG/L and Cray XT.The build mechanism of the MRNet transmission network of TAU requires the user to add the next self-defined position of converging of TAU_ONLINE_DUMP () in source code, usually need to be arranged on synchronous point, iteration point etc. and locate.This mode is higher to customer requirements, has increased burden for users.In addition, be not suitable for the situation that does not have source code, and can only converge analysis to outline data, limited its range of application.TAU itself makes up transmission network by MPI mechanism.The method requires all nodes (comprising aggregation node and computing node) that participate in transmission to start the mpd program.At first use MPI to start the probe instrument according to the total process number that plans sth. ahead (comprising the process of converging, communication process and monitoring process), by the progress information (being stored in file) of this instrument record node and operation on it; And then use monitoring process (calculation procedure) number to start probediff by MPI, and utilize node and the progress information of this this use of tool records, and in the node of first record and progress information table, deduct the employed progress information of monitoring process, remaining node and process then are used for making up MRNet internal transmission net, and then making the monitoring process carry in the internal tree leaf node, transmission network makes up and finishes.This netinit construction method does not have clear separate with calculation procedure at node level for the process of converging, and is unfavorable for the reasonable utilization of system maintenance and resource, is unfavorable for the change of system configuration yet.Such as the increase along with analysis task, individual node can not satisfy the demands, at this moment need analysis node is expanded to distributed type assemblies (distributed data base) or adopts novel distributed processing framework, such as Hadoop etc., because monitoring process and converge the process weave in, current mechanism implements with regard to more complicated, lacks flexibility.In addition, each step of this mechanism is all come preservation information with file, and expense is larger, and does not consider internal transmission process Placement Strategy.Other instruments are such as Open|SpeedShop, and Extrae uses MRNet as its transmission architecture although wait, and its construction method lacks versatility.Only support the inside middle communication process is positioned over extra host node such as transmission network initialization build mechanism among the Extrae.
Summary of the invention
The present invention has introduced the initialization build mechanism of a transmission architecture and has realized that its prototype system comes auxiliary development personnel rapid build on-line performance instrument.This machine-processed characteristic description is as follows.
1) this Mechanism Design is simple, has preferably versatility for the typical high-performance group system, and it can be applicable to, and different pile pitching method (source code level, binary level, storehouse level etc.) carries out data acquisition and different method of measurement comprises profile and trace;
2) it provides the allocation strategy of internal process and the addition method of monitoring process BE, thereby reaches the purpose of the expensive computational resource of rational utilization;
3) assistant service process can be accompanied by sequential monitoring and be triggered, and comprehensive information is auxiliary carries out the performance evaluation of Parallel application thereby can gather;
4) this mechanism flexible configuration is to user transparent and have preferably efficient.
The present invention aims to provide a kind of initialization construction method of tree-shaped transmission architecture of group system, this tree-shaped transmission architecture logically comprises aggregation node, communication node and computing node, wherein, computing node is born calculation task, monitoring process also is applied to it, and aggregation node is used for converging performance data; This aggregation node comprises front end process FE, and this front end process FE is responsible for convergence and storage; The communication node of tree-shaped transmission architecture comprises the communication process set, and it is responsible for convergence and filtration; It is characterized in that this initial method comprises the steps:
Step 1: thus described front end process FE poll is monitored the information of utilizing ICP/IP protocol to collect monitoring process set B E, and according to the topology of topological arrangement file generated communication process set CP, described monitoring process set B E sends described information to described front end process FE, trigger assistant service process collection AP synergic monitoring process collection BE according to CONFIG.SYS and together described information is sent to described front end process FE, generate inner tree topology, and then described monitoring process set B E is additional to internal tree terminal processes set IBE, thereby set up tree-shaped transmission architecture, in order to the transmission performance data;
Step 2: described monitoring process set B E gathers the performance data that monitoring technology is collected concurrent program by corresponding, and submits in real time this tree-shaped transmission architecture;
Step 3: described communication process set CP receives and transmits described performance data;
Step 4: described money end process FE receives the described performance data of this tree-shaped transmission architecture, and it is stored in data storage carrier, carries out visual presentation, to realize in real time and/or afterwards performance evaluation.
Step 5: tree-shaped transmission architecture auto-destruct, finish online data transmission collection process.
Initialization construction method of the present invention, it is characterized in that, in the situation that communication node and computing node are not distinguished, Placement Strategy by the intercommunication process generates the topological arrangement file and then generates the topology that described communication process is gathered CP, the Placement Strategy of described intercommunication process, realize by Average Strategy or the comprehensive allocation strategy of doing one's best, and monitoring process set B E is additional to internal tree terminal processes set IBE, realize by additional policy.
Initialization construction method of the present invention further comprises, wherein, the target of described Average Strategy of doing one's best is to keep total process to count nt equalization between each main frame to distribute, and this strategy comprises the steps:
Step 1: at first the TS container is emptied;
Step 2: use sum to try to achieve merchant quo and remainder rem divided by m;
Step 3: will discuss the quo assignment to each element T S in the TS container i(i ∈ [0...m-1]);
Step 4: if rem is not equal to 0, then front rem element among the TS added 1 successively;
Step 5: the element that compares relevant position in OS container and the TS container, if exist element in the OS container greater than the situation of TS container, this element in the OS container is replaced the element of relevant position in the TS container, and put it into the RS container, and sum deducts the value of this element;
Step 6: if the RS container is not empty, the execution in step seven that then circulates is until the size of RS container no longer changes; On the contrary, if the RS container is empty, perhaps RS container size has no longer changed through step 7, then direct execution in step eight;
Step 7: calculate poor divided by the element number of m and RS container of sum, resulting merchant quo and remainder rem; Each element in the traversal TS container sees whether it is included in the RS container, if do not exist, then corresponding element assignment is merchant quo; At this moment, if rem is not equal to 0, then need to select the element of rem value minimum in the TS container to increase respectively 1; And then the element in comparison OS container and the TS container, if still exist element in the OS container greater than the situation of TS container relevant position element, this element in the OS container is replaced the element of relevant position in the TS container, and put it into the RS container, and sum deducts the value of this element;
Step 8: the element of TS container and OS container relatively, if exist the element of TS container greater than the situation of the element of OS container relevant position, the difference that then element of this position TS container is deducted the element of relevant position OS container is given the AS container;
Wherein the tabulation that comprises CN of OS container with and on the BE number that comprised; The AS container has been showed the number of the cp that each cn comprises, interim allocation result in the middle of the TS container is used for recording; The RS container is used for storing exclusive result, and n represents the number of cp, and m represents the number of cn, and the sum initial value is the sum of be and cp, and container refers to deposit the space of related data, and CN is the computing node set, and cn is computing node, and cp is communication process, and be is monitoring process.
Initialization construction method of the present invention further comprises, wherein, described comprehensive allocation strategy comprises step:
Step 1: set up set of factors U=[U 1, U 2, U 3..., U Nf], it characterizes nf evaluation factor, is set as the characteristic index of main frame here;
Step 2: set up and pass judgment on collection V=[υ 1, υ 2, υ 3..., υ Mf], it characterizes mf judge, is set as the judge that can artificially identify here;
Step 3: in order to assess U i(i ∈ [1...nf]), it is for υ jThe degree of membership of (j ∈ [1...mf]) is r Ij, and then be expressed as r for the fuzzy set of this factor i=[r I1, r I2, r I3..., r Imf], i.e. f (U i)=r i, after set of factors is assessed, can get the fuzzy evaluation matrix as follows:
R = r 1 r 2 . . . r nf = R 11 R 12 . . . R 1 mf r 21 r 22 . . . r 2 mf . . . . . . . . . . . . r nf 1 r nf 2 . . . r nfmf
Step 4: set up the significance level that a weight sets is identified above-mentioned evaluation factor, weight sets obtains to be expressed as A=[a by expert assessment and evaluation or previous experience 1, a 2..., a Nf], and then A and R obtain matrix B by the dot product operation:
B=A·R=[b 1,b 2,...,b mf];
Step 5: B is carried out normalization, obtain:
C = [ c 1 , c 2 , . . . , c mf ] ( c i = b i / Σ j = 1 mf b j and Σ i = 1 mf c i = 1 ) ;
Obatained score be expressed as CV ' (V '=[k 1, k 2..., k Mf] T), wherein V ' is V TThe expression that quantizes, V TThe transposition that represents vectorial V, k i(i ∈ [1...mf]) is to pass judgment on υ among the V iThe expression that quantizes of (i ∈ [1...mf]);
Step 6: finally obtain the mark of each node among the computing node set CN, and then obtain the proportionate relationship of each node, by this proportionate relationship communication process set CP is distributed.
Initialization construction method of the present invention further comprises, wherein, described additional policy comprises step:
Step 1: travel through available monitoring process set B E and internal tree terminal processes set IBE and it is mated, wherein the calculating of the fanout of branch of internal tree terminal processes set IBE is that number by calculating monitoring process be obtains divided by the number of internal tree terminal processes ibe;
Step 2: in order to guarantee the adaptability of monitoring process set B E addition method, when the number of monitoring process be equaled the integral multiple of internal tree terminal processes ibe number, the merchant of gained was the fanout of branch; If divide exactly the result remainder is arranged, then the fanout of the branch merchant that equals gained adds 1, thereby can guarantee that internal tree terminal processes ibe and monitoring process be number relation do not interdepend, and all monitoring process be can be additional to this transmission architecture;
Step 3: the matching strategy of internal tree terminal processes ibe and monitoring process be makes the process that is positioned at same main frame have coupling priority, and after monitoring process set B E was additional to internal tree terminal processes set IBE, transmission network just successfully constructed.
Initialization construction method of the present invention further comprises, the addition method of assistant service process collection AP, and it can be realized by the trigger mechanism of a certain process in the node.
Initialization construction method of the present invention further comprises, the addition method of assistant service process collection AP, and it can be realized by the trigger mechanism of a certain process in the node.
Initialization construction method of the present invention further comprises, aggregation node comprises and converges data storage module and visual and analysis module.
Description of drawings
Fig. 1 is based on the process prescription of state machine;
Fig. 2 a is the poll listen phase in the building process of transmission architecture;
Fig. 2 b is the acquisition of information stage in the building process of transmission architecture;
Fig. 2 c is the generation phase of the communication in-house network in the building process of transmission architecture;
Fig. 2 d is that the transmission architecture in the building process of transmission architecture makes up the stage (the transmission network initialization in the building process of transmission architecture makes up the stage);
Fig. 3 is the start-up course of AP;
Fig. 4 is data transmission procedure figure.
Embodiment
In this part, we describe the initialization build mechanism of transmission architecture in detail.At first define some terms, then described transmission architecture initialization building process, and then introduced Placement Strategy and the addition method of intercommunication process.Introduced at last the processing method of assistant service process.
1 term definition
The below carries out formal description to the transmission architecture model, and node is divided into different roles and is responsible for corresponding task with process.
This framework is comprised of aggregation node (analysis node), communication node and computing node, and wherein communication node can be separated with computing node, also can not distinguish.The user can set by configuration information and parameter according to demand, is defined as follows:
Define 1 computing node
Computing node is the node of bearing calculation task in the cluster, and its set uses CN to represent.
Can get
CN={cn 1,cn 2,cn 3,...,cn m}
Expression has m computing node to participate in calculating (transmission);
Define 2 aggregation nodes
Aggregation node is be used to the node that converges performance data, and its set uses SN to represent, it comprises the process of converging, some configuration files, outputting log file etc.Converge the data storage module, visual and analysis module may reside in aggregation node or other nodes, depends primarily on specific system architecture.
Define 3 front end processes
Front end process is the process that is positioned at aggregation node SN and is responsible for convergence and storage.It uses FE to represent, as can be known FE ∈ SN.The FE here in fact with the MRNet model in FE be not a process, yet they be positioned at same main frame and have that executing data receives and the identity functions such as processing, therefore it is regarded as identical role here, do not add obvious differentiation.Front end process is start-up system on SN, and the relevant parameters such as total process number (nt), monitoring process number (nb) are set.
Define 4 communication processs
Communication process is the process collection of being responsible for convergence and filtration and being positioned at the communication node of overlay tree.It can be distributed in SN and CN or extra node according to demand, supposes that here it is distributed in SN and CN, and its set expression is as follows
CP={cp 1,cp 2,...,cp n}(cp i∈cn j?or?SN?,1≤i≤n,1≤j≤m)
They belong to the intercommunication process, and its target is the extensibility that the distributed treatment transformation task improves performance tool.
Define 5 internal tree terminal processes
The internal tree terminal processes is the leaf node of internal tree l network, and its set expression is
IBE={ibe 1,ibe 2,ibe 3,...,ibe l}
Their monitored processes are added and are used for making up transmission architecture.
Define 6 monitoring process
Monitoring process (terminal processes) is to be responsible for the process collection that the monitoring science is used, and itself is exactly the application process of pitching pile or packaged storehouse packing, and its set expression is as follows
BE={be 1,be 2,be 3,...,be nb}(be k∈cn j,1≤k≤nb,1≤j≤m)
Nb represents progress of work number, is determined and startup by mpi, and concurrent program is started as its parameter.
By above-mentioned definition, be easy to obtain
n+nb+1=nt
Define 7 assistant service processes
The assistant service process is be used to the process collection that other miscellaneous functions (such as the reading system state information) is provided.Its set expression is as follows
AP={ap 1,ap 2,ap 3,...,ap na}
In this model, suppose that there is a service processes in each computing node, in case AP is activated, it will be considered to be equal to BE and then participate in transfer of data.In this case, the monitoring process number equals nb+na, and the number of processes relation can be described below
n+nb+na+1=nt
2 process prescriptions
In this part, come the process of data of description transmission with model of state machine structure, as shown in Figure 1, this also is reference frame of the present invention.Use S iCome the expression state, with (S i→ S j) the expression status change, use C iDescribe the trigger event of status change, namely to finish inter-related task and just can to continue to walk downward (in order simplifying, not indicate among the figure.)。FE and BE originate in respectively S0 and S 5, and end at S 11At first, thereby the FE poll is monitored the information of collecting BE, is used for producing the topology of CP, and meanwhile, BE sends its information to FE.After topology produces, FE and then be responsible for by some strategies BE being additional to IBE.Certainly, this process comprises that some reliability mechanisms are such as C RfAnd C RbIn case BE has known its link information, the MRNet net just can come this transmission architecture of instantiation by the start-up course of additional modes of himself, when monitored concurrent program begins to carry out, all application processes will be connected to collection and the storage that FE carries out performance data by this transmission architecture.
In order comparatively to be widely used in typical commercial group system (x86), this communication pool realized based on the socket technology of ICP/IP protocol, and used some strategies to guarantee its reliability, such as shake hands, re-transmission etc.How this transmission architecture of initialization is described in Fig. 2 a-2d in the concurrent program startup.Though in the information exchange stage, be to be exchanged information by FE and each BE, the 1-n framework belonged to, but make it can practical requirement owing to the exchange message amount is little.And the method that can use multi-client " select " to call is improved performance, is exactly that connection and sending function are embedded into application program or pack the storehouse to reduce extra process expense in addition.In addition, it also supports the application program of having moved is added monitoring.
The allocation strategy of 3 intercommunication processes
In Fig. 2 b, FE receives the information of all BE and then knows its distribution on each node, ensuing work is the topology that produces CP, traditional method is that they are positioned over additional nodes except the main frame of these operation application processes, thereby can avoid monitoring the disturbance with operation action.Yet, in many cases, place CP with extra node not-so-practical, mainly be to share owing to the very expensive rareness of high-performance calculation resource and by many users.The designing requirement of extendible transmission network takes full advantage of limited resource, and then has produced the actual demand at the collaborative CP of distribution of initial phase and BE.Therefore, the present invention has proposed two kinds of implementation strategies according to different scenes, is described below:
The mean allocation strategy of 1) doing one's best
This tactful target is to keep nt (total process number) as far as possible equalization distribution (whether SN participates in the distribution depends on whether it participates in computation) between each main frame, shown in this strategy is described below.Wherein OS (Original Set) container results from the acquisition of information stage that Fig. 2 b represents, the tabulation that comprises CN with and on the BE number that comprised; And AS(Allocated Set) container has been showed the number of the cp that each cn comprises.Interim allocation result in the middle of TS (Temp Set) container is used for recording; RS (Record Set) container is used for storing exclusive result (not participating in again calculating merchant's distribution); N represents the number of cp; M represents the number of cn; The sum initial value is the sum of be and cp; Container refers to deposit the space of related data, and CN is the computing node set, and cn is computing node, and cp is communication process, and be is monitoring process.Strategy 1 step is as follows:
Step 1: at first the TS container is emptied;
Step 2: use sum to try to achieve merchant quo and remainder rem divided by m;
Step 3: will discuss the quo assignment to each element T S in the TS container i(i ∈ [0...m-1]);
Step 4: if rem is not equal to 0, then front rem element among the TS added 1 successively;
Step 5: the element that compares relevant position in OS container and the TS container, if exist element in the OS container greater than the situation of TS container, this element in the OS container is replaced the element of relevant position in the TS container, and put it into the RS container, and sum deducts the value of this element;
Step 6: if the RS container is not empty, the execution in step seven that then circulates is until the size of RS container no longer changes; On the contrary, if the RS container is empty, perhaps RS container size has no longer changed through step 7, then direct execution in step eight;
Step 7: calculate poor divided by the element number of m and RS container of sum, resulting merchant quo and remainder rem; Each element in the traversal TS container sees whether it is included in the RS container, if do not exist, then corresponding element assignment is merchant quo; At this moment, if rem is not equal to 0, then need to select the element of rem value minimum in the TS container to increase respectively 1; And then the element in comparison OS container and the TS container, if still exist element in the OS container greater than the situation of TS container relevant position element, this element in the OS container is replaced the element of relevant position in the TS container, and put it into the RS container, and sum deducts the value of this element;
Step 8: the element of TS container and OS container relatively, if exist the element of TS container greater than the situation of the element of OS container relevant position, the difference that then element of this position TS container is deducted the element of relevant position OS container is given the AS container.
This strategy also uses arthmetic statement as follows:
Figure BDA00001970563800081
2) comprehensive allocation strategy
Current high performance is calculated the feature of group system, such as resource-sharing, system's isomery (each node may have different CPU speed, memory size etc. in the cluster), wait can cause node level can with computational resource be not quite similar.In this case, the present invention proposes another kind of allocation strategy based on fuzzy overall evaluation, it realizes that thought is described below.
Set up set of factors U=[U 1, U 2, U 3..., U Nf] characterizing nf evaluation factor, these factors can be set as check figure that CPU forgives, cpu busy percentage, memory size etc.; Set up and pass judgment on collection V=[υ 1, υ 2, υ 3..., υ Mf] characterizing mf judge, it can be configured to " very serious, serious, more serious, better, good, very good etc. ".In order to assess U i(i ∈ [1...nf]), it is for υ jThe degree of membership of (j ∈ [1...mf]) is rij, and then can be expressed as T for the fuzzy set of this factor i=[r I1, r I2, r I3..., r Imf]. that is to say f (U i)=r i. after set of factors U is assessed, can get the fuzzy evaluation matrix as follows.
R = r 1 r 2 . . . r nf = R 11 R 12 . . . R 1 mf r 21 r 22 . . . r 2 mf . . . . . . . . . . . . r nf 1 r nf 2 . . . r nfmf
In addition, need to set up a weight sets and identify the significance level of these evaluation factors (weight sets can be obtained by expert assessment and evaluation or previous experience), be expressed as A=[a 1, a 2..., a Nf]. and then A and R obtain matrix B by the dot product operation
B=A·R=[b 1,b 2,...,b mf]
Then, B need to be carried out normalization, obtain
C = [ c 1 , c 2 , . . . , c mf ] ( c i = b i / Σ j = 1 mf b j and Σ i = 1 mf c i = 1 )
Obatained score can be expressed as CV ' (V '=[k 1, k 2..., k Mf] T), wherein V ' is V TThe expression that quantizes, V TThe transposition that represents vectorial V, k i(i ∈ [1...mf]) is to pass judgment on υ among the V iThe expression that quantizes of (i ∈ [1...mf]);
Finally can obtain the mark of each node among the CN, and then can obtain the proportionate relationship of each node, thereby can come CP is distributed by this proportionate relationship.
The additional policy of 4BE
After the topology of CP produced, BE should be additional to the leaf node set of IBE(intercommunication tree).Its method is the available BE of traversal and IBE and it is mated.The fanout(branch of IBE) calculating is that the number by calculating be obtains divided by the number of ibe, and in order to guarantee the adaptability of BE addition method, when the number of be equaled the integral multiple of ibe number, the merchant of gained was fanout.If divide exactly the result remainder is arranged, then the fanout merchant that equals gained adds 1, thereby can guarantee that ibe and be number relation do not interdepend, and all be can be additional to this transmission architecture.In addition, the matching strategy of ibe and be is that the process that is positioned at same main frame has coupling priority, thereby can improve efficiency of transmission.After BE was additional to IBE, transmission network just successfully constructed.
The Initiated Mechanism of 5AP
Present performance monitor analysis technology often needs to gather the information of each side and in addition comprehensive, and performance issue is found in comprehensively prehension program behavior.Such as when carrying out the data acquisition of user class application performance, may need to obtain the System Dependent state information, each node will have more a process and carries out the monitoring of status of system performance like this, thereby can understand the running situation of each node, whether task is overweight, whether process is too much etc., and these information help the system action of user's complete understanding, reasonable etc. such as the task distribution condition.Original construction method can't satisfy this demand, and flexibility is not strong.The present invention has increased the AP Initiated Mechanism of this part.As shown in Figure 3, FE reads the relevant information of CN and BE, and the transmission instruction starts the AP service to a certain be among the CN.If success, each cn then can return the information relevant with the AP process.If failure, FE can select be other among the CN to start AP at certain time intervals afterwards, until receive successful information (heavy reliable trigger mechanism).If operation AP then needs this process is joined between upper Fig. 2 b, the 2c.In case after starting, the role of AP in participating in performance monitoring and data transmission procedure is as BE.A kind of acquisition method of simple general-purpose system mode is exactly to obtain performance data with the Proc of pseudo-file system that (SuSE) Linux OS provides.The Proc information spinner will comprise the bag (byte) of cpu busy percentage, memory usage, transmission and acceptance etc.These information can help developer's reasonable distribution resource, and can and pinpoint the problems in conjunction with the comprehensive deep prehension program of the performance data that gathers.Certainly, AP also can be used for starting other assistant service processes.
After the network struction success, then can utilize the transmission protocol mechanism of MRNet maturation to realize converging of performance data, main process as shown in Figure 4.
The interest field that the present invention advocates is not restricted to disclosed specific form, but covers all improvement, equivalence and other any content that falls into the spirit and scope of the present invention.

Claims (7)

1. the initialization construction method of the tree-shaped transmission architecture of a group system, this tree-shaped transmission architecture logically comprises aggregation node, communication node and computing node, and wherein, computing node is born calculation task, monitoring process also is applied to it, and aggregation node is used for converging performance data; This aggregation node comprises front end process FE, and this front end process FE is responsible for convergence and storage; The communication node of tree-shaped transmission architecture comprises the communication process set, and it is responsible for convergence and filtration; It is characterized in that this initial method comprises the steps:
Step 1: thus described front end process FE poll is monitored the information of utilizing ICP/IP protocol to collect monitoring process set B E, and according to the topology of topological arrangement file generated communication process set CP, described monitoring process set B E sends described information to described front end process FE, trigger assistant service process collection AP synergic monitoring process collection BE according to CONFIG.SYS and together described information is sent to described front end process FE, generate inner tree topology, and then described monitoring process set B E is additional to internal tree terminal processes set IBE, thereby set up tree-shaped transmission architecture, in order to the transmission performance data;
Step 2: described monitoring process set B E gathers the performance data that monitoring technology is collected concurrent program by corresponding, and submits in real time this tree-shaped transmission architecture;
Step 3: described communication process set CP receives and transmits described performance data;
Step 4: described money end process FE receives the described performance data of this tree-shaped transmission architecture, and it is stored in data storage carrier, carries out visual presentation, to realize in real time and/or afterwards performance evaluation.
Step 5: tree-shaped transmission architecture auto-destruct, finish online data transmission collection process.
2. the initialization construction method of tree-shaped transmission architecture as claimed in claim 1, it is characterized in that, in the situation that communication node and computing node are not distinguished, Placement Strategy by the intercommunication process generates the topological arrangement file and then generates the topology that described communication process is gathered CP, the Placement Strategy of described intercommunication process, realize by Average Strategy or the comprehensive allocation strategy of doing one's best, and monitoring process set B E is additional to internal tree terminal processes set IBE, realize by additional policy.
3. the initialization construction method of tree-shaped transmission architecture as claimed in claim 2, wherein, the target of described Average Strategy of doing one's best is to keep total process to count nt equalization between each main frame to distribute, this strategy comprises the steps:
Step 1: at first the TS container is emptied;
Step 2: use sum to try to achieve merchant quo and remainder rem divided by m;
Step 3: will discuss the quo assignment to each element T S in the TS container i(i ∈ [0...m-1]);
Step 4: if rem is not equal to 0, then front rem element among the TS added 1 successively;
Step 5: the element that compares relevant position in OS container and the TS container, if exist element in the OS container greater than the situation of TS container, this element in the OS container is replaced the element of relevant position in the TS container, and put it into the RS container, and sum deducts the value of this element;
Step 6: if the RS container is not empty, the execution in step seven that then circulates is until the size of RS container no longer changes; On the contrary, if the RS container is empty, perhaps RS container size has no longer changed through step 7, then direct execution in step eight;
Step 7: calculate poor divided by the element number of m and RS container of sum, resulting merchant quo and remainder rem; Each element in the traversal TS container sees whether it is included in the RS container, if do not exist, then corresponding element assignment is merchant quo; At this moment, if rem is not equal to 0, then need to select the element of rem value minimum in the TS container to increase respectively 1; And then the element in comparison OS container and the TS container, if still exist element in the OS container greater than the situation of TS container relevant position element, this element in the OS container is replaced the element of relevant position in the TS container, and put it into the RS container, and sum deducts the value of this element;
Step 8: the element of TS container and OS container relatively, if exist the element of TS container greater than the situation of the element of OS container relevant position, the difference that then element of this position TS container is deducted the element of relevant position OS container is given the AS container;
Wherein the tabulation that comprises CN of OS container with and on the BE number that comprised; The AS container has been showed the number of the cp that each cn comprises, interim allocation result in the middle of the TS container is used for recording; The RS container is used for storing exclusive result, and n represents the number of cp, and m represents the number of cn, and the sum initial value is the sum of be and cp, and container refers to deposit the space of related data, and CN is the computing node set, and cn is computing node, and cp is communication process, and be is monitoring process.
4. the initialization construction method of tree-shaped transmission architecture as claimed in claim 2, wherein, described comprehensive allocation strategy comprises step:
Step 1: set up set of factors U=[U 1, U 2, U 3..., U Nf], it characterizes nf evaluation factor, is set as the characteristic index of main frame here;
Step 2: set up and pass judgment on collection V=[υ 1, υ 2, υ 3..., υ Mf], it characterizes mf judge, is set as the judge that can artificially identify here;
Step 3: in order to assess U i(i ∈ [1...nf]), it is for υ jThe degree of membership of (j ∈ [1...mf]) is r Ij, and then be expressed as r for the fuzzy set of this factor i=[r I1, r I2, r I3..., r Imf], i.e. f (U i)=r i, after set of factors is assessed, can get the fuzzy evaluation matrix as follows:
R = r 1 r 2 . . . r nf = R 11 R 12 . . . R 1 mf r 21 r 22 . . . r 2 mf . . . . . . . . . . . . r nf 1 r nf 2 . . . r nfmf
Step 4: set up the significance level that a weight sets is identified above-mentioned evaluation factor, weight sets obtains to be expressed as A=[a by expert assessment and evaluation or previous experience 1, a 2..., a Nf], and then A and R obtain matrix B by the dot product operation:
B=A·R=[b 1,b 2,...,b mf];
Step 5: B is carried out normalization, obtain:
C = [ c 1 , c 2 , . . . , c mf ] ( c i = b i / Σ j = 1 mf b j and Σ i = 1 mf c i = 1 ) ;
Obatained score be expressed as CV ' (V '=[k 1, k 2..., k Mf] T), wherein V ' is V TThe expression that quantizes, V TThe transposition that represents vectorial V, k i(i ∈ [1...mf]) is to pass judgment on υ among the V iThe expression that quantizes of (i ∈ [1...mf]);
Step 6: finally obtain the mark of each node among the computing node set CN, and then obtain the proportionate relationship of each node, by this proportionate relationship communication process set CP is distributed.
5. the initialization construction method of tree-shaped transmission architecture as claimed in claim 2, wherein, described additional policy comprises step:
Step 1: travel through available monitoring process set B E and internal tree terminal processes set IBE and it is mated, wherein the calculating of the fanout of branch of internal tree terminal processes set IBE is that number by calculating monitoring process be obtains divided by the number of internal tree terminal processes ibe;
Step 2: in order to guarantee the adaptability of monitoring process set B E addition method, when the number of monitoring process be equaled the integral multiple of internal tree terminal processes ibe number, the merchant of gained was the fanout of branch; If divide exactly the result remainder is arranged, then the fanout of the branch merchant that equals gained adds 1, thereby can guarantee that internal tree terminal processes ibe and monitoring process be number relation do not interdepend, and all monitoring process be can be additional to this transmission architecture;
Step 3: the matching strategy of internal tree terminal processes ibe and monitoring process be makes the process that is positioned at same main frame have coupling priority, and after monitoring process set B E was additional to internal tree terminal processes set IBE, transmission network just successfully constructed.
6. the initialization construction method of tree-shaped transmission architecture as claimed in claim 1 further comprises, the addition method of assistant service process collection AP, and it can be realized by the trigger mechanism of a certain process in the node.
7. the initialization construction method of tree-shaped transmission architecture as claimed in claim 1 is characterized in that: aggregation node comprises and converges data storage module and visual and analysis module.
CN201210274685.XA 2012-08-03 2012-08-03 Tree-structure-based initialization method Active CN103051509B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210274685.XA CN103051509B (en) 2012-08-03 2012-08-03 Tree-structure-based initialization method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210274685.XA CN103051509B (en) 2012-08-03 2012-08-03 Tree-structure-based initialization method

Publications (2)

Publication Number Publication Date
CN103051509A true CN103051509A (en) 2013-04-17
CN103051509B CN103051509B (en) 2015-04-22

Family

ID=48064014

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210274685.XA Active CN103051509B (en) 2012-08-03 2012-08-03 Tree-structure-based initialization method

Country Status (1)

Country Link
CN (1) CN103051509B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103647830A (en) * 2013-12-13 2014-03-19 浪潮电子信息产业股份有限公司 Dynamic management method for multilevel configuration files in cluster management system
WO2015062369A1 (en) * 2013-11-04 2015-05-07 华为技术有限公司 Method and apparatus for optimizing compilation in profiling technology
CN103795727B (en) * 2014-02-21 2016-09-28 中国电子科技集团公司第五十四研究所 A kind of data distributing method based on tree configuration
CN108647134A (en) * 2018-05-04 2018-10-12 北京物资学院 A kind of task monitoring, tracking and recognition methods towards multicore architecture
CN108830130A (en) * 2018-03-30 2018-11-16 徐国明 A kind of polarization EO-1 hyperion low-altitude reconnaissance image typical target detection method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070064731A1 (en) * 2005-09-06 2007-03-22 Hitachi Communication Technologies Ltd. Transmission apparatus with function of multi-step bandwidth assignment to other communication apparatuses
CN101582826A (en) * 2009-06-25 2009-11-18 北京洲洋伟业信息技术有限公司 Data transmission method based on dynamic binary tree child-nephew two-channel in internet classroom
CN101883039A (en) * 2010-05-13 2010-11-10 北京航空航天大学 Data transmission network of large-scale clustering system and construction method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070064731A1 (en) * 2005-09-06 2007-03-22 Hitachi Communication Technologies Ltd. Transmission apparatus with function of multi-step bandwidth assignment to other communication apparatuses
CN101582826A (en) * 2009-06-25 2009-11-18 北京洲洋伟业信息技术有限公司 Data transmission method based on dynamic binary tree child-nephew two-channel in internet classroom
CN101883039A (en) * 2010-05-13 2010-11-10 北京航空航天大学 Data transmission network of large-scale clustering system and construction method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YI DING KAI HU LUJIA CHEN RISU NA: "Research and Exploration of Mechanism for Reconfigurable Multi-Cluster System", 《THE 1ST INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND ENGINEERING (ICISE2009)》, 31 December 2009 (2009-12-31), pages 332 - 335, XP031662976 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015062369A1 (en) * 2013-11-04 2015-05-07 华为技术有限公司 Method and apparatus for optimizing compilation in profiling technology
CN104615473A (en) * 2013-11-04 2015-05-13 华为技术有限公司 Optimization method and optimization device for profiling compilation
CN104615473B (en) * 2013-11-04 2017-11-24 华为技术有限公司 The optimization method and device of outline technology compiling
CN103647830A (en) * 2013-12-13 2014-03-19 浪潮电子信息产业股份有限公司 Dynamic management method for multilevel configuration files in cluster management system
CN103647830B (en) * 2013-12-13 2017-09-15 浪潮电子信息产业股份有限公司 The dynamic management approach of multi-level configuration file in a kind of cluster management system
CN103795727B (en) * 2014-02-21 2016-09-28 中国电子科技集团公司第五十四研究所 A kind of data distributing method based on tree configuration
CN108830130A (en) * 2018-03-30 2018-11-16 徐国明 A kind of polarization EO-1 hyperion low-altitude reconnaissance image typical target detection method
CN108647134A (en) * 2018-05-04 2018-10-12 北京物资学院 A kind of task monitoring, tracking and recognition methods towards multicore architecture

Also Published As

Publication number Publication date
CN103051509B (en) 2015-04-22

Similar Documents

Publication Publication Date Title
CN103873321B (en) Distributed file system-based simulation distributed parallel computing platform and method
Cachin et al. Blockchain consensus protocols in the wild
Rocket Snowflake to avalanche: A novel metastable consensus protocol family for cryptocurrencies
Buchman Tendermint: Byzantine fault tolerance in the age of blockchains
Mazumdar et al. Clustering with noisy queries
Li et al. Scaling distributed machine learning with the parameter server
CN103051509B (en) Tree-structure-based initialization method
CN105677486B (en) Data parallel processing method and system
US20180062764A1 (en) Entangled links, transactions and trees for distributed computing systems
CN109189751A (en) Method of data synchronization and terminal device based on block chain
CN109948428A (en) The GPU cluster deep learning edge calculations system of facing sensing information processing
CN107800787A (en) A kind of shared computer network system of distributed big data real-time exchange
CN103581332B (en) HDFS framework and pressure decomposition method for NameNodes in HDFS framework
WO2015058578A1 (en) Method, apparatus and system for optimizing distributed computation framework parameters
CN102576347A (en) Processing transactions in graph-based applications
CN108038201B (en) A kind of data integrated system and its distributed data integration system
CN106101213A (en) Information-distribution type storage method
CN102045196B (en) Parallel construction method of Delaunay triangulated network
CN108390771A (en) A kind of network topology method for reconstructing and device
CN108259611A (en) Cluster docker management methods, device, system and readable storage medium storing program for executing
Albrecht et al. Distributed application configuration, management, and visualization with plush
CN115277692B (en) Automatic operation and maintenance method, device and system for edge network computing terminal equipment
CN106302656A (en) The Medical Data processing method of cloud storage platform
CN106412125B (en) A kind of based on load balance and sequence cloud monitoring system and construction method
CN115480843A (en) Service processing method and device, electronic equipment and nonvolatile storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20210528

Address after: No.8, Expo Road, Panlong District, Kunming City, Yunnan Province

Patentee after: Yunnan Institute of innovation Beijing University of Aeronautics and Astronautics

Address before: 100191 No. 37, Haidian District, Beijing, Xueyuan Road

Patentee before: BEIHANG University

TR01 Transfer of patent right