CN105389201A - Process management method and system thereof based on high-performance computing cluster - Google Patents
Process management method and system thereof based on high-performance computing cluster Download PDFInfo
- Publication number
- CN105389201A CN105389201A CN201410446186.3A CN201410446186A CN105389201A CN 105389201 A CN105389201 A CN 105389201A CN 201410446186 A CN201410446186 A CN 201410446186A CN 105389201 A CN105389201 A CN 105389201A
- Authority
- CN
- China
- Prior art keywords
- rubbish
- hpcc
- difference
- information
- processes
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a process management method and system thereof based on a high-performance computing cluster. The method comprises following steps of: S1, collecting processes of all nodes of the high-performance compute cluster and corresponding process information; S2, adopting an automatic contrasting method or a searching and screening method to determine whether processes are garbage processes or not; and S3, closing garbage processes. In the step S2, when the automatic contrasting method is utilized for determining whether processes are garbage processes or not, difference of CPU hold-up time and machine-dwelling time are firstly acquired according to process information and is utilized for comparing with pre-set value. If the difference exceeds the pre-set value, the processes are determined as garbage processes.By adoption of the technical scheme, users' processes of a system can be collected and analyzed and determined by adoption of the automatic contrasting method or the searching and screening method and garbage processes are cleared out. As a result, calculating resources are released and system load is decreased.
Description
Technical field
The present invention relates to HPCC, particularly relate to a kind of process management method based on HPCC and system thereof.
Background technology
Cluster (cluster) is one group of computing machine, and they integrally provide a group network resource to user.These single computer systems are exactly the node (node) of cluster.A desirable cluster is that user never can be appreciated that the node of group system bottom, and In the view of them, cluster is a system, but not multiple computer system.Process is a specific implementation of program, is the process of executive routine, and a program can perform repeatedly, can open up independently space at every turn and load, thus produce multiple process in internal memory.
High-performance calculation (HighPerfermanceComputing) cluster, be called for short HPC cluster, the system consisted of network a lot of computing machine, is mainly used in parallel computation, (SuSE) Linux OS is installed by all computing machines.Meeting is improper due to user's use, the exception of computing machine and produce a large amount of rubbish processes, system manager often needs to take much time and analyzes, judges and remove these rubbish processes to the process in High-Performance Computing Cluster, solves the problem that the system service efficiency of making is not high because of these rubbish processes.
Summary of the invention
The features and advantages of the present invention are partly stated in the following description, or can be apparent from this description, or learn by putting into practice the present invention.
For overcoming the problem of prior art, the invention provides a kind of process management method based on HPCC and system thereof, adopt automatic control methods or search screening technique and judge whether this process is rubbish process, thus process is effectively managed, improve clustered node utilization factor.
It is as follows that the present invention solves the problems of the technologies described above adopted technical scheme:
According to an aspect of the present invention, a kind of process management method based on HPCC is provided, it is characterized in that, comprising: S1, the process gathering all nodes in HPCC and corresponding progress information; S2, adopt automatic control methods or search screening technique and judge whether this process is rubbish process; S3, terminate this rubbish process; In this step S2, when adopting this automatic control methods to judge whether this process is rubbish process, first time in machine and the CPU holding time of this process is obtained according to this progress information, then the time in machine of this process and the difference of CPU holding time is calculated, and this difference and preset value are made comparisons, if this difference is greater than this preset value, then judge that this process is as rubbish process.
Preferably, in this step S2, this preset value arranges different values according to the type difference of this process.
Preferably, this step S1, S2, S3 realize by inputting instruction in shell script.
Preferably, in this step S2, adopting automatic control methods or searching the process that screening technique carries out judging is consumer process.
Preferably, in this step S2, when adopt this search screening technique judge whether this process is rubbish process time, first in this consumer process all, find out the consumer process that there is abnormal interrupt situation, filter out wherein more also at the consumer process of the abnormal interrupt situation of this existence taking this node resource, be judged as rubbish process.
Preferably, in this step S2, when adopt this search screening technique judge whether this process is rubbish process time, judge whether this process is rubbish process according to user's information of this process, time in machine, the size taking the CPU of this node and any one or more information in the memory size of this node of taking in conjunction with the information of this process whether well afoot.
According to another aspect of the present invention, a kind of process management system based on HPCC is provided, it is characterized in that, comprising: collecting unit, for gathering the process of all nodes in HPCC and corresponding progress information; Judging unit, is connected with this collecting unit, comprises and automatically contrasts module and search screening module, for judging whether this process is rubbish process; End unit, for terminating this rubbish process; In this judging unit, this automatically contrasts module and comprises: calculating sub module, for calculating the time in machine of this process and the difference of CPU holding time according to this progress information; Contrast submodule, for this difference and preset value being made comparisons, if this difference is greater than this preset value, then this process is rubbish process.
Preferably, this preset value has different values according to the type difference of this process.
Preferably, this searches screening module for finding out the process that there is abnormal interrupt situation in this processes all; Also also taking the process of this node resource for filtering out in the process of the abnormal interrupt situation of this existence all, being judged as rubbish process.
Preferably, this is searched screening module and judges whether this process is rubbish process for the user's information according to this process, time in machine, the size taking the CPU of this node and any one or more information in the memory size of this node of taking in conjunction with the information of this process whether well afoot.
The invention provides a kind of process management method based on HPCC and system thereof, the T.T. of CPU and the contrast of preset value is taken by calculation procedure, automatically rubbish process is terminated according to comparing result, also rubbish process can be terminated by searching screening technique, thus give system manager certain operating space, confirm and remove the rubbish process of user, release computational resource.
By reading instructions, those of ordinary skill in the art will understand the characteristic sum content of these technical schemes better.
Accompanying drawing explanation
Below by with reference to accompanying drawing describe the present invention particularly in conjunction with example, advantage of the present invention and implementation will be more obvious, wherein content shown in accompanying drawing is only for explanation of the present invention, and does not form restriction of going up in all senses of the present invention, in the accompanying drawings:
Fig. 1 is the schematic flow sheet of the process management method based on HPCC of the embodiment of the present invention.
Fig. 2 is the schematic diagram by the management of shell script implementation process of the embodiment of the present invention.
Fig. 3 is the structural representation of the process management system based on HPCC of the embodiment of the present invention.
Embodiment
As shown in Figure 1, the invention provides a kind of process management method based on HPCC, comprising: S1, the process gathering all nodes in HPCC and corresponding progress information; S2, adopt automatic control methods or search screening technique and judge whether this process is rubbish process; S3, terminate this rubbish process; In this step S2, when adopting automatic control methods to judge whether process is rubbish process, first time in machine and the CPU holding time of process is obtained according to this progress information, then the time in machine of calculation procedure and the difference of CPU holding time, and this difference and preset value are made comparisons, if difference is greater than preset value, then judge that this process is as rubbish process.Wherein the time in machine is the time that process exists in systems in which, and system is operating system, and namely process is in an operating system from the duration produced till now.
This preset value can arrange different values according to the type difference of process, and in the present embodiment, the value of setting is 7 days.
Process is carried out Automatic Optimal operation by the algorithm that automatic control methods achieves to be provided according to system, and carries out interference management without the need to system manager.It should be noted that, process can be divided into system process and consumer process roughly, and to adopt automatic control methods in step s 2 or search the process that screening technique carries out judging be consumer process, that is, rubbish process must be consumer process, and there is no fear of being system process.So before the automatic control methods of enforcement, consumer process will be filtered out by automatic or manual in all processes, the consumer process and corresponding consumer process information that directly gather all nodes in HPCC certainly also can be set in step sl.
In step s 2, when adopt this search screening technique judge whether this process is rubbish process time, first in all consumer process, find out the consumer process that there is abnormal interrupt situation, filter out also at the consumer process of the abnormal interrupt situation of this existence taking described node resource more wherein, be judged as rubbish process, and then terminated it in step s3.The consumer process of the abnormal interrupt situation of above-mentioned existence refers to that the program that this consumer process is corresponding exists abnormal situation of interrupting.
When adopt this search screening technique judge whether this process is rubbish process time, can also judge whether this process is rubbish process according to user's information of this process, time in machine, the size taking the CPU of node and any one or more information in the memory size of node of taking in conjunction with the information of this process whether well afoot.Such as whether can there is the wrong situation sent out according to its operation of user's information inspection, when the process that the operation that user's mistake is sent out produces is also underway, can conclude that this process is rubbish process; First can also select the large of the CPU taking node or take the large multiple processes of the internal memory of node, in conjunction with the service condition of the plurality of process, conclude whether process is rubbish process.
Searching the requirement that screening technique achieves system manager well, when some process goes wrong, without the need to waiting until that this process proceeds to the preset value of its correspondence, this process can be terminated at once; In the specific implementation, automatic control methods with search screening technique and can carry out simultaneously.
Please refer to Fig. 2, in the present embodiment, above-mentioned steps S1, S2, S3 can by inputting instruction to realize in shell script.Shell is a kind of program possessing specific function, and it is an interface between the kernel program (kernel) of user and UNIX/Linux operating system.
As shown in Figure 3, the present invention also provides a kind of process management system based on HPCC, comprising: collecting unit 10, for gathering the process of all nodes in HPCC and corresponding progress information; Judging unit 20, is connected with this collecting unit 10, comprises contrast module 21 automatically and screens module 22, for judging whether process is rubbish process with searching; End unit 30, for terminating this rubbish process.Although do not show in figure, but in judging unit 20, this automatically contrast module 21 comprise: calculating sub module, for according to the time in machine of progress information calculation procedure and the difference of CPU holding time, wherein time in machine of process and CPU holding time are included in progress information; Contrast submodule, for this difference and preset value being made comparisons, if this difference is greater than preset value, then this process is rubbish process.Wherein, preset value can have different values according to the type difference of this process.
Search screening module 22 for finding out the process that there is abnormal interrupt situation in all processes; Also also taking the process of this node resource for filtering out in the process of the abnormal interrupt situation of all existence, being judged as rubbish process.Above-mentioned process refers to consumer process, and the process that there is abnormal interrupt situation refers to the abnormal situation of interrupting of program existence that this process is corresponding.In practical operation, search screening module 22 and can be used for finding out consumer process in all processes.Also collecting unit 10 directly can be arranged to gather the consumer process of all nodes in HPCC and corresponding consumer process information.
This is searched screening module 22 and can also be used for user's information according to this process, time in machine, the size taking the CPU of node and any one or more information in the memory size of node of taking and judge whether this process is rubbish process in conjunction with the information of this process whether well afoot.Such as whether can there is the wrong situation sent out according to its operation of user's information inspection, when the process that the operation that user's mistake is sent out produces is also underway, can conclude that this process is rubbish process; First can also select the large of the CPU taking node or take the large multiple processes of the internal memory of node, in conjunction with the service condition of the plurality of process, conclude whether process is rubbish process.
In the present embodiment, collecting unit 10, judging unit 20 and end unit 30 are all based on shell script technology.And automatic contrast module in described judging unit 20 21 is screened module 22 and can be worked with searching simultaneously.
The invention provides a kind of process management method based on HPCC and system thereof, can consumer process in acquisition system and by automatic control methods with search screening technique and carry out analysis and judge, confirm and the process that removes rubbish, thus can computational resource be discharged, reduce system load.
Above with reference to the accompanying drawings of the preferred embodiments of the present invention, those skilled in the art do not depart from the scope and spirit of the present invention, and multiple flexible program can be had to realize the present invention.For example, to illustrate as the part of an embodiment or the feature that describes can be used for another embodiment to obtain another embodiment.These are only the better feasible embodiment of the present invention, not thereby limit to interest field of the present invention that the equivalence change that all utilizations instructions of the present invention and accompanying drawing content are done all is contained within interest field of the present invention.
Claims (10)
1. based on a process management method for HPCC, it is characterized in that, comprising:
S1, the process gathering all nodes in HPCC and corresponding progress information;
S2, adopt automatic control methods or search screening technique and judge whether described process is rubbish process;
S3, terminate described rubbish process;
In described step S2, when adopting described automatic control methods to judge whether described process is rubbish process, first time in machine and the CPU holding time of described process is obtained according to described progress information, then the time in machine of described process and the difference of CPU holding time is calculated, and described difference and preset value are made comparisons, if described difference is greater than described preset value, then judge that described process is as rubbish process.
2. according to claim 1 based on the process management method of HPCC, it is characterized in that, in described step S2, described preset value arranges different values according to the type difference of described process.
3. according to claim 1 based on the process management method of HPCC, it is characterized in that, described step S1, S2, S3 realize by inputting instruction in shell script.
4. according to claim 1 based on the process management method of HPCC, it is characterized in that, in described step S2, adopting automatic control methods or searching the process that screening technique carries out judging is consumer process.
5. according to claim 1 or 4 based on the process management method of HPCC, it is characterized in that, in described step S2, when searching screening technique described in adopting and judging whether described process is rubbish process, first in all described consumer process, find out the consumer process that there is abnormal interrupt situation, filtering out wherein more also taking the consumer process of the abnormal interrupt situation of described existence of described node resource, being judged as rubbish process.
6. according to claim 1 based on the process management method of HPCC, it is characterized in that, in described step S2, when searching screening technique described in adopting and judging whether described process is rubbish process, judge whether described process is rubbish process according to user's information of described process, time in machine, the size taking the CPU of described node and any one or more information in the memory size of described node of taking in conjunction with the information of described process whether well afoot.
7. based on a process management system for HPCC, it is characterized in that, comprising:
Collecting unit, for gathering the process of all nodes in HPCC and corresponding progress information;
Judging unit, is connected with described collecting unit, comprises and automatically contrasts module and search screening module, for judging whether described process is rubbish process;
End unit, for terminating described rubbish process;
In described judging unit, described automatic contrast module comprises: calculating sub module, for calculating the time in machine of described process and the difference of CPU holding time according to described progress information; Contrast submodule, for described difference and preset value being made comparisons, if described difference is greater than described preset value, then described process is rubbish process.
8. according to claim 7 based on the process management system of HPCC, it is characterized in that, described preset value has different values according to the type difference of described process.
9., according to claim 7 based on the process management system of HPCC, it is characterized in that, described in search screening module for finding out the process that there is abnormal interrupt situation in all described processes; Also for filtering out also in the process taking described node resource in the process of the abnormal interrupt situation of all described existence, be judged as rubbish process.
10. according to claim 7 based on the process management system of HPCC, it is characterized in that, described in search screening module and judge whether described process is rubbish process for the user's information according to described process, time in machine, the size taking the CPU of described node and any one or more information in the memory size of described node of taking in conjunction with the information of described process whether well afoot.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410446186.3A CN105389201B (en) | 2014-09-03 | 2014-09-03 | A kind of process management method and its system based on High Performance Computing Cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410446186.3A CN105389201B (en) | 2014-09-03 | 2014-09-03 | A kind of process management method and its system based on High Performance Computing Cluster |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105389201A true CN105389201A (en) | 2016-03-09 |
CN105389201B CN105389201B (en) | 2018-11-13 |
Family
ID=55421508
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410446186.3A Active CN105389201B (en) | 2014-09-03 | 2014-09-03 | A kind of process management method and its system based on High Performance Computing Cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105389201B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106325861A (en) * | 2016-08-18 | 2017-01-11 | 北京奇虎科技有限公司 | Method and device used for managing distributed system |
CN106371928A (en) * | 2016-09-18 | 2017-02-01 | 安徽爱她有果电子商务有限公司 | Method for managing computer |
CN110955710A (en) * | 2019-11-26 | 2020-04-03 | 杭州数梦工场科技有限公司 | Method and device for processing dirty data in data exchange operation |
CN111639006A (en) * | 2020-05-29 | 2020-09-08 | 深圳前海微众银行股份有限公司 | Cluster process management method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102141934A (en) * | 2011-02-28 | 2011-08-03 | 浪潮(北京)电子信息产业有限公司 | Method and device for controlling process on fat node |
CN102591765A (en) * | 2011-12-31 | 2012-07-18 | 珠海市君天电子科技有限公司 | Progress automatic management system |
CN102662762A (en) * | 2012-03-30 | 2012-09-12 | 浪潮电子信息产业股份有限公司 | Method for effectively controlling use of memory resource of fat node |
US20130055278A1 (en) * | 2011-08-29 | 2013-02-28 | Kaspersky Lab Zao | Efficient management of computer resources |
-
2014
- 2014-09-03 CN CN201410446186.3A patent/CN105389201B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102141934A (en) * | 2011-02-28 | 2011-08-03 | 浪潮(北京)电子信息产业有限公司 | Method and device for controlling process on fat node |
US20130055278A1 (en) * | 2011-08-29 | 2013-02-28 | Kaspersky Lab Zao | Efficient management of computer resources |
CN102591765A (en) * | 2011-12-31 | 2012-07-18 | 珠海市君天电子科技有限公司 | Progress automatic management system |
CN102662762A (en) * | 2012-03-30 | 2012-09-12 | 浪潮电子信息产业股份有限公司 | Method for effectively controlling use of memory resource of fat node |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106325861A (en) * | 2016-08-18 | 2017-01-11 | 北京奇虎科技有限公司 | Method and device used for managing distributed system |
CN106371928A (en) * | 2016-09-18 | 2017-02-01 | 安徽爱她有果电子商务有限公司 | Method for managing computer |
CN110955710A (en) * | 2019-11-26 | 2020-04-03 | 杭州数梦工场科技有限公司 | Method and device for processing dirty data in data exchange operation |
CN111639006A (en) * | 2020-05-29 | 2020-09-08 | 深圳前海微众银行股份有限公司 | Cluster process management method and device |
Also Published As
Publication number | Publication date |
---|---|
CN105389201B (en) | 2018-11-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8881286B2 (en) | Clustering processing method and device for virus files | |
CN105389201A (en) | Process management method and system thereof based on high-performance computing cluster | |
CN110134738B (en) | Distributed storage system resource estimation method and device | |
US20150120637A1 (en) | Apparatus and method for analyzing bottlenecks in data distributed data processing system | |
CN107544832A (en) | A kind of monitoring method, the device and system of virtual machine process | |
JP5862245B2 (en) | Arrangement apparatus, arrangement program, and arrangement method | |
CN108021441A (en) | A kind of resources of virtual machine collocation method and device based on cloud computing | |
CN110908796B (en) | Multi-operation merging and optimizing system and method in Gaia system | |
CN110737648A (en) | Performance characteristic dimension reduction method and device, electronic equipment and storage medium | |
WO2020211253A1 (en) | Elastic scaling method and apparatus for number of hosts in distributed system, and computer device | |
CN108874508A (en) | A kind of cloud computing virtual server system load equilibration scheduling method | |
CN110196751B (en) | Method and device for isolating mutual interference service, electronic equipment and storage medium | |
US10324643B1 (en) | Automated initialization and configuration of virtual storage pools in software-defined storage | |
CN111026574A (en) | Method and device for diagnosing Elasticissearch cluster problems | |
CN108108625B (en) | Method, system and storage medium for detecting overflow vulnerability based on format isomerism | |
US11122065B2 (en) | Adaptive anomaly detection for computer systems | |
CN114138330B (en) | Knowledge graph-based code clone detection optimization method and device and electronic equipment | |
CN110955710B (en) | Dirty data processing method and device in data exchange operation | |
CN110955498B (en) | Process processing method, device and equipment and computer readable storage medium | |
CN107766442B (en) | A kind of mass data association rule mining method and system | |
CN106648867B (en) | Intelligent graceful restart method and device based on cloud data center | |
KR101837236B1 (en) | Basic block size considering execution path exploration method and system for improving the code coverage | |
CN117135151B (en) | Fault detection method of GPU cluster, electronic equipment and storage medium | |
US10031788B2 (en) | Request profile in multi-threaded service systems with kernel events | |
CN107545186A (en) | It is quick to solve the idle method, apparatus of engine and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |