CN102880832A - Method for implementing mass data management system under colony - Google Patents

Method for implementing mass data management system under colony Download PDF

Info

Publication number
CN102880832A
CN102880832A CN201210309450XA CN201210309450A CN102880832A CN 102880832 A CN102880832 A CN 102880832A CN 201210309450X A CN201210309450X A CN 201210309450XA CN 201210309450 A CN201210309450 A CN 201210309450A CN 102880832 A CN102880832 A CN 102880832A
Authority
CN
China
Prior art keywords
data
implementation method
node
copy
management
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201210309450XA
Other languages
Chinese (zh)
Other versions
CN102880832B (en
Inventor
吕灼恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shuguang zhisuan Information Technology Co.,Ltd.
Original Assignee
Dawning Information Industry Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dawning Information Industry Beijing Co Ltd filed Critical Dawning Information Industry Beijing Co Ltd
Priority to CN201210309450.XA priority Critical patent/CN102880832B/en
Publication of CN102880832A publication Critical patent/CN102880832A/en
Application granted granted Critical
Publication of CN102880832B publication Critical patent/CN102880832B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the field of computers, and discloses a method for implementing a mass data management system under a colony. The method achieves the purposes that after computing operation of a computing operation program, data computed by a computing node is copied to a management node in the job script processing procedure, and when errors are caused in the copying process, auxiliary copy can be carried out in the peripheral thread management process and the data state is modified. The method provided by the invention ensures the real-time performance, security, correctness and high efficiency of data, solves the problems of both island computation and inequalities, and designs corresponding solutions for general problems caused by cycle leakage and delay, and improves the system stability and reliability.

Description

The implementation method of the system of the data magnanimity management under a kind of cluster
Technical field
The present invention relates to computer realm, be specifically related to the implementation method of the system of the data magnanimity management under a kind of cluster.
Background technology
Management is the important leverage of the optimum operation of IT system, and different information technoloy equipments have the management system of oneself.Large-scale calculations data center particularly must come the equipment such as operational management calculating, storage, network by the management system of concentrating, can respond fast business change, anomalous event, the Continuous optimization with process data center.Data center under cloud computing environment, people are more and more higher to the requirement of computing power, in engineering design, aviation, bio-science, medical science, military affairs wait the calculating of numerous areas also to become increasingly complex, scale is also in the growth that becomes progression, huge calculation task like this, have no idea to finish for single computing machine, therefore much all use ultra-large cluster to calculate, in carrying out large-scale parallel calculating, the scheduling of calculation task is a relatively more popular topic, equally, calculate to finish how management data equally also is a very important topic, how could be so that calculate the data that produce, be presented in real time in face of the user, and confusion does not occur in data, and the data that a lot of science produce in calculating all are the very high data of confidentiality, and security how to manage these data guarantee data is a problem that makes people think deeply.
Under cloud computing environment, the operational mode of independent separate can not be supported the expansion of cloud service separately, new IT operational mode has proposed challenge to traditional management framework, to virtual, and dynamic, relevance, robotization, real-time, high efficiency, the requirement of security etc. improves constantly, the problem of existing system:
Real-time is not strong, and the user calculates and finishes and can not just get access to computational data after calculation task is really finished, and certain time-delay is always arranged;
Security is not high, and it is nonsensical that a lot of data are put into storage in fact, and the data secure context is also had a lot of problems;
Correctness is not high, if there is larger error to produce in science is calculated, to such an extent as to may affect the security of production;
High efficiency is not strong, and the utilization factor of resource integral body is not high.
Summary of the invention
For the deficiencies in the prior art, the invention provides the implementation method of the system of the data magnanimity management under a kind of cluster, can guarantee the real-time of data, security, correctness, high efficiency.
The implementation method of the system of the data magnanimity management under a kind of cluster provided by the invention, its improvements are, after computing application program computational tasks is finished, computing node is calculated the data communication device finish to be crossed the script processing procedure of operation and copies management node to, stagger the time when copying out, assist copy by object-line thread management process, and the Update Table state.
Wherein, described script processing procedure is just automatically to copy to after job run is finished to share storage.
Wherein, described object-line thread management process assists the step of copy to comprise:
(1) PBS query count node;
(2) data recording of the script processing procedure failure by operation in the java process Query Database;
(3) script packing is with the data copy of the script processing procedure failure by the operation share directory to shared disk;
(4) peripheral thread upgrades database success mark;
(5) thread sleep.If necessary, set length of one's sleep and carry out the auxiliary copy procedure of next round.
Wherein, step (1) PBS query count node, and the running status of operation generated a book keeping operation file.
Wherein, by the conversion of the described book keeping operation file of java process monitoring, and the conversion of described book keeping operation file is written in the mysql database in real time, forms in the database history information to operation; The history information composition data form of at least one.
Wherein, be provided with index function in the data sheet, and usage data storehouse connection pool is deposited the linking number of n quantity.
Wherein, be encrypted setting during data copy.
Compared with the prior art, beneficial effect of the present invention is:
The present invention can guarantee the real-time of data, security, and correctness, high efficiency avoids user's data to be stolen, and reveals.
The present invention processed calculate isolated island, not all should problem and the cycle of having designed leaks and the corresponding solution of the as a result ubiquity problem that postpones.
The present invention has promoted the experience of user's cloud computing service.
The present invention has promoted Systems balanth, reliability.
Description of drawings
Fig. 1 is communication scheme between a plurality of blade provided by the invention or the PC node.
Fig. 2 is the process flow diagram that object-line thread management process provided by the invention is assisted copy.
Embodiment
Below in conjunction with accompanying drawing the specific embodiment of the present invention is described in further detail.
Communication scheme is as shown in Figure 1 between a plurality of blades or the PC node during based on the Distributed Calculation of cluster for present embodiment, it couples together by high-speed local area network, and be equipped with certain parallel support software, form a loosely-coupled concurrent computational system, carry out cluster management with PBS, job scheduling.Distributed computing fabric comprises three category nodes among the figure, submits node to, management node, computing node.
Submit to node to be responsible for the PBS management node is submitted in operation, management node carries out unified monitoring and rational management to the resource of group system, make the CPU of the abundant Sharing computer of each node of cluster, internal memory, the resources such as disk, computing node is responsible for the evaluation work of task, calculates checkout result to be returned in the storage of management node after finishing, and at the management node carry NFS of group system, all computing nodes can be accessed by carry, management node is the core of calculating, but its not calculating of supplemental characteristic, and just group system is managed, be respectively: task management, node administration, telecommunication management, data base administration.
Present embodiment carries out data record after computing application program computational tasks is finished, be about to computing node and calculate the data copy return pipe reason node of finishing, and present embodiment guarantees the real-time of data, high efficiency, correctness by dual mode;
1) process by the script of operation, namely job run is finished just automatically to copy back and is shared storage, and present embodiment is to share storage, it is the copy of local disk, the problem that does not have disk I/O, so the calculating of operation and data record almost be same process, can ignore its delay time;
2) manage by peripheral thread, it is auxiliary copy procedure, and process flow diagram as shown in Figure 2.PBS query count node manages operation, and the running status of each operation is carried out record, can generate a book keeping operation file and record the details of operation.The present invention monitors the variation of book keeping operation file with the process of a java, and be written in real time in the mysql database, to form in the database history information to operation, but the generated data form uses in the future, continuous variation along with the time, data in the database can constantly increase, the pressure of inquiry also just constantly increases, therefore present embodiment newly-built index in the historical data table, simultaneously also used database connection pool to deposit the linking number of n quantity (n is positive integer), so that the search efficiency of database is higher, make a mistake such as first kind of way and result of calculation not to be copied back storage, the second way has a java thread and monitors constantly whether the failure of data the automatic recovery is arranged, if unsuccessfully will pack with script (can think that namely calculation procedure itself oneself is complete), with the data copy of the script processing procedure failure by the operation share directory to shared disk, and then with the status modifier of record.Thread entered sleep after this process was finished, and if necessary, set the length of one's sleep and carried out the auxiliary copy procedure of next round.
When user data arrives the cloud computing service merchant by Internet Transmission, do not allow user's data be stolen by strict cipher mode, the data that produce in cloud computing guarantee the safety of storage data, and the data in the storage are also encrypted.The same result data of user's transmission of giving is also encrypted.
Can well guarantee the security of data, high efficiency, correctness, and real-time by above mode.Allow the management of the imperceptible calculating of user and data separate.The user who promotes greatly experiences, and has also strengthened architecture simultaneously.
Should be noted that at last: above embodiment is only in order to illustrate that technical scheme of the present invention is not intended to limit, although with reference to above-described embodiment the present invention is had been described in detail, those of ordinary skill in the field are to be understood that: still can make amendment or be equal to replacement the specific embodiment of the present invention, and do not break away from any modification of spirit and scope of the invention or be equal to replacement, it all should be encompassed in the middle of the claim scope of the present invention.

Claims (7)

1. the implementation method of the system of the management of the data magnanimity under the cluster, it is characterized in that, after computing application program computational tasks is finished, computing node is calculated the data communication device finish to be crossed the script processing procedure of operation and copies management node to, stagger the time when copying out, assist copy by object-line thread management process, and the Update Table state.
2. implementation method as claimed in claim 1 is characterized in that, described script processing procedure is just automatically to copy to after job run is finished to share storage.
3. implementation method as claimed in claim 1 is characterized in that, described object-line thread management process assists the step of copy to comprise:
(1) PBS query count node;
(2) data recording of the script processing procedure failure by operation in the java process Query Database;
(3) script packing, with the data copy of the script processing procedure failure by operation to share directory;
(4) peripheral thread upgrades database success mark;
(5) thread sleep.
4. implementation method as claimed in claim 3 is characterized in that, step (1) PBS query count node, and the running status of operation generated a book keeping operation file.
5. implementation method as claimed in claim 4 is characterized in that, by the conversion of the described book keeping operation file of java process monitoring, and the conversion of described book keeping operation file is written in the mysql database in real time, forms in the database history information to operation; The history information composition data form of at least one.
6. implementation method as claimed in claim 5 is characterized in that, is provided with index function in the data sheet, and usage data storehouse connection pool is deposited the linking number of n quantity.
7. implementation method as claimed in claim 1 is characterized in that, is encrypted setting during data copy.
CN201210309450.XA 2012-08-28 2012-08-28 A kind of implementation method of the system of the data magnanimity management under cluster Active CN102880832B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210309450.XA CN102880832B (en) 2012-08-28 2012-08-28 A kind of implementation method of the system of the data magnanimity management under cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210309450.XA CN102880832B (en) 2012-08-28 2012-08-28 A kind of implementation method of the system of the data magnanimity management under cluster

Publications (2)

Publication Number Publication Date
CN102880832A true CN102880832A (en) 2013-01-16
CN102880832B CN102880832B (en) 2016-08-31

Family

ID=47482153

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210309450.XA Active CN102880832B (en) 2012-08-28 2012-08-28 A kind of implementation method of the system of the data magnanimity management under cluster

Country Status (1)

Country Link
CN (1) CN102880832B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198097A (en) * 2013-03-11 2013-07-10 中国科学院计算机网络信息中心 Massive geoscientific data parallel processing method based on distributed file system
CN108958892A (en) * 2018-08-14 2018-12-07 郑州云海信息技术有限公司 A kind of method and apparatus creating the container for deep learning operation
CN109086134A (en) * 2018-07-19 2018-12-25 郑州云海信息技术有限公司 A kind of operation method and device of deep learning operation
CN110781189A (en) * 2019-10-25 2020-02-11 北京达佳互联信息技术有限公司 Document platform construction method and device, electronic equipment and storage medium
CN112528456A (en) * 2019-09-18 2021-03-19 曙光信息产业(北京)有限公司 Heterogeneous node computing system and method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100153482A1 (en) * 2008-12-10 2010-06-17 Full Armor Corporation Cloud-Based Automation of Resources
CN101951411A (en) * 2010-10-13 2011-01-19 戴元顺 Cloud scheduling system and method and multistage cloud scheduling system
US20110126197A1 (en) * 2009-11-25 2011-05-26 Novell, Inc. System and method for controlling cloud and virtualized data centers in an intelligent workload management system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100153482A1 (en) * 2008-12-10 2010-06-17 Full Armor Corporation Cloud-Based Automation of Resources
US20110126197A1 (en) * 2009-11-25 2011-05-26 Novell, Inc. System and method for controlling cloud and virtualized data centers in an intelligent workload management system
CN101951411A (en) * 2010-10-13 2011-01-19 戴元顺 Cloud scheduling system and method and multistage cloud scheduling system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李全枝等: "《集群资源管理系统PBS及其应用》", 《微机发展》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103198097A (en) * 2013-03-11 2013-07-10 中国科学院计算机网络信息中心 Massive geoscientific data parallel processing method based on distributed file system
CN103198097B (en) * 2013-03-11 2016-02-10 中国科学院计算机网络信息中心 A kind of magnanimity earth science data method for parallel processing based on distributed file system
CN109086134A (en) * 2018-07-19 2018-12-25 郑州云海信息技术有限公司 A kind of operation method and device of deep learning operation
CN108958892A (en) * 2018-08-14 2018-12-07 郑州云海信息技术有限公司 A kind of method and apparatus creating the container for deep learning operation
CN112528456A (en) * 2019-09-18 2021-03-19 曙光信息产业(北京)有限公司 Heterogeneous node computing system and method
CN112528456B (en) * 2019-09-18 2024-05-07 曙光信息产业(北京)有限公司 Heterogeneous node computing system and method
CN110781189A (en) * 2019-10-25 2020-02-11 北京达佳互联信息技术有限公司 Document platform construction method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN102880832B (en) 2016-08-31

Similar Documents

Publication Publication Date Title
Bao et al. Massive sensor data management framework in cloud manufacturing based on Hadoop
CN103365929B (en) The management method of a kind of data base connection and system
CN102103518B (en) System for managing resources in virtual environment and implementation method thereof
CN102843418B (en) A kind of resource scheduling system
Lu et al. Research on Hadoop cloud computing model and its applications
CN103595799B (en) A kind of method realizing distributed shared data storehouse
CN107330056A (en) Wind power plant SCADA system and its operation method based on big data cloud computing platform
CN103049482B (en) The implementation method that in a kind of distributed heterogeneous system, data fusion stores
CN102880832A (en) Method for implementing mass data management system under colony
CN102917025A (en) Method for business migration based on cloud computing platform
Labouseur et al. Scalable and Robust Management of Dynamic Graph Data.
CN103955510A (en) Massive electricity marketing data integration method uploaded by ETL cloud platform
CN103491155A (en) Cloud computing method and system for achieving mobile computing and obtaining mobile data
CN103235811A (en) Data storage method and device
Zhang et al. Oceanrt: Real-time analytics over large temporal data
CN103944964A (en) Distributed system and method carrying out expansion step by step through same
Feng et al. Review of hadoop performance optimization
Mao et al. An optimal distributed K-Means clustering algorithm based on cloudstack
CN103034647A (en) Excel data import based on multi-threading technology
Li et al. Hadoop-Based University Ideological and Political Big Data Platform Design and Behavior Pattern Mining
CN111737655A (en) User authority management method, system and storage medium of cloud management platform
CN106227465A (en) A kind of data placement method of ring structure
Liu Building of cloud computing in university employment information library
CN105516274A (en) Method and system for realizing SAN (Storage Network Area)-generic-provider based on cloud platform
Liu et al. Edge node data replica management method for distribution Internet of Things

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211022

Address after: 100089 zone A-1, floor 2, building 36, yard 8, Dongbeiwang West Road, Haidian District, Beijing

Patentee after: Shuguang zhisuan Information Technology Co.,Ltd.

Address before: 100193 No.36 Zhongguancun Software Park, No.8 Dongbeiwang West Road, Haidian District, Beijing

Patentee before: Dawning Information Industry (Beijing) Co.,Ltd.