CN106502795A - The method and system of scientific algorithm application deployment are realized on distributed type assemblies - Google Patents
The method and system of scientific algorithm application deployment are realized on distributed type assemblies Download PDFInfo
- Publication number
- CN106502795A CN106502795A CN201610954803.XA CN201610954803A CN106502795A CN 106502795 A CN106502795 A CN 106502795A CN 201610954803 A CN201610954803 A CN 201610954803A CN 106502795 A CN106502795 A CN 106502795A
- Authority
- CN
- China
- Prior art keywords
- distributed type
- type assemblies
- node
- cluster
- scientific algorithm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/60—Software deployment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
Abstract
The invention discloses realizing the method and system of scientific algorithm application deployment on distributed type assemblies, belong to scientific algorithm applied technical field, disclosure address how the application that itself is not possessed parallel ability carries out the problem of parallel computation operation, the technical scheme for adopting for:The external node of configuration cluster, name node and back end on distributed type assemblies;Client is accessed to cluster to exterior node by cluster;Name node is used for carrying out Map processes;Back end is used for carrying out Reduce processes;The installation share directory of distributed type assemblies, in name node, is set, disposing application program and/or file in share directory is installed;Share directory is installed globally shared for carrying out application program or file;The user of switching distributed type assemblies, opens application program, executes;Back end is logged in, checks whether the Reduce processes of application execute using top instructions;After being finished, collect all of node integration and feed back to active user.
Description
Technical field
The present invention relates to a kind of scientific algorithm applied technical field, specifically realizes scientific algorithm on distributed type assemblies
The method and system of application deployment.
Background technology
Common scientific algorithm application is deployed in High-Performance Computing Cluster, and High-Performance Computing Cluster can squeeze computing resource to greatest extent
Ability be High-Performance Computing Cluster a big feature, and the reason for major clusters species selected as scientific algorithm application it
One.
But want realize in High-Performance Computing Cluster efficient parallel, the most important condition be by scientific algorithm application can parallel based on
Calculate, in other words, the application is through multiple programming and exploitation.From the point of view of with regard to current application market, great majority are used for nature section
Application is all concurrent program, and the application for being used for social sciences is seldom then parallel compilation, in High-Performance Computing Cluster it
Can only single node serial computing.
The distributed type assemblies hadoop cluster that namely we often mention, hadoop are a kind of open source technologies, Hadoop's
Big data is processed engine as far as possible near storage by distributed structure/architecture.The MapReduce of Hadoop(MapReduce)Function reality
Show and individual task has been smashed, and fragment task (Map) has been sent on multiple nodes, afterwards again in the form of individual data collection
(Reduce) is loaded in data warehouse.Hadoop is made up of many elements.Its bottommost is Hadoop Distributed
File System (HDFS distributed file systems), it store the file in Hadoop clusters on all memory nodes.
The MapReduce of Hadoop adopts Master/Slave(Master/slave)Structure.Master:It is the unique global pipe of whole cluster
Reason person, function include:JobTracker in task management, condition monitoring and task scheduling etc., i.e. MapReduce.Slave:
TaskTracker in the execution of responsible task and the return of task status, i.e. MapReduce.
JobTracker is a background service process, after startup, can monitor always and receive from each
The heartbeat message that TaskTracker sends, including the information such as resource service condition and task run situation.The master of JobTracker
Want function:1. Operation control:In hadoop, each application program is expressed as an operation, and each operation is divided into multiple again
Task, the Operation control module of JobTracker are then responsible for decomposition and the condition monitoring of operation;Condition monitoring:Mainly include
TaskTracker condition monitorings, job state monitoring and task status monitoring;Main Function:Fault-tolerant and provide for task scheduling
Decision-making foundation.2. resource management.
TaskTracker is the bridge between JobTracker and Task:On the one hand, receive from JobTracker and execute
Various orders:Operation task, submission task, kill task dispatching;On the other hand, the state of each task on local node is passed through
Heart beat cycle is reported to JobTracker.Employing RPC agreements between TaskTracker and JobTracker and Task is carried out
Communication.The function of TaskTracker:1. heart beating is reported:Various information on all nodes are periodically passed through heart beating machine by Tracker
System is reported to JobTracker;Information includes:Machine class information(Node health situation, resource service condition etc.), task level
Other information(Tasks carrying progress, task run state etc.);2. order is executed:JobTracker can be assigned to TaskTracker
Various orders, mainly include:Startup task (LaunchTaskAction), submission task (CommitTaskAction), kill
Task (KillTaskAction), kill operation and (KillJobAction) and reinitialize
(TaskTrackerReinitAction).3. resource management.
There are a lot of commercial companies on market multiple different business-like hadoop products are provided for application.And this
Its core of different types of product remains original hadoop a bit, and its agent structure has certain difference with High-Performance Computing Cluster,
The HDFS of bottom(Hadoop Distributed File System)File system and MapReduce distributed treatments are to constitute
The core of hadoop Distributed Calculations.High-performance calculation depends on to apply itself has parallel ability, and hadoop cluster sheet
Body is provided the ability of distributed treatment by system layer.Obviously, some cannot be parallel scientific algorithm application deployment in height
Serial computing on performance cluster, efficiency be not in fact high, it is impossible at utmost using cluster resource.Application itself does not possess parallel
Ability, how to carry out parallel computation operation to this application is the problem for solving that needs at present.
Content of the invention
The technical assignment of the present invention is to provide on distributed type assemblies the method and system for realizing scientific algorithm application deployment, solves
The application how itself is not possessed parallel ability carries out the problem of parallel computation operation.
The present invention technical assignment realize in the following manner,
The method for realizing scientific algorithm application deployment on distributed type assemblies, step are as follows:
(1), deployment distributed type assemblies so that the user of distributed type assemblies normal storage file and can normally carry out Map-
Reduce processes;A cluster is configured on distributed type assemblies to exterior node, at least two name nodes(Namenode)With many
The individual back end for being calculated(Datanode);Client is accessed to cluster to exterior node by cluster;Name node is used for
Carry out Map processes;Back end is used for carrying out Reduce processes;
(2), distributed type assemblies using scientific algorithm application carry out tuning, comprising Map-Reduce function interfaces;
(3), in name node, arrange distributed type assemblies installation share directory, install share directory in application deployment journey
Sequence and/or file;Share directory is installed globally shared for carrying out application program or file so that the installation can be accessed altogether
The child node for enjoying catalogue can call application program or reading and writing of files;
(4), switching distributed type assemblies user, open application program, there are application programming interfaces, under the application programming interfaces
Carry out example or program debugging operation;
(5), log in each execution Reduce processes back end(Datanode), application is checked using top instructions
Whether Reduce processes execute;
(6), be finished after, distributed type assemblies are collected automatically all of node and are integrated and feed back to active user.
Step(1)In, distributed type assemblies are using hadoop or commercial version hadoop of increasing income.
Step(3)In, share directory is installed and uses common memory space, or using the shared catalogues of NFS, such as/opt.
Step(3)In, installing disposing application program and/or Files step in share directory is:Installed by uploading decompression
Three steps, test after installing whether each node can normally access and execute corresponding application program and/or file.
Step(4)In, example or program debugging operation:Calculating by all Datanode of the Map-Reduce invocations of procedure
Resource, the number for calling computing resource is arranged by hadoop configuration process;Application program is opened, the example of application program is input into
Parameter or corresponding program, start to execute.
Step(6)In, node integration feeds back to active user's way of output and is:Screen output or file output.
The system for realizing scientific algorithm application deployment on distributed type assemblies, configures a cluster external on distributed type assemblies
Node, at least two name nodes(Namenode)With multiple back end for being calculated(Datanode);Client passes through
Cluster is accessed to cluster to exterior node;Name node is used for carrying out Map processes;Back end is used for carrying out Reduce processes.
Distributed type assemblies are using hadoop or commercial version hadoop of increasing income.
Client accesses cluster to exterior node by external network or public network, and cluster passes through internal network to exterior node
Name node can be accessed(Namenode)And back end(Datanode), name node(Namenode)And back end
(Datanode)The internal network for carrying out data interaction and communication is gigabit networking or 10,000,000,000 networks or Infiniband nets
Network.
Realize on the distributed type assemblies of the present invention that the method and system of scientific algorithm application deployment have advantages below:
1st, the cross-node Distributed Calculation of the application based on hadoop exploitations is realized on distributed type assemblies, makes up high-performance collection
The shortcoming for applying serial working performance low in group;
2nd, common distributed type assemblies are such as increased income hadoop or the distributed software of commercial version;Can realize that this kind of application is passed through
Applying after hadoop tunings affixes one's name to computing in hadoop platform uppers, so as to the ability with cross-node distributed arithmetic, improves
PC cluster efficiency;
3rd, simple and clear, it is easy to operate, distributed type assemblies are the whole bases for setting up, and realize the distributed meter of scientific algorithm application
The core for setting up at last and purpose, the node that there are other character types in cluster are still suitable for;Final purpose is the section of deployment
Learn to calculate to apply and can realize irrealizable cross-node computing in High-Performance Computing Cluster, to greatest extent cluster resource is utilized
Come, for scientific computing process in;
4th, application program and file are deployed under installation share directory, the application deployment time that can be saved under distributed type assemblies
And difficulty.
Description of the drawings
The present invention is further described below in conjunction with the accompanying drawings.
Accompanying drawing 1 is the flow chart of the method for realizing scientific algorithm application deployment on distributed type assemblies;
Accompanying drawing 2 is the structured flowchart of the system for realizing scientific algorithm application deployment on distributed type assemblies.
Specific embodiment
With reference to Figure of description and specific embodiment to realizing on distributed type assemblies of the invention that scientific algorithm application is disposed
Method and system be described in detail below.
Embodiment 1:
The method for realizing scientific algorithm application deployment on the distributed type assemblies of the present invention, step are as follows:
(1), deployment distributed type assemblies so that the user of distributed type assemblies normal storage file and can normally carry out Map-
Reduce processes;A cluster is configured on distributed type assemblies to exterior node, at least two name nodes(Namenode)With many
The individual back end for being calculated(Datanode);Client is accessed to cluster to exterior node by cluster;Name node is used for
Carry out Map processes;Back end is used for carrying out Reduce processes;Distributed type assemblies are using the hadoop that increases income;
(2), distributed type assemblies using scientific algorithm application carry out tuning, comprising Map-Reduce function interfaces;
(3), in name node, arrange distributed type assemblies installation share directory, install share directory use common storage
Space;Disposing application program and/or file in share directory is installed;Installing share directory is used for application program or file
Carry out globally shared so that the child node that can access the installation share directory can call application program or reading and writing of files;Install
In share directory, disposing application program and/or Files step are:Three steps are installed by uploading decompression, after installing, test each section
Whether point can normally access and execute corresponding application program and/or file;
(4), switching distributed type assemblies user, open application program, there are application programming interfaces, under the application programming interfaces
Carry out example or program debugging operation;Example or program debugging operation:By all Datanode of the Map-Reduce invocations of procedure
Computing resource, the number for calling computing resource is arranged by hadoop configuration process;Application program is opened, application program is input into
Example parameter or corresponding program, start to execute;
(5), log in each execution Reduce processes back end(Datanode), application is checked using top instructions
Whether Reduce processes execute;
(6), be finished after, distributed type assemblies are collected automatically all of node and are integrated and feed back to active user;Node is integrated anti-
Active user's way of output of feeding is:Screen is exported.
Embodiment 2:
The method for realizing scientific algorithm application deployment on the distributed type assemblies of the present invention, step are as follows:
(1), deployment distributed type assemblies so that the user of distributed type assemblies normal storage file and can normally carry out Map-
Reduce processes;A cluster is configured on distributed type assemblies to exterior node, at least two name nodes(Namenode)With many
The individual back end for being calculated(Datanode);Client is accessed to cluster to exterior node by cluster;Name node is used for
Carry out Map processes;Back end is used for carrying out Reduce processes;Distributed type assemblies use commercial version hadoop;
(2), distributed type assemblies using scientific algorithm application carry out tuning, comprising Map-Reduce function interfaces;
(3), the installation share directory of distributed type assemblies, in name node, is set, share directory is installed using the shared mesh of NFS
Record/opt;Disposing application program and/or file in share directory is installed;Installing share directory is used for application program or text
Part carries out globally shared so that the child node that can access the installation share directory can call application program or reading and writing of files;Peace
In dress share directory, disposing application program and/or Files step are:Three steps are installed by uploading decompression, are tested after installing each
Whether node can normally access and execute corresponding application program and/or file;
(4), switching distributed type assemblies user, open application program, there are application programming interfaces, under the application programming interfaces
Carry out example or program debugging operation;Example or program debugging operation:By all Datanode of the Map-Reduce invocations of procedure
Computing resource, the number for calling computing resource is arranged by hadoop configuration process;Application program is opened, application program is input into
Example parameter or corresponding program, start to execute;
(5), log in each execution Reduce processes back end(Datanode), application is checked using top instructions
Whether Reduce processes execute;
(6), be finished after, distributed type assemblies are collected automatically all of node and are integrated and feed back to active user;Node is integrated anti-
Active user's way of output of feeding is:File is exported.
Embodiment 3:
The system for realizing scientific algorithm application deployment on the distributed type assemblies of the present invention, configures a cluster on distributed type assemblies
To exterior node, at least two name nodes(Namenode)With multiple back end for being calculated(Datanode);Client
Exterior node is accessed to cluster by cluster;Name node is used for carrying out Map processes;Back end is used for carrying out Reduce mistakes
Journey.Distributed type assemblies are using hadoop or commercial version hadoop of increasing income.Client is accessed by external network or public network
Cluster can access name node to exterior node by internal network to exterior node, cluster(Namenode)And back end
(Datanode), name node(Namenode)And back end(Datanode)Carry out the internal network of data interaction and communication
For gigabit networking.
By specific embodiment above, the those skilled in the art can readily realize the present invention.But should
Work as understanding, the present invention is not limited to above-mentioned specific embodiment.On the basis of disclosed embodiment, the technical field
Technical staff can the different technical characteristic of combination in any, so as to realize different technical schemes.
In addition to the technical characteristic described in description, the known technology of those skilled in the art is.
Claims (9)
1. the method for realizing scientific algorithm application deployment on distributed type assemblies, it is characterised in that step is as follows:
(1), deployment distributed type assemblies so that the user of distributed type assemblies normal storage file and can normally carry out Map-
Reduce processes;A cluster is configured to exterior node, at least two name nodes on distributed type assemblies and multiple is calculated
Back end;Client is accessed to cluster to exterior node by cluster;Name node is used for carrying out Map processes;Back end
For carrying out Reduce processes;
(2), distributed type assemblies using scientific algorithm application carry out tuning, comprising Map-Reduce function interfaces;
(3), in name node, arrange distributed type assemblies installation share directory, install share directory in application deployment journey
Sequence and/or file;Share directory is installed globally shared for carrying out application program or file so that the installation can be accessed altogether
The child node for enjoying catalogue can call application program or reading and writing of files;
(4), switching distributed type assemblies user, open application program, there are application programming interfaces, under the application programming interfaces
Carry out example or program debugging operation;
(5), log in the back end of each execution Reduce process, whether the Reduce processes for checking application using top instructions
Execute;
(6), be finished after, distributed type assemblies are collected automatically all of node and are integrated and feed back to active user.
2. the method for realizing scientific algorithm application deployment on distributed type assemblies according to claim 1, it is characterised in that step
Suddenly(1)In, distributed type assemblies are using hadoop or commercial version hadoop of increasing income.
3. the method for realizing scientific algorithm application deployment on distributed type assemblies according to claim 1, it is characterised in that step
Suddenly(3)In, share directory is installed and uses common memory space, or using the shared catalogues of NFS.
4. the method for realizing scientific algorithm application deployment on distributed type assemblies according to claim 1, it is characterised in that step
Suddenly(3)In, installing disposing application program and/or Files step in share directory is:Three steps are installed by uploading decompression, are installed
Test whether each node can normally access and execute corresponding application program and/or file after finishing.
5. the method for realizing scientific algorithm application deployment on distributed type assemblies according to claim 1, it is characterised in that step
Suddenly(4)In, example or program debugging operation:By the computing resource of all Datanode of the Map-Reduce invocations of procedure, call
The number of computing resource is arranged by hadoop configuration process;Open application program, be input into application program example parameter or
Corresponding program, starts to execute.
6. the method for realizing scientific algorithm application deployment on distributed type assemblies according to claim 1, it is characterised in that step
Suddenly(6)In, node integration feeds back to active user's way of output and is:Screen output or file output.
7. the system for realizing scientific algorithm application deployment on distributed type assemblies, it is characterised in that configure on distributed type assemblies
Cluster is to exterior node, at least two name nodes and multiple back end for being calculated;Client is by cluster to exterior node
Cluster is accessed;Name node is used for carrying out Map processes;Back end is used for carrying out Reduce processes.
8. the system for realizing scientific algorithm application deployment on distributed type assemblies according to claim 7, it is characterised in that point
Cloth cluster is using hadoop or commercial version hadoop of increasing income.
9. the system for realizing scientific algorithm application deployment on distributed type assemblies according to claim 7, it is characterised in that visitor
Family end accesses cluster to exterior node by external network or public network, and cluster can access title to exterior node by internal network
It is gigabit networking or ten thousand that node and back end, name node and back end carry out data interaction and the internal network of communication
Million networks or Infiniband networks.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610954803.XA CN106502795A (en) | 2016-11-03 | 2016-11-03 | The method and system of scientific algorithm application deployment are realized on distributed type assemblies |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610954803.XA CN106502795A (en) | 2016-11-03 | 2016-11-03 | The method and system of scientific algorithm application deployment are realized on distributed type assemblies |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106502795A true CN106502795A (en) | 2017-03-15 |
Family
ID=58322502
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610954803.XA Pending CN106502795A (en) | 2016-11-03 | 2016-11-03 | The method and system of scientific algorithm application deployment are realized on distributed type assemblies |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106502795A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107168795A (en) * | 2017-05-12 | 2017-09-15 | 西南大学 | Codon deviation factor model method based on CPU GPU isomery combined type parallel computation frames |
CN107846411A (en) * | 2017-11-24 | 2018-03-27 | 郑州云海信息技术有限公司 | A kind of DNS clustered deploy(ment)s system and method |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103631623A (en) * | 2013-11-29 | 2014-03-12 | 浪潮(北京)电子信息产业有限公司 | Method and device for allocating application software in trunking system |
CN103647797A (en) * | 2013-11-15 | 2014-03-19 | 北京邮电大学 | Distributed file system and data access method thereof |
CN103678360A (en) * | 2012-09-13 | 2014-03-26 | 腾讯科技(深圳)有限公司 | Data storing method and device for distributed file system |
CN104113597A (en) * | 2014-07-18 | 2014-10-22 | 西安交通大学 | Multi- data-centre hadoop distributed file system (HDFS) data read-write system and method |
CN104363222A (en) * | 2014-11-11 | 2015-02-18 | 浪潮电子信息产业股份有限公司 | Hadoop-based network security event analyzing method |
CN104702702A (en) * | 2012-01-11 | 2015-06-10 | 北京奇虎科技有限公司 | System and method for downloading data |
-
2016
- 2016-11-03 CN CN201610954803.XA patent/CN106502795A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104702702A (en) * | 2012-01-11 | 2015-06-10 | 北京奇虎科技有限公司 | System and method for downloading data |
CN103678360A (en) * | 2012-09-13 | 2014-03-26 | 腾讯科技(深圳)有限公司 | Data storing method and device for distributed file system |
CN103647797A (en) * | 2013-11-15 | 2014-03-19 | 北京邮电大学 | Distributed file system and data access method thereof |
CN103631623A (en) * | 2013-11-29 | 2014-03-12 | 浪潮(北京)电子信息产业有限公司 | Method and device for allocating application software in trunking system |
CN104113597A (en) * | 2014-07-18 | 2014-10-22 | 西安交通大学 | Multi- data-centre hadoop distributed file system (HDFS) data read-write system and method |
CN104363222A (en) * | 2014-11-11 | 2015-02-18 | 浪潮电子信息产业股份有限公司 | Hadoop-based network security event analyzing method |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107168795A (en) * | 2017-05-12 | 2017-09-15 | 西南大学 | Codon deviation factor model method based on CPU GPU isomery combined type parallel computation frames |
CN107168795B (en) * | 2017-05-12 | 2019-05-03 | 西南大学 | Codon deviation factor model method based on CPU-GPU isomery combined type parallel computation frame |
CN107846411A (en) * | 2017-11-24 | 2018-03-27 | 郑州云海信息技术有限公司 | A kind of DNS clustered deploy(ment)s system and method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4867660B2 (en) | Componentized automated provisioning and management of computing environments for computing utilities | |
US10936589B1 (en) | Capability-based query planning for heterogenous processing nodes | |
JP6581108B2 (en) | Processing data from multiple sources | |
CN102880503B (en) | Data analysis system and data analysis method | |
CN111241078A (en) | Data analysis system, data analysis method and device | |
Tao et al. | Dynamic resource allocation algorithm for container-based service computing | |
WO2019226327A1 (en) | Data platform fabric | |
CN104954453A (en) | Data mining REST service platform based on cloud computing | |
US9747130B2 (en) | Managing nodes in a high-performance computing system using a node registrar | |
US10671621B2 (en) | Predictive scaling for cloud applications | |
CN104008012B (en) | A kind of high-performance MapReduce implementation methods based on dynamic migration of virtual machine | |
Luckow et al. | Pilot-data: an abstraction for distributed data | |
US9184982B2 (en) | Balancing the allocation of virtual machines in cloud systems | |
CN107294771A (en) | A kind of efficient deployment system and application method suitable for big data cluster | |
CN101256599A (en) | System for gathering data of distributing simulation platform based on grid | |
Luckow et al. | Pilot-edge: Distributed resource management along the edge-to-cloud continuum | |
Sánchez et al. | Agent-based platform to support the execution of parallel tasks | |
CN108153859A (en) | A kind of effectiveness order based on Hadoop and Spark determines method parallel | |
CN106502795A (en) | The method and system of scientific algorithm application deployment are realized on distributed type assemblies | |
Kim et al. | RIDE: real-time massive image processing platform on distributed environment | |
Amoretti et al. | Efficient autonomic cloud computing using online discrete event simulation | |
Rossant et al. | Playdoh: a lightweight Python library for distributed computing and optimisation | |
Belgacem et al. | Virtual ez grid: A volunteer computing infrastructure for scientific medical applications | |
CN115048564B (en) | Distributed crawler task scheduling method, system and equipment | |
CN206421382U (en) | A kind of data handling system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170315 |
|
RJ01 | Rejection of invention patent application after publication |