CN112398750B - Method for compressing and transmitting operation starting data in parallel computing - Google Patents
Method for compressing and transmitting operation starting data in parallel computing Download PDFInfo
- Publication number
- CN112398750B CN112398750B CN201910764215.3A CN201910764215A CN112398750B CN 112398750 B CN112398750 B CN 112398750B CN 201910764215 A CN201910764215 A CN 201910764215A CN 112398750 B CN112398750 B CN 112398750B
- Authority
- CN
- China
- Prior art keywords
- information
- data information
- job
- data
- computing
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000005540 biological transmission Effects 0.000 claims abstract description 40
- 238000007906 compression Methods 0.000 claims abstract description 10
- 230000006835 compression Effects 0.000 claims abstract description 10
- 230000006837 decompression Effects 0.000 claims abstract description 5
- 238000007405 data analysis Methods 0.000 claims abstract description 4
- 238000007726 management method Methods 0.000 claims description 23
- 238000013144 data compression Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 4
- 238000004458 analytical method Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L47/00—Traffic control in data switching networks
- H04L47/10—Flow control; Congestion control
- H04L47/22—Traffic shaping
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/04—Protocols for data compression, e.g. ROHC
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Computer And Data Communications (AREA)
- Information Transfer Between Computers (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method for compressing and transmitting operation start data in parallel computing, which comprises the following steps: s11, starting a job, and acquiring all data information which needs to be sent to a computing resource for running the job task; s12, compressing the full data information into attribute data information with the repeated common information deleted; s2, compressing the attribute data information into format data information described by an independent formatting statement; s3, obtaining compressed transmission data information for being sent to the computing resource; s4, carrying out general decompression and reverse data analysis corresponding to a general compression algorithm on the transmission data information to obtain original full data information; s5, the acquired all data information is locally stored by the respective operation programs of the computing resources, and when the computing resources need the data information, the data information can be directly read through the local storage. The invention solves the problems of large information transmission quantity and long time in the starting process of large-scale operation, improves the starting efficiency of large-scale operation and effectively relieves the network pressure.
Description
Technical Field
The invention relates to a method for compressing and transmitting operation start data in parallel computing, belonging to the technical field of computers.
Background
Parallel computing refers to a process of solving a computing problem by using multiple computing resources at the same time, and solves the same problem in a concurrent and cooperative manner through multiple nodes/processors, so as to improve the computing speed and the processing capacity. In parallel computing, when parallel jobs are submitted and scheduling operations are started, the job management system needs to transfer job information, resource information, job task information, and the like to computing resources of all running job tasks in multiple times for construction of the parallel jobs due to correlation between parallel job processes. The speed of job related data transfer will directly affect job start time and efficiency.
In the current mainstream parallel job management system, when job information, resource information and job task information are required to be transmitted to all computing resources for running the job tasks in the process of job starting, relevant original information is directly listed and transmitted, when the working scale is large, the amount of information required to be transmitted is large, the transmission time is long, and meanwhile, a transmission network is subjected to relatively large pressure in the transmission process, so that the running control efficiency is influenced. Therefore, how to solve the problems of large information transmission amount and long time in the starting process of large-scale operation, improve the starting efficiency of large-scale operation, effectively relieve the network pressure, and become the direction of the efforts of the technicians in the field.
Disclosure of Invention
The invention aims to provide a method for transmitting operation starting data compression in parallel computing, which solves the problems of large information transmission amount and long time in the large-scale operation starting process, improves the efficiency of large-scale operation starting and effectively relieves the network pressure.
In order to achieve the above purpose, the invention adopts the following technical scheme: a job-started data compression transfer method in parallel computing is based on massive parallel jobs, and comprises the following steps:
s11, starting a job, namely acquiring all data information which needs to be sent to a computing resource for running the job task, wherein job global resource information in the all data information comprises repeated common information;
s12, the job management system extracts the common information in the job global resource information which is required to be sent to each computing resource from the full data information, so that the full data information is compressed into attribute data information which is reduced in repeated common information and only contains one share of common information and other personalized computing resource information;
s2, the job management system analyzes the continuity rule of the resource list in the attribute data information obtained in the S12, and carries out range description on the attribute data information, so that the attribute data information is compressed into format data information described by an independent formatting statement;
s3, the job management system compresses the commonality information which is required to be sent to the computing resource running the job task and the format data information obtained in S12 through a general compression algorithm to obtain compressed transmission data information which is used for being sent to the computing resource;
s4, the computing resources receive the transmission data information from the job management system, perform general decompression corresponding to a general compression algorithm on the transmission data information to obtain common information and format data information, perform reverse data analysis on the common information and the format data information, convert the range description into the original list description, and then re-add the extracted common information into personalized attribute information of each computing resource to obtain original all-data information;
s5, the acquired all data information is locally stored by respective operation programs of the computing resources, and when the computing resources need the data information in the all data information, the all data information can be directly read through the local storage.
The further improved scheme in the technical scheme is as follows:
1. in the above scheme, in S12, the commonality information includes the resource type, the number of operation cores in the node, and the bitmap.
2. In the above-described aspect, in S11, the all-data information includes job information, resource information, and job task information.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:
the method for compressing and transmitting the data for starting the operation in the parallel computing solves the problems of large information transmission amount and long time in the process of starting the large-scale operation, improves the efficiency of starting the large-scale operation, effectively relieves the network pressure, reduces the transmission time by effectively compressing the transmitted data in the process of collecting and transmitting the information, reduces the data amount required to be transmitted through the network in the process of starting the operation, and simultaneously reduces the transmission time by locally storing global data on a computing node, avoids the repeated transmission of related data, effectively relieves the network transmission pressure, reduces the data transmission time and improves the operation starting efficiency.
Drawings
FIG. 1 is a flow chart of a method for job initiated data compression transfer in parallel computing.
Detailed Description
Examples: a job-started data compression transfer method in parallel computing is based on massive parallel jobs, and comprises the following steps:
s11, starting a job, namely acquiring all data information which needs to be sent to a computing resource for running the job task, wherein job global resource information in the all data information comprises repeated common information;
s12, the job management system extracts the common information in the job global resource information which is required to be sent to each computing resource from the full data information, so that the full data information is compressed into attribute data information which is reduced in repeated common information and only contains one share of common information and other personalized computing resource information;
s2, the job management system analyzes the continuity rule of the resource list in the attribute data information obtained in the S12, and carries out range description on the attribute data information, so that the attribute data information is compressed into format data information described by an independent formatting statement, for example, the resource list is node array 1,2,3,4,5,7,9, and the resource list is converted into 1-5,7 and 9 through range description, so that the compression transmission data is obviously effective when the number of resources is huge and the system has continuity;
s3, the job management system compresses the commonality information which is required to be sent to the computing resource running the job task and the format data information obtained in S12 through a general compression algorithm to obtain compressed transmission data information which is used for being sent to the computing resource;
s4, the computing resources receive the transmission data information from the job management system, perform general decompression corresponding to a general compression algorithm on the transmission data information to obtain common information and format data information, perform reverse data analysis on the common information and the format data information, convert the range description into the original list description, and then re-add the extracted common information into personalized attribute information of each computing resource to obtain original all-data information;
s5, the acquired all data information is locally stored by respective operation programs of the computing resources, and when the computing resources need the data information in the all data information, the all data information can be directly read through the local storage.
In S12, the commonality information includes the resource type, the number of intra-node operation cores, and the bitmap.
In S11, the full data information includes job information, resource information, and job task information.
Examples are further explained as follows:
assuming that the total amount of data to be transmitted is D1 initially, the job management system firstly extracts the commonality information of the job global resources to be transmitted to each computing resource, such as the resource type, the number of operation cores in the node, the bitmap and the like, uses uniform global description to replace a plurality of independent node attribute information, can reduce the attribute data of a single computing resource in the transmitted data through the step, and compresses the data amount from the initial D1 to the D2;
performing regular analysis on a resource list in the transmission data, performing scoping description on the resource list so as to describe a large amount of node list information through an independent formatting statement when the number of resources is large, and compressing the transmission data amount from D2 to D3;
because the information sent to the computing resource has integrity, all the information is uniformly analyzed and processed after being received, and the processing in the receiving process is not needed, the operation management system can package the data and compress the data through a general compression algorithm before sending the data to the computing resource, and the transmission data quantity is compressed from D3 to D4, so that only the compressed data packet with the size of D4 is needed to be sent to the computing resource;
after receiving the job information sent by the job management system, the computing resource restores and acquires the original data (the original data with the size of D1) through decompression and reverse analysis operations. Reducing a large amount of data transmission by a very small amount of computation;
the job program in the computing resource stores the received job and resource information sent by the job management system locally, so that the job and resource information can be read locally when needed, frequent data acquisition from the job management system through a control network is avoided, and the network transmission message quantity is reduced;
extracting and simplifying the common information of the resource attribute, regularly analyzing and describing the information of the resource list in a range, calculating the local storage of the resource to the global information, avoiding repeated transmission of the information, compressing the data by using a general compression algorithm before network transmission, and the like, so that the transmission data amount is reduced, the transmission time is reduced, and the operation starting efficiency is improved;
and based on the localized storage of the computing resources of the job related information, the job management system can optimize part of repeated information in the job data sent for many times in the process of starting the job, and reduce the data volume of each information transmission.
In order to facilitate a better understanding of the present invention, the terms used herein will be briefly explained below:
parallel computation (Parallel Computing): parallel computing refers to a process of solving a computing problem by using multiple computing resources at the same time, and solves the same problem in a concurrent and cooperative manner through multiple nodes/processors, so as to improve the computing speed and the processing capacity.
Parallel operation: generally, the task process set is written by parallel languages such as MPI and the like, runs on the computing resources of a parallel computer, is started and controlled by a job management system, and solves the same problem through inter-process cooperation.
A parallel job management system: the management control system is operated in the parallel computer and used for performing functions such as parallel job scheduling, task starting, control and recovery.
When the method for compressing and transmitting the data is used for operation starting in parallel computing, the problems of large information transmission amount and long time in the process of starting large-scale operation are solved, the efficiency of starting large-scale operation is improved, the network pressure is effectively relieved, the transmitted data is effectively compressed in the process of information collection and transmission, the data amount required to be transmitted through a network in the process of starting the operation is reduced, the transmission time is shortened, meanwhile, the local storage of global data is realized on a computing node, the repeated transmission of related data is avoided, the network transmission pressure is effectively relieved, the data transmission time is shortened, and the operation starting efficiency is improved.
The above embodiments are provided to illustrate the technical concept and features of the present invention and are intended to enable those skilled in the art to understand the content of the present invention and implement the same, and are not intended to limit the scope of the present invention. All equivalent changes or modifications made in accordance with the spirit of the present invention should be construed to be included in the scope of the present invention.
Claims (3)
1. A method for compressing and transmitting operation start data in parallel computing is characterized in that: based on massive parallel operation, the method comprises the following steps:
s11, starting a job, namely acquiring all data information which needs to be sent to a computing resource for running the job task, wherein job global resource information in the all data information comprises repeated common information;
s12, the job management system extracts the common information in the job global resource information which is required to be sent to each computing resource from the full data information, so that the full data information is compressed into attribute data information which is reduced in repeated common information and only contains one share of common information and other personalized computing resource information;
s2, the job management system analyzes the continuity rule of the resource list in the attribute data information obtained in the S12, and carries out range description on the attribute data information, so that the attribute data information is compressed into format data information described by an independent formatting statement;
s3, the job management system compresses the commonality information which is required to be sent to the computing resource running the job task and the format data information obtained in S12 through a general compression algorithm to obtain compressed transmission data information which is used for being sent to the computing resource;
s4, the computing resources receive the transmission data information from the job management system, perform general decompression corresponding to a general compression algorithm on the transmission data information to obtain common information and format data information, perform reverse data analysis on the common information and the format data information, convert the range description into the original list description, and then re-add the extracted common information into personalized attribute information of each computing resource to obtain original all-data information;
s5, the acquired all data information is locally stored by respective operation programs of the computing resources, and when the computing resources need the data information in the all data information, the all data information can be directly read through the local storage.
2. The method for job-initiated data compression delivery in parallel computing of claim 1, wherein: in S12, the commonality information includes the resource type, the number of intra-node operation cores, and the bitmap.
3. The method for job-initiated data compression delivery in parallel computing of claim 1, wherein: in S11, the full data information includes job information, resource information, and job task information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910764215.3A CN112398750B (en) | 2019-08-19 | 2019-08-19 | Method for compressing and transmitting operation starting data in parallel computing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910764215.3A CN112398750B (en) | 2019-08-19 | 2019-08-19 | Method for compressing and transmitting operation starting data in parallel computing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112398750A CN112398750A (en) | 2021-02-23 |
CN112398750B true CN112398750B (en) | 2024-02-06 |
Family
ID=74603434
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910764215.3A Active CN112398750B (en) | 2019-08-19 | 2019-08-19 | Method for compressing and transmitting operation starting data in parallel computing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112398750B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0955708A (en) * | 1995-08-11 | 1997-02-25 | Fujitsu Ltd | Data compression and transfer system |
CN105721810A (en) * | 2014-12-05 | 2016-06-29 | 北大方正集团有限公司 | Image compression storage method and apparatus |
CN107172079A (en) * | 2017-06-27 | 2017-09-15 | 武汉蓝星软件技术有限公司 | A kind of data compression exchange method based on application service core frame platform |
CN107409152A (en) * | 2015-03-12 | 2017-11-28 | 英特尔公司 | Method and apparatus for compressing the data received by network |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2472072B (en) * | 2009-07-24 | 2013-10-16 | Hewlett Packard Development Co | Deduplication of encoded data |
US10963171B2 (en) * | 2017-10-16 | 2021-03-30 | Red Hat, Inc. | Compressibility instrumented dynamic volume provisioning |
-
2019
- 2019-08-19 CN CN201910764215.3A patent/CN112398750B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0955708A (en) * | 1995-08-11 | 1997-02-25 | Fujitsu Ltd | Data compression and transfer system |
CN105721810A (en) * | 2014-12-05 | 2016-06-29 | 北大方正集团有限公司 | Image compression storage method and apparatus |
CN107409152A (en) * | 2015-03-12 | 2017-11-28 | 英特尔公司 | Method and apparatus for compressing the data received by network |
CN107172079A (en) * | 2017-06-27 | 2017-09-15 | 武汉蓝星软件技术有限公司 | A kind of data compression exchange method based on application service core frame platform |
Non-Patent Citations (1)
Title |
---|
一种并行作业任务启动模型及其可扩展性分析;宋长明等;计算机工程与科学;第35卷(第11期);第182-186页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112398750A (en) | 2021-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102236581B (en) | Mapping reduction method and system thereof for data center | |
CN111209310B (en) | Service data processing method and device based on stream computing and computer equipment | |
CN103312544A (en) | Method, equipment and system for controlling terminals during log file reporting | |
CN111736907B (en) | Data analysis method of self-adaptive low-delay memory computing engine | |
CN116755844B (en) | Data processing method, device and equipment of simulation engine and storage medium | |
CN114036183A (en) | Data ETL processing method, device, equipment and medium | |
CN111107022B (en) | Data transmission optimization method, device and readable storage medium | |
CN111898009A (en) | Distributed acquisition system and method for multi-source power data fusion | |
CN112398750B (en) | Method for compressing and transmitting operation starting data in parallel computing | |
CN113918532A (en) | Portrait label aggregation method, electronic device and storage medium | |
CN112416557B (en) | Method and device for determining call relation, storage medium and electronic device | |
CN111405020B (en) | Asynchronous file export method and system based on message queue and fastDFS micro-service framework | |
CN111460021B (en) | Data export method and device | |
CN111447229A (en) | Large-scale data acquisition method and device based on compressed sensing theory | |
CN111276231A (en) | Medical data monitoring method and device, computer equipment and storage medium | |
WO2022253131A1 (en) | Data parsing method and apparatus, computer device, and storage medium | |
CN115561620A (en) | Chip testing method and device based on GRPC (glass-fiber reinforced polycarbonate) and storage medium | |
CN112612823B (en) | Big data time sequence analysis method based on fusion of Pyspark and Pandas | |
CN114063943A (en) | Data transmission system, method, device, medium, and apparatus | |
CN112417015A (en) | Data distribution method and device, storage medium and electronic device | |
CN113190237A (en) | Data processing method, system and device | |
CN107330089B (en) | Cross-network structured data collection system | |
CN106815017B (en) | Dynamic language performance analysis and display method and system | |
CN113190581A (en) | Method and terminal for dynamically generating report form based on big data | |
CN111159004A (en) | Hadoop cluster simulation test method and device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |