CN112398750B - Method for compressing and transmitting operation starting data in parallel computing - Google Patents

Method for compressing and transmitting operation starting data in parallel computing Download PDF

Info

Publication number
CN112398750B
CN112398750B CN201910764215.3A CN201910764215A CN112398750B CN 112398750 B CN112398750 B CN 112398750B CN 201910764215 A CN201910764215 A CN 201910764215A CN 112398750 B CN112398750 B CN 112398750B
Authority
CN
China
Prior art keywords
information
data information
job
data
computing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910764215.3A
Other languages
Chinese (zh)
Other versions
CN112398750A (en
Inventor
宋长明
龚道永
钱宇
张宏宇
李伟东
刘沙
刘睿涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN201910764215.3A priority Critical patent/CN112398750B/en
Publication of CN112398750A publication Critical patent/CN112398750A/en
Application granted granted Critical
Publication of CN112398750B publication Critical patent/CN112398750B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L47/00Traffic control in data switching networks
    • H04L47/10Flow control; Congestion control
    • H04L47/22Traffic shaping
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer And Data Communications (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for compressing and transmitting operation start data in parallel computing, which comprises the following steps: s11, starting a job, and acquiring all data information which needs to be sent to a computing resource for running the job task; s12, compressing the full data information into attribute data information with the repeated common information deleted; s2, compressing the attribute data information into format data information described by an independent formatting statement; s3, obtaining compressed transmission data information for being sent to the computing resource; s4, carrying out general decompression and reverse data analysis corresponding to a general compression algorithm on the transmission data information to obtain original full data information; s5, the acquired all data information is locally stored by the respective operation programs of the computing resources, and when the computing resources need the data information, the data information can be directly read through the local storage. The invention solves the problems of large information transmission quantity and long time in the starting process of large-scale operation, improves the starting efficiency of large-scale operation and effectively relieves the network pressure.

Description

Method for compressing and transmitting operation starting data in parallel computing
Technical Field
The invention relates to a method for compressing and transmitting operation start data in parallel computing, belonging to the technical field of computers.
Background
Parallel computing refers to a process of solving a computing problem by using multiple computing resources at the same time, and solves the same problem in a concurrent and cooperative manner through multiple nodes/processors, so as to improve the computing speed and the processing capacity. In parallel computing, when parallel jobs are submitted and scheduling operations are started, the job management system needs to transfer job information, resource information, job task information, and the like to computing resources of all running job tasks in multiple times for construction of the parallel jobs due to correlation between parallel job processes. The speed of job related data transfer will directly affect job start time and efficiency.
In the current mainstream parallel job management system, when job information, resource information and job task information are required to be transmitted to all computing resources for running the job tasks in the process of job starting, relevant original information is directly listed and transmitted, when the working scale is large, the amount of information required to be transmitted is large, the transmission time is long, and meanwhile, a transmission network is subjected to relatively large pressure in the transmission process, so that the running control efficiency is influenced. Therefore, how to solve the problems of large information transmission amount and long time in the starting process of large-scale operation, improve the starting efficiency of large-scale operation, effectively relieve the network pressure, and become the direction of the efforts of the technicians in the field.
Disclosure of Invention
The invention aims to provide a method for transmitting operation starting data compression in parallel computing, which solves the problems of large information transmission amount and long time in the large-scale operation starting process, improves the efficiency of large-scale operation starting and effectively relieves the network pressure.
In order to achieve the above purpose, the invention adopts the following technical scheme: a job-started data compression transfer method in parallel computing is based on massive parallel jobs, and comprises the following steps:
s11, starting a job, namely acquiring all data information which needs to be sent to a computing resource for running the job task, wherein job global resource information in the all data information comprises repeated common information;
s12, the job management system extracts the common information in the job global resource information which is required to be sent to each computing resource from the full data information, so that the full data information is compressed into attribute data information which is reduced in repeated common information and only contains one share of common information and other personalized computing resource information;
s2, the job management system analyzes the continuity rule of the resource list in the attribute data information obtained in the S12, and carries out range description on the attribute data information, so that the attribute data information is compressed into format data information described by an independent formatting statement;
s3, the job management system compresses the commonality information which is required to be sent to the computing resource running the job task and the format data information obtained in S12 through a general compression algorithm to obtain compressed transmission data information which is used for being sent to the computing resource;
s4, the computing resources receive the transmission data information from the job management system, perform general decompression corresponding to a general compression algorithm on the transmission data information to obtain common information and format data information, perform reverse data analysis on the common information and the format data information, convert the range description into the original list description, and then re-add the extracted common information into personalized attribute information of each computing resource to obtain original all-data information;
s5, the acquired all data information is locally stored by respective operation programs of the computing resources, and when the computing resources need the data information in the all data information, the all data information can be directly read through the local storage.
The further improved scheme in the technical scheme is as follows:
1. in the above scheme, in S12, the commonality information includes the resource type, the number of operation cores in the node, and the bitmap.
2. In the above-described aspect, in S11, the all-data information includes job information, resource information, and job task information.
Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:
the method for compressing and transmitting the data for starting the operation in the parallel computing solves the problems of large information transmission amount and long time in the process of starting the large-scale operation, improves the efficiency of starting the large-scale operation, effectively relieves the network pressure, reduces the transmission time by effectively compressing the transmitted data in the process of collecting and transmitting the information, reduces the data amount required to be transmitted through the network in the process of starting the operation, and simultaneously reduces the transmission time by locally storing global data on a computing node, avoids the repeated transmission of related data, effectively relieves the network transmission pressure, reduces the data transmission time and improves the operation starting efficiency.
Drawings
FIG. 1 is a flow chart of a method for job initiated data compression transfer in parallel computing.
Detailed Description
Examples: a job-started data compression transfer method in parallel computing is based on massive parallel jobs, and comprises the following steps:
s11, starting a job, namely acquiring all data information which needs to be sent to a computing resource for running the job task, wherein job global resource information in the all data information comprises repeated common information;
s12, the job management system extracts the common information in the job global resource information which is required to be sent to each computing resource from the full data information, so that the full data information is compressed into attribute data information which is reduced in repeated common information and only contains one share of common information and other personalized computing resource information;
s2, the job management system analyzes the continuity rule of the resource list in the attribute data information obtained in the S12, and carries out range description on the attribute data information, so that the attribute data information is compressed into format data information described by an independent formatting statement, for example, the resource list is node array 1,2,3,4,5,7,9, and the resource list is converted into 1-5,7 and 9 through range description, so that the compression transmission data is obviously effective when the number of resources is huge and the system has continuity;
s3, the job management system compresses the commonality information which is required to be sent to the computing resource running the job task and the format data information obtained in S12 through a general compression algorithm to obtain compressed transmission data information which is used for being sent to the computing resource;
s4, the computing resources receive the transmission data information from the job management system, perform general decompression corresponding to a general compression algorithm on the transmission data information to obtain common information and format data information, perform reverse data analysis on the common information and the format data information, convert the range description into the original list description, and then re-add the extracted common information into personalized attribute information of each computing resource to obtain original all-data information;
s5, the acquired all data information is locally stored by respective operation programs of the computing resources, and when the computing resources need the data information in the all data information, the all data information can be directly read through the local storage.
In S12, the commonality information includes the resource type, the number of intra-node operation cores, and the bitmap.
In S11, the full data information includes job information, resource information, and job task information.
Examples are further explained as follows:
assuming that the total amount of data to be transmitted is D1 initially, the job management system firstly extracts the commonality information of the job global resources to be transmitted to each computing resource, such as the resource type, the number of operation cores in the node, the bitmap and the like, uses uniform global description to replace a plurality of independent node attribute information, can reduce the attribute data of a single computing resource in the transmitted data through the step, and compresses the data amount from the initial D1 to the D2;
performing regular analysis on a resource list in the transmission data, performing scoping description on the resource list so as to describe a large amount of node list information through an independent formatting statement when the number of resources is large, and compressing the transmission data amount from D2 to D3;
because the information sent to the computing resource has integrity, all the information is uniformly analyzed and processed after being received, and the processing in the receiving process is not needed, the operation management system can package the data and compress the data through a general compression algorithm before sending the data to the computing resource, and the transmission data quantity is compressed from D3 to D4, so that only the compressed data packet with the size of D4 is needed to be sent to the computing resource;
after receiving the job information sent by the job management system, the computing resource restores and acquires the original data (the original data with the size of D1) through decompression and reverse analysis operations. Reducing a large amount of data transmission by a very small amount of computation;
the job program in the computing resource stores the received job and resource information sent by the job management system locally, so that the job and resource information can be read locally when needed, frequent data acquisition from the job management system through a control network is avoided, and the network transmission message quantity is reduced;
extracting and simplifying the common information of the resource attribute, regularly analyzing and describing the information of the resource list in a range, calculating the local storage of the resource to the global information, avoiding repeated transmission of the information, compressing the data by using a general compression algorithm before network transmission, and the like, so that the transmission data amount is reduced, the transmission time is reduced, and the operation starting efficiency is improved;
and based on the localized storage of the computing resources of the job related information, the job management system can optimize part of repeated information in the job data sent for many times in the process of starting the job, and reduce the data volume of each information transmission.
In order to facilitate a better understanding of the present invention, the terms used herein will be briefly explained below:
parallel computation (Parallel Computing): parallel computing refers to a process of solving a computing problem by using multiple computing resources at the same time, and solves the same problem in a concurrent and cooperative manner through multiple nodes/processors, so as to improve the computing speed and the processing capacity.
Parallel operation: generally, the task process set is written by parallel languages such as MPI and the like, runs on the computing resources of a parallel computer, is started and controlled by a job management system, and solves the same problem through inter-process cooperation.
A parallel job management system: the management control system is operated in the parallel computer and used for performing functions such as parallel job scheduling, task starting, control and recovery.
When the method for compressing and transmitting the data is used for operation starting in parallel computing, the problems of large information transmission amount and long time in the process of starting large-scale operation are solved, the efficiency of starting large-scale operation is improved, the network pressure is effectively relieved, the transmitted data is effectively compressed in the process of information collection and transmission, the data amount required to be transmitted through a network in the process of starting the operation is reduced, the transmission time is shortened, meanwhile, the local storage of global data is realized on a computing node, the repeated transmission of related data is avoided, the network transmission pressure is effectively relieved, the data transmission time is shortened, and the operation starting efficiency is improved.
The above embodiments are provided to illustrate the technical concept and features of the present invention and are intended to enable those skilled in the art to understand the content of the present invention and implement the same, and are not intended to limit the scope of the present invention. All equivalent changes or modifications made in accordance with the spirit of the present invention should be construed to be included in the scope of the present invention.

Claims (3)

1. A method for compressing and transmitting operation start data in parallel computing is characterized in that: based on massive parallel operation, the method comprises the following steps:
s11, starting a job, namely acquiring all data information which needs to be sent to a computing resource for running the job task, wherein job global resource information in the all data information comprises repeated common information;
s12, the job management system extracts the common information in the job global resource information which is required to be sent to each computing resource from the full data information, so that the full data information is compressed into attribute data information which is reduced in repeated common information and only contains one share of common information and other personalized computing resource information;
s2, the job management system analyzes the continuity rule of the resource list in the attribute data information obtained in the S12, and carries out range description on the attribute data information, so that the attribute data information is compressed into format data information described by an independent formatting statement;
s3, the job management system compresses the commonality information which is required to be sent to the computing resource running the job task and the format data information obtained in S12 through a general compression algorithm to obtain compressed transmission data information which is used for being sent to the computing resource;
s4, the computing resources receive the transmission data information from the job management system, perform general decompression corresponding to a general compression algorithm on the transmission data information to obtain common information and format data information, perform reverse data analysis on the common information and the format data information, convert the range description into the original list description, and then re-add the extracted common information into personalized attribute information of each computing resource to obtain original all-data information;
s5, the acquired all data information is locally stored by respective operation programs of the computing resources, and when the computing resources need the data information in the all data information, the all data information can be directly read through the local storage.
2. The method for job-initiated data compression delivery in parallel computing of claim 1, wherein: in S12, the commonality information includes the resource type, the number of intra-node operation cores, and the bitmap.
3. The method for job-initiated data compression delivery in parallel computing of claim 1, wherein: in S11, the full data information includes job information, resource information, and job task information.
CN201910764215.3A 2019-08-19 2019-08-19 Method for compressing and transmitting operation starting data in parallel computing Active CN112398750B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910764215.3A CN112398750B (en) 2019-08-19 2019-08-19 Method for compressing and transmitting operation starting data in parallel computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910764215.3A CN112398750B (en) 2019-08-19 2019-08-19 Method for compressing and transmitting operation starting data in parallel computing

Publications (2)

Publication Number Publication Date
CN112398750A CN112398750A (en) 2021-02-23
CN112398750B true CN112398750B (en) 2024-02-06

Family

ID=74603434

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910764215.3A Active CN112398750B (en) 2019-08-19 2019-08-19 Method for compressing and transmitting operation starting data in parallel computing

Country Status (1)

Country Link
CN (1) CN112398750B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0955708A (en) * 1995-08-11 1997-02-25 Fujitsu Ltd Data compression and transfer system
CN105721810A (en) * 2014-12-05 2016-06-29 北大方正集团有限公司 Image compression storage method and apparatus
CN107172079A (en) * 2017-06-27 2017-09-15 武汉蓝星软件技术有限公司 A kind of data compression exchange method based on application service core frame platform
CN107409152A (en) * 2015-03-12 2017-11-28 英特尔公司 Method and apparatus for compressing the data received by network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2472072B (en) * 2009-07-24 2013-10-16 Hewlett Packard Development Co Deduplication of encoded data
US10963171B2 (en) * 2017-10-16 2021-03-30 Red Hat, Inc. Compressibility instrumented dynamic volume provisioning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0955708A (en) * 1995-08-11 1997-02-25 Fujitsu Ltd Data compression and transfer system
CN105721810A (en) * 2014-12-05 2016-06-29 北大方正集团有限公司 Image compression storage method and apparatus
CN107409152A (en) * 2015-03-12 2017-11-28 英特尔公司 Method and apparatus for compressing the data received by network
CN107172079A (en) * 2017-06-27 2017-09-15 武汉蓝星软件技术有限公司 A kind of data compression exchange method based on application service core frame platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种并行作业任务启动模型及其可扩展性分析;宋长明等;计算机工程与科学;第35卷(第11期);第182-186页 *

Also Published As

Publication number Publication date
CN112398750A (en) 2021-02-23

Similar Documents

Publication Publication Date Title
CN102236581B (en) Mapping reduction method and system thereof for data center
CN111209310B (en) Service data processing method and device based on stream computing and computer equipment
CN103312544A (en) Method, equipment and system for controlling terminals during log file reporting
CN111736907B (en) Data analysis method of self-adaptive low-delay memory computing engine
CN116755844B (en) Data processing method, device and equipment of simulation engine and storage medium
CN114036183A (en) Data ETL processing method, device, equipment and medium
CN111107022B (en) Data transmission optimization method, device and readable storage medium
CN111898009A (en) Distributed acquisition system and method for multi-source power data fusion
CN112398750B (en) Method for compressing and transmitting operation starting data in parallel computing
CN113918532A (en) Portrait label aggregation method, electronic device and storage medium
CN112416557B (en) Method and device for determining call relation, storage medium and electronic device
CN111405020B (en) Asynchronous file export method and system based on message queue and fastDFS micro-service framework
CN111460021B (en) Data export method and device
CN111447229A (en) Large-scale data acquisition method and device based on compressed sensing theory
CN111276231A (en) Medical data monitoring method and device, computer equipment and storage medium
WO2022253131A1 (en) Data parsing method and apparatus, computer device, and storage medium
CN115561620A (en) Chip testing method and device based on GRPC (glass-fiber reinforced polycarbonate) and storage medium
CN112612823B (en) Big data time sequence analysis method based on fusion of Pyspark and Pandas
CN114063943A (en) Data transmission system, method, device, medium, and apparatus
CN112417015A (en) Data distribution method and device, storage medium and electronic device
CN113190237A (en) Data processing method, system and device
CN107330089B (en) Cross-network structured data collection system
CN106815017B (en) Dynamic language performance analysis and display method and system
CN113190581A (en) Method and terminal for dynamically generating report form based on big data
CN111159004A (en) Hadoop cluster simulation test method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant