CN116932837A

CN116932837A - Pulsar parallel search optimization method and system based on clusters

Info

Publication number: CN116932837A
Application number: CN202311176454.XA
Authority: CN
Inventors: 李明辉; 张幸楠; 潘之辰; 张立云
Original assignee: Guizhou University
Current assignee: Guizhou University
Priority date: 2023-09-13
Filing date: 2023-09-13
Publication date: 2023-10-24

Abstract

The application provides a pulsar parallel search optimization method and a pulsar parallel search optimization system based on clusters, wherein the pulsar parallel search optimization method comprises the following steps: s1: dividing the original data applied from the FAST according to a preset dividing mode to obtain a plurality of pieces of block data; s2: applying for cluster resources according to preset meeting conditions to obtain a main node and a plurality of computing nodes; s3: respectively and correspondingly distributing a plurality of calculation nodes to a plurality of partitioned data, and carrying out parallel pulsar parallel search calculation; s4: releasing the resources of the calculation result in the step S3, and summarizing the resources to a master node; s5: the method and the device aim to improve the calculation speed and the precision of pulsar search, solve the bottleneck problem under the condition of large-scale data by an efficient and reliable parallel calculation mode, and ensure the feasibility and the efficiency of integral calculation.

Description

Pulsar parallel search optimization method and system based on clusters

Technical Field

The application relates to the technical field of computer software, in particular to a pulsar parallel search optimization method and system based on clusters.

Background

Pulsar is a high-density material body in universe, and its stability and higher rotating speed and electromagnetic radiation capability make it become one of important astronomical research, and has important effect for deep research in fields of universe, astronomical physics, etc. Pulsar search is one of the main pulsar detection methods at present, and based on signals received by a radio telescope, pulsar detection is realized by analyzing and processing data. However, when processing large-scale pulsar data, the problems of slow computing speed, insufficient parallelism and the like existing in the conventional individual computer or cluster computing technology have become bottlenecks, and efficient computing is difficult to achieve.

Accordingly, there is a need to develop a cluster-based pulsar parallel search optimization method and system that addresses the deficiencies of the prior art to solve or mitigate one or more of the problems described above.

Disclosure of Invention

In view of this, the application provides a pulsar parallel search optimization method and system based on clusters, which aims to improve the calculation speed and precision of pulsar search.

In one aspect, the application provides a cluster-based pulsar parallel search optimization method, which comprises the following steps:

s1: dividing the original data according to a preset dividing mode to obtain a plurality of pieces of block data; the preset dividing mode is as follows: firstly dividing total data into block data according to observation batches, dividing the whole data set into a plurality of smaller segments, wherein each segment corresponds to one observation batch, and then carrying out parallel processing by utilizing a plurality of command lines; the command uses three commands in presto software, prepdata, realfft and accicesearch; firstly, preprocessing block data of each observation batch by using a prepdata command, wherein the command is responsible for reading data and performing basic data cleaning and calibration operations; next, performing fast fourier transform on the preprocessed data by using a realfft command, and converting the time domain data into frequency domain data; finally, using an accelesearch command to perform fast cumulative search on the frequency domain data to detect the pulse signals, wherein the command searches different dispersion values in the frequency domain according to a preset dispersion range and step length to search possible pulse signals;

s2: applying for cluster resources according to preset meeting conditions to obtain a main node and a plurality of computing nodes, wherein the preset meeting conditions are that:

there are enough free resources in the cluster;

the resource requirement of the job is matched with the resource allocation strategy in the cluster;

the current scheduling policy allows the job to commit and execute;

s3: respectively and correspondingly distributing a plurality of calculation nodes to a plurality of partitioned data, and carrying out parallel pulsar parallel search calculation;

s4: releasing the resources of the calculation result in the step S3, and summarizing the resources to a master node;

s5: the method comprises the steps of merging results in a main node to obtain a final search result, wherein the merging of the results in the main node is specifically as follows: in the parallel computing process, the computing results of all computing nodes are uploaded to a main node in real time through an MPI communication mechanism to be combined, so that a final search result is obtained.

In the aspect and any possible implementation manner described above, there is further provided an implementation manner, wherein the raw data in S1 is pulsar observation data applied from FAST, and the raw data is stored in a form of a field file, including but not limited to, an observation parameter of a pulsar, a signal strength, and frequency information.

Aspects and any possible implementation as described above, further providing an implementation, the number of computing nodes in S2 includes, but is not limited to, 20 core resource nodes.

In the aspect and any possible implementation manner as described above, there is further provided an implementation manner, where the cluster resource application in S2 is completed by a Slurm scheduler.

In the aspect and any possible implementation manner described above, there is further provided an implementation manner, in S3, a plurality of computing nodes are allocated to a plurality of block data correspondingly, and specifically:

s31: submitting the job by using a switch command of the slm, and designating the total resource quantity required by the job, including the node number and the CPU core number;

s32: the switch command adds the job into the waiting job queue, and once the job in the job waiting queue can meet the resource requirement and reach the head of the queue, the slope allocates available resources for the job and allocates the resources to each data block.

In the aspect and any possible implementation manner described above, there is further provided an implementation manner, where the resource release process in S4 specifically is: and after one command line is operated, releasing occupied CPU core and memory resources, wherein the command line does not occupy the allocated CPU core and memory any more, returning the CPU core and memory to the cluster system, storing results or intermediate data generated by parallel tasks as required in the process of releasing the resources, and writing the results into a designated storage medium after each command line is executed.

In the aspect and any possible implementation manner described above, there is further provided a cluster-based pulsar parallel search optimization system, based on the cluster-based pulsar parallel search optimization method, the pulsar parallel search optimization system includes:

the data dividing module is used for dividing the original data according to a preset dividing mode to obtain a plurality of block data; the preset dividing mode is as follows: firstly dividing total data into block data according to observation batches, dividing the whole data set into a plurality of smaller segments, wherein each segment corresponds to one observation batch, and then carrying out parallel processing by utilizing a plurality of command lines; the command uses three commands in presto software, prepdata, realfft and accicesearch; firstly, preprocessing block data of each observation batch by using a prepdata command, wherein the command is responsible for reading data and performing basic data cleaning and calibration operations; next, performing fast fourier transform on the preprocessed data by using a realfft command, and converting the time domain data into frequency domain data; finally, using an accelesearch command to perform fast cumulative search on the frequency domain data to detect the pulse signals, wherein the command searches different dispersion values in the frequency domain according to a preset dispersion range and step length to search possible pulse signals;

the resource application module is used for applying for cluster resources according to preset meeting conditions to obtain a main node and a plurality of computing nodes, wherein the preset meeting conditions are that:

there are enough free resources in the cluster;

the current scheduling policy allows the job to commit and execute;

the data distribution calculation module is used for correspondingly distributing a plurality of calculation nodes to a plurality of block data respectively and carrying out parallel pulsar parallel search calculation;

the data summarizing module is used for releasing the resources of the calculation result and summarizing the calculation result to the master node;

the data merging module is used for merging the results in the main node to obtain a final search result; the merging process specifically comprises the following steps: in the parallel computing process, the computing results of all computing nodes are uploaded to a main node in real time through an MPI communication mechanism to be combined, so that a final search result is obtained.

Compared with the prior art, the application can obtain the following technical effects:

in conclusion, the technical scheme of the application provides reliable guarantee for the efficient calculation of the pulsar search algorithm in a large-scale cluster environment by realizing the functions of resource dynamic application and task parallel queuing operation. The scheme has the advantages of simple and convenient realization, high resource utilization rate, easy expansion and the like, passes the verification of code realization, and has wide application prospect in the fields of large-scale data processing, analysis and the like.

Of course, it is not necessary for any of the products embodying the application to achieve all of the technical effects described above at the same time.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a cluster-based pulsar parallel search optimization algorithm according to an embodiment of the present application;

fig. 2 is a step diagram of a cluster-based pulsar parallel search optimization method according to an embodiment of the present application.

Detailed Description

For a better understanding of the technical solution of the present application, the following detailed description of the embodiments of the present application refers to the accompanying drawings.

It should be understood that the described embodiments are merely some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

The terminology used in the embodiments of the application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

As shown in fig. 2, the application provides a cluster-based pulsar parallel search optimization method, which comprises the following steps:

s1: dividing the original data according to a preset dividing mode to obtain a plurality of pieces of block data;

s2: applying for cluster resources according to preset meeting conditions to obtain a main node and a plurality of computing nodes;

s5: and merging the results in the main node to obtain the final search result.

The preset dividing mode in the step S1 specifically comprises the following steps: the total data is first divided into block data by observation batch. This means that the whole dataset is divided into a number of smaller segments, one for each observation batch. Then, parallel processing is performed using a plurality of command lines. The command line uses three commands in prestoc software, prepdata, realfft and accicesearch. First, the block data of each observation batch is preprocessed using a prepdata command, which is responsible for reading the data, performing basic data clean-up and calibration operations. Next, the preprocessed data is subjected to a fast fourier transform using a reaifft command, converting the time domain data into frequency domain data. Finally, the frequency domain data is subjected to fast cumulative search by using an acessearch command to detect the pulse signal, and the command searches different dispersion values in the frequency domain according to a preset dispersion range and step length to search for possible pulse signals. By running these command lines in parallel, different command lines are responsible for processing different observation batches to increase the efficiency of data processing. Meanwhile, due to the fact that parallel processing is used, each command line can process different dispersion values, and therefore the searching coverage range of pulse signals is increased.

The raw data in the S1 are as follows: pulsar observations were applied from FAST (Five-handred-meter Aperture Spherical radio Telescope). These data are stored in the form of a files of the field and include information about the observed parameters of the pulsar, the signal strength, the frequency, etc.

The preset meeting condition in the S2 is specifically: whether enough idle resources exist in the cluster; whether the resource requirement of the job is matched with the resource allocation strategy in the cluster or not; whether the current scheduling policy allows the job to commit and execute.

The several computing nodes in S2 include, but are not limited to, 20 core resource nodes.

And the cluster resource application in the S2 is completed through a slurry scheduler.

The allocation process in S3 specifically includes: the total resources are applied for by the switch command of the norm, and the resources are allocated to the respective data blocks by the switch command. In this method, a task is first submitted using a slice command and the total number of resources required for the task, including the number of nodes, the number of CPU cores, etc., is specified. The switch command adds the job to the waiting job queue, and once the job in the job waiting queue can meet the resource requirement and reach the head of the queue, the slope allocates available resources for the job and allocates the resources to each data block. This allocation of resources is not done randomly, but is based on the availability of resources in the cluster, e.g. matching according to the availability of nodes and CPU cores in the cluster. If the available resources are insufficient to meet the job's needs, the job may wait until sufficient resources are available.

The resource release process in S4 specifically includes: when one command line runs, the occupied CPU core and memory resources are released, which means that the command line does not occupy the allocated CPU core and memory any more, and returns the allocated CPU core and memory to the cluster system. In the process of releasing the resources, the results or intermediate data generated by the parallel tasks can be stored as required, and the results can be written into a designated storage medium after each command line is executed, so that the data loss can be prevented, and necessary data can be provided for subsequent analysis and processing. After releasing the CPU core and memory resources, the cluster system may reassign these resources to subsequent unexecuted command lines so that subsequent unexecuted command lines may begin parallel execution using the just released resources.

The merging of the results in the step S4 is specifically as follows: and combining the interaction of tasks and results on different computing nodes through the communication mechanism of the MPI.

The application provides a pulsar parallel search optimization system based on clusters, which comprises:

there are enough free resources in the cluster;

the current scheduling policy allows the job to commit and execute;

As shown in fig. 1, in the flow of the present application, data is first divided, an original data set is divided according to a certain manner, and then block data is distributed to different computing nodes, and pulsar signals are searched in parallel by using parallel computing capability between the nodes. The computing nodes may be multiple computers within a cluster, with the parallel computing power of each computing node being achieved by the number of nodes and the number of cores. In the parallel computing process, the computing results of all computing nodes are uploaded to a main node in real time through an MPI communication mechanism to be combined, so that a final search result is obtained.

The application aims to solve the bottleneck problem of the traditional calculation mode in large-scale pulsar data processing. According to the technical scheme, the cluster environment of the national key laboratory supercomputer of the public big data of Guizhou university is utilized to realize quick and efficient searching of large-scale pulsar data. The cluster consists of 20 servers, each server is provided with 1 Intel Xeon E5-2620 v4 core 2.1GHz processor, 64GB DDR4 memory and 1TB SATA hard disk, and an InfiniBand FDR 56Gb/s high-speed interconnection network is adopted and supports MPI parallel computation. All nodes in the cluster have Red Hat Enterprise Linux Server 7.6.7.6 operating systems installed, providing a stable software development and computing environment. And an SLURM scheduler is adopted to support the allocation, submission and monitoring of the computing tasks, and the utilization of computing resources is maximized by automating the reservation and release of the computing nodes.

Specifically, the scheme subdivides the pulsar search task according to indexes such as data quantity, dispersion range, step length and the like, and distributes the task to 20 nodes to realize parallel processing. Through the OpenMPI parallel programming framework, the code executes 20 parallel computing tasks on each node, each task is responsible for reading a small block of data, and the result is output through a pulsar search algorithm. And through the communication mechanism of the MPI, the interaction of tasks on different computing nodes and the combination of results finally form pulsar search results.

In order to ensure that each search task has enough resource operation and ensure the maximum resource utilization rate of the whole cluster, the application adopts a dynamic resource application mode. Before the task is submitted to the SLURM scheduler, a resource pre-application is performed, comprising 1 node and 20 cores. And an SLURM scheduler management tool is adopted to monitor the running condition of each task, and adjust the resource allocation condition of the task at any time, so that the efficient completion of the task and the optimal utilization of the resources are realized.

The specific implementation details of the application are as follows:

in order to realize the dynamic application of resources, the application adopts the pre-application function provided by the Slur scheduler. In code, each search task invokes a resource pre-application function before submitting to the Slurm scheduler. The pre-application function applies for resources of 1 node and 20 cores on the cluster by using an API interface provided by the Slurm, and stores resource information in environment variables of the task. The pre-applied resources will be used for scheduling and allocation operations for subsequent computing tasks.

For the case that some tasks fail to acquire resources in time, the application processes the tasks through a waiting mechanism provided by the slurry scheduler, and the tasks wait until the available resources in the cluster.

For each calculation task, the code of the application breaks down the calculation task into 20 task parallel calculations, and performs task queuing and resource scheduling through a slurry scheduler. In the code, the split 20 tasks are submitted to the Slurm scheduler at one time by calling a job submitting function provided by the Slurm. Each task will be scheduled and allocated according to the pre-applied resource information of 1 node and 20 cores. The task is arranged to run on available resources by the Slur scheduler according to the priority of the task, the resource requirement, the current cluster load condition and other factors so as to maximize the utilization rate of the cluster resources.

Example 1:

the method provided by the application is realized on a cluster of public big data key laboratory supercomputers built together by the university of Guizhou province. In the environment, the pulsar search algorithm obtains high-efficiency calculation in a large-scale cluster environment through the characteristics of resource dynamic application, task parallel queuing operation and the like. Each searching sub-block is subdivided according to the dispersion range and the step length, 1 node and 20 core computing resources are applied, and high concurrency processing is carried out through 20 parallel computing tasks. Resource allocation and management are carried out through the slurry scheduler, so that cluster resources can be utilized to the greatest extent, the efficiency of searching tasks is improved, meanwhile, the use of the resources of the supercomputer center is planned, and the optimal allocation and utilization of the resources are realized. The method is simple and convenient to realize, high in resource utilization rate and easy to expand, and can be widely applied to the fields of large-scale data processing, analysis and the like through code verification.

The pulsar parallel search optimization method and system based on the cluster provided by the embodiment of the application are described in detail. The above description of embodiments is only for aiding in the understanding of the method of the present application and its core ideas; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Certain terms are used throughout the description and claims to refer to particular components. Those of skill in the art will appreciate that a hardware manufacturer may refer to the same component by different names. The description and claims do not take the form of an element differentiated by name, but rather by functionality. As referred to throughout the specification and claims, the terms "comprising," including, "and" includes "are intended to be interpreted as" including/comprising, but not limited to. By "substantially" is meant that within an acceptable error range, a person skilled in the art is able to solve the technical problem within a certain error range, substantially achieving the technical effect. The description hereinafter sets forth a preferred embodiment for practicing the application, but is not intended to limit the scope of the application, as the description is given for the purpose of illustrating the general principles of the application. The scope of the application is defined by the appended claims.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a product or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such product or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a commodity or system comprising such elements.

It should be understood that the term "and/or" as used herein is merely one relationship describing the association of the associated objects, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.

While the foregoing description illustrates and describes the preferred embodiments of the present application, it is to be understood that the application is not limited to the forms disclosed herein, but is not to be construed as limited to other embodiments, and is capable of numerous other combinations, modifications and environments and is capable of changes or modifications within the scope of the inventive concept as expressed herein, either as a result of the foregoing teachings or as a result of the knowledge or technology of the relevant art. And that modifications and variations which do not depart from the spirit and scope of the application are intended to be within the scope of the appended claims.

Claims

1. The pulsar parallel search optimization method based on the clusters is characterized by comprising the following steps of:

there are enough free resources in the cluster;

the current scheduling policy allows the job to commit and execute;

s3: respectively and correspondingly distributing a plurality of calculation nodes to a plurality of block data, and carrying out pulsar parallel search calculation;

2. The cluster-based pulsar parallel search optimization method according to claim 1, wherein the raw data in S1 is pulsar observation data applied from FAST, and the raw data is stored in the form of a field file, including but not limited to, pulsar observation parameters, signal strength, and frequency information.

3. A cluster-based pulsar parallel search optimization method according to claim 1, wherein said number of computing nodes in S2 includes, but is not limited to, 20 core resource nodes.

4. The method for optimizing pulsar parallel search based on cluster according to claim 3, wherein the step of applying for cluster resources according to preset meeting conditions in the step of S2 is accomplished by a Slurm scheduler.

5. The cluster-based pulsar parallel search optimization method according to claim 1, wherein in S3, a plurality of computing nodes are allocated to a plurality of partitioned data respectively, and specifically:

6. The cluster-based pulsar parallel search optimization method according to claim 1, wherein the resource release process in S4 specifically comprises: and after one command line is operated, releasing occupied CPU core and memory resources, wherein the command line does not occupy the allocated CPU core and memory any more, returning the CPU core and memory to the cluster system, storing results or intermediate data generated by parallel tasks as required in the process of releasing the resources, and writing the results into a designated storage medium after each command line is executed.

7. A cluster-based pulsar parallel search optimization system, comprising:

there are enough free resources in the cluster;

the current scheduling policy allows the job to commit and execute;

the data distribution calculation module is used for correspondingly distributing a plurality of calculation nodes to a plurality of block data respectively and carrying out pulsar parallel search calculation;

the data merging module is used for merging the results in the main node to obtain a final search result; the merging processing is carried out on the results in the main node, specifically: in the parallel computing process, the computing results of all computing nodes are uploaded to a main node in real time through an MPI communication mechanism to be combined, so that a final search result is obtained.