CN115827238A - Single pulse searching method based on Ray parallel framework - Google Patents

Single pulse searching method based on Ray parallel framework Download PDF

Info

Publication number
CN115827238A
CN115827238A CN202211602907.6A CN202211602907A CN115827238A CN 115827238 A CN115827238 A CN 115827238A CN 202211602907 A CN202211602907 A CN 202211602907A CN 115827238 A CN115827238 A CN 115827238A
Authority
CN
China
Prior art keywords
cpu
creating
pulse
search
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211602907.6A
Other languages
Chinese (zh)
Inventor
傅志明
于徐红
谢晓尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guizhou Education University
Original Assignee
Guizhou Education University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guizhou Education University filed Critical Guizhou Education University
Priority to CN202211602907.6A priority Critical patent/CN115827238A/en
Publication of CN115827238A publication Critical patent/CN115827238A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a single-pulse searching method based on a Ray parallel framework, which comprises the following steps: s1, reading a pulsar time sequence data file; s2, creating a DM list, a candlist and a num _ v _ DMstr list; s3, creating a local scheduler and distributing a single-pulse search task according to the state of the CPU process; s4, generating a Worker process on each CPU to perform a single pulse search task; and S5, recording and storing the pulsar candidate information searched on each CPU. The parallel strategy of the invention is only based on the CPU, does not need to depend on CUDA and GPU, does not need to modify codes, and can realize high-performance monopulse search under the CPU environment.

Description

Single pulse searching method based on Ray parallel framework
Technical Field
The invention relates to the technical field of pulsar data processing, in particular to a single-pulse searching method based on a Ray parallel framework.
Background
Pulsar is a special type of neutron star, and has an extremely strong magnetic field and an extremely stable rotation period. Based on the stable periodicity of pulsar, the pulsar detection at present mainly uses fast fourier transform and fast folding algorithm to search periodically in time domain. However, in the actual exploration research process, some pulsar signals do not exhibit obvious periodicity, such as zero pulsar, intermittent pulsar, special astronomical pulse signal fast radiostorm found by Lorimer and the like, and rotating radiotemporary source found by McLaughlin and the like, and this celestial body cannot be searched by the periodic search method and needs to be searched by single pulse search.
The currently common single pulse search tool is special software HEIMDALL for single pulse search accelerated by PRESTO. PRESTO is a large pulsar search and analysis software suite developed by ScottRansom, one of the most common software used to perform pulsar searches, which has been involved in finding over 700 pulsar. FAST observation data mainly uses PRESTO to search for pulsar, but the PRESTO development history is earlier, the search process mostly adopts a single-core serial mode, and the problem that the PRESTO search speed is still deficient in mass pulsar observation data is solved.
Disclosure of Invention
The invention aims to provide a single-pulse searching method based on a Ray parallel framework, which solves the existing problems.
In order to achieve the purpose, the invention provides the following technical scheme:
a single pulse searching method based on a Ray parallel framework comprises the following steps:
s1, reading a pulsar time sequence data file;
s2, creating a DM list, a candlist and a num _ v _ DMstr list;
s3, creating a local scheduler and distributing a monopulse search task according to the state of the CPU process;
s4, generating a Worker process on each CPU to perform a single-pulse search task;
and S5, recording and storing the pulsar candidate information searched on each CPU.
A single pulse searching method based on a Ray parallel framework specifically comprises the following steps:
s1, reading a pulsar time sequence data file:
reading and storing parameter information corresponding to a pulsar time sequence data dat file obtained after the decoloration and different dat data files as args parameters;
step S2, creating a DM list, a candlist and a num _ v _ DMstr list:
creating a DM list for recording dispersion values used correspondingly when different dat data files are subjected to dispersion elimination, creating a candlist for recording and storing candidate body information obtained by searching the last single pulse, and creating a num _ v _ DMstr list for recording the number of candidate bodies obtained by corresponding searching under different DM values;
s3, creating a local scheduler and distributing a monopulse search task according to the state of the CPU process:
according to the number of CPUs in the current platform and the process state of each CPU, single-pulse search tasks are distributed in a balanced manner;
s4, generating a Worker process on each CPU to perform a single-pulse search task;
and S5, recording and storing pulsar candidate information searched on each CPU:
and receiving candidate body information obtained by the monopulse search task in each process and storing the candidate body information in a singlepulse file.
Compared with the prior art, the invention has the following beneficial effects:
the parallel strategy of the invention is only based on the CPU, does not need to depend on CUDA and GPU, does not need to modify codes, and can realize high-performance monopulse search under the CPU environment.
Compared with the detrending algorithm in the original PRESTO single pulse search, the redesigned optimized detrending algorithm can effectively improve the performance of the detrending algorithm of the original program.
Compared with the original PRESTO single-pulse search program, the method provided by the invention has the advantages that the original single-pulse search program in the PRESTO is optimized by using the Ray parallel framework, and the performance of the single-pulse search program is obviously improved.
The method firstly optimizes the trend removing algorithm in the PRESTO single-pulse search, then combines the characteristic that the single-pulse search algorithm does not have data coupling condition when searching different dat data files, and optimizes the original single-pulse search program in the PRESTO by using the Ray parallel framework, thereby obviously improving the performance of the single-pulse search program on the basis of ensuring the search effect.
The algorithm optimization and parallelization work in the invention is based on a pure CPU environment, does not need to depend on CUDA and GPU, does not need to modify codes, can realize high-performance single-pulse search in the CPU environment, and greatly improves the speed of processing pulsar data.
Drawings
FIG. 1 is a flow chart of the detrending algorithm after optimization in accordance with the present invention;
FIG. 2 is a diagram illustrating parallelization of monopulse search according to the present invention;
FIG. 3 is a graph comparing the optimized detrending algorithm with the acceleration of the original program;
FIG. 4 is a comparison graph of the acceleration of the original program after the single pulse search is parallelized according to the present invention;
FIG. 5 is a comparison graph of search results obtained from the original program after the optimization of the detrending algorithm and the parallelization of the monopulse search.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
As shown in fig. 1 and 2, a single pulse search method based on a Ray parallel framework is used for accelerating the processing speed of pulsar data, firstly, an optimized detrending algorithm is provided, time series data of an original array type received by a telescope are converted into a list type, and detrending is performed on the list type data by a least square method; then, realizing the parallelization of the single-pulse search by adopting a high-performance parallel framework Ray; finally, verifying the effectiveness of the method on the real FAST pulsar data;
a single pulse searching method based on a Ray parallel framework comprises the following steps:
s1, reading a pulsar time sequence data file:
reading and storing parameter information corresponding to a pulsar time sequence data dat file obtained after the decoloration and different dat data files as args parameters;
step S2, creating a DM list, a candlist and a num _ v _ DMstr list:
creating a DM list for recording dispersion values used correspondingly when the dispersion of different dat data files is eliminated, creating a candlistlist for recording and storing candidate body information obtained by searching the last single pulse, and creating a num _ v _ DMstr list for recording the number of candidate bodies obtained by correspondingly searching under different DM values;
s3, creating a local scheduler and distributing a single-pulse search task according to the state of the CPU process:
according to the number of CPUs in the current platform and the process state of each CPU, single-pulse search tasks are distributed in a balanced manner;
s4, generating a Worker process on each CPU to perform a single-pulse search task;
and S5, recording and storing pulsar candidate information searched on each CPU:
and receiving candidate body information obtained by the monopulse search task in each process and storing the candidate body information in a singlepulse file.
The invention firstly uses FAST observation data file FH20201014_00C10.Fits to carry out performance analysis on single-pulse search program single _ pulse _ search. Py in PRESTO, and can find that the trend removing algorithm part accounts for about 60% of the total time overhead of the single-pulse _ search. Py program. Py program parts take time as shown in table 1.
TABLE 1
Reading documents Trend elimination Convolution operation Threshold filtering Recording candidates Others Total time (seconds)
Time consuming (seconds) 0.61 92.65 18.01 9.31 0.61 28.07 149.28
Ratio of time to volume 0.41% 62.07% 12.07% 6.24% 0.41% 18.81% 100.00%
The detrending is to suppress power spectrum fluctuation of signals caused by long-time observation in the acquisition process so as to obtain the optimal detection effect. The method is characterized in that a detrend method in a signal processing module signal in a Python mathematical computation library Scipy is called to realize the detrend algorithm in the original program, the core of the method is to calculate and obtain parameters of a fitting straight line by using a least square method, but in the operation flow of the method, reconstruction and restoration of a data shape and calculation of redundant parameters exist. The method is optimized, the operation flow of trend removing is redesigned, redundant calculation in the original method is removed, unnecessary expenses are reduced, and the optimized trend removing algorithm flow is shown in figure 1:
1. in the original method, data shape reconstruction and type conversion are performed to obtain a data type suitable for calculation, and in an optimized algorithm, time series data are directly converted into a list type supported by Cython, so that the operation overhead brought by the data shape reconstruction and type conversion is reduced;
2. in the original method, a least square method respectively calculates a regression coefficient (coef), a residual sum of squares (residuals), an independent variable rank (rank) and an independent variable singular value(s), for fitting a straight line, only the regression coefficient is required to be calculated, in an optimized algorithm, the least square method is redesigned to only calculate the regression coefficient (coef), the least square method is realized based on a Cython programming mode, and the cost brought by calculation of other parameters is reduced.
The method and the device have the advantages that when the single-pulse search program is combined to search different dat files, the data coupling condition does not exist, and the single-pulse search parallelization can be realized by respectively searching in different processes. Fig. 2 shows a schematic diagram of parallelization of monopulse search by Ray, where a global scheduler and a local scheduler are created when a main process is executed, and the global scheduler is enabled only in a cluster deployment mode, which is not described herein. When the master process calls the remote function, the task is submitted to the local scheduler, then the local scheduler distributes the task to the worker process in the local machine, and the worker process completes calculation and returns the result.
As shown in fig. 3 and 4, in order to verify the effectiveness of the de-trending algorithm and the parallelization of the monopulse search in the invention in increasing the monopulse search speed in the real pulsar data, a FAST observation data file FH20201014_00C10 is selected for comparing the acceleration conditions of the monopulse search.
Under the same experimental environment and the same experimental data, the single pulse search programs before and after the optimization detrending algorithm are respectively run for 10 times, each running time is recorded, the experimental result of the figure 3 is obtained, the horizontal axis is the running times, the vertical axis is the time consumption (unit: second), the single _ pulse _ search _ raw is the time consumption situation broken line diagram of the original program, and the single _ pulse _ search _ Cy _ rm _ rad is the time consumption situation broken line diagram after the optimization detrending algorithm is redesigned and optimized by using the Cython programming.
On the basis of optimizing a de-trend algorithm, parallelization is realized by using a Ray framework, an original single-pulse search program and a parallelization program based on the Ray are respectively run for 10 times, each running time is recorded, and an experimental result of the graph 4 is obtained.
It can be analyzed from fig. 3 that redesigning the optimization detrending algorithm and using the Cython programming optimization improves the performance by about 1.8 times compared to the original program.
From the analysis of fig. 4, it can be found that, in the invention, after the Ray framework is used for parallelizing the single pulse search program, the data processing time consumption is greatly improved compared with the original single pulse search program, and the speed-up ratio is about 10 times.
It can be seen from fig. 3 and 4 that the performance of the monopulse search program can be effectively improved and the data processing time can be significantly shortened by using the optimized detrending algorithm and the parallel monopulse search method based on the Ray framework. This verifies the effectiveness of the present invention in speeding up the monopulse search in real pulsar data.
Meanwhile, by comparing the search result graphs generated before and after optimization, the parallelization monopulse search algorithm based on the de-trending algorithm optimization provided by the invention can well keep the search effect of the serial algorithm, and the condition that the search results are inconsistent due to algorithm optimization and parallelization is avoided. The search result graphs before and after program optimization are shown in fig. 5, the left graph is a search result graph generated in series by an original program, the middle graph is a search result graph generated after an optimization de-trend algorithm, and the right graph is a search result graph generated by using a Ray framework in a parallelization mode.

Claims (2)

1. A single pulse searching method based on a Ray parallel framework is characterized by comprising the following steps:
s1, reading a pulsar time sequence data file;
s2, creating a DM list, a candlist and a num _ v _ DMstr list;
s3, creating a local scheduler and distributing a monopulse search task according to the state of the CPU process;
s4, generating a Worker process on each CPU to perform a single pulse search task;
and S5, recording and storing the pulsar candidate information searched on each CPU.
2. The single-pulse searching method based on the Ray parallel framework as claimed in claim 1, which is characterized by comprising the following steps:
s1, reading a pulsar time sequence data file:
reading and storing parameter information corresponding to a pulsar time sequence data dat file obtained after the decoloration and different dat data files as args parameters;
step S2, creating a DM list, a candlistlist and a num _ v _ DMstr list:
creating a DM list for recording dispersion values used correspondingly when different dat data files are subjected to dispersion elimination, creating a candlist for recording and storing candidate body information obtained by searching the last single pulse, and creating a num _ v _ DMstr list for recording the number of candidate bodies obtained by corresponding searching under different DM values;
s3, creating a local scheduler and distributing a monopulse search task according to the state of the CPU process:
according to the number of CPUs in the current platform and the progress states of the CPUs, single-pulse search tasks are distributed in a balanced mode;
s4, generating a Worker process on each CPU to perform a single-pulse search task;
and S5, recording and storing pulsar candidate information searched on each CPU:
and receiving candidate body information obtained by the monopulse search task in each process and storing the candidate body information in a singlepulse file.
CN202211602907.6A 2022-12-13 2022-12-13 Single pulse searching method based on Ray parallel framework Pending CN115827238A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211602907.6A CN115827238A (en) 2022-12-13 2022-12-13 Single pulse searching method based on Ray parallel framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211602907.6A CN115827238A (en) 2022-12-13 2022-12-13 Single pulse searching method based on Ray parallel framework

Publications (1)

Publication Number Publication Date
CN115827238A true CN115827238A (en) 2023-03-21

Family

ID=85547129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211602907.6A Pending CN115827238A (en) 2022-12-13 2022-12-13 Single pulse searching method based on Ray parallel framework

Country Status (1)

Country Link
CN (1) CN115827238A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932837A (en) * 2023-09-13 2023-10-24 贵州大学 Pulsar parallel search optimization method and system based on clusters

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932837A (en) * 2023-09-13 2023-10-24 贵州大学 Pulsar parallel search optimization method and system based on clusters

Similar Documents

Publication Publication Date Title
Karnagel et al. Adaptive work placement for query processing on heterogeneous computing resources
US10067963B2 (en) Method for pre-processing and processing query operation on multiple data chunk on vector enabled architecture
CN115827238A (en) Single pulse searching method based on Ray parallel framework
Rosenfeld et al. The Operator Variant Selection Problem on Heterogeneous Hardware.
US20170068710A1 (en) Collecting statistics in unconventional database environments
CN113742088B (en) Pulsar search parallel optimization method for processing radio telescope data
CN110929850A (en) Deep learning operator automatic optimization system and method based on Shenwei processor
Hou et al. Full tensor gravity gradiometry data inversion: Performance analysis of parallel computing algorithms
Pirk et al. X-device query processing by bitwise distribution
CN116089022A (en) Parameter configuration adjustment method, system and storage medium of log search engine
Marrakchi et al. Fine-grained parallel solution for solving sparse triangular systems on multicore platform using OpenMP interface
CN115329815A (en) Bearing fault diagnosis method and system for optimizing SVM (support vector machine) model parameters through bubble entropy and AOA (automated optical inspection)
CN107967496B (en) Image feature matching method based on geometric constraint and GPU cascade hash
CN116302515A (en) Cloud service load prediction method and system based on double channels
CN104834532A (en) Distributed data vectorization processing method and device
Rekachinsky et al. Modeling parallel processing of databases on the central processor Intel Xeon Phi KNL
Karnagel et al. Heterogeneous placement optimization for database query processing
CN113592064A (en) Ring polishing machine process parameter prediction method, system, application, terminal and medium
CN107807952B (en) Spark-based Apriori parallelization method, system and device
Lin et al. Benchmarking deep learning frameworks with FPGA-suitable models on a traffic sign dataset
CN106095396A (en) Loop collapsing CPU streamline optimization method
Arunachalam et al. End-to-end industrial IoT: software optimization and acceleration
Imai et al. Acceleration of Large Deep Learning Training with Hybrid GPU Memory Management of Swapping and Re-computing
Ren et al. Auto-tuning Mixed-precision Computation by Specifying Multiple Regions
Zhou et al. OpenCL-code generation framework for MobileNets with depthwise separable convolution and merged layers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination