CN115982544A - Spark-based monopulse search method and parallelization research method thereof - Google Patents
Spark-based monopulse search method and parallelization research method thereof Download PDFInfo
- Publication number
- CN115982544A CN115982544A CN202310026933.7A CN202310026933A CN115982544A CN 115982544 A CN115982544 A CN 115982544A CN 202310026933 A CN202310026933 A CN 202310026933A CN 115982544 A CN115982544 A CN 115982544A
- Authority
- CN
- China
- Prior art keywords
- pulse
- search
- spark
- parallelization
- searching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a Spark-based monopulse search method and a parallelization research method thereof, which comprise the following steps of S1, achromatizing and dispersing: removing the delay response caused by the dispersion effect; step S2, matched filtering: searching each achromatic time sequence or "DM channel" for pulses whose amplitude is above some S/N threshold, the threshold being selected according to the number of acceptable false positive samples; step S3, candidate diagnosis: and manually checking the data judged as the single-pulse candidate. The distributed cluster task allocation method based on the single pulse search module analyzes the advantages of Spark compared with a mainstream distributed architecture, completes parallel optimization based on Spark for a single pulse search module in the existing pulsar search program, constructs a distributed cluster, and simultaneously designs a task allocation algorithm facing the distributed cluster based on the load balancing concept by combining a batch processing application scene.
Description
Technical Field
The invention relates to the technical field of monopulse search, in particular to a monopulse search method based on Spark and a parallelization research method thereof.
Background
Pulsar is a highly magnetized rotating dense star. In 1967, hewish et al discovered the first radio pulsar PSRB1919+21[1], and in 1974 Hewish obtained a Nobel prize for its contribution to pulsar discovery and the like. The pulsar has extreme physical characteristics of large mass, small radius, super-strong attraction, super-strong magnetic field, evolution along with fixed stars, an explosion process of the supernova, high stability of a self-transmission period and the like, so that the pulsar has extremely important significance in the fields of research of gravitational fields, magnetic layer particle acceleration mechanisms, high-energy radiation, radio radiation, the explosion theory of the supernova, spacecraft navigation and the like.
The early pulsar search method is realized by periodic search, and the crab cloud pulsar discovered in the same year as the first pulsar does not show regular autorotation characteristics at first and is detected by captured giant pulses. Therefore, searching for non-periodic signals is also becoming an increasingly important point in the radio astronomy. Inspired by the discovery of crab cloud pulsar in 1999, davidj. Nice discovered a new pulsar PSRJ1918+08 2 by a single-shot search using data from the alaiesibo telescope. In 2004 and 2007, duncanLorimer et al discovered rotary radio transients (RRAT) and fast radio storms (DattRadioBursts, FRB) [3] [4] respectively by the single pulse search technique.
With the breakthrough of hardware and software, the total amount of observation data collected by a single tour project is continuously increased. The total amount of early observation data is generally between GB and TB, in 2021, 500-meter-caliber radio telescope (FAST) in Guizhou formally operates, in 7 months to 5 months from 2017, the observation data of the 500-meter-caliber spherical radio telescope (CRAFTS) which scans multiple scientific targets simultaneously reaches number PB, the total amount of data collected by the 500-meter-caliber spherical radio telescope in the radio pulsar field is expected to be 10-100 PB, and the observation data has advanced into the PB era. [5] [6] such a volume of data, at the "astronomical level," presents a significant challenge to the analysis and processing of the data. The traditional processing program based on a single machine and a serial mode can not meet the requirement on timeliness, and therefore a Spark-based monopulse search method with higher practicability and a parallelization research method thereof are provided.
Disclosure of Invention
The invention aims to provide a Spark-based monopulse search method and a parallelization research method thereof, and solves the existing problems.
In order to achieve the purpose, the invention provides the following technical scheme: the Spark-based monopulse search method and the parallelization research method thereof comprise the following steps:
step S1, achromatizing and dispersing: removing the delay response caused by the dispersion effect;
s2, matched filtering: searching each achromatic time series or "DM channel" for pulses whose amplitude is above some S/N threshold, the threshold being chosen according to the number of acceptable false positive samples;
step S3, candidate diagnosis: and manually checking the data judged as the single-pulse candidate.
Preferably, the step S2 specifically includes the following steps:
for long-time observation, the detection level of the strong pulse is inhibited, and the signal can be subjected to detrending processing by taking piecewise linear fitting as a smoothing method so as to achieve effective approximation of optimal detection;
in the case of unknown pulse widening parameters
Where σ is the time series root mean square noise, W n For noise correlation time, A i Is the intrinsic pulse area, W i Is the pulse intrinsic width; for heavily scattered pulses, the shape of the measurement will be controlled by the pulse stretching function, then
Wherein W b Is the net pulse width.
The Spark-based monopulse search parallelization research method comprises the following steps:
step (1), a single pulse signal is searched by PRESTO by using a single _ pulse _ search.
The system architecture for realizing the single-pulse search parallelization mainly comprises three layers, wherein the uppermost layer is a data source layer, and an HDFS (Hadoop distributed file system) is used for storing the dat files generated after the color dissipation is reduced;
the middle layer is a task scheduling layer and mainly completes the distribution of a group of computing tasks, and the parallel searching of the DM channels is realized by distributing the searching tasks of different DM channels to different computing nodes;
the lowest layer is a data processing layer and mainly completes a single-pulse search data processing task.
Compared with the prior art, the invention has the following beneficial effects:
the advantages of Spark compared with a mainstream distributed architecture are analyzed, spark-based parallel optimization is completed for a single pulse search module in the existing pulsar search program, a distributed cluster is constructed, and a task allocation algorithm facing the distributed cluster is designed based on the load balancing concept by combining batch processing application scenes; the performance of the system is evaluated through experiments, and results show that the system has remarkable advantages in large-scale data processing application scenes and provides effective data support for subsequent application to actual environments.
Compared with the original search program, the method has the advantages of obvious acceleration effect, strong system expansibility, compatibility with different performance and different architecture computing nodes, realization of distributed parallel search by fully utilizing the existing resources, and suitability for large-scale single pulse search scenes. Meanwhile, the method has important reference significance for integrating the interference removing and dispersing part and the searching part in the follow-up monopulse search.
Drawings
FIG. 1 is a schematic diagram of a single _ pulse _ search. Py search process according to the present invention;
FIG. 2 is a schematic diagram of a single pulse search distributed processing model design according to the present invention;
FIG. 3 is a schematic flow chart of a task assignment algorithm of the present invention;
FIG. 4 is a diagram illustrating the effect of the number of threads on the acceleration effect according to the present invention;
FIG. 5 is a diagram illustrating the effect of calculating data size on acceleration effect according to the present invention;
FIG. 6 is a schematic diagram illustrating the influence of the number of cluster nodes on the acceleration effect according to the present invention;
FIG. 7 is a diagram illustrating the effect comparison of the task allocation algorithm according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
The Spark-based monopulse search method and the parallelization research method thereof comprise the following steps:
step S1, achromatizing and dispersing: removing the delay response caused by the dispersion effect;
step S2, matched filtering: searching each achromatic time sequence or "DM channel" for pulses whose amplitude is above some S/N threshold, the threshold being selected according to the number of acceptable false positive samples;
for long-time observation, the detection level of the strong pulse is inhibited, and the signal can be subjected to detrending processing by taking piecewise linear fitting as a smoothing method so as to achieve effective approximation of optimal detection;
in the case of unknown pulse widening parameters
Where σ is the time series root mean square noise, W n For noise correlation time, A i Is the intrinsic pulse area, W i Is the pulse intrinsic width; for heavily scattered pulses, the shape of the measurement will be controlled by the pulse stretching function, then
Wherein W b Is the net pulse width;
step S3, candidate diagnosis: and manually checking the data judged as the monopulse candidate.
As shown in fig. 1, the Spark-based single pulse search parallelization research method includes the following steps:
step (1), a single pulse signal is searched by using a single _ pulse _ search.
Firstly reading a dat file generated after the de-dispersion, then performing de-trending on the read time sequence, then performing convolution operation on the time sequence and window functions with different widths, calculating a sigma value (signal-to-noise ratio), then filtering signals lower than a selected threshold value, finally recording a screened candidate body and outputting a singlepulse file;
for the dat files obtained after dispersion elimination, different DM values correspond to different dat time sequence files, and monopulse search needs to be performed on each different DM channel independently, so that parallel search can be performed between different DM channels, and when smoothing operation is performed on a time sequence, piecewise linear fitting needs to be performed on the whole data, different data segments can be performed in parallel, and then the smoothed data segments are combined for subsequent calculation, and meanwhile, a smoothing processing module is also time-consuming for a search program, so that parallelization is suitable;
step (2), a system architecture (figure 2) for realizing the single-pulse search parallelization is mainly divided into three layers, wherein the uppermost layer is a data source layer, and an HDFS (Hadoop distributed file system) is used for storing a dat file generated after the color dissipation is eliminated;
the middle layer is a task scheduling layer and mainly completes the distribution of a group of calculation tasks (each DM channel corresponds to one subtask), and the parallel search of the DM channels is realized by distributing the search tasks of different DM channels to different calculation nodes; since the Spark calculation model needs to drive the task by the Driver container and then distribute the calculation task to the execution container for execution, and the monopulse search program algorithm needs to read the dat data file by the Driver first, if the Driver and the execution are not in the same node, huge data IO (input/output) overhead is generated, and the search efficiency is reduced; therefore, a task scheduling layer is added between a data source layer and a data processing layer, the searching of a DM channel is strictly limited to be completed on one node so as to improve the searching efficiency, meanwhile, the prolonging of the total computing time of the cluster caused by overlong computing time of a certain node is avoided in consideration of the computing performance difference of different nodes, and the load balancing is realized through dynamic allocation; the task scheduling system is developed by using a yarnAPI, the algorithm flow is as shown in figure 3, a task queue is distributed to each computing node, each node operation queue is monitored in real time, a proper threshold value is set according to the performance of the node, and when the number of tasks in a certain node task queue is smaller than the performance threshold value, a new task is distributed to the node;
the lowest layer is a data processing layer and mainly completes a single-pulse search data processing task; after receiving the parameters transmitted by the task scheduling layer, the Master node of the Spark cluster allocates the computing tasks to the designated computing nodes, and after receiving the task request, the computing nodes read the dat files of the corresponding DM channels from the HDFS according to the request parameters for searching; in the searching process, each DM channel obtains a Driver drive, the read time sequence generates RDD, a map operator of Spark maps data blocks in the RDD to a plurality of executors to execute a trend removing operation, a trend removing module is used for realizing multithread parallel execution, the smoothed time sequence is collected into a Driver container through a collect operator, and subsequent calculation is completed in the Driver after combination.
Experiment and result analysis
Because distributed computing is restricted by various factors, the performance of the system is tested respectively from three dimensions of thread number, node number and data file size in experiments; the experimental environment is Ubuntu18.04.6, hadoop3.2.3, spark3.1.3. The hardware configuration parameters of the nodes used in the experiment are shown in table 1, wherein one node configured with an AMD chip and a 32G memory is selected as a system management node, and the rest nodes are used as computing nodes; the observation files of the 500 m-caliber spherical radio telescope CRAFTS sky patrol project are used in the experiment, and the parameters are shown in table 2.
TABLE 1 Experimental node configuration parameters
Table 2 experimental file parameters
Firstly, the influence of different thread quantities set by the system on the system performance under an X86 architecture and an ARMv8-A architecture is respectively tested, a 5.fits file is selected for testing, the same computing node of different architectures is respectively used, searching is carried out in a channel with a DM of 8.10, the time consumption distribution of program operation when a Spark program is used for setting different thread quantities is recorded, and compared with a PRESTO serial computing search program, the test result is shown in figure 4, wherein PRESTO represents the original program operation time, spark _ local represents the Spark program single-machine mode operation time, and Spark _ on _ yarn represents the yarn scheduling Spark program single-machine operation time; the result shows that, no matter in the X86 architecture or the ARMv8-a architecture, when the number of threads is set to 6, the acceleration effect is significantly improved, and then the acceleration effect is still slightly improved when the number of threads is continuously increased; for the X86 architecture, the acceleration ratio can reach 1.83 at most in spark _ local mode, and can reach 1.27 at most in spark _ on _ corner mode; and for the ARMv8-A framework, the acceleration ratio can reach 1.86 at the maximum in spark _ local mode, and the acceleration ratio can reach 1.46 at the maximum in spark _ on _ yarn mode. The acceleration ratio after the program parallelization is theoretically close to the number of the started threads, however, the actual test result has a certain difference, the reason is found by analyzing the program running process that the current data file is relatively small, the data calculation time is not large enough in the whole program time consumption, so the advantages of the system cannot be completely embodied, and the influence of the size of the data file on the acceleration effect of the system is tested next.
For the test of the influence of the size of the data file on the system performance, a plurality of groups of dat data files (each group of calculation tasks comprises a plurality of sub-calculation tasks of different DM channels) generated after the drift scan data with different sizes are decolored are selected for experiments, the consumed time of single-machine multi-thread search and cluster multi-node parallel search of the system is respectively tested, the test environment is an X86 architecture, 6 calculation nodes are commonly used for clusters, the time distribution of the system for processing the calculation tasks with different sizes is obtained, and is compared with a PRESTO serial calculation search program, as shown in FIG. 5, wherein the Spark single-machine multi-thread test node and the PRESTO serial program are tested to be the same calculation node, and simultaneously, according to the conclusion of the experiments, the thread number is selected to be 6 in order to achieve the maximization of the resource utilization; the result shows that the system has great advantages when processing large data files (long observation time), the acceleration effect is obviously improved along with the increase of the data files, and when processing data tasks with the size of 8.34G, the acceleration ratio of single machine multithreading of Spark is about 1.90, and the acceleration ratio of cluster is about 5.83.
In order to test the influence of the increase of the number of the nodes on the acceleration effect of the system, the test selects four nodes with the same performance and two nodes with better performance in an X86 architecture to perform a search test on 5.fits, the search range is DM7.50 to DM9.40, the step length is 0.10, and the test results are shown in FIG. 6 when the task size is 8.34G; the test environment is an X86 architecture, wherein the first four nodes have the same calculation performance (the memory 16G), and the second two nodes have better calculation performance (the memory 32G); the result shows that when the nodes with the same computing performance are increased, certain performance loss can be generated due to the fact that cluster management consumption is increased, but in combination with the experiment, the overall performance of the cluster still almost linearly increases along with the increase of computing tasks; when a node with better computing performance is added, the overall performance of the cluster is obviously higher than the improvement range when the same computing node is added, which shows that the system has good compatibility to computing nodes with different performances, and can fully utilize the performances of different nodes in the cluster.
Finally, under the batch processing scene, the influence of the task allocation algorithm provided by the method on the system performance is tested; three computing nodes of an X86 architecture are used in an experiment, wherein two nodes have the same computing performance (the memory 16G), and one node has better computing performance (the memory 32G); the test data is the same as the test data, and the result is shown in fig. 7, wherein average represents that the task is evenly distributed to each computing node, and balance is the distribution algorithm provided by the text; the result shows that for a batch processing application scene, when the task allocation algorithm provided by the text is used, cluster resources can be utilized more reasonably, and the system performance is improved obviously.
[1] Hi Qing Peng, li Chun Xiao, li Rong Wang, etc. the stretchy bridge satellite laser ranging time window and ranging success probability analysis [ J ] astronomical research and techniques, 2019,16 (4): 422-430.
GAOQingpeng,LI Chunxiao,LI Rongwang,etal.Magpiebridgesatelli telaserrangingtimewindowand
distance probability analysis[J].Astronomical Research&
Technology,2019,16(4):422-430.
[2]MORINA.Simulationofinforaredimagingseekingmissiles[C]//ProceedingsofSPIE,2001,4365:46-57.
Zhang, xie xiao yao, li \33730, liu shijie, wangbi, xuhong, xu fei ping, xu yun, jiang hou a data processing acceleration method and system for FASTPB magnitude pulsar [ J ] astronomical research and technology, 2021,18 (01): 129-137.doi.
Pan's day, qianlie, yueying Ridge, pulsar search techniques and FAST telescope pulsar search prospecting [ J ] astronomical research and techniques, 2017,14 (01): 8-16. DOI.
Bear smart, tian Xiu chen, zhao Qing, von Kun, and True, large-scale astronomical data sky coverage generation algorithm [ J ] based on Spark, tianjin science and technology university proceedings, 2018,33 (05): 63-67+78. DOI.
Wansen, research [ D ] Kunmu university of technology, 2021.DOI, 10.27200/d.cnki.gkmlu.2021.000780 is realized based on Spark radio interference array imaging algorithm.
Claims (3)
1. The single pulse searching method based on Spark is characterized by comprising the following steps:
step S1, achromatic dispersion: removing the delay response caused by the dispersion effect;
step S2, matched filtering: searching each achromatic time series or "DM channel" for pulses whose amplitude is above some S/N threshold, the threshold being chosen according to the number of acceptable false positive samples;
step S3, candidate diagnosis: and manually checking the data judged as the single-pulse candidate.
2. The Spark-based monopulse search method according to claim 1, wherein: the step S2 specifically includes the following steps:
for long-time observation, the detection level of the strong pulse is inhibited, and the signal can be subjected to detrending processing by taking piecewise linear fitting as a smoothing method so as to achieve effective approximation of optimal detection;
in the case of unknown pulse widening parameters
WhereinIs time series RMS noise->For the time associated with the noise>Is intrinsic pulse area, <' > is>Is the pulse intrinsic width; for heavily scattered pulses, the shape of the measurement will be controlled by the pulse stretching function, then
3. The Spark-based monopulse search parallelization research method according to claim 1, comprising the following steps:
step (1), a single pulse signal is searched by using a single _ pulse _ search.
The system architecture for realizing the single-pulse search parallelization mainly comprises three layers, wherein the uppermost layer is a data source layer, and an HDFS (Hadoop distributed file system) is used for storing the dat files generated after the color dissipation is reduced;
the middle layer is a task scheduling layer and mainly completes the distribution of a group of computing tasks, and the parallel searching of the DM channels is realized by distributing the searching tasks of different DM channels to different computing nodes;
the lowest layer is a data processing layer and mainly completes a single-pulse search data processing task.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310026933.7A CN115982544A (en) | 2023-01-09 | 2023-01-09 | Spark-based monopulse search method and parallelization research method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310026933.7A CN115982544A (en) | 2023-01-09 | 2023-01-09 | Spark-based monopulse search method and parallelization research method thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115982544A true CN115982544A (en) | 2023-04-18 |
Family
ID=85959505
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310026933.7A Pending CN115982544A (en) | 2023-01-09 | 2023-01-09 | Spark-based monopulse search method and parallelization research method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115982544A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116451030A (en) * | 2023-06-16 | 2023-07-18 | 中国科学院国家天文台 | Baseband data pulse searching method and system based on GPU |
CN116932837A (en) * | 2023-09-13 | 2023-10-24 | 贵州大学 | Pulsar parallel search optimization method and system based on clusters |
-
2023
- 2023-01-09 CN CN202310026933.7A patent/CN115982544A/en active Pending
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116451030A (en) * | 2023-06-16 | 2023-07-18 | 中国科学院国家天文台 | Baseband data pulse searching method and system based on GPU |
CN116451030B (en) * | 2023-06-16 | 2023-09-05 | 中国科学院国家天文台 | Baseband data pulse searching method and system based on GPU |
CN116932837A (en) * | 2023-09-13 | 2023-10-24 | 贵州大学 | Pulsar parallel search optimization method and system based on clusters |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN115982544A (en) | Spark-based monopulse search method and parallelization research method thereof | |
CN110619595A (en) | Graph calculation optimization method based on interconnection of multiple FPGA accelerators | |
Napoli et al. | A cloud-distributed GPU architecture for pattern identification in segmented detectors big-data surveys | |
US20110161483A1 (en) | Virtual server system and physical server selection method | |
CN106250233B (en) | MapReduce performance optimization system and optimization method | |
US8321476B2 (en) | Method and system for determining boundary values dynamically defining key value bounds of two or more disjoint subsets of sort run-based parallel processing of data from databases | |
US20210240457A1 (en) | Offload server and offload program | |
Wan et al. | Comprehensive measurement and analysis of the user-perceived i/o performance in a production leadership-class storage system | |
Holly et al. | Profiling energy consumption of deep neural networks on nvidia jetson nano | |
Gowanlock | Hybrid KNN-join: Parallel nearest neighbor searches exploiting CPU and GPU architectural features | |
US20180121135A1 (en) | Data processing system and data processing method | |
Zhao et al. | Toward locality-aware scheduling for containerized cloud services | |
Kommareddy et al. | Investigating fairness in disaggregated non-volatile memories | |
Suetterlein et al. | Extending the roofline model for asynchronous many-task runtimes | |
Wang et al. | Efficient parallel computing of graph edit distance | |
Saar et al. | Chirp-based impedance spectroscopy of piezo-sensors | |
Oh et al. | Convolutional neural network accelerator with reconfigurable dataflow | |
Fan et al. | Parallel geometric correction for single spaceborne SAR image | |
Lancaster et al. | TimeTrial: A low-impact performance profiler for streaming data applications | |
Lu et al. | Improving mapreduce performance by using a new partitioner in yarn | |
Takahashi et al. | A framework for searching a predictive model | |
Lonardo et al. | A FPGA-based Network Interface Card with GPUDirect enabling realtime GPU computing in HEP experiments | |
Lu et al. | A large-scale heterogeneous computing framework for non-uniform sampling two-dimensional convolution applications | |
Daley et al. | Performance analysis of emerging data analytics and HPC workloads | |
Knyazyan et al. | Astronomical plates spectra extraction objectives and possible solutions implemented on Digitized First Byurakan Survey (DFBS) images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |