CN111292222B

CN111292222B - Pulsar dispersion eliminating device and method

Info

Publication number: CN111292222B
Application number: CN202010073731.4A
Authority: CN
Inventors: 托乎提努尔; 王娜; 张海龙; 王杰
Original assignee: Xinjiang Astronomical Observatory of CAS
Current assignee: Xinjiang Astronomical Observatory of CAS
Priority date: 2020-01-22
Filing date: 2020-01-22
Publication date: 2023-05-12
Anticipated expiration: 2040-01-22
Also published as: CN111292222A

Abstract

The invention provides a pulsar decoloring and dispersing method, which comprises a switchable searching mode and a folding mode, and comprises the following steps: adopting an FFT parallel computing module to obtain multi-channel data, initializing related variables according to parameters of the multi-channel data, performing task decomposition, and calculating dispersion quantity translation; distributing GPU memory, and translating and copying the dispersion amount into the GPU memory; and calling a kernel function of the GPU by adopting the CPU, giving a calculation task to the GPU for dispersion elimination, and copying a dispersion elimination processing result to a CPU memory until the dispersion elimination processing is finished. The invention also provides a pulsar dispersion eliminating device. The invention greatly improves the pulsar achromatic powder computing performance, improves the processing speed, meets the real-time chromatic dispersion processing requirement of mass data, reduces the data storage quantity and reduces the system cost.

Description

Pulsar dispersion eliminating device and method

Technical Field

The invention belongs to the field of astronomical observation, and relates to a pulsar dispersion eliminating device and method.

Background

Pulsar is a neutron star which rotates rapidly and generates a magnetic field, has very high density and stable period, and emits electromagnetic waves outwards along the direction of magnetic poles while rotating around its own rotation axis, and when the electromagnetic waves sweep the earth, a radio telescope on the earth receives periodic pulse signals.

When the pulsar signal propagates in the space, the speed is reduced due to the influence of inter-star medium dispersion, and the propagation speed of radio waves with high frequency is faster than that of radio waves with low frequency, so that the time for reaching the radio telescope at high frequency and low frequency is delayed to a certain extent, and the pulsar signal has energy dispersion to deform the pulse profile, so that the pulse is widened, the signal-to-noise ratio is reduced, and even the pulse signal disappears.

Since pulsar signals are extremely weak, it is necessary to perform a process of decoloring the pulsar signals in order to observe clearly visible pulse contours. The pulsar achromatizing technology can effectively improve the sensitivity of astronomical observation and improve the pulsar identification and detection capability of an observation system. In recent years, pulsar scientific research and observation put higher requirements on chromatic dispersion cancellation technology, and a chromatic dispersion system with ultra-bandwidth and high-speed signal processing capability is a necessary trend of development of future radio pulsar observation equipment, and related technologies meet great challenges. The prior achromatizing treatment technology has the following defects:

(1) The improvement of the performance of the observation equipment rapidly expands the frequency range of the astronomical observable astronomical signal of radio, and along with the continuous increase of the observation bandwidth, the resolution is higher and higher, so that the generated data volume is huge, and the existing achromatic technology cannot rapidly process mass data in real time. For example, the amount of data generated by leading-edge observers such as ultra wideband receivers, multi-beam receivers, and PAF receivers is very large, typically of the TB order, and the real-time processing of such large data presents unprecedented challenges to dispersion techniques and dispersion processing algorithms.

(2) When the pulsar dispersion quantity (Dispersion Measure, DM) search space is large, the existing achromatic dispersion processing method has relatively low operation efficiency, huge calculation quantity and longer time consumption, can not meet the high-speed real-time pulsar search requirement, and rapidly increases the system power consumption, complexity and cost.

Therefore, it is necessary to develop a pulsar dispersion eliminating device and method with higher resolution, wider processing bandwidth and stronger system stability, which are important for pulsar searching, observation and scientific research.

Disclosure of Invention

The invention aims to provide a pulsar dispersion eliminating device and a pulsar dispersion eliminating method in high speed and real time so as to greatly improve the computing performance of pulsar dispersion eliminating.

In order to achieve the above object, the present invention provides a pulsar decoloring method, including a switchable search mode and a folding mode, comprising:

s1: providing a data processing module, wherein the data processing module comprises an FFT parallel computing module and a achromatic processing module, and the FFT parallel computing module is adopted to divide the baseband data of the pulsar into a plurality of mutually independent narrow channel signals so as to obtain multi-channel data;

s2: the data exchange between the CPU and the GPU, the multithreading task allocation and the optimization of GPU parallel computing resources are completed by adopting the achromatic processing module, and the method comprises the following steps:

s21: initializing related variables according to parameters of the multi-channel data, wherein the related variables comprise the number of channels, a DM value, a signal frequency range and the bandwidth of each channel;

s22: in order to improve the parallel computing speed of the chromatic dispersion, task decomposition is carried out, a CPU is controlled to calculate the chromatic dispersion translation, and the chromatic dispersion translation is stored in a CPU memory;

the formula expression of the dispersion amount translation is as follows:

shift＝4.15×10 ³ ×(f ₁ ^-2 -f ₂ ^-2 )

wherein shift is dispersion amount translation, f ₁ ，f ₂ Frequencies of channels of the multi-channel data, which are low frequency and high frequency, respectively;

step S23: distributing GPU memory;

step S24: copying the dispersion amount translation into the GPU memory allocated in the step S23;

step S25: the CPU is used for calling the kernel function of the GPU, and the calculation task is handed to the GPU for dispersion elimination, which comprises the following steps: enabling multithreading on the GPU to calculate delay time of each channel of the multichannel data, performing accumulation operation on each dispersion value to obtain a dispersion eliminating processing result, and writing the dispersion eliminating processing result into a global memory of the GPU;

the delay time of the frequency channel is as follows:

wherein t is _sam For sampling period of signal, t _DM The delay time of the frequency channel is given by DM, the dispersion quantity is given by shift, and the dispersion quantity is shifted;

step S26: copying the result of the decoloring processing to the CPU memory until the decoloring processing of all the baseband data is finished, and releasing the GPU memory allocated in step S23.

In the search mode, the step S1 further includes: writing the multi-channel data to a disk to temporarily store the multi-channel data; the step S21 further includes: before initialization, reading the multichannel data from a disk and storing the multichannel data into a CPU memory; and the step S24 further includes: copying the multichannel data into the GPU memory allocated in the step S23; in the folded mode, the step S1 further includes: and maintaining the multi-channel data in the video memory of the GPU.

In the step S24, all threads in one thread block acquire the dispersion amount shift of each channel of the multi-channel data, and store the dispersion amount shift in the shared memory in the GPU memory.

In the step S1, the parallel computation of the FFT parallel computation module is implemented by adopting a CUFFT function; in said step S24 is implemented by using memory management functions of the C and/or CUDA API.

The pulsar decoloring method further comprises the step S3 of: writing the achromatic processing result in the CPU memory into a file, wherein the file is stored in a magnetic disk and comprises single-channel time sequence data, file header information and a data part, and the file header information comprises the number of channels, signal bandwidth, sampling rate and dispersion quantity.

In another aspect, the present invention provides a pulsar dispersion apparatus, coupled to a receiver, comprising: the signal digitizing module is positioned on the programmable logic device, and is used for digitizing the pulsar analog signals from the receiver, converting the pulsar analog signals into digital signals, and then generating and transmitting data packets; the data receiving module is positioned on the CPU and comprises an annular buffer area in the CPU memory, and is arranged to receive the data packet sent by the signal digitizing module, unpack the data packet and write the unpacked baseband data into the annular buffer area; the data processing module is located on a GPU platform, and comprises two switchable folding modes and a searching mode, and is configured to read the baseband data in the ring buffer and execute the pulsar erasing method according to the above description on the baseband data.

The signal digitizing module is a ROACH2 hardware platform of CASPER of Berkeley of California university in the United states, is provided with 8 tera Ethernet interfaces, generates UDP data packets, realizes high-speed transmission of baseband data, and is realized by a graphical programming mode.

The baseband data comprises a plurality of data elements and the ring buffer is arranged such that when one data element of the baseband data is processed, the remaining data elements do not need to be moved from their storage locations.

The pulsar decoloring device disclosed by the invention utilizes a programmable logic device to sample signals, package and send UDP data packets, and adopts a GPU with parallel processing capability to carry out FFT and decoloring, so that the advantages of an FPGA and a GPU computing platform are fully exerted; meanwhile, the shared memory annular buffer area is used in the CPU, so that high-speed data transmission and processing of the heterogeneous platform are realized, the multi-task allocation of the CPU, the GPU and the FPGA platform is optimized, the flexibility and the expandability of a dispersion elimination algorithm are improved, and the decoloring device and the decoloring method provided by the invention have higher development efficiency and data processing capability and lower development cost. The pulsar decoloring device adopts the high-performance GPU as a core platform for data processing, and improves the utilization rate of GPU resources by efficiently utilizing the global memory and the shared memory of the GPU, thereby reducing the calculation time; the invention realizes the multi-task parallel processing, greatly improves the pulsar achromatic computing performance, improves the processing speed, meets the real-time chromatic dispersion processing requirement of mass data, reduces the data storage quantity and reduces the system cost.

Drawings

Fig. 1 is a data processing block diagram of a pulsar dispersion device according to an embodiment of the present invention.

FIG. 2 is a CUDA flow chart of a search mode of a pulsar achromatizing method according to an embodiment of the present invention.

Fig. 3 is a system overview of a pulsar dispersion device according to one embodiment of the invention.

Fig. 4 is a schematic diagram showing an acceleration ratio of the pulsar erasing method according to an embodiment of the present invention to the conventional CPU erasing method.

Detailed Description

The invention will be further illustrated with reference to specific examples. It should be understood that the following examples are illustrative of the present invention and are not intended to limit the scope of the present invention.

Referring to fig. 1, a CUDA flowchart of a pulsar achromatizing method according to an embodiment of the present invention includes a switchable search mode and a folding mode, including the steps of:

step S1: providing a data processing module 3, wherein the data processing module 3 comprises an FFT parallel computing module 31 and an achromatizing processing module 32 which are positioned on a GPU, and dividing the baseband data of the pulsar into a plurality of mutually independent narrow channel signals by adopting the FFT parallel computing module 31 to obtain multi-channel data required by achromatizing; wherein the parallel computation of the FFT parallel computation block 31 is implemented by using a CUFFT function.

The FFT parallel computing module 31 copies the baseband data from the ring buffer 23 of the CPU to the global memory of the GPU, and then performs high-speed data processing by using the multithreaded parallel computing resources of the GPU, so as to implement fast fourier transform, thereby generating multi-channel data.

In addition, in the search mode, the step S1 further includes: writing the multi-channel data to a disk to temporarily store the multi-channel data;

in the folded mode, the step S1 further includes: and maintaining the multi-channel data in the video memory of the GPU.

Step S2: as shown in fig. 1, the achromatic processing module 32 is adopted to complete data exchange between the CPU and the GPU, multi-thread task allocation and optimization of GPU parallel computing resources, and specifically comprises the following steps:

step S21: initializing. And initializing related variables such as channel number, DM value, signal frequency range, bandwidth of each channel and the like according to the parameters of the multi-channel data.

In addition, in the search mode, the step S21 further includes: before initialization, the multichannel data is read from a disk and stored in a CPU memory. Thus, in the folding mode, only a simple initialization operation is performed, whereas in the seek mode, multichannel data in the disk is read first and then initialized.

Step S22: the dispersion shift is calculated. In order to increase the speed of the parallel computing of the chromatic dispersion, task decomposition is carried out, a CPU is controlled to calculate the chromatic dispersion amount translation, and the chromatic dispersion amount translation is stored in a CPU memory.

The task decomposition method specifically comprises the following steps of: the conventional achromatic formula of the achromatic method is decomposed into two parts, so that the parallel parts of the achromatic formula are processed by the GPU, and high parallelization is performed.

The dispersion amount shift has the formula:

shift＝4.15×10 ³ ×(f ₁ ^-2 -f ₂ ^-2 )

thus, the dispersion amount shift is calculated in series in the CPU, and the result of the calculation is transmitted to the GPU, and then all the remaining calculation tasks are given to the multithreading process of the GPU.

Step S23: and distributing GPU memory. Since the data processing of the CPU and the GPU are relatively independent, the memory space used by the GPU is prepared in advance.

Step S24: in the search mode, the multichannel data and the dispersion amount are horizontally copied into the GPU memory allocated in the step S23; if folding mode is used, the multi-channel data is already in the GPU, so only dispersion amount panning copies the dispersion amount panning into the GPU memory allocated in step S23.

The CPU and the GPU are provided with independent memory spaces, and cannot directly access parameters and variables of the other party. For the GPU to process data, the dispersion shift is first copied from the CPU into the GPU memory allocated in step S23. In the step S24, the memory management function of the C and/or CUDA API is used to translate and copy the multi-channel data and the dispersion into the GPU memory allocated in the step S23, so as to realize data transmission between the GPU video memory and the CPU memory.

In the step S24, all threads in a thread block acquire the dispersion shift of each channel of the multi-channel data, and store the dispersion shift into the shared memory in the GPU memory, so as to implement the replication of the dispersion shift.

the delay time of the frequency channel is as follows:

the dispersion amount translation shift adopted in the decoloring processing uses a shared memory, so that the access delay of the global memory is hidden, and the shared memory is a storage system in a GPU (graphics processing unit) chip, so that the shared memory has larger bandwidth and lower access delay compared with the local memory or the global memory of the GPU, and the computing performance of the GPU is improved.

Step S26: copying the decoloring processing result to a CPU memory. And copying the decoloring processing result to a CPU memory by calling a cudaMemcpy () function in the CPU until the decoloring processing of all the baseband data is finished, and releasing the GPU memory allocated in the step S23.

Step S3: writing the achromatizing processing result in the CPU memory into a file to realize storage. The file is stored on a disk and comprises single channel time series data, file header information and data parts, and a clear pulsar signal profile can be seen after folding. The file header information includes the number of channels, signal bandwidth, sampling rate, and dispersion amount, etc.

Because of the huge calculation amount of the pulsar search mode, the GPU acceleration method provided by the invention has more outstanding calculation performance. The CUDA program flow chart of the search mode is shown in FIG. 1, and comprises the processes of data reading and initialization, dispersion translation calculation, GPU memory allocation, data copying, GPU kernel function calling, processing result copying, file writing and the like. The dispersion processing calculation complexity is DM number (dispersion amount number) x frequency channel number x sampling number. Therefore, when parallel achromatic processing tasks are processed in the GPU, the GPU does not need large buffering and complicated flow control operations, and the achromatic processing module 32 performs parallel processing in the dispersion amount and the sampling time dimension on all threads of the GPU through the same CUDA code, so that higher calculation efficiency is obtained. The dispersion elimination processing module 322 respectively accumulates the dispersion amounts of the multiple frequency channels corresponding to the same sampling time after compensation, namely, each thread of the GPU realizes the addition calculation of the multiple frequency channels.

According to the invention, the calculation tasks of the FFT algorithm with large calculation amount and the accumulation part of the frequency channel are mapped to the multithread of the GPU for parallel processing, so that the dependence of the computer system on the CPU performance is reduced, and the data processing performance of the whole system is greatly improved.

The results of processing multichannel data with a center frequency of 408MHz, a sample of 131072 (single channel sample), and a bandwidth of 20MHz loaded into the GPU are shown in tables 1 and 2.

TABLE 1CPU and GPU achromatizing time (unit: s)

TABLE 2CPU and GPU achromatizing time (unit: s)

As can be seen from tables 1 and 2, when the number of channels is fixed, the data processing time of the GPU and the CPU increases linearly with the increase of the DM number, and the decoloring processing time of the GPU is far less than the calculation time of the CPU; when the DM number is fixed, the more time the CPU and GPU need to process as the number of frequency channels of the multi-channel data increases, but the computation time of the CPU is many times longer than that of the GPU. Acceleration of the GPU decoloring method as shown in fig. 2, the acceleration ratio of the parallel algorithm increases as the DM number increases. When the DM number is 2560, the TITAN V GPU has the highest acceleration ratio (i.e., 538 times the CPU speed), and then the acceleration ratio begins to drop. The larger the number of channels, the larger the acceleration ratio of the GPU decoloring processing, and the better the acceleration performance obtained.

In short, the time of the decoloring processing of the GPU is less than the calculation time of the CPU, the calculation speed is hundreds of times different, and the execution time of the decoloring processing is greatly shortened by the GPU. The achromatism processing method effectively solves the problem that real-time processing cannot be performed on a CPU platform due to huge achromatism calculation amount.

Fig. 3 shows a pulsar erasing apparatus 100 according to an embodiment of the present invention, which is suitable for dispersion processing of pulsar signals and related scientific research. Pulsar erasing apparatus 100 is connected to a receiver 200 of a radio telescope, and is mainly composed of three modules: the system comprises a signal digitizing module 1, a data receiving module 2 and a data processing module 3.

As shown in fig. 3-4, the signal digitizing module 1 is a signal processing platform located on a programmable logic device (Field Programmable Gate Array, FPGA), and includes a sampling module 11, a data packet generating module 12, and a teraethernet interface 13, where the sampling module 11 is configured to digitize a pulsar analog signal from the receiver 200, convert the pulsar analog signal into a digital signal, the data packet generating module 12 is configured to generate a data packet of baseband data, and the teraethernet (10 GbE) interface 13 is configured to transmit the data packet to the data receiving module 2. The signal digitizing module 1 processes the data, except for digitizing and generating the data packets, without any processing, so that the transmitted data are pulsar original data, i.e. baseband data.

In this embodiment, the signal digitizing module 1 is a ROACH2 hardware platform of Berkeley CASPER of university of California, and its core processor is an Xilinx Virtex X series FPGA chip, and a Z-DOK connection mode is adopted between the core processor and an A/D sampling board, so as to provide a sampling board with dual channels, 8 bits and highest sampling rate of 5 GHz. The ROACH2 hardware platform is also provided with 8 tera Ethernet (10 GbE) interfaces 13, and the 10GbE network card module is specifically adopted to be connected with a tera Ethernet switch, so that high-speed real-time transmission of data can be realized. The development software environment of the signal digitizing module 1 comprises a signal processing development library provided by the Xilinx/System Generator, the Matlab/Simulink and the CASPER, so that the development difficulty of the FPGA is reduced and the design of the FPGA is efficiently realized in a graphical programming mode. The packet generated by the packet generation module 12 is preferably a UDP packet.

The data receiving module 2 is positioned on the CPU and comprises a tera Ethernet (10 GbE) network card 21, a data packet receiving module 22 and a high-speed annular buffer zone 23 designed in the CPU memory, wherein the tera Ethernet (10 GbE) network card 21 is arranged to receive the data packet sent by the signal digitizing module, and unpacking processing is carried out to obtain unpacked baseband data; the packet receiving module 22 is arranged to write baseband data into a ring buffer 23 designed in the CPU memory. Thereby, the data receiving module 2 is arranged to receive the data packets sent by the signal digitizing module, to perform a de-packetization process and to write de-packetized baseband data into said ring buffer 23. The data in the data packet received by the ethernet network card 21 is baseband data, and is not processed, so the data size is large, and the data cannot be directly sent to the GPU for processing, so a section of ring buffer 23 needs to be opened up in the CPU memory as temporary storage of the data. In the process of continuously buffering data, the next process copies the data directly from the ring buffer 23 to the video memory of the GPU described below in first-out order.

The baseband data includes a plurality of data elements, and the ring buffer 23 is configured to store the baseband data and implement first-in first-out of the baseband data, so as to greatly improve data access and processing speed while avoiding data movement. The size of the ring buffer 23 may be set according to the requirement, and there is no fixed size, such as 16MB-128 MB.

The key of the buffer technology is to design a set of first-in first-out buffer management algorithm, and the annular buffer 23 is a data structure used for representing a buffer with fixed size and connected end to end, and is suitable for buffering data streams. The ring buffer 23 is arranged such that after one data element of the baseband data is processed, the remaining data elements do not need to be moved in their storage locations. Therefore, in the process that the data elements are continuously cached, the baseband data are directly copied into the GPU video memory for processing in the CPU memory according to the first-in first-out sequence, so that the data packet loss is reduced, and the problem of high-speed data transmission among heterogeneous platforms is solved. Conversely, a non-circular buffer requires that after one data element is consumed, the remaining data elements be moved forward. The invention effectively improves the data exchange on the heterogeneous platform by realizing the annular buffer zone.

The data processing module 3 is located on the GPU so that the multithreading by the GPU is exclusively responsible for the parallel computing tasks. The data processing module 3 comprises an FFT parallel computing module 31 and a dispersion processing module 32, arranged to first read the baseband data in the ring buffer 23 and perform the pulsar dispersion method according to the above description on the baseband data, thereby implementing the FFT multi-channel filtering and dispersion processing in parallel on a high-performance GPU platform. As described above, the signal processing flow includes a series of processes of buffer data acquisition, FFT, initialization, dispersion translation calculation, video memory allocation, data transmission, GPU kernel function invoking, processing result writing, and the like, so as to complete data exchange between the CPU and the GPU, multithreading task allocation, and optimization of GPU parallel computing resources.

The FFT parallel computing module 31 employs a cuFFT acceleration bank in the CUDA parallel computing architecture. The FFT operation speed is greatly improved, flexible layout of data can be realized through cuFFT, and 1D FFT conversion can be processed efficiently. The GPU cannot directly communicate with the CPU memory, and data interaction is performed between the memory and the video memory through the PCI-E bus.

In this embodiment, the CPU and GPU respectively use Intel Xeon E5-1620 CPU and NVIDIA new generation GForce series TITAN V GPU, and the software environment is designed by adopting the latest CUDA and Linux systems. In order to improve the data transmission and calculation performance of the GPU, the Stream mode provided by the CUDA is adopted for design, and the bidirectional data transmission of the CPU memory and the GPU video memory is divided, so that the GPU can simultaneously perform calculation work while the CPU memory and the GPU video memory interact data.

The foregoing description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention, and various modifications can be made to the above-described embodiment of the present invention. All simple, equivalent changes and modifications made in accordance with the claims and the specification of the present application fall within the scope of the patent claims. The present invention is not described in detail in the conventional art.

Claims

1. A pulsar erasing method comprising a switchable search mode and a folding mode, comprising:

step S1: providing a data processing module (3), wherein the data processing module (3) comprises an FFT parallel computing module (31) and an achromatic processing module (32), and the FFT parallel computing module is adopted to divide the baseband data of the pulsar into a plurality of mutually independent narrow channel signals so as to obtain multi-channel data;

step S2: the data exchange between the CPU and the GPU, the multithreading task allocation and the optimization of GPU parallel computing resources are completed by adopting the achromatic processing module, and the method comprises the following steps:

step S21: initializing related variables according to parameters of the multi-channel data, wherein the related variables comprise the number of channels, a DM value, a signal frequency range and the bandwidth of each channel;

step S22: in order to improve the parallel computing speed of the chromatic dispersion, task decomposition is carried out, a CPU is controlled to calculate the chromatic dispersion translation, and the chromatic dispersion translation is stored in a CPU memory;

the formula expression of the dispersion amount translation is as follows:

shift=4.15×10 ³ ×(f ₁ ^-2 -f ₂ ^-2 )

where shift is the dispersion shift,f ₁ ，f ₂ frequencies of channels of the multi-channel data, which are low frequency and high frequency, respectively;

step S23: distributing GPU memory;

the delay time of the frequency channel is as follows:

t _DM = shift×DM/t _samp ，

wherein t is _samp For sampling period of signal, t _DM The delay time of the frequency channel is given by DM, the dispersion quantity is given by shift, and the dispersion quantity is shifted;

2. The pulser erasing method according to claim 1, wherein in the search mode, the step S1 further includes: writing the multi-channel data to a disk to temporarily store the multi-channel data; the step S21 further includes: before initialization, reading the multichannel data from a disk and storing the multichannel data into a CPU memory; and the step S24 further includes: copying the multichannel data into the GPU memory allocated in the step S23;

3. The pulsar erasing method according to claim 1, wherein in the step S24, all threads in one thread block acquire dispersion amount shift of each channel of the multi-channel data, respectively, and store the shift in the shared memory in the GPU memory.

4. The pulsar achromatizing method according to claim 1, characterized in that in said step S1, the parallel computation of said FFT parallel computation module is implemented by employing a CUFFT function; in said step S24 is implemented by using memory management functions of the C and/or CUDA API.

5. The pulsar erasing method according to claim 1, further comprising step S3: writing the achromatic processing result in the CPU memory into a file, wherein the file is stored in a magnetic disk and comprises single-channel time sequence data, file header information and a data part, and the file header information comprises the number of channels, signal bandwidth, sampling rate and dispersion quantity.

6. A pulsar dispersion device, coupled to a receiver (200), comprising:

the signal digitizing module (1) is positioned on the programmable logic device, and is used for digitizing the pulsar analog signals from the receiver (200), converting the pulsar analog signals into digital signals, and then generating and transmitting data packets;

the data receiving module (2) is positioned on the CPU and comprises an annular buffer area (23) in the CPU memory, and is arranged to receive the data packet sent by the signal digitizing module, carry out unpacking processing and write unpacked baseband data into the annular buffer area (23);

data processing module (3) on a GPU platform comprising two switchable folding modes and a search mode, arranged to read baseband data in said ring buffer (23) and to perform a pulsar-achromatic method according to one of claims 1 to 5 on said baseband data.

7. Pulsar dispersion eliminating device according to claim 6, characterized in that the signal digitizing module (1) is a ROACH2 hardware platform of berkeley CASPER of university of california, which is equipped with 8 tera ethernet interfaces (13), generating UDP data packets and realizing high-speed transmission of baseband data, the signal digitizing module (1) is a signal processing platform located on an FPGA, and the development of the FPGA is realized by means of a graphical programming.

8. The pulsar-dispersion device of claim 6, wherein the baseband data comprises a plurality of data elements, the ring buffer being configured such that when one data element of the baseband data is processed, the remaining data elements do not need to be moved from their storage locations.