CN111368252A - Pulsar coherent de-dispersion system and method - Google Patents

Pulsar coherent de-dispersion system and method Download PDF

Info

Publication number
CN111368252A
CN111368252A CN202010130257.4A CN202010130257A CN111368252A CN 111368252 A CN111368252 A CN 111368252A CN 202010130257 A CN202010130257 A CN 202010130257A CN 111368252 A CN111368252 A CN 111368252A
Authority
CN
China
Prior art keywords
gpu
cpu
data
coherent
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010130257.4A
Other languages
Chinese (zh)
Inventor
托乎提努尔
王娜
张海龙
王杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinjiang Astronomical Observatory of CAS
Original Assignee
Xinjiang Astronomical Observatory of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xinjiang Astronomical Observatory of CAS filed Critical Xinjiang Astronomical Observatory of CAS
Priority to CN202010130257.4A priority Critical patent/CN111368252A/en
Publication of CN111368252A publication Critical patent/CN111368252A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • G06F17/142Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Discrete Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

A method of coherent de-dispersion comprising the steps of: reading pulsar baseband data; initializing variables and parameters of a CPU (central processing unit) at a host end and a GPU (graphics processing unit) at a device end; the data processing of the CPU and the GPU is relatively independent, and data are exchanged in a pointer transmission mode; copying the data of the CPU memory to a GPU video memory; setting and initializing an FFT plan; calculating an FFT algorithm; the CPU starts a GPU kernel function, distributes a GPU multithreading task, calculates multiplication operation of pulsar signals and an interstellar medium function chirp in a frequency domain, and executes a coherent achromatic processing algorithm in a multithreading mode of the GPU at the equipment end; converting the result of GPU Kernel processing into a time domain signal, and setting 1D inverse IFFT plan; copying the processing result to a CPU, and removing the overlapped part of the data; and writing the file, and releasing the memory resources opened by the GPU equipment terminal if the de-chromatic processing of all the data is finished. The invention solves the problem that the coherent de-dispersion algorithm cannot calculate in real time on a CPU platform due to huge calculation amount.

Description

Pulsar coherent de-dispersion system and method
Technical Field
The invention relates to the technical field of pulsar signal observation and search, in particular to a pulsar coherent de-dispersion system and a pulsar coherent de-dispersion method.
Background
The pulsar is a fast-rotating neutron star, has very high density and stable period, emits electromagnetic waves outwards along the direction of a magnetic pole while rotating around a self rotating shaft at high speed, and receives periodic pulse signals by a radio telescope on the earth when the electromagnetic waves sweep the earth. Pulsar signals encounter the effects of interplanetary media during the process of cosmic space propagation. Due to the influence of dispersion of the interstellar media, the propagation speeds of radio waves with different frequencies are different, and the high-frequency propagation is faster than the low-frequency propagation, so that the time for a pulsar signal to reach a radio telescope is delayed, the increase of the bandwidth can cause pulse widening, pulse energy is dispersed to deform the pulse profile, the sensitivity is reduced, and even the pulse signal is eliminated.
Since the pulsar signal is extremely weak, it is necessary to disperse the pulsar signal in order to observe a clearly visible pulse profile. The pulsar achromatic technology can effectively improve the sensitivity of astronomical observation and improve the pulsar identification and detection capability of an observation system. In recent years, pulsar scientific research and observation put higher requirements on the cancellation dispersion technology, an achromatic system with ultra-bandwidth and high-speed signal processing capability is a necessary trend for development of future radio pulsar observation equipment, and related technologies encounter great challenges. The existing coherent de-dispersion processing technology has the following defects:
(1) the coherent de-dispersion method for pulsar has huge calculation amount, comprises FFT, IFFT and chirp function multiplication, and has relatively low calculation efficiency and long time consumption in the existing de-dispersion processing method, so that the requirement of high-speed real-time pulsar observation cannot be met. At present, the commonly used coherent de-dispersion processing is realized on a computer in a serial mode, and because a CPU thread is used, high parallelization cannot be carried out, the operation efficiency is low, and the speed is low.
(2) The performance of the observation equipment is improved, the frequency range of the celestial body signals which can be observed by the radio astronomy is rapidly expanded, the resolution ratio is higher and higher along with the continuous increase of the observation bandwidth, the generated data volume is huge, and the existing coherent achromatic technology cannot rapidly process mass data in real time. For example, the amount of data generated by leading-edge observation devices such as ultra-wideband receivers, multi-beam receivers and PAF receivers is very large, usually of TB order, and the real-time processing of such large data poses unprecedented challenges for coherent achromatic techniques and chromatic dispersion processing algorithms.
(3) Due to the problems of low speed, poor real-time data processing performance and the like of the existing CPU coherent fading method, the searching requirement of the pulsar signal cannot be met. Therefore, the pulsar search generally adopts an incoherent achromatic processing method with a small calculation amount, but the method cannot completely eliminate the pulsar chromatic effect and influences the signal-to-noise ratio of the signal to a certain extent.
Disclosure of Invention
It is therefore an objective of the claimed invention to provide a system and method for coherent de-dispersion of pulsar that at least partially solves at least one of the above-mentioned problems.
To achieve the above object, as an aspect of the present invention, there is provided a method of coherent achromatic comprising the steps of:
step 1: reading pulsar baseband data; initializing variables and parameters of a CPU (central processing unit) at a host end and a GPU (graphics processing unit) at a device end;
step 2: the data processing of the CPU and the GPU is relatively independent, and data are exchanged in a pointer transmission mode;
and step 3: copying the data of the CPU memory to a GPU video memory;
and 4, step 4: setting and initializing FFT plan, and setting 1D complex number to complex number FFT algorithm execution rules by using cufftPlan1D (& plan, fftsize, CUFFT _ C2C, BATCH);
and 5: calculating an FFT algorithm;
step 6: the CPU starts a GPU kernel function, distributes a GPU multithreading task, calculates multiplication operation of pulsar signals and an interstellar medium function chirp in a frequency domain, and executes a coherent achromatic processing algorithm in a multithreading mode of the GPU at the equipment end;
and 7: converting the GPU Kernel processing result into a time domain signal, and setting 1D inverse IFFT plan, namely calculating inverse fast Fourier transform;
and 8: copying the processing result to a CPU, and removing the overlapped part of the data;
and step 9: and writing the file, and releasing the memory resources opened by the GPU equipment terminal if the de-chromatic processing of all the data is finished.
The format of the pulse satellite baseband data in the step 1 is psrdada, and the file comprises header information and a data part;
the CPU at the host end and the GPU at the equipment end comprise variables and parameters including observation frequency, bandwidth and DM value.
And 2, the CPU at the host end in the step 2 uses a cudaMalloc function to allocate a GPU memory space.
And 3, the data transmission between the GPU video memory and the CPU memory in the step 3 is realized through the memory management functions of the C and the CUDA API.
Wherein N is overlapped in the FFT of N-point complex sampling in the FFT operation in the step 5DMPoint sampling and FFT operation are implemented by using a cuFFT library of a CUDA parallel architecture, a function of the cuFFT library is a global function and is effective in the whole program, and only Host can call the function.
And after the CuFFT function call is completed, the control right is returned to the host.
Wherein, the interstellar medium inverse transfer function is calculated in the GPU in the step 6, and the difference from the CPU algorithm is that FFT signals and H are-1(f) The complex multiplication in the GPU is independently and parallelly calculated, so that the time is saved, and the delay of memory access is eliminated.
In the step 7, a cuffexecc 2C () function of the cuFFT is used for realizing an inverse fast fourier transform algorithm in parallel at a high speed, and a result of the GPU Kernel processing is converted into a time domain signal
As another aspect of the invention, the invention also provides a system for coherent dispersion elimination, which comprises a CPU and a GPU, and the software development environment comprises a CUDA and a Linux operating system.
The CPU is selected from Intel Xeon E5-1620 CPU, the GPU is selected from NVIDIA GPU, the CUDA is selected from CUDA 10.0, and the Linux operating system is selected from Ubuntu 18.04.
Based on the above technical solution, the system and method for coherent de-dispersion of pulsar according to the present invention have at least one of the following advantages over the prior art:
(1) the invention solves the problem that the coherent de-dispersion algorithm cannot calculate in real time on a CPU platform due to huge calculation amount. The performance of the GPU achromatic algorithm obtains a high acceleration ratio, the calculation performance of the algorithm is greatly improved, the great advantages of a GPU parallel calculation platform are fully exerted, and the real-time achromatic dispersion processing requirement of massive astronomical data is met. The GPU coherent de-dispersion processing method is easy for GPU cluster expansion, is easy for realizing real-time processing of mass data, and has very wide development prospect in the pulsar research field.
(2) The CUDA multi-thread task allocation, management and communication are realized, a multi-level storage structure of the GPU is efficiently utilized, the GPU resource utilization rate is improved, and the computing time is further reduced; the method and the device realize multi-task parallel processing, greatly improve the computation performance of the dispersion elimination processing of the pulsar signals, improve the processing speed and meet the real-time coherent dispersion elimination processing requirement of mass data.
(3) The pulsar coherent system and the method provided by the invention effectively solve the problem of pulsar signal dispersion processing, can quickly acquire the real profile of the pulsar signal, greatly improve the signal-to-noise ratio of the pulsar signal and the pulsar detection capability, can be used for quick radio explosion and pulsar search, and acquire a higher signal-to-noise ratio.
Drawings
FIG. 1 is a flow chart of a method for coherent de-dispersion using a GPU according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an overlapped FFT in an embodiment of the present invention;
FIG. 3 is a diagram of a Kernel thread layout in an embodiment of the invention;
FIG. 4 is a graph comparing the elapsed time of two GPU platforms TITAN V and Tesla k20 according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a GPU coherent achromatic acceleration ratio in an embodiment of the present invention;
FIG. 6 is a graphical representation of the results of coherent de-dispersion processing according to an embodiment of the present invention.
Detailed Description
The invention provides a pulsar coherent de-dispersion system and a method, which can eliminate the dispersion effect of pulsar signals at a high speed, effectively improve the data processing speed of the de-dispersion system, optimize the distribution of calculation tasks on a CPU + GPU platform and fully exert the advantages of the GPU calculation platform. The coherent achromatic system realized by the invention adopts Intel Xeon E5-1620 CPU and NVIDIA GPU, and the software development environment adopts CUDA 10.0 and Linux operating system (Ubuntu 18.04).
Specifically, the invention discloses a coherent dispersion-eliminating method, as shown in fig. 1, comprising the following steps:
step 1: reading pulsar baseband data; initializing variables and parameters of a CPU (central processing unit) at a host end and a GPU (graphics processing unit) at a device end;
step 2: the data processing of the CPU and the GPU is relatively independent, and data are exchanged in a pointer transmission mode;
and step 3: copying the data of the CPU memory to a GPU video memory;
and 4, step 4: setting and initializing FFT plan, and setting 1D complex number to complex number FFT algorithm execution rules by using cufftPlan1D (& plan, fftsize, CUFFT _ C2C, BATCH);
and 5: calculating an FFT algorithm;
step 6: the CPU starts a GPU kernel function, distributes a GPU multithreading task, calculates multiplication operation of pulsar signals and an interstellar medium function chirp in a frequency domain, and executes a coherent achromatic processing algorithm in a multithreading mode of the GPU at the equipment end;
and 7: converting the GPU Kernel processing result into a time domain signal, and setting 1D inverse IFFT plan, namely calculating inverse fast Fourier transform;
and 8: copying the processing result to a CPU, and removing the overlapped part of the data;
and step 9: and writing the file, and releasing the memory resources opened by the GPU equipment terminal if the de-chromatic processing of all the data is finished.
The format of the pulse satellite baseband data in the step 1 is psrdada, and the file comprises header information and a data part;
the CPU at the host end and the GPU at the equipment end comprise variables and parameters including observation frequency, bandwidth and DM value.
And 2, the CPU at the host end in the step 2 uses a cudaMalloc function to allocate a GPU memory space.
And 3, the data transmission between the GPU video memory and the CPU memory in the step 3 is realized through the memory management functions of the C and the CUDA API.
Wherein N is overlapped in the FFT of N-point complex sampling in the FFT operation in the step 5DMPoint sampling and FFT operation are implemented by using a cuFFT library of a CUDA parallel architecture, a function of the cuFFT library is a global function and is effective in the whole program, and only Host can call the function.
And after the CuFFT function call is completed, the control right is returned to the host.
Wherein, the interstellar medium inverse transfer function is calculated in the GPU in the step 6, and the difference from the CPU algorithm is that FFT signals and H are-1(f) The complex multiplication in the GPU is independently and parallelly calculated, so that the time is saved, and the delay of memory access is eliminated.
In the step 7, a cuffexecc 2C () function of the cuFFT is used for realizing an inverse fast fourier transform algorithm in parallel at a high speed, and a result of the GPU Kernel processing is converted into a time domain signal
The invention also discloses a system for coherent dispersion elimination, which comprises a CPU and a GPU, and the software development environment comprises a CUDA and a Linux operating system.
The CPU is selected from Intel Xeon E5-1620 CPU, the GPU is selected from NVIDIA GPU, the CUDA is selected from CUDA 10.0, and the Linux operating system is selected from Ubuntu 18.04.
In order that the objects, technical solutions and advantages of the present invention will become more apparent, the present invention will be further described in detail with reference to the accompanying drawings in conjunction with the following specific embodiments.
The coherent achromatic dispersion is directly carried out dispersion processing in a baseband signal, the calculated amount is large, the implementation is relatively complex, firstly, pulsar baseband data is read, an inverse transmission filter (chirp) function of an interstellar medium is calculated, and then, the read data and stars are combinedAnd carrying out convolution operation on the inverse transmission function of the interstage. However, since the time-domain interstellar dielectric filter is computationally expensive and complex to implement, the FFT transforms the time-domain signal into the frequency domain, and then multiplies the frequency domain signal by the chirp function, thereby restoring the original time-domain signal to the processing result. When the coherent de-dispersion is carried out by adopting the FFT mode, N complex samples are taken each time, and NDMThe samples are superimposed on each other, and after the achromatic processing, the superimposed portions are removed, as shown in fig. 2.
In the GPU program, a Kernel function utilizes a GPU multithreading structure to realize multiplication of complex numbers in parallel at high speed. And accelerating FFT calculation by the cuFFT, writing a processing result into a global memory of the GPU, starting GPU threads by the Host, and distributing a calculation task of each thread. The number of threads started by the GPU is equal to the length of the FFT, the Chrip function is calculated firstly, and then multiplication operation of the FFT result and the Chirp function is calculated. And finally writing the result into the global memory of the GPU. And the GPU thread reads and writes the data of the global memory through the index of the thread id. The complex multiplication is completed in the register of the GPU, so that the access delay of the global memory is reduced. The GPU-enabled kernel thread layout is shown in FIG. 3, threads and thread blocks are both 2D layouts, and the GPU-enabled thread index is as follows:
idx=blockDim.x*gridDim.x*ix+ix.
wherein ix and iy are respectively expressed as x-axis and y-axis direction coordinates of the thread.
ix=blockIdx.x×blockDim.x+threadIdx.x
iy=blockIdx.y×blockDim.y+threadIdx.y
In the CUDA kernel program, each thread is responsible for multiplication of one complex sample point.
The FFT calculation is a main factor influencing the acceleration performance of the coherent achromatic algorithm, the number of FFT points is increased, the calculation amount of the algorithm is rapidly increased, and the real-time processing of data is difficult to realize. The coherent de-dispersion processing times for the CPU and GPU are shown in table 1. Compared with a CPU algorithm, the GPU parallel algorithm obtains an acceleration ratio of dozens of times, and has obvious acceleration advantages.
Table 1 shows the coherent de-dispersion processing times of the GPU and CPU. Due to the huge calculation amount of FFT, as the number of FFT points increases, the data processing time of CPU, K20 and TITAN V also increases, and the calculation time of GPU is far shorter than that of CPU.
TABLE 1 coherent de-dispersion processing time
(unit: ms)
FFT length CPU Tesla K20 TITAN V
210 0.223 0.798 0.136
213 1.474 0.933 0.221
216 11.633 1.582 0.873
219 152.008 7.703 6.255
222 1400.051 56.642 48.800
225 13152.976 549.455 483.201
FIG. 4 shows the computation time of two GPU platforms, and it can be seen from the graph that the coherent de-dispersion processing time of TITAN V and Tesla K20 is not very different, and the number of FFT points exceeds 222Thereafter, the elapsed time curve rises rapidly, requiring more time to complete the data processing.
FIG. 5 shows the speed-up ratio of the GPU coherent achromatic algorithm when the number of FFT points reaches 222The GPU parallel algorithm obtains the highest speed-up ratio which is about 28 times that of a CPU. The acceleration ratio of the TITAN V is significantly higher than that of Tesla K20, and if the processed data is larger, the acceleration performance of the TITAN V is more significant.
The GPU coherent achromatic method provided by the invention is generally used for processing the chromatic dispersion effect which can not be eliminated by the incoherent achromatic, and if the coherent achromatic is operated on a plurality of channel data, a GPU algorithm can obtain a better speed-up ratio along with the increase of the number of channels. The coherent de-dispersion processing result of the pulsar PSR B1937+21 is shown in fig. 6, and the real profile of the pulsar can be theoretically obtained by using the coherent de-dispersion method, so that the signal-to-noise ratio of the pulsar signal can be improved.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention and are not intended to limit the present invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method of coherent dispersion cancellation comprising the steps of:
step 1: reading pulsar baseband data; initializing variables and parameters of a CPU (central processing unit) at a host end and a GPU (graphics processing unit) at a device end;
step 2: the data processing of the CPU and the GPU is relatively independent, and data are exchanged in a pointer transmission mode;
and step 3: copying the data of the CPU memory to a GPU video memory;
and 4, step 4: setting and initializing FFT plan, and setting 1D complex number to complex number FFT algorithm execution rules by using cufftPlan1D (& plan, fftsize, CUFFT _ C2C, BATCH);
and 5: calculating an FFT algorithm;
step 6: the CPU starts a GPU kernel function, distributes a GPU multithreading task, calculates multiplication operation of pulsar signals and an interstellar medium function chirp in a frequency domain, and executes a coherent achromatic processing algorithm in a multithreading mode of the GPU at the equipment end;
and 7: converting the GPU Kernel processing result into a time domain signal, and setting 1D inverse IFFT plan, namely calculating inverse fast Fourier transform;
and 8: copying the processing result to a CPU, and removing the overlapped part of the data;
and step 9: and writing the file, and releasing the memory resources opened by the GPU equipment terminal if the de-chromatic processing of all the data is finished.
2. The method according to claim 1, wherein the format of the pulse satellite baseband data in step 1 is psrdada, and the file contains header information and a data part;
the CPU at the host end and the GPU at the equipment end comprise variables and parameters including observation frequency, bandwidth and DM value.
3. The method according to claim 1, wherein the host-side CPU allocates the GPU memory space in step 2 using a cudaMalloc function.
4. The method according to claim 1, wherein the data transmission between the GPU video memory and the CPU memory in step 3 is implemented by memory management functions of C and CUDA APIs.
5. The method according to claim 1, wherein N overlaps in the FFT of N-point complex samples in the FFT operation in step 5DMPoint sampling and FFT operation are implemented by using a cuFFT library of a CUDA parallel architecture, a function of the cuFFT library is a global function and is effective in the whole program, and only Host can call the function.
6. The method of claim 5, wherein the FFT operation cannot execute the whole operation on the GPU, and after the CuFFT function call is completed, the control right is returned to the host.
7. The method according to claim 1, wherein the step 6 of calculating the interplanetary medium inverse transfer function in the GPU is different from the CPU algorithm in that the FFT signal is H and H-1(f) The complex multiplication in the GPU is independently and parallelly calculated, so that the time is saved, and the delay of memory access is eliminated.
8. The method according to claim 1, wherein in step 7, a cuffexecc 2C () function of cuFFT is used to implement an inverse fast fourier transform algorithm in parallel at high speed, and a result of GPU Kernel processing is converted into a time domain signal.
9. A coherent de-dispersive system employing the method according to any of the claims 1 to 8, comprising a CPU and a GPU, the software development environment comprising a CUDA and a Linux operating system.
10. The coherent de-dispersive system according to claim 9, wherein the CPU is selected from Intel XeonE e5-1620 CPU, the GPU is selected from NVIDIA GPU, the CUDA is selected from CUDA 10.0, and the Linux operating system is selected from Ubuntu 18.04.
CN202010130257.4A 2020-02-28 2020-02-28 Pulsar coherent de-dispersion system and method Pending CN111368252A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010130257.4A CN111368252A (en) 2020-02-28 2020-02-28 Pulsar coherent de-dispersion system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010130257.4A CN111368252A (en) 2020-02-28 2020-02-28 Pulsar coherent de-dispersion system and method

Publications (1)

Publication Number Publication Date
CN111368252A true CN111368252A (en) 2020-07-03

Family

ID=71208356

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010130257.4A Pending CN111368252A (en) 2020-02-28 2020-02-28 Pulsar coherent de-dispersion system and method

Country Status (1)

Country Link
CN (1) CN111368252A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407979A (en) * 2021-08-16 2021-09-17 深圳致星科技有限公司 Heterogeneous acceleration method, device and system for longitudinal federated logistic regression learning
CN114331805A (en) * 2021-12-27 2022-04-12 中国科学院苏州生物医学工程技术研究所 OCT imaging method and system based on GPU

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106289239A (en) * 2016-08-15 2017-01-04 中国科学院新疆天文台 A kind of method eliminating the interference of wideband time domain in the pulsar data time of advent

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106289239A (en) * 2016-08-15 2017-01-04 中国科学院新疆天文台 A kind of method eliminating the interference of wideband time domain in the pulsar data time of advent

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘东亮;PAUL DEMOREST;南仁东;: "基于CUDA的相干消色散算法实现与测试" *
张海龙: "脉冲星数字终端技术综述" *
托乎提努尔等: "基于图形处理器的高速中值滤波算法" *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113407979A (en) * 2021-08-16 2021-09-17 深圳致星科技有限公司 Heterogeneous acceleration method, device and system for longitudinal federated logistic regression learning
CN113407979B (en) * 2021-08-16 2021-11-26 深圳致星科技有限公司 Heterogeneous acceleration method, device and system for longitudinal federated logistic regression learning
CN114331805A (en) * 2021-12-27 2022-04-12 中国科学院苏州生物医学工程技术研究所 OCT imaging method and system based on GPU
CN114331805B (en) * 2021-12-27 2023-04-14 中国科学院苏州生物医学工程技术研究所 OCT imaging method and system based on GPU

Similar Documents

Publication Publication Date Title
US11023206B2 (en) Dot product calculators and methods of operating the same
CN111368252A (en) Pulsar coherent de-dispersion system and method
CN113190515B (en) Heterogeneous parallel computing-based urban mass point cloud coordinate transformation method
Liu et al. Parallel processing of massive remote sensing images in a GPU architecture
CN109359267A (en) A kind of low complex degree multiplier-less fixed point FFT optimization method based on dynamic cut position
CN111539997A (en) Image parallel registration method, system and device based on GPU computing platform
Bahri et al. Image feature extraction algorithm based on CUDA architecture: case study GFD and GCFD
Rahman et al. Parallel implementation of a spatio-temporal visual saliency model
CN103761709A (en) Parallel real-time SAR image spot and noise reducing method based on multiple DSPs
Liang et al. Real-time implementation and performance optimization of 3D sound localization on GPUs
Liu et al. A real-time orbit SATellites Uncertainty propagation and visualization system using graphics computing unit and multi-threading processing
CN113359134A (en) SAR data distributed real-time imaging processing system and method based on embedded GPU
CN111292222B (en) Pulsar dispersion eliminating device and method
Li et al. Fast convolution operations on many-core architectures
Zhang et al. GPU-based parallel back projection algorithm for the translational variant BiSAR imaging
US20200394994A1 (en) Invertible neural network to synthesize audio signals
Mujahid et al. GPU-accelerated multivariate empirical mode decomposition for massive neural data processing
CN113344765B (en) Frequency domain astronomical image target detection method and system
Westerlund et al. Performance analysis of GPU-accelerated filter-based source finding for HI spectral line image data
Majid et al. Parallel implementation of the wideband DOA algorithm on single core, multicore, GPU and IBM cell BE processor
Fan et al. Parallel geometric correction for single spaceborne SAR image
Eller et al. Acceleration of 2-D Finite difference time domain acoustic wave simulation using GPUs
CN117407643B (en) Optimization method, system, equipment and medium for general matrix multiplication
Schlemon et al. Resource-Constrained Optimizations For Synthetic Aperture Radar On-Board Image Processing
CN109543137B (en) Parallel fast Fourier transform data processing method and device in cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200703