-
Intensity-sensitive quality assessment of extended sources in astronomical images
Authors:
X. Li,
K. Adamek,
W. Armour
Abstract:
Radio astronomy studies the Universe by observing the radio emissions of celestial bodies. Different methods can be used to recover the sky brightness distribution (SBD), which describes the distribution of celestial sources from recorded data, with the output dependent on the method used. Image quality assessment (IQA) indexes can be used to compare the differences between restored SBDs produced…
▽ More
Radio astronomy studies the Universe by observing the radio emissions of celestial bodies. Different methods can be used to recover the sky brightness distribution (SBD), which describes the distribution of celestial sources from recorded data, with the output dependent on the method used. Image quality assessment (IQA) indexes can be used to compare the differences between restored SBDs produced by different image reconstruction techniques to evaluate the effectiveness of different techniques. However, reconstructed images (for the same SBD) can appear to be very similar, especially when observed by the human visual system (HVS). Hence current structural similarity methods, inspired by the HVS, are not effective. In the past, we have proposed two methods to assess point source images, where low amounts of concentrated information are present in larger regions of noise-like data. But for images that include extended source(s), the increase in complexity of the structure makes the IQA methods for point sources over-sensitive since the important objects cannot be described by isolated point sources. Therefore, in this article we propose augmented Low-Information Similarity Index (augLISI), an improved version of LISI, to assess images including extended source(s). Experiments have been carried out to illustrate how this new IQA method can help with the development and study of astronomical imaging techniques. Note that although we focus on radio astronomical images herein, these IQA methods are also applicable to other astronomical images, and imaging techniques.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Pulscan: Binary pulsar detection using unmatched filters on NVIDIA GPUs
Authors:
Jack White,
Karel Adámek,
Jayanta Roy,
Scott Ransom,
Wesley Armour
Abstract:
The Fourier Domain Acceleration Search (FDAS) and Fourier Domain Jerk Search (FDJS) are proven matched filtering techniques for detecting binary pulsar signatures in time-domain radio astronomy datasets. Next generation radio telescopes such as the SPOTLIGHT project at the GMRT produce data at rates that mandate real-time processing, as storage of the entire captured dataset for subsequent offline…
▽ More
The Fourier Domain Acceleration Search (FDAS) and Fourier Domain Jerk Search (FDJS) are proven matched filtering techniques for detecting binary pulsar signatures in time-domain radio astronomy datasets. Next generation radio telescopes such as the SPOTLIGHT project at the GMRT produce data at rates that mandate real-time processing, as storage of the entire captured dataset for subsequent offline processing is infeasible. The computational demands of FDAS and FDJS make them challenging to implement in real-time detection pipelines, requiring costly high performance computing facilities. To address this we propose Pulscan, an unmatched filtering approach which achieves order-of-magnitude improvements in runtime performance compared to FDAS whilst being able to detect both accelerated and some jerked binary pulsars. We profile the sensitivity of Pulscan using a distribution (N = 10,955) of synthetic binary pulsars and compare its performance with FDAS and FDJS. Our implementation of Pulscan includes an OpenMP version for multicore CPU acceleration, a version for heterogeneous CPU/GPU environments such as NVIDIA Grace Hopper, and a fully optimized NVIDIA GPU implementation for integration into an AstroAccelerate pipeline, which will be deployed in the SPOTLIGHT project at the GMRT. Our results demonstrate that unmatched filtering in Pulscan can serve as an efficient data reduction step, prioritizing datasets for further analysis and focusing human and subsequent computational resources on likely binary pulsar signatures.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
CLEAN algorithm implementation comparisons between popular software packages
Authors:
Daniel Wright,
Karel Adámek,
Wesley Armour
Abstract:
The CLEAN algorithm, first published by Högbom and its later variants such as Multiscale CLEAN (msCLEAN) by Cornwell, has been the most popular tool for deconvolution in radio astronomy. Interferometric imaging used in aperture synthesis radio telescopes requires deconvolution for removal of the telescopes point spread function from the observed images. We have compared source fluxes produced by d…
▽ More
The CLEAN algorithm, first published by Högbom and its later variants such as Multiscale CLEAN (msCLEAN) by Cornwell, has been the most popular tool for deconvolution in radio astronomy. Interferometric imaging used in aperture synthesis radio telescopes requires deconvolution for removal of the telescopes point spread function from the observed images. We have compared source fluxes produced by different implementations of Högbom and msCLEAN (WSCLEAN, CASA) with a prototype implementation of Högbom and msCLEAN for the Square Kilometer Array (SKA) on two datasets. First is a simulation of multiple point sources of known intensity using Högbom, where none of the software packages detected all the simulated point sources to within 1.0% of the simulated values. The second is of supernova remnant G055.7+3.4 taken by the Karl G. Jansky Very Large Array (VLA) using msCLEAN, where each of the software packages produced different images for the same settings.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
Toward using GANs in astrophysical Monte-Carlo simulations
Authors:
Ahab Isaac,
Wesley Armour,
Karel Adámek
Abstract:
Accurate modelling of spectra produced by X-ray sources requires the use of Monte-Carlo simulations. These simulations need to evaluate physical processes, such as those occurring in accretion processes around compact objects by sampling a number of different probability distributions. This is computationally time-consuming and could be sped up if replaced by neural networks. We demonstrate, on an…
▽ More
Accurate modelling of spectra produced by X-ray sources requires the use of Monte-Carlo simulations. These simulations need to evaluate physical processes, such as those occurring in accretion processes around compact objects by sampling a number of different probability distributions. This is computationally time-consuming and could be sped up if replaced by neural networks. We demonstrate, on an example of the Maxwell-Jüttner distribution that describes the speed of relativistic electrons, that the generative adversarial network (GAN) is capable of statistically replicating the distribution. The average value of the Kolmogorov-Smirnov test is 0.5 for samples generated by the neural network, showing that the generated distribution cannot be distinguished from the true distribution.
△ Less
Submitted 16 February, 2024;
originally announced February 2024.
-
Accelerating Dedispersion using Many-Core Architectures
Authors:
Jan Novotný,
Karel Adámek,
M. A. Clark,
Mike Giles,
Wesley Armour
Abstract:
Astrophysical radio signals are excellent probes of extreme physical processes that emit them. However, to reach Earth, electromagnetic radiation passes through the ionised interstellar medium (ISM), introducing a frequency-dependent time delay (dispersion) to the emitted signal. Removing dispersion enables searches for transient signals like Fast Radio Bursts (FRB) or repeating signals from isola…
▽ More
Astrophysical radio signals are excellent probes of extreme physical processes that emit them. However, to reach Earth, electromagnetic radiation passes through the ionised interstellar medium (ISM), introducing a frequency-dependent time delay (dispersion) to the emitted signal. Removing dispersion enables searches for transient signals like Fast Radio Bursts (FRB) or repeating signals from isolated pulsars or those in orbit around other compact objects. The sheer volume and high resolution of data that next generation radio telescopes will produce require High-Performance Computing (HPC) solutions and algorithms to be used in time-domain data processing pipelines to extract scientifically valuable results in real-time. This paper presents a state-of-the-art implementation of brute force incoherent dedispersion on NVIDIA GPUs, and on Intel and AMD CPUs. We show that our implementation is 4x faster (8-bit 8192 channels input) than other available solutions and demonstrate, using 11 existing telescopes, that our implementation is at least 20 faster than real-time. This work is part of the AstroAccelerate package.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Cutting the cost of pulsar astronomy: Saving time and energy when searching for binary pulsars using NVIDIA GPUs
Authors:
Jack White,
Karel Adamek,
Wes Armour
Abstract:
Using the Fourier Domain Acceleration Search (FDAS) method to search for binary pulsars is a computationally costly process. Next generation radio telescopes will have to perform FDAS in real time, as data volumes are too large to store. FDAS is a matched filtering approach for searching time-domain radio astronomy datasets for the signatures of binary pulsars with approximately linear acceleratio…
▽ More
Using the Fourier Domain Acceleration Search (FDAS) method to search for binary pulsars is a computationally costly process. Next generation radio telescopes will have to perform FDAS in real time, as data volumes are too large to store. FDAS is a matched filtering approach for searching time-domain radio astronomy datasets for the signatures of binary pulsars with approximately linear acceleration. In this paper we will explore how we have reduced the energy cost of an SKA-like implementation of FDAS in AstroAccelerate, utilising a combination of mixed-precision computing and dynamic frequency scaling on NVIDIA GPUs. Combining the two approaches, we have managed to save 58% of the overall energy cost of FDAS with a (<3%) sacrifice in numerical sensitivity.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Bits missing: Finding exotic pulsars using bfloat16 on NVIDIA GPUs
Authors:
Jack White,
Karel Adamek,
Jayanta Roy,
Sofia Dimoudi,
Scott M. Ransom,
Wesley Armour
Abstract:
The Fourier Domain Acceleration Search (FDAS) is an effective technique for detecting faint binary pulsars in large radio astronomy datasets. This paper quantifies the sensitivity impact of reducing numerical precision in the GPU accelerated FDAS pipeline of the AstroAccelerate software package. The prior implementation used IEEE-754 single-precision in the entire binary pulsar detection pipeline,…
▽ More
The Fourier Domain Acceleration Search (FDAS) is an effective technique for detecting faint binary pulsars in large radio astronomy datasets. This paper quantifies the sensitivity impact of reducing numerical precision in the GPU accelerated FDAS pipeline of the AstroAccelerate software package. The prior implementation used IEEE-754 single-precision in the entire binary pulsar detection pipeline, spending a large fraction of the runtime computing GPU accelerated FFTs. AstroAccelerate has been modified to use bfloat16 (and IEEE754 double-precision to provide a "gold standard" comparison) within the Fourier domain convolution section of the FDAS routine. Approximately 20,000 synthetic pulsar filterbank files representing binary pulsars were generated using SIGPROC with a range of physical parameters. They have been processed using bfloat16, single and double-precision convolutions. All bfloat16 peaks are within 3% of the predicted signal-to-noise ratio of their corresponding single-precision peaks. Of 14,971 "bright" single-precision fundamental peaks above a power of 44.982 (our experimentally measured highest noise value), 14,602 (97.53%) have a peak in the same acceleration and frequency bin in the bfloat16 output plane, whilst in the remaining 369 the nearest peak is located in the adjacent acceleration bin. There is no bin drift measured between the single and double-precision results. The bfloat16 version of FDAS achieves a speedup of approximately 1.6x compared to single-precision. A comparison between AstroAccelerate and the PRESTO software package is presented using observations collected with the GMRT of PSR J1544+4937, a 2.16ms black widow pulsar in a 2.8 hour compact orbit.
△ Less
Submitted 24 June, 2022;
originally announced June 2022.
-
A Novel Greedy Approach To Harmonic Summing Using GPUs
Authors:
Karel Adamek,
Jayanta Roy,
Wesley Armour
Abstract:
Incoherent harmonic summing is a technique which is used to improve the sensitivity of Fourier domain search methods. A one dimensional harmonic sum is used in time-domain radio astronomy as part of the Fourier domain periodicity search, a type of search used to detect isolated single pulsars. The main problem faced when implementing the harmonic sum on many-core architectures, like GPUs, is the v…
▽ More
Incoherent harmonic summing is a technique which is used to improve the sensitivity of Fourier domain search methods. A one dimensional harmonic sum is used in time-domain radio astronomy as part of the Fourier domain periodicity search, a type of search used to detect isolated single pulsars. The main problem faced when implementing the harmonic sum on many-core architectures, like GPUs, is the very unfavourable memory access pattern of the harmonic sum algorithm. The memory access pattern gets worse as the dimensionality of the harmonic sum increases. Here we present a set of algorithms for calculating the harmonic sum that are suited to many-core architectures such as GPUs. We present an evaluation of the sensitivity of these different approaches, and their performance. This work forms part of the AstroAccelerate project which is a GPU accelerated software package for processing time-domain radio astronomy data.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Implementation of 3D degridding algorithm on the NVIDIA GPUs using CUDA
Authors:
Karel Adámek,
Peter Wortmann,
Bojan Nikolic,
Ben Mort,
Wesley Armour
Abstract:
Practical aperture synthesis imaging algorithms work by iterating between estimating the sky brightness distribution and a comparison of a prediction based on this estimate with the measured data ("visibilities"). Accuracy in the latter step is crucial but is made difficult by irregular and non-planar sampling of data by the telescope. In this work we present a GPU implementation of 3d de-gridding…
▽ More
Practical aperture synthesis imaging algorithms work by iterating between estimating the sky brightness distribution and a comparison of a prediction based on this estimate with the measured data ("visibilities"). Accuracy in the latter step is crucial but is made difficult by irregular and non-planar sampling of data by the telescope. In this work we present a GPU implementation of 3d de-gridding which accurately deals with these two difficulties and is designed for distributed operation. We address the load balancing issues caused by large variation in visibilities that need to be computed. Using CUDA and NVidia GPUs we measure performance up to 1.2 billion visibilities per second.
△ Less
Submitted 25 February, 2021;
originally announced February 2021.
-
Implementing CUDA Streams into AstroAccelerate -- A Case Study
Authors:
Jan Novotný,
Karel Adámek,
Wes Armour
Abstract:
To be able to run tasks asynchronously on NVIDIA GPUs a programmer must explicitly implement asynchronous execution in their code using the syntax of CUDA streams. Streams allow a programmer to launch independent concurrent execution tasks, providing the ability to utilise different functional units on the GPU asynchronously. For example, it is possible to transfer the results from a previous comp…
▽ More
To be able to run tasks asynchronously on NVIDIA GPUs a programmer must explicitly implement asynchronous execution in their code using the syntax of CUDA streams. Streams allow a programmer to launch independent concurrent execution tasks, providing the ability to utilise different functional units on the GPU asynchronously. For example, it is possible to transfer the results from a previous computation performed on input data n-1, over the PCIe bus whilst computing the result for input data n, by placing different tasks in different CUDA streams. The benefit of such an approach is that the time taken for the data transfer between the host and device can be hidden with computation. This case study deals with the implementation of CUDA streams into AstroAccelerate. AstroAccelerate is a GPU accelerated real-time signal processing pipeline for time-domain radio astronomy.
△ Less
Submitted 6 May, 2021; v1 submitted 4 January, 2021;
originally announced January 2021.
-
Initial results from a realtime FRB search with the GBT
Authors:
Devansh Agarwal,
D. R. Lorimer,
M. P. Surnis,
X. Pei,
A. Karastergiou,
G. Golpayegani,
D. Werthimer,
J. Cobb,
M. A. McLaughlin,
S. White,
W. Armour,
D. H. E. MacMahon,
A. P. V. Siemion,
G. Foster
Abstract:
We present the data analysis pipeline, commissioning observations and initial results from the GREENBURST fast radio burst (FRB) detection system on the Robert C. Byrd Green Bank Telescope (GBT) previously described by Surnis et al. which uses the 21~cm receiver observing commensally with other projects. The pipeline makes use of a state-of-the-art deep learning classifier to winnow down the very…
▽ More
We present the data analysis pipeline, commissioning observations and initial results from the GREENBURST fast radio burst (FRB) detection system on the Robert C. Byrd Green Bank Telescope (GBT) previously described by Surnis et al. which uses the 21~cm receiver observing commensally with other projects. The pipeline makes use of a state-of-the-art deep learning classifier to winnow down the very large number of false positive single-pulse candidates that mostly result from radio frequency interference. In our observations totalling 156.5 days so far, we have detected individual pulses from 20 known radio pulsars which provide an excellent verification of the system performance. We also demonstrate, through blind injection analyses, that our pipeline is complete down to a signal-to-noise threshold of 12. Depending on the observing mode, this translates to peak flux sensitivities in the range 0.14--0.89~Jy. Although no FRBs have been detected to date, we have used our results to update the analysis of Lawrence et al. to constrain the FRB all-sky rate to be $1140^{+200}_{-180}$ per day above a peak flux density of 1~Jy. We also constrain the source count index $α=0.83\pm0.06$ which indicates that the source count distribution is substantially flatter than expected from a Euclidean distribution of standard candles (where $α=1.5$). We discuss this result in the context of the FRB redshift and luminosity distributions. Finally, we make predictions for detection rates with GREENBURST, as well as other ongoing and planned FRB experiments.
△ Less
Submitted 31 March, 2020;
originally announced March 2020.
-
Development of production-ready GPU data processing pipeline software for AstroAccelerate
Authors:
Cees Carels,
Karel Adámek,
Jan Novotný,
Wesley Armour
Abstract:
Upcoming large scale telescope projects such as the Square Kilometre Array (SKA) will see high data rates and large data volumes; requiring tools that can analyse telescope event data quickly and accurately. In modern radio telescopes, analysis software forms a core part of the data read out, and long-term software stability and maintainability are essential. AstroAccelerate is a many core acceler…
▽ More
Upcoming large scale telescope projects such as the Square Kilometre Array (SKA) will see high data rates and large data volumes; requiring tools that can analyse telescope event data quickly and accurately. In modern radio telescopes, analysis software forms a core part of the data read out, and long-term software stability and maintainability are essential. AstroAccelerate is a many core accelerated software package that uses NVIDIA(R) GPUs to perform realtime analysis of radio telescope data, and it has been shown to be substantially faster than realtime at processing simulated SKA-like data. AstroAccelerate contains optimised GPU implementations of signal processing tools used in radio astronomy including dedispersion, Fourier domain acceleration search, single pulse detection, and others. This article describes the transformation of AstroAccelerate from a C-like prototype code to a production-ready software library with a C++ API and a Python interface; while preserving compatibility with legacy software that is implemented in C. The design of the software library interfaces, refactoring aspects, and coding techniques are discussed.
△ Less
Submitted 16 January, 2020; v1 submitted 16 December, 2019;
originally announced December 2019.
-
Searching for pulsars in extreme orbits -- GPU acceleration of the Fourier domain 'jerk' search
Authors:
Karel Adámek,
Jan Novotný,
Sofia Dimoudi,
Wesley Armour
Abstract:
Binary pulsars are an important target for radio surveys because they present a natural laboratory for a wide range of astrophysics for example testing general relativity, including detection of gravitational waves. The orbital motion of a pulsar which is locked in a binary system causes a frequency shift (a Doppler shift) in their normally very periodic pulse emissions. These shifts cause a reduc…
▽ More
Binary pulsars are an important target for radio surveys because they present a natural laboratory for a wide range of astrophysics for example testing general relativity, including detection of gravitational waves. The orbital motion of a pulsar which is locked in a binary system causes a frequency shift (a Doppler shift) in their normally very periodic pulse emissions. These shifts cause a reduction in the sensitivity of traditional periodicity searches. To correct this smearing Ransom [2001], Ransom et al. [2002] developed the Fourier domain acceleration search (FDAS) which uses a matched filtering technique. This method is however limited to a constant pulsar acceleration. Therefore, Andersen and Ransom [2018] broadened the Fourier domain acceleration search to account also for a linear change in the acceleration by implementing the Fourier domain "jerk" search into the PRESTO software package. This extension increases the number of matched filters used significantly. We have implemented the Fourier domain "jerk" search (JERK) on GPUs using CUDA. We have achieved 90x performance increase when compared to the parallel implementation of JERK in PRESTO. This work is part of the AstroAccelerate project Armour et al. [2019], a many-core accelerated time-domain signal processing library for radio astronomy.
△ Less
Submitted 4 November, 2019;
originally announced November 2019.
-
Single Pulse Detection Algorithms for Real-time Fast Radio Burst Searches using GPUs
Authors:
Karel Adamek,
Wesley Armour
Abstract:
The detection of non-repeating or irregular events in time-domain radio astronomy has gained importance over the last decade due to the discovery of fast radio bursts. Existing or upcoming radio telescopes are gathering more and more data and consequently the software, which is an important part of these telescopes, must process large data volumes at high data rates. Data has to be searched throug…
▽ More
The detection of non-repeating or irregular events in time-domain radio astronomy has gained importance over the last decade due to the discovery of fast radio bursts. Existing or upcoming radio telescopes are gathering more and more data and consequently the software, which is an important part of these telescopes, must process large data volumes at high data rates. Data has to be searched through to detect new and interesting events, often in real-time. These requirements necessitate new and fast algorithms which must process data quickly and accurately. In this work we present new algorithms for single pulse detection using boxcar filters. We have quantified the signal loss introduced by single pulse detection algorithms which use boxcar filters and based on these results, we have designed two distinct "lossy" algorithms. Our lossy algorithms use an incomplete set of boxcar filters to accelerate detection at the expense of a small reduction in detected signal power. We present formulae for signal loss, descriptions of our algorithms and their parallel implementation on NVIDIA GPUs using CUDA. We also present tests of correctness, tests on artificial data and the performance achieved. Our implementation can process SKA-MID-like data 266$\times$ faster than real-time on a NVIDIA P100 GPU and 500x faster than real-time on a NVIDIA Titan V GPU with a mean signal power loss of 7%. We conclude with prospects for single pulse detection for beyond SKA era, nanosecond time resolution radio astronomy.
△ Less
Submitted 27 April, 2020; v1 submitted 18 October, 2019;
originally announced October 2019.
-
GREENBURST: a commensal fast radio burst search back-end for the Green Bank Telescope
Authors:
Mayuresh P. Surnis,
Devansh Agarwal,
Duncan R. Lorimer,
Xin Pei,
Griffin Foster,
Aris Karastergiou,
Golnoosh Golpayegani,
Ronald J. Maddalena,
Steve White,
Wes Armour,
Jeff Cobb,
Maura A. McLaughlin,
David H. E. MacMahon,
Andrew P. V. Siemion,
Dan Werthimer,
Chris J. Williams
Abstract:
We describe the design and deployment of GREENBURST, a commensal Fast Radio Burst (FRB) search system at the Green Bank Telescope. GREENBURST uses the dedicated L-band receiver tap to search over the 960$-$1920 MHz frequency range for pulses with dispersion measures out to $10^4$ pc cm$^{-3}$. Due to its unique design, GREENBURST will obtain data even when the L-band receiver is not being used for…
▽ More
We describe the design and deployment of GREENBURST, a commensal Fast Radio Burst (FRB) search system at the Green Bank Telescope. GREENBURST uses the dedicated L-band receiver tap to search over the 960$-$1920 MHz frequency range for pulses with dispersion measures out to $10^4$ pc cm$^{-3}$. Due to its unique design, GREENBURST will obtain data even when the L-band receiver is not being used for scheduled observing. This makes it a sensitive single pixel detector capable of reaching deeper in the radio sky. While single pulses from Galactic pulsars and rotating radio transients will be detectable in our observations, and will form part of the database we archive, the primary goal is to detect and study FRBs. Based on recent determinations of the all-sky rate, we predict that the system will detect approximately one FRB for every 2$-$3 months of continuous operation. The high sensitivity of GREENBURST means that it will also be able to probe the slope of the FRB source function, which is currently uncertain in this observing band.
△ Less
Submitted 13 March, 2019;
originally announced March 2019.
-
A GPU implementation of the harmonic sum algorithm
Authors:
Karel Adámek,
Wesley Armour
Abstract:
Time-domain radio astronomy utilizes a harmonic sum algorithm as part of the Fourier domain periodicity search, this type of search is used to discover single pulsars. The harmonic sum algorithm is also used as part of the Fourier domain acceleration search which aims to discover pulsars that are locked in orbit around another pulsar or compact object. However porting the harmonic sum to many-core…
▽ More
Time-domain radio astronomy utilizes a harmonic sum algorithm as part of the Fourier domain periodicity search, this type of search is used to discover single pulsars. The harmonic sum algorithm is also used as part of the Fourier domain acceleration search which aims to discover pulsars that are locked in orbit around another pulsar or compact object. However porting the harmonic sum to many-core architectures like GPUs is not a straightforward task. The main problem that must be overcome is the very unfavourable memory access pattern, which gets worse as the dimensionality of the harmonic sum increases. We present a set of algorithms for calculating the harmonic sum that are more suited to many-core architectures such as GPUs. We present an evaluation of the sensitivity of these different approaches, and their performance. This work forms part of the AstroAccelerate project which is a GPU accelerated software package for processing time-domain radio astronomy data.
△ Less
Submitted 6 December, 2018;
originally announced December 2018.
-
Science Pipelines for the Square Kilometre Array
Authors:
Jamie Farnes,
Ben Mort,
Fred Dulwich,
Stef Salvini,
Wes Armour
Abstract:
The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of 5 zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization dat…
▽ More
The Square Kilometre Array (SKA) will be both the largest radio telescope ever constructed and the largest Big Data project in the known Universe. The first phase of the project will generate on the order of 5 zettabytes of data per year. A critical task for the SKA will be its ability to process data for science, which will need to be conducted by science pipelines. Together with polarization data from the LOFAR Multifrequency Snapshot Sky Survey (MSSS), we have been developing a realistic SKA-like science pipeline that can handle the large data volumes generated by LOFAR at 150 MHz. The pipeline uses task-based parallelism to image, detect sources, and perform Faraday Tomography across the entire LOFAR sky. The project thereby provides a unique opportunity to contribute to the technological development of the SKA telescope, while simultaneously enabling cutting-edge scientific results. In this paper, we provide an update on current efforts to develop a science pipeline that can enable tight constraints on the magnetised large-scale structure of the Universe.
△ Less
Submitted 20 November, 2018;
originally announced November 2018.
-
A GPU implementation of the Correlation Technique for Real-time Fourier Domain Pulsar Acceleration Searches
Authors:
Sofia Dimoudi,
Karel Adamek,
Prabu Thiagaraj,
Scott M. Ransom,
Aris Karastergiou,
Wesley Armour
Abstract:
The study of binary pulsars enables tests of general relativity. Orbital motion in binary systems causes the apparent pulsar spin frequency to drift, reducing the sensitivity of periodicity searches. Acceleration searches are methods that account for the effect of orbital acceleration. Existing methods are currently computationally expensive, and the vast amount of data that will be produced by ne…
▽ More
The study of binary pulsars enables tests of general relativity. Orbital motion in binary systems causes the apparent pulsar spin frequency to drift, reducing the sensitivity of periodicity searches. Acceleration searches are methods that account for the effect of orbital acceleration. Existing methods are currently computationally expensive, and the vast amount of data that will be produced by next generation instruments such as the Square Kilometre Array (SKA) necessitates real-time acceleration searches, which in turn requires the use of High Performance Computing (HPC) platforms. We present our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs). The correlation technique is applied as a convolution with multiple Finite Impulse Response filters in the Fourier domain. Two approaches are compared: the first uses the NVIDIA cuFFT library for applying Fast Fourier Transforms (FFTs) on the GPU, and the second contains a custom FFT implementation in GPU shared memory. We find that the FFT shared memory implementation performs between 1.5 and 3.2 times faster than our cuFFT-based application for smaller but sufficient filter sizes. It is also 4 to 6 times faster than the existing GPU and OpenMP implementations of FDAS. This work is part of the AstroAccelerate project, a many-core accelerated time-domain signal processing library for radio astronomy.
△ Less
Submitted 15 April, 2018;
originally announced April 2018.
-
Pulsar Searches with the SKA
Authors:
L. Levin,
W. Armour,
C. Baffa,
E. Barr,
S. Cooper,
R. Eatough,
A. Ensor,
E. Giani,
A. Karastergiou,
R. Karuppusamy,
M. Keith,
M. Kramer,
R. Lyon,
M. Mackintosh,
M. Mickaliger,
R van Nieuwpoort,
M. Pearson,
T. Prabu,
J. Roy,
O. Sinnen,
L. Spitler,
H. Spreeuw,
B. W. Stappers,
W. van Straten,
C. Williams
, et al. (2 additional authors not shown)
Abstract:
The Square Kilometre Array will be an amazing instrument for pulsar astronomy. While the full SKA will be sensitive enough to detect all pulsars in the Galaxy visible from Earth, already with SKA1, pulsar searches will discover enough pulsars to increase the currently known population by a factor of four, no doubt including a range of amazing unknown sources. Real time processing is needed to deal…
▽ More
The Square Kilometre Array will be an amazing instrument for pulsar astronomy. While the full SKA will be sensitive enough to detect all pulsars in the Galaxy visible from Earth, already with SKA1, pulsar searches will discover enough pulsars to increase the currently known population by a factor of four, no doubt including a range of amazing unknown sources. Real time processing is needed to deal with the 60 PB of pulsar search data collected per day, using a signal processing pipeline required to perform more than 10 POps. Here we present the suggested design of the pulsar search engine for the SKA and discuss challenges and solutions to the pulsar search venture.
△ Less
Submitted 4 December, 2017;
originally announced December 2017.
-
Improved Acceleration of the GPU Fourier Domain Acceleration Search Algorithm
Authors:
Karel Adámek,
Sofia Dimoudi,
Mike Giles,
Wesley Armour
Abstract:
We present an improvement of our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs) (Dimoudi & Armour 2015; Dimoudi et al. 2017). Our new improved convolution code which uses our custom GPU FFT code is between 2.5 and 3.9 times faster the than our cuFFT-based implementation (on an NVIDIA P100) and allows for a…
▽ More
We present an improvement of our implementation of the Correlation Technique for the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs) (Dimoudi & Armour 2015; Dimoudi et al. 2017). Our new improved convolution code which uses our custom GPU FFT code is between 2.5 and 3.9 times faster the than our cuFFT-based implementation (on an NVIDIA P100) and allows for a wider range of filter sizes then our previous version. By using this new version of our convolution code in FDAS we have achieved 44% performance increase over our previous best implementation. It is also approximately 8 times faster than the existing PRESTO GPU implementation of FDAS (Luo 2013). This work is part of the AstroAccelerate project (Armour et al. 2002), a many-core accelerated time-domain signal processing library for radio astronomy.
△ Less
Submitted 29 November, 2017;
originally announced November 2017.
-
ALFABURST: A commensal search for Fast Radio Bursts with Arecibo
Authors:
Griffin Foster,
Aris Karastergiou,
Golnoosh Golpayegani,
Mayuresh Surnis,
Duncan R. Lorimer,
Jayanth Chennamangalam,
Maura McLaughlin,
Wes Armour,
Jeff Cobb,
David H. E. MacMahon,
Xin Pei,
Kaustubh Rajwade,
Andrew P. V. Siemion,
Dan Werthimer,
Chris J. Williams
Abstract:
ALFABURST has been searching for Fast Radio Bursts (FRBs) commensally with other projects using the Arecibo L-band Feed Array (ALFA) receiver at the Arecibo Observatory since July 2015. We describe the observing system and report on the non-detection of any FRBs from that time until August 2017 for a total observing time of 518 hours. With current FRB rate models, along with measurements of telesc…
▽ More
ALFABURST has been searching for Fast Radio Bursts (FRBs) commensally with other projects using the Arecibo L-band Feed Array (ALFA) receiver at the Arecibo Observatory since July 2015. We describe the observing system and report on the non-detection of any FRBs from that time until August 2017 for a total observing time of 518 hours. With current FRB rate models, along with measurements of telescope sensitivity and beam size, we estimate that this survey probed redshifts out to about 3.4 with an effective survey volume of around 600,000 Mpc$^3$. Based on this, we would expect, at the 99% confidence level, to see at most two FRBs. We discuss the implications of this non-detection in the context of results from other telescopes and the limitation of our search pipeline. During the survey, single pulses from 17 known pulsars were detected. We also report the discovery of a Galactic radio transient with a pulse width of 3 ms and dispersion measure of 281 pc cm$^{-3}$, which was detected while the telescope was slewing between fields.
△ Less
Submitted 22 November, 2017; v1 submitted 30 October, 2017;
originally announced October 2017.
-
Initial Results from the ALFABURST Survey
Authors:
Mayuresh Surnis,
Griffin Foster,
Golnoosh Golpayegani,
Aris Karastergiou,
Duncan Lorimer,
Jayanth Chennamangalam,
Kaustubh Rajwade,
Maura McLaughlin,
Devansh Agarwal,
Wes Armour,
Dan Werthimer,
Jeff Cobb,
Andrew Siemion,
David MacMahon,
Deepthi Gorthi,
Xin Pei
Abstract:
Here, we present initial results from the ALFABURST radio transient survey, which is currently running in a commensal mode with the ALFA receiver at the Arecibo telescope. We observed for a total of 1400 hours and have detected single pulses from known pulsars but did not detect any FRBs. The non-detection of FRBs is consistent with the current FRB sky rates.
Here, we present initial results from the ALFABURST radio transient survey, which is currently running in a commensal mode with the ALFA receiver at the Arecibo telescope. We observed for a total of 1400 hours and have detected single pulses from known pulsars but did not detect any FRBs. The non-detection of FRBs is consistent with the current FRB sky rates.
△ Less
Submitted 24 October, 2017;
originally announced October 2017.
-
SETIBURST: A Robotic, Commensal, Realtime Multi-Science Backend for the Arecibo Telescope
Authors:
Jayanth Chennamangalam,
David MacMahon,
Jeff Cobb,
Aris Karastergiou,
Andrew P. V. Siemion,
Kaustubh Rajwade,
Wes Armour,
Vishal Gajjar,
Duncan R. Lorimer,
Maura A. McLaughlin,
Dan Werthimer,
Christopher Williams
Abstract:
Radio astronomy has traditionally depended on observatories allocating time to observers for exclusive use of their telescopes. The disadvantage of this scheme is that the data thus collected is rarely used for other astronomy applications, and in many cases, is unsuitable. For example, properly calibrated pulsar search data can, with some reduction, be used for spectral line surveys. A backend th…
▽ More
Radio astronomy has traditionally depended on observatories allocating time to observers for exclusive use of their telescopes. The disadvantage of this scheme is that the data thus collected is rarely used for other astronomy applications, and in many cases, is unsuitable. For example, properly calibrated pulsar search data can, with some reduction, be used for spectral line surveys. A backend that supports plugging in multiple applications to a telescope to perform commensal data analysis will vastly increase the science throughput of the facility. In this paper, we present 'SETIBURST', a robotic, commensal, realtime multi-science backend for the 305-m Arecibo Telescope. The system uses the 1.4 GHz, seven-beam Arecibo L-band Feed Array (ALFA) receiver whenever it is operated. SETIBURST currently supports two applications: SERENDIP VI, a SETI spectrometer that is conducting a search for signs of technological life, and ALFABURST, a fast transient search system that is conducting a survey of fast radio bursts (FRBs). Based on the FRB event rate and the expected usage of ALFA, we expect 0-5 FRB detections over the coming year. SETIBURST also provides the option of plugging in more applications. We outline the motivation for our instrumentation scheme and the scientific motivation of the two surveys, along with their descriptions and related discussions.
△ Less
Submitted 17 January, 2017;
originally announced January 2017.
-
A Real-time Single Pulse Detection Algorithm for GPUs
Authors:
Karel Adámek,
Wesley Armour
Abstract:
The detection of non-repeating events in the radio spectrum has become an important area of study in radio astronomy over the last decade due to the discovery of fast radio bursts (FRBs). We have implemented a single pulse detection algorithm, for NVIDIA GPUs, which use boxcar filters of varying widths. Our code performs the calculation of standard deviation, matched filtering by using boxcar filt…
▽ More
The detection of non-repeating events in the radio spectrum has become an important area of study in radio astronomy over the last decade due to the discovery of fast radio bursts (FRBs). We have implemented a single pulse detection algorithm, for NVIDIA GPUs, which use boxcar filters of varying widths. Our code performs the calculation of standard deviation, matched filtering by using boxcar filters and thresholding based on the signal-to-noise ratio. We present our parallel implementation of our single pulse detection algorithm. Our GPU algorithm is approximately 17x faster than our current CPU OpenMP code (NVIDIA Titan XP vs Intel E5-2650v3). This code is part of the AstroAccelerate project which is a many-core accelerated time-domain signal processing code for radio astronomy. This work allows our AstroAccelerate code to perform a single pulse search on SKA-like data 4.3x faster than real-time.
△ Less
Submitted 29 November, 2016;
originally announced November 2016.
-
Commissioning of ALFABURST: initial tests and results
Authors:
Kaustubh Rajwade,
Jayanth Chennamangalam,
Duncan Lorimer,
Aris Karastergiou,
Dan Werthimer,
Andrew Siemion,
David MacMahon,
Jeff Cobb,
Christopher Williams,
Wes Armour
Abstract:
Fast Radio Bursts (FRBs) are apparently one-time, relatively bright radio pulses that have been observed in recent years. The origin of FRBs is currently unknown and many instruments are being built to detect more of these bursts to better characterize their physical properties and identify the source population. ALFABURST is one such instrument. ALFABURST takes advantage of the 7-beam Arecibo L-b…
▽ More
Fast Radio Bursts (FRBs) are apparently one-time, relatively bright radio pulses that have been observed in recent years. The origin of FRBs is currently unknown and many instruments are being built to detect more of these bursts to better characterize their physical properties and identify the source population. ALFABURST is one such instrument. ALFABURST takes advantage of the 7-beam Arecibo L-band Feed Array (ALFA) receiver on the 305-m Arecibo Radio Telescope in Puerto Rico, to detect FRBs in real-time at L-band (1.4 GHz). We present the results of recent on-sky tests and observations undertaken during the commissioning phase of the instrument. ALFABURST is now available for commensal observations with other ALFA projects.
△ Less
Submitted 13 February, 2016;
originally announced February 2016.
-
Pulsar Acceleration Searches on the GPU for the Square Kilometre Array
Authors:
Sofia Dimoudi,
Wesley Armour
Abstract:
Pulsar acceleration searches are methods for recovering signals from radio telescopes, that may otherwise be lost due to the effect of orbital acceleration in binary systems. The vast amount of data that will be produced by next generation instruments such as the Square Kilometre Array (SKA) necessitates real-time acceleration searches, which in turn requires the use of HPC platforms. We present o…
▽ More
Pulsar acceleration searches are methods for recovering signals from radio telescopes, that may otherwise be lost due to the effect of orbital acceleration in binary systems. The vast amount of data that will be produced by next generation instruments such as the Square Kilometre Array (SKA) necessitates real-time acceleration searches, which in turn requires the use of HPC platforms. We present our implementation of the Fourier Domain Acceleration Search (FDAS) algorithm on Graphics Processor Units (GPUs) in the context of the SKA, as part of the Astro-Accelerate real-time data processing library, currently under development at the Oxford e-Research Centre (OeRC), University of Oxford.
△ Less
Submitted 24 November, 2015; v1 submitted 23 November, 2015;
originally announced November 2015.
-
ALFABURST: A realtime fast radio burst monitor for the Arecibo telescope
Authors:
Jayanth Chennamangalam,
Aris Karastergiou,
David MacMahon,
Wes Armour,
Jeff Cobb,
Duncan Lorimer,
Kaustubh Rajwade,
Andrew Siemion,
Dan Werthimer,
Christopher Williams
Abstract:
Fast radio bursts (FRBs) constitute an emerging class of fast radio transient whose origin continues to be a mystery. Realizing the importance of increasing coverage of the search parameter space, we have designed, built, and deployed a realtime monitor for FRBs at the 305-m Arecibo radio telescope. Named 'ALFABURST', it is a commensal instrument that is triggered whenever the 1.4 GHz seven-beam A…
▽ More
Fast radio bursts (FRBs) constitute an emerging class of fast radio transient whose origin continues to be a mystery. Realizing the importance of increasing coverage of the search parameter space, we have designed, built, and deployed a realtime monitor for FRBs at the 305-m Arecibo radio telescope. Named 'ALFABURST', it is a commensal instrument that is triggered whenever the 1.4 GHz seven-beam Arecibo $L$-Band Feed Array (ALFA) receiver commences operation. The ongoing commensal survey we are conducting using ALFABURST has an instantaneous field of view of 0.02 sq. deg. within the FWHM of the beams, with the realtime software configurable to use up to 300 MHz of bandwidth. We search for FRBs with dispersion measure up to 2560 cm$^{-3}$ pc and pulse widths ranging from 0.128 ms to 16.384 ms. Commissioning observations performed over the past few months have demonstrated the capability of the instrument in detecting single pulses from known pulsars. In this paper, I describe the instrument and the associated survey.
△ Less
Submitted 12 November, 2015;
originally announced November 2015.
-
A polyphase filter for many-core architectures
Authors:
Karel Adámek,
Jan Novotný,
Wes Armour
Abstract:
In this article we discuss our implementation of a polyphase filter for real-time data processing in radio astronomy. We describe in detail our implementation of the polyphase filter algorithm and its behaviour on three generations of NVIDIA GPU cards, on dual Intel Xeon CPUs and the Intel Xeon Phi (Knights Corner) platforms. All of our implementations aim to exploit the potential for data reuse t…
▽ More
In this article we discuss our implementation of a polyphase filter for real-time data processing in radio astronomy. We describe in detail our implementation of the polyphase filter algorithm and its behaviour on three generations of NVIDIA GPU cards, on dual Intel Xeon CPUs and the Intel Xeon Phi (Knights Corner) platforms. All of our implementations aim to exploit the potential for data reuse that the algorithm offers. Our GPU implementations explore two different methods for achieving this, the first makes use of L1/Texture cache, the second uses shared memory. We discuss the usability of each of our implementations along with their behaviours. We measure performance in execution time, which is a critical factor for real-time systems, we also present results in terms of bandwidth (GB/s), compute (GFlop/s) and type conversions (GTc/s). We include a presentation of our results in terms of the sample rate which can be processed in real-time by a chosen platform, which more intuitively describes the expected performance in a signal processing setting. Our findings show that, for the GPUs considered, the performance of our polyphase filter when using lower precision input data is limited by type conversions rather than device bandwidth. We compare these results to an implementation on the Xeon Phi. We show that our Xeon Phi implementation has a performance that is 1.47x to 1.95x greater than our CPU implementation, however is not insufficient to compete with the performance of GPUs. We conclude with a comparison of our best performing code to two other implementations of the polyphase filter, showing that our implementation is faster in nearly all cases. This work forms part of the Astro-Accelerate project, a many-core accelerated real-time data processing library for digital signal processing of time-domain radio astronomy data.
△ Less
Submitted 21 April, 2016; v1 submitted 11 November, 2015;
originally announced November 2015.
-
Limits on Fast Radio Bursts at 145 MHz with ARTEMIS, a real-time software backend
Authors:
A. Karastergiou,
J. Chennamangalam,
W. Armour,
C. Williams,
B. Mort,
F. Dulwich,
S. Salvini,
A. Magro,
S. Roberts,
M. Serylak,
A. Doo,
A. V. Bilous,
R. P. Breton,
H. Falcke,
J. -M. Griessmeier,
J. W. T. Hessels,
E. F. Keane,
V. I. Kondratiev,
M. Kramer,
J. van Leeuwen,
A. Noutsos,
S. Oslowski,
C. Sobey,
B. W. Stappers,
P. Weltevrede
Abstract:
Fast Radio Bursts (FRBs), are millisecond radio signals that exhibit dispersion larger than what the Galactic electron density can account for. We have conducted a 1446 hour survey for Fast Radio Bursts (FRBs) at 145~MHz, covering a total of 4193 sq. deg on the sky. We used the UK station of the LOFAR radio telescope -- the Rawlings Array -- , accompanied for a majority of the time by the LOFAR st…
▽ More
Fast Radio Bursts (FRBs), are millisecond radio signals that exhibit dispersion larger than what the Galactic electron density can account for. We have conducted a 1446 hour survey for Fast Radio Bursts (FRBs) at 145~MHz, covering a total of 4193 sq. deg on the sky. We used the UK station of the LOFAR radio telescope -- the Rawlings Array -- , accompanied for a majority of the time by the LOFAR station at Nançay, observing the same fields at the same frequency. Our real-time search backend, ARTEMIS, utilizes graphics processing units to search for pulses with dispersion measures up to 320 cm$^{-3}$ pc. Previous derived FRB rates from surveys around 1.4~GHz, and favoured FRB interpretations, motivated this survey, despite all previous detections occurring at higher dispersion measures. We detected no new FRBs above a signal-to-noise threshold of 10, leading to the most stringent upper limit yet on the FRB event rate at these frequencies: 29 sky$^{-1}$ day$^{-1}$ for 5~ms-duration pulses above 62~Jy. The non-detection could be due to scatter-broadening, limitations on the volume and time searched, or the shape of FRB flux density spectra. Assuming the latter and that FRBs are standard candles, the non-detection is compatible with the published FRB sky rate, if their spectra follow a power law with frequency ($\propto ν^α$), with $α\gtrsim+0.1$, demonstrating a marked difference from pulsar spectra. Our results suggest that surveys at higher frequencies, including the low frequency component of the Square Kilometre Array, will have better chances to detect, estimate rates and understand the origin and properties of FRBs.
△ Less
Submitted 10 June, 2015;
originally announced June 2015.
-
Observations of transients and pulsars with LOFAR international stations and the ARTEMIS backend
Authors:
Maciej Serylak,
Aris Karastergiou,
Chris Williams,
Wesley Armour,
Michael Giles,
the LOFAR Pulsar Working Group
Abstract:
The LOw Frequency ARray - LOFAR - is a new radio interferometer designed with emphasis on flexible digital hardware instead of mechanical solutions. The array elements, so-called stations, are located in the Netherlands and in neighbouring countries. The design of LOFAR allows independent use of its international stations, which, coupled with a dedicated backend, makes them very powerful telescope…
▽ More
The LOw Frequency ARray - LOFAR - is a new radio interferometer designed with emphasis on flexible digital hardware instead of mechanical solutions. The array elements, so-called stations, are located in the Netherlands and in neighbouring countries. The design of LOFAR allows independent use of its international stations, which, coupled with a dedicated backend, makes them very powerful telescopes in their own right. This backend is called the Advanced Radio Transient Event Monitor and Identification System (ARTEMIS). It is a combined software/hardware solution for both targeted observations and real-time searches for millisecond radio transients which uses Graphical Processing Unit (GPU) technology to remove interstellar dispersion and detect millisecond radio bursts from astronomical sources in real-time.
△ Less
Submitted 30 October, 2012; v1 submitted 16 October, 2012;
originally announced October 2012.
-
Observations of transients and pulsars with LOFAR international stations
Authors:
Maciej Serylak,
Aris Karastergiou,
Chris Williams,
Wes Armour,
LOFAR Pulsar Working Group
Abstract:
The LOw FRequency ARray - LOFAR is a new radio telescope that is moving the science of radio pulsars and transients into a new phase. Its design places emphasis on digital hardware and flexible software instead of mechanical solutions. LOFAR observes at radio frequencies between 10 and 240 MHz where radio pulsars and many transients are expected to be brightest. Radio frequency signals emitted fro…
▽ More
The LOw FRequency ARray - LOFAR is a new radio telescope that is moving the science of radio pulsars and transients into a new phase. Its design places emphasis on digital hardware and flexible software instead of mechanical solutions. LOFAR observes at radio frequencies between 10 and 240 MHz where radio pulsars and many transients are expected to be brightest. Radio frequency signals emitted from these objects allow us to study the intrinsic pulsar emission and phenomena such as propagation effects through the interstellar medium. The design of LOFAR allows independent use of its stations to conduct observations of known bright objects, or wide field monitoring of transient events. One such combined software/hardware solution is called the Advanced Radio Transient Event Monitor and Identification System (ARTEMIS). It is a backend for both targeted observations and real-time searches for millisecond radio transients which uses Graphical Processing Unit (GPU) technology to remove interstellar dispersion and detect millisecond radio bursts from astronomical sources in real-time using a single LOFAR station.
△ Less
Submitted 2 July, 2012;
originally announced July 2012.
-
A GPU-based survey for millisecond radio transients using ARTEMIS
Authors:
W. Armour,
A. Karastergiou,
M. Giles,
C. Williams,
A. Magro,
K. Zagkouris,
S. Roberts,
S. Salvini,
F. Dulwich,
B. Mort
Abstract:
Astrophysical radio transients are excellent probes of extreme physical processes originating from compact sources within our Galaxy and beyond. Radio frequency signals emitted from these objects provide a means to study the intervening medium through which they travel. Next generation radio telescopes are designed to explore the vast unexplored parameter space of high time resolution astronomy, b…
▽ More
Astrophysical radio transients are excellent probes of extreme physical processes originating from compact sources within our Galaxy and beyond. Radio frequency signals emitted from these objects provide a means to study the intervening medium through which they travel. Next generation radio telescopes are designed to explore the vast unexplored parameter space of high time resolution astronomy, but require High Performance Computing (HPC) solutions to process the enormous volumes of data that are produced by these telescopes. We have developed a combined software /hardware solution (code named ARTEMIS) for real-time searches for millisecond radio transients, which uses GPU technology to remove interstellar dispersion and detect millisecond radio bursts from astronomical sources in real-time. Here we present an introduction to ARTEMIS. We give a brief overview of the software pipeline, then focus specifically on the intricacies of performing incoherent de-dispersion. We present results from two brute-force algorithms. The first is a GPU based algorithm, designed to exploit the L1 cache of the NVIDIA Fermi GPU. Our second algorithm is CPU based and exploits the new AVX units in Intel Sandy Bridge CPUs.
△ Less
Submitted 28 November, 2011;
originally announced November 2011.