# OzGrav 2020 Research Proposal

- 3429 words
- 18 min

- tags:
- none

# Introduction

## Background

In 1687, Isaac Newton published his book, “*Philosophiæ Naturalis
Principia Mathematica*”, containing his law of universal gravitation:

\[F = G\dfrac{m_1m_2}{r^2},\]

where \(F\) is the resultant gravitational force, \(G\) is the universal
gravitational constant, \(m\) is the mass of the objects and \(r\) is
the distance between the objects.

It has since been taught in high-schools across the world in introductory physics classes, due in part to its simplicity and general accuracy for predicting movement. Despite its general applicability, one major issue of the theory is that it infers that gravitational is instantaneously applied, without any apparent method through which it could be transmitted.

Roughly two centuries later, in 1905, Albert Einstein presented the
theory of special relativity in his paper “*Zur Elektrodynamik bewegter
Körper*” (English: *“On the Electrodynamics of Moving
Bodies”*). The theory introduced the concept of **spacetime** to
describe inertial reference frames as a four-dimensional coordinate
system, \((t, x, y, z)\), where \(t\) is time and \((x, y, z)\) are the
three spatial dimensions. He further stated two important axioms; that
the speed of light in a vacuum is the same for all observers regardless
of motion and that the laws of physics are invariant in all inertial
frames of reference. About ten years later, Albert Einstein incorporated
the effect of gravity with special relativity, forming the general
theory of relativity.

The general theory of relativity postulates that the effect of gravity can be characterised as each gravitational potential source changing the curvature of spacetime. The relationship of gravitational mass-energy and the shape of spacetime is given by Einstein’s field equations:

\[G_{\mu{}v} + \Lambda{}g_{\mu{}v} = \dfrac{8\pi{}G}{c^4}T_{\mu{}v},\]

where \(T_{\mu{}v}\) is the stress-energy tensor^{1}, \(G_{\mu{}v}\) is
the Einstein tensor, \(g_{\mu{}v}\) is the spacetime metric, \(\Lambda\)
is the cosmological constant, \(G\) is the universal gravitational
constant and \(c\) is the speed of light.

An implication of gravity curving spacetime is that massive accelerating
objects would cause ‘ripples’ in fabric of spacetime called
gravitational waves. The existence of gravitational waves remained a
theory until 1974, when Russell Hulse and Joseph Taylor discovered a
binary pair of neutron stars that were orbiting each other. After
several years of measurement, they found that the speed at which the
stars were orbiting each other was slowing in a manner consistent with
the predictions of the general theory of relativity, showing that
gravitational waves did indeed exist ^{2}.

There were several experiments performed in the 1960s and 1970s to
determine methods to detect gravitaional waves, resulting in several
large laser interferometric detectors (that is, detectors that use
interferometry the phenomena by which waves superpose on each other to
create a resultant wave for detection) being built throughout the early
2000s, including the American Laser Interferometric Gravitational-Wave
Observatory (LIGO) ^{3}, the Italian Virgo detector ^{4}, and the German GEO600 detector. The
initial observation runs between 2002 and 2011 by the various detectors
failed to directly detect any gravitational waves, and as such, the
majority of the detectors began work to increase their sensitivity
throughout the 2010s ^{5}. The increase in detector
sensitivity has brought success in the search for gravitational waves,
with the first direct detection occurring on the 14th of September, 2015
^{6}^{7}.

Due to their design, the detectors have a significant amount of noise
from sources that are not gravitational waves, in addition to the
gravitational waves themselves having very weak signals. As such, a
large amount of data processing needs to be done to the outputs produced
by the detectors in order to filter and extract any possible
gravitational waves. These data processors are known as ‘*pipelines*’,
and are mostly created by research groups that are a part of the LIGO
Scientific Collaboration ^{8}. These
pipelines are used throughout observation runs for real-time data
analysis.

One such pipeline is the Summed Parallel Infinite Impulse Response
(SPIIR) pipeline, created by Shaun Hooper in 2012 ^{9}.
The pipeline uses a number of IIR filters which are commonly used in
signal processing for bandpass filtering to approximate possible
gravitational wave signals for detection. The pipeline was further
developed by Qi Chu in 2017, by using GPU acceleration techniques to
increase the speed of analysis, as well implementing a method to use a
frequentist coherent search ^{10}. The pipeline is currently the
fastest of all existing pipelines, and has participated in every
observation run since November 2015, successfully detecting all events
that were seen in more than one detector.

The SPIIR pipeline uses GStreamer, a library for composing, filtering
and moving around signals, in addition to the GStreamer LIGO Algorithm
Library (`gstlal`

). After receiving data from the detectors, the
pipeline performs data conditioning and data whitening, followed by the
usage of the IIR filters. The data is then combined for post-processing,
where events are given sky localization and then inserted into the LIGO
event database ^{11}.

## Problem Description & Goal

As of 2020-04-20, the SPIIR pipeline supports the use of two or three detectors for gravitational wave searching the two American LIGO detectors and the Virgo detector. There are several issues with the current pipeline design that this research project aims to address.

Further detectors are likely to be coming online in the near future, with old detectors occasionally being removed from detection for maintainance. For example, the Japanese KAGRA detector is undergoing testing with the goal of being used in the next observation run, and LIGO India is currently being installed. With the current design of the pipeline, adding and removing detectors is a significant undertaking that takes a substantial amount of development time.

In addition, if a detector is indeed added to SPIIR, the detector
* must* be used for the coherent post-processing, and can’t be used
just for synchronization purposes. This presents an issue as additional
detectors are added, as each detector has its own sensitivity, reading
variations and range of observable gravitational wave frequencies,
resulting in some detectors being suitable for searching specific
frequency ranges whilst showing no discernable change in output for
other detectors, causing many false negatives.

An ideal architecture for the pipeline would be significantly more
composable, able to add and remove the usage of different detectors for
post-processing with minimal effort.

As such, this research project aims to complete a subsection of this
idealised architecture. The project aims to remove the requirement for
all detectors to be used for coherent post-processing with sky
localization, and instead aims to provide a generic interface that would
allow for any number of detectors to be used for coherent
post-processing, whilst still allowing the unused detectors to undergo
all other parts of the pipeline and remain synchronized with the used
detectors. The project shall explore a number of different possible
codebase refactorings as well as exploring new techniques for
efficiently combining \(N\) data sources for coherent search, and shall
measure the performance impact of the changes using a number of
benchmarks.

# Literature Review

This research project aims to refactor the SPIIR pipeline codebase and
explore techniques for combining some unknown \(N\) number of data
sources for coherent search. Whilst the refactoring part of the project
is somewhat tangential to the aims of the project, looking at the
literature for refactoring will help with the development of the
research methodology. The literature for both refactoring and coherent
search are developing and varied.

Refactoring is defined by Murphy-Hill as the
process of changing the structure of software without changing its
behaviour ^{12}. Of course, this isn’t the only way to modify source code to
address known issues. George Fairbanks offers the options of ignoring,
refactoring or rewriting code as potential methods to deal with problems
in (Fairbanks 2019), and makes several distinctions between the options.
According to Fairbanks, the major difference between refactoring and
rewriting is the process by which the code is modified. When
refactoring, incremental changes are made to the odebase, with the major
goal being to keep the newly written code integrated with the existing
codebase and tests, as the outputs of a module given some inputs should
still remain identical. In contrast, when rewriting, the new code is
written using none of the existing codebase, possibly resulting in
majorly different outputs and possibly even data flow architecture.
Unlike with refactoring, existing tests might not be able to be
leveraged, but it becomes much easier to make sweeping architectural
changes to the codebase.

This still leaves the third option ignoring the issues. Fairbanks points
out that ignoring issues simply means that they will have to be dealt
with at a later date, and contribute to ‘*techinical debt*’ a term
coined by Ward Cunningham in (Cunningham 1992) to help explain why
otherwise working code may need to be refactored or rewritten, and has
since turned into its own area of academic research, as well as a major
focus of industry. Some examples of activities that “accrue” technical
debt are; a lack of documentation, implementing sub-optimal algorithms
and a lack of testing. Allman notes that technical debt has a
number of similarities to financial debt, in that there can be advantages
and disadvantages to accruing the debt ^{10}. One such advantage, is that
the codebase can be shipped without being entirely complete, and
may indeed reach functional completeness in a shorter time than if the
technical debt was not accrued. Some potential disadvantages, however,
include faults in the system, increased maintainance and extensibility
effort, as well as increased time onboarding new members of staff.

Technical debt, then, is clearly an area that needs to be managed over
the course of a programming project, even when rewriting or refactoring.
(Fairley and Willshire 2017) notes that whilst rewriting or reworking an
existing codebase can reduce or eliminate technical debt, the project
also risks accumulating additional debt if not correctly managed. To
help with management, (Fairley and Willshire 2017) suggests adopting a
development process that includes regularly reviewing expected and
actual progress, whilst Allman encourages regular internal and
external documentation, as well as maintaining an index of prioritised
debts ^{10}.
The SPIIR pipeline uses a group of IIR (infinite impulse response)
filters with time delays to approximate a matched filter. Xiaoyang Guo states
in a 2018 paper that the output of the (i)th IIR filter can be expressed
with the equation:

\[y^i_k = a^i_1y^i_{k-1} + b^i_0x_{k-d_i},\]

where \(a^i_1\) and \(b^i_0\) are coefficients, \(k\) is time in a
discrete form and \(x_{k-d_i}\) denotes input with some time delay
\(d_i\) ^{11}. After summing the output of the filters, the resulting signal
undergoes coherent post-processing to determine the likelihood of an
event having occurred.

Coherent post-processing was introducted in Qi Chu's thesis as an alternative
to coincidence post-processing^{13}. Qi Chu states that the
multi-detector maximum log likelihood ratio to be equal to the coherent
signal to noise ratio \(\rho{}^2_c\), which can be expressed as:

\[\rho^2_c = \underset{max{A_{jk},\theta,t_{c},\alpha,\delta}}{\ln \mathcal{L}_{NW}},\]

where \(A_{jk}\) describes the relative inclination of the detector to
the source, \(\theta\) is the mass of the source, \(\alpha\) and
\(\beta\) are the sky directions of the source and \(\mathcal{L}_{NW}\)
is the network log likelihood network.

Whilst SPIIR’s coherent search currently only supports the use of two or three detectors, Qi Chu estimates the computational cost of the coherent search to be:

\[O(2N^3_dN_mN_p),\]

where \(N_d\) is the number of detectors, \(N_m\) is the number of sets
of IIR filters, and \(N_p\) is the number of sky locations^{13}. Xiaoyang Guo
discusses a number of optimizations made to the pipeline,
including in sections of the coherent post-processing, but only reduced
constant factors and not the computational cost of the overall process ^{11}.
As such, as more detectors are introduced, the computational time for
coherent search increases cubically.

An interesting parallel can, however, be made to sorting, a very widely
studied area of computer science. Comparison sorts are known to have a
lower bound of \(\Omega{}(n\log{n})\) number of comparison operations
(Cormen et al., n.d.) which can be increased depending on the
algorithmic inputs, however when implemented on a distributed network,
the number of comparators per thread can be reduced to \(O(\log^2{n})\)
using a sorting network, which uses a fixed set of comparisons on the
order of \(O(n\log^2{n})\) ^{14}. By measuring the total number of
comparitors, it would appear that the distributed algorithm is less efficient,
however GPUGems (^{14}) notes that the distributed algorithm can sort more keys
per second than an optimal \(O(n\log{n})\) algorithm running on a single
thread.

The SPIIR coherent search is distributed on a number of GPUs using CUDA
^{11}, however the computational cost in Qi Chu's thesis is
computed as being on the order of the number of computations, not the
number of computation per thread ^{13}. It is therefore possible that the
computational cost per thread is different to the overall computational
cost, and thus as detectors are added using the existing algorithm, the
growth of the computational runtime may not be cubic.

# Methods

The process for development will follow the suggested method in (Fairley
and Willshire 2017) to minimize technical debt. The gravitational wave
research group at the University of Western Australia meets once a week
to present progress reports and submit the planned progress for the next
week. By tracking the difference between planned progress and the actual
progress each week, the level of technical debt that the project is
incurring can be roughly determined. Another method by which technical
debt can be minimized is through correct testing^{10}. Unit
tests will be created for the coherent search function to determine the
correct outputs using the existing algorithm, and as components are
refactored, the tests will be used to ensure that there are no
regressions. Internal and external documentation shall also be created
and maintained throughout the project as per Allman's paper^{10}, with
internal documentation being comments on the function of code, and
external documentation being the methods stated in the final report.

Development for the new pipeline will be performed on the OzStar cluster
at Swinburne University of Technology ^{15}. The OzStar cluster already
contains a set of sample data to run the pipeline on, as well as the
tools for running the pipeline. As such, all testing will be performed
on the cluster using the existing sample data.

Validation of the correctness of the eventual coherent search algorithm will be an important factor to show that there are no behavioural regressions in the course of the project. Validation will be performed using a copy of the existing codebase and comparing the outputs of the coherent search function for equivalent inputs. If the new algorithm raises any false positives or false negatives that were not also seen in the existing codebase, then it can be stated that the new algorithm is not correct.

This project will also measure the performance and computational cost difference between the 2-3 detector specific coherent search function and the generalised coherent search function. The difference in computational cost will be measured by an analysis of the average case of the resulting algorithm and comparing it to the average case of the original coherent search function. The performance cost will be measured by determining the runtime of coherent search and comparing it across equivalent inputs for different sized inputs.

# Expected Findings

There are a number of expected findings from this research. First, it would be expected that the new coherent search algorithm would replicate the outputs of the existing search algorithm for the same inputs (i.e. it is valid as per section 3). Second, it is expected that no new coherent search algorithm is found, and that the generalisation of the existing algorithm results in a regression of less than 5% of the current coherent search runtime.

# Proposed timeline

Task |
Time |
Description |
---|---|---|

Proposal Writing (Due 2020-04-20) | 2020-03 to 2020-04 | Write research proposal. |

Literature Review | 2020-04 to 2020-06 | Investigate efficient coherent search methods |

Analyse Existing Codebase | 2020-04 to 2020-07 | Determine expected inputs and outputs of coherent search methods and determine per-thread computational cost |

Determine valid unit tests | 2020-05 | Determine tests that can be used to determine the correctness of coherent search algorithms |

Oral Progress Report (Due 2020-05-22) | 2020-05 | Give report detailing progress on research |

Refactor Existing Codebase | 2020-06 to 2020-08 | Perform changes on pipeline |

Validate New Pipeline | 2020-08 | Compare outputs from changed pipeline to the previous version to ensure there is not a regression of results |

Measure Performance Differences | 2020-08 | Determine performance cost of changes |

Abstract | 2020-08 to 2020-09 | Prepare and submit research abstract |

Seminar | 2020-09 to 2020-10 | Prepare and give seminar on research |

Paper Writing | 2020-09 to 2020-10 | Prepare and submit paper on research |

^{1}

Tensors are very similar to matrices of vectors and are typically used to describe mathematical geometric relationships

^{6}

Abbott, B. P., R. Abbott, T. D. Abbott, M. R. Abernathy, F. Acernese, K.
Ackley, C. Adams, et al. 2016. “Observation of Gravitational Waves from
a Binary Black Hole Merger.” *Phys. Rev. Lett.* 116 (6): 061102.
https://doi.org/10.1103/PhysRevLett.116.061102.

^{5}

“About ‘aLIGO’.” n.d. *LIGO Lab: Caltech*.
https://www.ligo.caltech.edu/page/about-aligo.

^{10}

Allman, Eric. 2012. “Managing Technical Debt.” *Commun. ACM* 55 (5):
50–55. https://doi.org/10.1145/2160718.2160733.

^{13}

Chu, Q. 2017. “Low-Latency Detection and Localization of Gravitational Waves from Compact Binary Coalescences.”

^{14}

“GPUGems Chapter 46. Improved Gpu Sorting.” n.d. *NVIDIA Developer*.
NVIDIA. https://developer.nvidia.com/gpugems/GPUGems2/gpugems2_chapter46.html.

^{11}

Guo, Xiaoyang, Qi Chu, Zhihui Du, and Linqing Went. 2018. “GPU-Optimised Low-Latency Online Search for Gravitational Waves from Binary Coalescences.” In, 2018-:2638–42. EURASIP. https://ieeexplore.ieee.org/document/8553574.

^{9}

Hooper, Shaun, Shin Kee Chung, Jing Luan, David Blair, Yanbei Chen, and
Linqing Wen. 2012. “Summed Parallel Infinite Impulse Response Filters
for Low-Latency Detection of Chirping Gravitational Waves.” *Physical
Review D - Particles, Fields, Gravitation and Cosmology* 86 (2).

^{7}

“LIGO Detected Gravitational Waves from Black Holes.” n.d. *Caltech*.
https://www.ligo.caltech.edu/detection.

^{3}

“LIGO Lab: Caltech: MIT.” n.d. *Caltech*. https://www.ligo.caltech.edu/.

^{8}

“LIGO Scientific Collaboration.” n.d. *LIGO Scientific Collaboration*.
https://ligo.org/.

^{12}

Murphy-Hill, E, C Parnin, and A. P Black. 2012. “How We Refactor, and
How We Know It.” *IEEE Transactions on Software Engineering* 38 (1):
5–18. https://ieeexplore.ieee.org/document/6112738.

^{15}

“OzStar – Supercomputing at Swinburne University of Technology.” n.d.
*Swinburne University of Technology*. https://supercomputing.swin.edu.au/.

^{4}

“Virgo Detector.” n.d. *Virgo*. http://www.virgo.infn.it/.

^{2}

Weisberg, Joel M., Joseph H. Taylor, and Lee A. Fowler. 1981. “Gravitational Waves from an Orbiting Pulsar” 245 (4): 74–83. https://www.jstor.org/stable/24964580.