Introduction

The past five years have witnessed a revolution in astronomy. The direct detection of gravitational waves (GW) emitted from the binary black hole (BBH) merger GW150914 (Fig. 1) by the Advanced Laser Interferometer Gravitational-Wave Observatory (LIGO) detector1 on September 14, 2015 (ref.2) was a watershed event, not only in demonstrating that GWs could be directly detected but more fundamentally in revealing new insights into these exotic objects and the Universe itself. On August 17, 2017, the Advanced LIGO and Advanced Virgo3 detectors jointly detected GW170817, the merger of a binary neutron star (BNS) system4, an equally momentous event leading to the observation of electromagnetic (EM) radiation emitted across the entire spectrum through one of the most intense astronomical observing campaigns ever undertaken5.

Fig. 1: The detected gravitational-wave strain amplitude as a function of time for GW150914, the first signal detected nearly simultaneously by the LIGO Hanford and Livingston observatories on September 14, 2015.
figure 1

The waveforms are shifted and inverted to compensate for the slightly different arrival times and different orientations of the detectors (red: LIGO Hanford, blue: LIGO Livingston). The upper inset is a simulation of the merger produced using numerical relativity to illustrate the evolution of the black hole event horizons as the system coalesces and merges. Figure adapted with permission from ref.2, CC BY 3.0 (https://creativecommons.org/licenses/by/3.0/).

Coming nearly 100 years after Albert Einstein first predicted their existence6, but doubted that they could ever be measured, the first direct GW detections have undoubtedly opened a new window on the Universe. The scientific insights emerging from these detections have already revolutionized multiple domains of physics and astrophysics, yet, they are ‘the tip of the iceberg’, representing only a small fraction of the future potential of GW astronomy. As is the case for the Universe seen through EM waves, different classes of astrophysical sources emit GWs across a broad spectrum ranging over more than 20 orders of magnitude, and require different detectors for the range of frequencies of interest (Fig. 2).

Fig. 2: The gravitational-wave spectrum probed by strain-sensitive gravitational-wave detectors, ranging from 10−9 Hz to more than 1,000 Hz.
figure 2

The source classes are shown above the spectrum and the detectors below. The portion of the gravitational-wave spectrum below 10−9 Hz probed through measurements of the cosmic microwave background polarization is not shown.

In this Roadmap, we present the perspectives of the Gravitational Wave International Committee (GWIC, https://gwic.ligo.org) on the emerging field of GW astronomy and physics in the coming decades. The GWIC was formed in 1997 to facilitate international collaboration and cooperation in the construction, operation and use of the major GW detection facilities worldwide. Its primary goals are: to promote international cooperation in all phases of construction and scientific exploitation of GW detectors, to coordinate and support long-range planning for new instruments or existing instrument upgrades, and to promote the development of GW detection as an astronomical tool, exploiting especially the potential for multi-messenger astrophysics. Our intention in this Roadmap is to present a survey of the science opportunities and to highlight the future detectors that will be needed to realize those opportunities. The recent remarkable discoveries in GW astronomy have spurred the GWIC to re-examine and update the GWIC roadmap originally published a decade ago7.

We first present an overview of GWs, the methods used to detect them and some scientific highlights from the past five years. Next, we provide a detailed survey of some of the outstanding scientific questions that can be answered with planned or envisioned future GW detectors. We then discuss the future prospects for synergistic observations using GW and EM observatories. Finally, we highlight some of the technological challenges to be overcome to build future GW detectors before concluding.

GW fundamentals and detectors

Fundamentally different from and complementary to other astrophysical ‘messengers’ such as photons, neutrinos or cosmic rays, GWs provide unique information about the most energetic astrophysical processes in the Universe by carrying information about the dynamics of massive objects such as black holes and neutron stars moving at relativistic speeds. As predicted by general relativity (GR), GWs are transverse (oscillating perpendicular to the direction of propagation), travel at the speed of light and possess two polarizations.

GWs physically manifest themselves as time-dependent strains, h, in spacetime, or, more precisely, h = δL/L, where L is the distance between two reference points in space and δL is the induced displacement over the baseline L. GR predicts that the induced strain is perpendicular to the axis of GW propagation and is quadrupolar, that is, a wave travelling along the z-axis stretches (then compresses) the path along the x-axis while shrinking (then stretching) the y-axis (for one polarization; in the orthogonal polarization, the elongation/compression occurs along axes rotated 45° relative to the x-axis and y-axis). GW detectors rely on a measurement of the variations in the light travel time between separated reference points — or ‘test masses’ — caused by a passing GW. The test masses are configured such that each is in near-perfect free fall (and, as such, approximate a local inertial frame) and are separated over very long baselines. The light travel times between pairs of test masses are monitored and read out via a detector such that any changes in the spacetime curvature caused by passing GWs induce modulations in these light travel times. This concept is simply illustrated for ground-based detectors in Fig. 3, which shows a simple Michelson interferometer.

Fig. 3: The concept of a simple laser interferometer gravitational-wave detector.
figure 3

A gravitational-wave (GW) strain shortens one arm while lengthening the other as it passes the detector, resulting in a slight difference in round-trip travel time for the laser light. This, in turn, leads to a phase shift of the light in one arm of the detector relative to the other, creating a change in light intensity at the photodetector. The time-dependent intensity recorded by the photodetector produces a reconstruction of the propagating GW.

Ground-based detectors

Current ground-based observatories probe the high-frequency portion of the GW spectrum from ~10 Hz to ~10 kHz dominated by stellar-mass compact sources, such as coalescing BBH and neutron star systems, and (as yet to be observed) supernovae and isolated neutron stars. All ground-based detectors use enhanced Michelson interferometry with suspended mirrors to directly measure a GW’s phase and amplitude. The detection of audio-band GWs places extremely stringent demands on the isolation of the mirrors from local forces and disturbances. The two US-based Advanced LIGO detectors1 have L = 4 km arm lengths, whereas the European-based Advanced Virgo3 and the Japan-based KAGRA8,9 have L = 3 km arms. Typical strains from astrophysical sources are on the order of 10−21 or less, thus, displacement sensitivities δL of less than ~10−18 m are required to detect GWs with sufficient signal-to-noise (SNR) ratio. This is an incredibly small displacement; for comparative purposes, note that the radius of a proton is ~8.5 × 10−16 m.

A schematic view of Advanced LIGO, shown in Fig. 4, illustrates the configuration of the current generation of ground-based detectors. The mirrors are suspended from multi-stage pendulum systems such that, above the resonant frequencies of the suspension system (typically around 1 Hz), they can be effectively treated as in free fall (that is, in a local inertial frame) in the direction of light propagation. These suspensions and accompanying seismic isolation systems reduce the undesired test-mass motion induced by ambient ground motion by about a factor of 1012 from 1 Hz to 10 Hz (refs3,10). In addition to seismic noise, there are three primary noise sources that currently limit interferometer sensitivity: thermal noise produced by random displacements of the mirror surfaces that are produced by thermally fluctuating stresses in the mirror coatings, substrates and suspensions11; Newtonian (or dynamic gravity gradient) noise arising from earth (ground) and atmospheric density perturbations directly exerting dynamic forces on the mirrors12; and quantum noise resulting from both vacuum fluctuations of the EM field that limit phase resolution in the readout photodetector (so-called ‘shot noise’) and displacements of the mirrors via quantum radiation pressure noise (QRPN), which induce stochastic impulses (or ‘kicks’) on the mirrors due to the random arrival time of the momentum-carrying photons13.

Fig. 4: A schematic of a working laser interferometer gravitational-wave detector.
figure 4

a | A simplified view of the Advanced LIGO interferometer, showing the optical configuration including the laser, input mode cleaner, power-recycling and signal-recycling cavities, the 4-km-long arm cavities and the output mode cleaner. The gravitational-wave (GW) signal is recorded on the output photodiode after the output mode cleaner. b | A schematic of one of the arm cavities. The input test mass (ITM) and end test mass (ETM) mirrors are held by quadruple suspensions to suppress ground motion disturbances. Panel a courtesy of LIGO/Fabrice Matichard. Panel b courtesy of LIGO/Daniel Sigg.

The effect of QRPN is diminished as the mirror mass increases, and both QRPN and shot noise can be reduced by injecting quantum-engineered squeezed vacuum states of light into the interferometer14. Thermal noise manifests itself in a variety of ways in mirror coatings, mirror substrates and suspensions15; it can be understood from a statistical mechanics perspective as infinitesimal internal motions of macroscopic objects at non-zero temperatures caused by intrinsic dissipation (or mechanical loss) in the system. In addition to these fundamental noise sources, a very large number of technical noises must be identified and overcome, which broadly group into laser frequency and intensity noises, acoustically and seismically driven scattered light noises, sensor and actuator noises, stochastic forces from electrical and magnetic fields, and, potentially, energy deposited by energetic particles. (More details about these noise sources are presented in the last section, where we discuss some of the challenges to building future ground-based detectors.)

To deliver the best science, a network of globally distributed interferometers functioning as a unified detector is required. The Advanced LIGO and Advanced Virgo detectors have actively searched the GW sky in a highly coordinated campaign during a series of observing runs carried out from 2015. Figure 5 shows the sensitivities of the LIGO and Virgo interferometers during the ‘O2’ observing run; in the latest ‘O3’ run, the detectors have achieved sensitivities sufficient to detect BBH mergers on a weekly basis16.

Fig. 5: The sensitivity of the LIGO-Virgo network for the ‘O2’ observing run plotted as a function of frequency.
figure 5

The vertical axis presents the sensitivity as an amplitude spectral density, that is, the strain per unit square root of frequency. The stable, narrow spectral features are due to very high-Q mechanical resonances and electrical (50-Hz or 60-Hz) coupling. At low frequencies, scattered light, seismic and control noise dominate; in the mid-band, thermal noise in the mirror dielectric coatings is the leading term. At high frequencies, quantum noise is the limit to sensitivity. Figure adapted with permission from ref.26, CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/).

The KAGRA detector recently joined LIGO and Virgo to form the LIGO-Virgo-KAGRA network; the LIGO-India17 interferometer will be joining later in this decade, dramatically improving the ability of the network to confidently detect and locate GW events18 and providing new methods for testing alternative theories of gravity through enhanced ability to resolve GW polarizations19.

The LIGO-Virgo observations have, in a few years, already produced revelations about some of the most energetic and cataclysmic processes in the Universe. From GW150914, and more recent BBH mergers observed by the LIGO Scientific and Virgo collaborations, it is now known that there is a population of black holes paired in orbitally bound binary systems that evolve through the emission of GWs and merge in less than a Hubble time (the age of the Universe); that black holes of many tens and even hundreds of solar masses exist in nature; and that the properties of the observed black holes are entirely consistent with GR to within current measurement limits16,20,21,22,23,24,25,26,27,28. The BNS detection GW170817 and subsequent observations in the EM domain collectively comprise the first demonstration of GW–EM multi-messenger astronomy, providing an astounding wealth of knowledge, including the first definitive link between BNS merger progenitors and short gamma-ray bursts29,30,31,32,33,34,35,36,37; the first definitive observation of a kilonova38,39,40,41,42,43,44,45,46, conclusive spectroscopic proof that BNS mergers produce heavy elements through r-process nucleosynthesis40,47,48,49,50,51,52; the first demonstration that GWs travel at the same speed as light to better than a few parts in 1015 (ref.29); and an independent method for measuring the Hubble constant using detected GWs as a ‘standard siren’ for determining the absolute distance to the source53,54,55. Additionally, the Advanced LIGO and Advanced Virgo detections have enabled tests of GR in the strong gravity regime that were inaccessible to other experiments and astronomical observations56,57, motivating research on many fronts in fundamental physics and astrophysics. This only represents a brief overview of the recent discoveries and, as we discuss in detail below, captures only a fraction of the potential science afforded by future GW observations.

Space-based detectors

When launched in the mid-2030s, the Laser Interferometer Space Antenna (LISA)58 will possess a breathtaking scientific portfolio. LISA will explore much of the GW Universe in the frequency band from 100 μHz to 100 mHz. A constellation of three satellites separated by 2.5 × 109 m in an Earth-trailing orbit, LISA will be capable of detecting the first seed black holes formed out to redshifts z ~ 20 or more59, and intermediate-mass and ‘light’ super-massive coalescing black hole systems in the 102–107M (solar mass) range, thus, tracing the evolution of black holes from the early Universe through the peak of the star formation era. Through detections of extreme mass ratio inspirals (EMRIs, binary systems with mass ratios as small as ~10−6)60, LISA will directly map the curvature of spacetime at the event horizons of massive black holes, yielding even more precise tests of GR in the strong gravitational field regime. LISA might also detect stellar-mass BBH systems years before they are detectable by ground-based detectors61, and provide very precise sky localization of such events for EM follow-up. By discovering new sources of galactic compact binaries comprised of white dwarfs, neutron stars and stellar-mass black holes, LISA will survey the predominant population of binary compact objects and map the structure of the Milky Way62.

The LISA Pathfinder (LPF)63, launched in 2015 and operated until mid-2017, has paved much of the way for the full-scale LISA mission. LPF was a European Space Agency (ESA) mission, with contributions from a consortium of European national agencies, as well as NASA. It convincingly demonstrated some of the key performance requirements for the full LISA mission, most notably the displacement sensitivity and control of spurious acceleration noise required for LISA. More on LISA science is presented in the next section, whereas the LISA and LPF detector technology is discussed in detail in the last section.

PTAs

Pulsar timing arrays (PTAs)64,65,66,67 explore the nanohertz portion of the GW spectrum ranging from 10−9 to 10−6 Hz. Rather than using laser light to measure variations in detector length as ground-based and space-based detectors do, a PTA measures variations in the radio frequency pulse arrival times at the Earth from an array of millisecond pulsars68,69 (Fig. 6).

Fig. 6: Illustration of the radio pulsar timing process.
figure 6

The pulsar beams sweep across the radio antenna and are detected using a radio receiver at ~GHz frequencies. The data are corrected for delays due to the interstellar medium (‘de-dispersion’) and then folded at the period of the pulsar to create an integrated profile. This profile is then cross-correlated with a noise-free template profile to calculate a time of arrival (TOA), using a high-precision reference clock at the observatory.

Pulsars are rotating neutron stars that act like cosmic lighthouses, appearing as periodic pulsating radio sources. Because millisecond pulsars, pulsars with periods between roughly 1.4 and 30 ms, possess rotational stabilities comparable with the best atomic clocks, they are ideal timing sources. Once effects such as rotational spin-down, astrometric position and motion, and orbital effects from binary companions are accounted for, the pulse arrival times can be precisely modelled and predicted to fractions of a microsecond for up to decades into the future70, and variations arising from GW perturbations can be measured. Distortions in the spacetime around Earth or the pulsars will produce systematics in timing residuals (deviations of the measured pulse arrival times relative to the predicted arrival times), and, crucially, spatially correlated systematics in the timing residuals of the array of pulsars across the sky71. A GW emitted from a single binary system passing the pulsar-Earth system will cause two frequency components in the time series of the timing residuals: one from the spacetime variations at the pulsar (‘pulsar term’), the other from variations at the Earth (‘Earth term’), with different frequencies resulting from changes in the orbital frequency of the emitting source during the time it takes for the radio pulses to travel to the Earth. The top panel of Fig. 7 shows the expected detection in the form of the Hellings and Downs curve, the correlated response of a pair of pulsar-Earth baselines to a stochastic GW background averaged over all sky positions and polarizations as a function of the angle between the pulsar pair-Earth baselines71.

Fig. 7: The pulsar timing array network.
figure 7

The top plot represents the Hellings and Downs curve, showing the predicted correlation in pulsar timing residuals as a function of millisecond pulsar angular separation resulting from a stochastic gravitational-wave background. The lower map shows the telescopes that have been or will in the future be used for this experiment. Arecibo (Puerto Rico), the Green Bank Telescope (GBT; West Virginia), Lovell (UK), Effelsberg (Germany), Nançay (France), Westerbork Synthesis Radio Telescope (WSRT; Netherlands), Sardinia Radio Telescope (SRT; Italy) and Parkes (Australia) are currently contributing International Pulsar Timing Array (IPTA) data. The Very Large Array (VLA; New Mexico) will soon begin to contribute. The Deep Space Station 43 (DSS-43; Australia), the Molonglo Observatory Synthesis Telescope (MOST; Australia), the Giant Metrewave Radio Telescope (GMRT; India) and Ooty (India), as well as three under-construction telescopes — CHIME (Canada), Five-hundred-meter Aperture Spherical Telescope (FAST; China) and MeerKAT (South Africa). Figure courtesy of Shami Chatterjee.

Pulsars are observed at monthly or more rapid cadences in order to sample and measure changing properties, such as the position of the pulsar (that is, proper motion) and varying dispersion due to the interstellar medium. In addition, they must be observed for roughly one half-hour per observation to average over enough of the pulses to mitigate the effects of jitter induced by astrophysical and receiver noise. The observations themselves cover very wide bandwidths (>GHz) or occur near-simultaneously at multiple radio frequencies in order to correct for the effects of interstellar dispersion. Pulsar timing instruments must have fine frequency resolution (~1 MHz) to correct for these effects, coupled with high time resolution in order to sufficiently sample the roughly millisecond-wide radio pulses.

As each pulsar needs to be timed for about a year (equivalent to one Earth orbit) to be properly localized and understood, PTA experiments must have years-long durations. In practice, the lower end of the frequency window is given by the length of the data set (currently about 1 nHz), whereas the upper end is given by the cadence of the timing observations (currently about 1 μHz). Timing residual amplitudes of about 100 ns or less are resolved for the best timed millisecond pulsars.

Today, there are three major PTAs: the Parkes PTA72 in Australia, the European PTA Consortium65 and the NANOGrav73 consortium in North America. These arrays regularly achieve sub-microsecond timing on over 100 millisecond pulsars (MSPs), which collectively form the International Pulsar Timing Array74 (IPTA). PTA science is often sensitivity-limited, and many of the MSPs being discovered in recent surveys have flux densities that often require hour-long observations with 100-m class (or larger) telescopes to achieve the requisite sub-microsecond timing. The Five-hundred-meter Aperture Spherical Telescope (FAST) (500 m diameter) and MeerKAT (64 antennas × 13.7 m diameter) telescopes have been commissioned, and are now commencing regular MSP timing, joining many existing 64–100-m class facilities in the Northern Hemisphere, and the Parkes 64-m telescope in the Southern Hemisphere. Figure 7 illustrates the radio telescopes used for pulsar timing experiments around the globe. NANOGrav has used two telescopes — the Arecibo Observatory (AO) in Puerto Rico and the Green Bank Telescope (GBT) in West Virginia — with each telescope providing roughly half of the sensitivity to GWs. NANOGrav currently observes almost 80 MSPs, about half at the AO and the other half at the GBT, and is seeing the first indications of a signal consistent with GWs75.

The recent loss of the AO poses significant challenges. In the short term, to minimize the loss of sensitivity to the stochastic background of GWs, NANOGrav is going to move most of the pulsars observed at the AO to the GBT, requiring roughly double the amount of time currently used at the GBT. Longer term, the US community will need a replacement for the AO (such as the DSA-2000 concept76). Legacy AO observations will anchor combined future data sets, allowing us to characterize the low-frequency GW universe and glean unique insights into galaxy evolution and cosmology.

The most promising GW sources in the nanohertz band are super-massive (107–1010M) binary black holes (SMBBHs) that form via the collisions of massive galaxies. The astrophysical stochastic gravitational-wave background (ASGWB) produced by the cosmic population of slowly inspiralling SMBBHs across the Universe77,78,79 is the first signal likely to be detected, due to the very long lifetime in the detection band and the relatively small rate of systems in the final coalescence phase. As sensitivity improves, this may be followed by the observation of individual SMBBHs80,81,82; parallel EM observations can both help recover GW signals and allow for richer physics to be extracted. The detection of the ASGWB will reveal essential information about the formation of the large-scale structure of the Universe, determine the rates of galaxy mergers and definitively resolve the ‘final parsec’ problem83 — the theoretical difficulty of shrinking the orbit of a SMBBH by a factor of ~100 after its formation at a separation of ~1 pc via the scattering of stars. PTA measurements are currently probing the expected range of astrophysical signals84,85,86,87 and, based on recent results, a detection of the ASGWB may be imminent79. The detection of individual SMBBHs will allow for combined EM and GW multi-messenger observations and, although only a handful are expected, the scientific return of these discoveries will be immense88.

Cosmic microwave background polarization

The lowest frequencies of the GW spectrum, down to approximately 10−18 Hz, are populated by a stochastic background of remnant primordial GWs produced during the Big Bang. Standard inflationary cosmology predicts a GW spectrum too feeble to be detectable by current ground-based detectors, LISA or PTAs, although some extensions of inflation and more exotic models, including first-order phase transitions and topological defects, predict primordial GW energy densities that can be detected across the frequency bands89,90. EM-based measurements of the cosmic microwave background (CMB) polarization may reveal signs of the remnant primordial GWs91. As CMB polarization measurements are based on a fundamentally different detection method than their higher frequency counterparts, this approach is not discussed further here.

Upcoming physics and astronomy with GWs

In the coming decades, the new observational window of GW astronomy promises to deliver data that will transform the landscape of physics, addressing some of the most pressing problems in fundamental physics, astrophysics and cosmology88,92,93,94,95 (see Box 1). The next generation of ground-based GW observatories planned for the 2030s, the Einstein Telescope (ET, ref.96) and Cosmic Explorer (CE, ref.97) (collectively referred to as 3G), as well as the LISA58 mission, will observe merging black holes and neutron stars when the Universe was still in its infancy. PTAs (ref.74) will continue to evolve to greater sensitivity. LIGO Voyager98, a major upgrade under consideration for the current LIGO observatories in the late 2020s, could test some of the key technologies needed for the ET and CE and, at the same time, provide a significant increase in sensitivity over the current generation of detectors. With all of these instruments, one can expect to witness extremely high SNR events that could reveal subtle signatures of new physics. The potential of GW science in the next two decades is illustrated in Fig. 8, which compares the reach of the current ground-based detectors Advanced LIGO and Advanced Virgo with that of planned 3G observatories for 1.4–1.4 M BNS and 30–30 M BBH mergers as a function of redshift and ‘lookback’ time towards the Big Bang.

Fig. 8: The conceptual reach of next-generation, ground-based detectors.
figure 8

Shown is the range of the current ground-based detectors and future planned detectors Einstein Telescope (ET) and Cosmic Explorer (CE) to detect 1.4 + 1.4 solar mass (M) binary neutron star (BNS) (left) and 30 + 30 M binary black hole (BBH) (right) mergers as a function of ‘lookback’ time (to the Big Bang) and redshift. Yellow (white) circles represent populations of BNS (BBH) mergers as a function of redshift or lookback time. CE is envisioned to be implemented in two phases, designated CE1 and CE2. At their ultimate planned sensitivities, ET and CE will be able to observe all stellar-mass BNS and BBH mergers. BH, black hole; NS, neutron star. A+ refers to Advanced LIGO and O3 to the third LIGO-Virgo operating run. Figure courtesy of Evan Hall.

Fundamental physics

GW observations, because they explore the most extreme conditions of spacetime and of matter, can serve as unsurpassed probes of fundamental physics. In this section, we will look at the power of this new tool in exploring gravity and matter at their most extremes.

Testing GR and modified theories of gravity

GR has been a tremendously successful theory in explaining current astronomical observations and laboratory experiments99,100,101. Nevertheless, there is a general consensus that GR is, at best, incomplete, representing an approximation to a more complete theory that cures some or all of its problems102. These issues include the loss of information down a black hole103, which contradicts unitary evolution of physical states in quantum mechanics; the inevitability of spacetime singularities104,105, for example, at the centre of a black hole where physical quantities such as the density and curvature of spacetime become infinitely large; a cosmological constant that is responsible for the late-time accelerated expansion of the Universe106,107, whose value cannot be accounted for in the standard model of particle physics108; and the lack of a viable formulation of quantum gravity, which might resolve all of these problems but has, so far, been elusive. These difficulties led to increased interest in searching for GR violations in observations in the hope that they will provide clues to an alternative theory of gravity.

The spacetime curvature at the horizon of a black hole of mass M and radius R ~ 2GM/c2 goes as \(\kappa \sim \sqrt{GM\,/\,{c}^{2}{R}^{3}}={c}^{2}\,/\,\sqrt{8}GM\), where G is the gravitational constant and c is the speed of light. Note that κ is larger for lighter black holes, thus, binary coalescences of the lightest astrophysical black holes are, therefore, the strongest regions of gravity that we know of and are ideal for testing strong field predictions of GR101,102. Sub-solar-mass black hole binaries, should they exist, would have even greater curvature. Although neutron stars are lighter than astrophysical black holes, they are not as compact and, hence, probe smaller curvature scales. Black holes also probe regions of greatest compactness (or dimensionless gravitational potential) defined as Φ = GM/c2R, which is largest for black holes. Past experiments such as the Cassini spacecraft109 and the double pulsar orbital decay110 verified the validity of GR in regimes where fields are moderately strong and/or velocities are small compared with the speed of light (see Fig. 9). Current and future experiments, such as the Event Horizon Telescope (EHT)111 and the GRAVITY instrument112, explore the validity of GR near massive black holes and, hence, in the small curvature, but high compactness, regime. X-ray observations by the NICER experiment113 probes GR in the high curvature and large compactness regime of neutron stars114, whereas GW observations of stellar-mass black holes by ground-based detectors (area denoted by ‘GW ground’ in Fig. 9) and LISA probe regions’ curvature and compactness on a wide range of scales: stellar-mass black holes of up to ~5–100 M (mostly ground-based observatories, but also LISA for sources that are close by), intermediate-mass black holes of 102–104M (ground-based observatories and LISA) and super-massive black holes (SMBHs) of 105–1010M (LISA at the lower end and PTAs at the higher end of the mass range), offering tests of GR over ten orders of magnitude in length scale and twenty orders of magnitude in curvature.

Fig. 9: Gravitational-wave observations offer a unique ability to test general relativity in regions of high curvature and strong gravity.
figure 9

The plot shows the curvature scale κ (in km−1) due to the presence of mass M of size R defined as \(\kappa \equiv \sqrt{GM\,/\,{c}^{2}{R}^{3}}\), where G is the gravitational constant, and the compactness, or the dimensionless gravitational potential, Φ ≡ GM/c2R probed by different experiments. For example, observation of the double binary pulsar (total mass of 2.6 solar masses (M) and orbital separation of ~106 km) probes the curvature scale of κ ~ 2 × 10−9 km−1 and compactness of Φ ~ 4 × 10−6. In contrast, a black hole binary of total mass 10 M at the time of merger probes a curvature scale of κ ~ 2.4 × 10−2 km−1 and compactness of Φ ~ 0.5. Filled circles represent past experiments that have verified general relativity at the corresponding curvature scale and compactness. Rectangles and the rhombus represent ongoing or future experiments that will probe ever greater scales of curvature and compactness. (LAGEOS stands for the Laser Geometric Environmental Observation Survey; NICER stands for Neutron star Interior Composition Explorer; EHT stands for Event Horizon Telescope and is represented by the blue vertical rectangle; S2 and IR Flare refer to monitoring of the star ‘S2’ and near-infrared flaring near Sagittarius A*; GW ground indicates gravitational-wave observations by ground-based detectors.) Figure adapted with permission from ref.256, APS.

In addition to probing the strong field predictions of GR, the vast cosmological distances over which GWs travel (redshifts in excess of z ~ 20 both in the case of LISA and future ground-based detectors) will greatly constrain local Lorentz invariance and graviton mass99. Violations of Lorentz invariance or a non-zero graviton mass could cause dispersion in the observed waves and, hence, help to discover new physics predicted by certain quantum gravity theories. At the same time, propagation effects could also reveal the presence of large extra spatial dimensions that lead to different values for the luminosity distance to a source inferred by GW and EM observations (see refs115,116) or cause birefringence of the waves predicted in certain formulations of string theory, as discussed in refs117,118. In certain modified gravity theories, GWs have more than two polarizations (in contradiction with GR); the presence of such additional degrees of freedom could be explored by future detector networks99,119.

Equation of state of ultra-high density matter

Neutron star cores can reach densities many times that of nuclear saturation density (~2 × 1017 kg m−3), making them the regions of the highest matter density known in the Universe120. With colliding BNS (and neutron star–black hole binaries), one can probe the structure of cold, ultra-high-density matter121. Some 50 years after pulsars (neutron stars that beam pulsed radio signals towards Earth) were first discovered122, a good understanding of the physics of neutron stars is still lacking, especially regarding the composition of their inner core. It is possible that the core is simply a gigantic nucleus or, alternatively, a sea of free quarks and gluons. Nucleons in neutron star cores could undergo a phase transition to quark–gluon plasma123,124 at the super-high densities that might be found in the hyper-massive neutron stars125,126 that form in the aftermath of the merger of two neutron stars and live briefly before collapsing to a black hole. GWs emitted during the last few cycles of inspiral of neutron stars and the ringdown of the remnant carry the crucial signatures of the properties of their cores and hot dense matter equation of state.

A neutron star in a binary system can tidally deform its companion, which can cause the system to inspiral and merge more rapidly127. The tidal deformation is greater for neutron stars with larger radii or cores with stiffer equations of state, as in the case of hadronic matter. Conversely, cores comprised of unbounded quark matter will have smaller radii and be harder to deform; thus, tidal effects wouldn’t alter the orbital evolution as much. The tidal effects are directly encoded in the emitted GWs, but the effect arises at fifth-post-Newtonian order (or at \({\mathscr{O}}{(v/c)}^{10}\) in post-Newtonian expansion)127. Moreover, the post-merger phase could also produce GW emission from the (initially) highly deformed hyper-massive neutron star128; the spectral features of the emitted signal can be mapped directly to the equation of state of ‘hot’ dense matter, including possible phase transitions129,130,131.

GW170817 ruled out some of the stiffest equations of state and determined the radii of companions to be in the range 9–13 km (refs4,132). Third-generation ground-based observatories will constrain the radius to within a few hundred metres and help measure central densities and pressures in neutron stars92. They will also detect post-merger waveforms and provide constraints for ‘hot’ equations of state129,130,131 and possibly offer clues of quark deconfining phase transitions123,124.

Exploring dark matter properties with GW observations

Black hole and neutron star mergers could provide unique insight into the nature of dark matter133, another long-standing problem in astrophysics and cosmology. Much of what is known about dark matter comes from its gravitational influence on the dynamics of stars and galaxies and the CMB. Dark matter, in extensions of the standard model, is conceived to be comprised of weakly interacting massive particles (WIMP)134 or, possibly, extremely light particles, such as axions, which were proposed to solve the strong charge-parity (CP) problem in quantum chromodynamics (QCD)135.

Efforts are underway to make a direct detection of WIMP and axionic dark matter in laboratory experiments; however, these experiments have not been successful to date. GW observations could help infer the properties of dark matter in several ways, although any inference drawn will still be indirect evidence of their properties. For example, in the presence of a dark matter fluid, BBHs would experience a drag force that would alter their general relativistic orbital dynamics. This signature could be extracted from the frequency evolution of the observed GW136.

Dark matter could be composed, at least in part, of ultralight bosons such as QCD axions137, dark photons or other light particles138, spanning a wide mass range of 10−33–10−10 eV (refs137,138). The Compton wavelength of ultralight bosons in the mass range 10−20–10−10 eV corresponds to the horizon size of black holes of mass 10–1010M. Although these ultralight fields may not interact with other standard model particles, the equivalence principle implies that their gravitational interaction with, for instance, black holes could have observable consequences. For example, bosonic fields whose Compton wavelength matches the horizon scale of an astrophysical black hole could form bound states (often called ‘gravitational atoms’) around black holes and extract their rotational energy and angular momentum via the mechanism of superradiance139,140. This would result in a Bose–Einstein condensate that acts as a source of continuously emitted GWs. Ground-based detectors would explore the higher end of the mass range from 10−13 eV to 10−10 eV, which corresponds to QCD axions. LISA could explore the presence of even lighter bosonic fields.

Primordial black holes have also been proposed to constitute dark matter and gained attention after LIGO’s first discovery of stellar black holes with unusually large masses141. Searches have also been performed for sub-solar-mass black holes, but no detection has been made so far, leading to some of the best upper limits on the fraction of dark matter in black holes of mass 0.2–1.0 M (ref.142). The existence of sub-solar-mass black holes would be considered to be a definitive proof that they were produced in the primordial Universe, as stellar evolution cannot produce black holes below about 3 M.

Cosmology

From an observational cosmology point of view, the past few decades have witnessed a series of compounding problems that simply won’t go away, including the accelerated expansion of the Universe106,107, the tension between the local and early Universe measurements of the Hubble constant143 and the lack of direct observation of dark matter144. As we discuss below, these are each prime examples of outstanding puzzles in modern physics where GW observations are bound to lead to progress.

The reason why GW observations have the potential to make fundamental contributions to cosmology is that merging binary systems are ‘standard sirens’, that is, the signal contains direct information about the luminosity distance to the source53,145. This information can be extracted from the signal, provided the detector network can localize the sky position of the source and measure the polarization of the signal.

Hubble constant

LIGO and Virgo, together with EM observations, made their first measurement of the Hubble constant using the standard siren GW170817 (ref.54), and the measurement is continuously being refined146. Stand-alone measurements are also possible via a statistical association between a GW source and nearby galaxies147. The current ground-based network, recently augmented by KAGRA and later this decade by LIGO-India, is expected to improve the measurement of the Hubble constant to a precision of a few percent. This approach does not rely on astronomical distance ladders, so it provides an important check on systematic errors and other assumptions used in other methods148. GW measurements may help resolve the existing tension between the two principal Hubble constant measurement methods149,150, clarifying whether the tension is due to measurement issues or new physics.

Dark energy

Big Bang cosmology is largely consistent with GR. However, the accelerated expansion of the Universe in its recent history cannot be explained by the theory, indicating either its failure in that our cosmological principles are too simple or there exists an exotic form of matter–energy density, termed ‘dark energy’151. Many dedicated telescopes are being built to try to ascertain the nature of dark energy. GW observations offer an independent tool for understanding the acceleration and the nature of dark energy, via the ability of BNS and black holes to serve as GW ‘standard sirens’. LISA and 3G detectors will reach to higher redshifts, enabling them to measure the amount of dark energy and, possibly, the dark energy equation of state, even without counterpart identifications. The large population of binary mergers out to z = 3 or more that is expected to be detected by the 3G network will allow for a test of cosmological isotropy: do the distances to these events vary around the sky in a statistically significant way? On smaller angular scales, these distances will also allow for independent estimates of weak lensing, mapping the dark matter. The large variety of sources observed by LISA will provide different classes of standard sirens. Stellar black hole binaries at z 0.2 (ref.152), EMRIs at z 1 (ref.153) and SMBBHs up to z ≈ 10 (ref.154) will enable precision cosmology across the entire astrophysically relevant redshift range.

With a population of compact binary mergers observed with 3G detectors, and their redshifts obtained by follow-up EM observations, it will be possible to accurately measure cosmological parameters such as the dark matter and dark energy densities, and the equation of state of dark energy155, giving a completely independent and complementary measurement of the dynamics of the Universe.

Astrophysical and primordial stochastic backgrounds

The ASGWB that will be detected by PTAs also contains cosmological information. The properties of the ASGWB depend on the formation and evolution of cosmological source populations86. PTA measurements of the ASGWB produced by SMBBHs, the most promising GW source in that band, will constrain the evolution of the SMBHs that become quasi-stellar objects and active galactic nuclei (AGN). In addition, PTAs are sensitive to GWs produced by fundamental physical phenomena such as phase transitions in the early Universe, cosmic strings and inflation, all of which would provide unique windows into high-energy and early-Universe physics156,157,158. Finally, just as for ground-based and spaced-based detectors, EM counterparts to individual SMBBH systems will allow for new measurements of the Hubble constant.

GW astrophysics

Formation and evolution of compact stars

Binaries formed from pairs of compact stellar remnant objects such as white dwarfs, neutron stars or black holes are efficient emitters of GWs. As these binaries form through various channels and enter the GW-driven regime of their evolution, where GW radiation determines the orbital dynamics, they sweep up in frequency through first the mHz band of space-based detectors and, eventually, into the Hz to kHz band of ground-based interferometers159. The more massive systems (such as GW150914) among these binaries will first sweep through the LISA band, crossing to the ground-based frequency band a few years later. LISA will allow precise determination of the sky location and time of coalescence weeks in advance, making it possible to schedule massive and deep EM coverage of the sky at the time of merger. Such measurements can yield clues as to the likely progenitor systems and their evolution, providing an important constraint on models of stellar evolution. Detailed measurements of GWs from individual systems can also provide information about the internal structure of the objects involved, such as (discussed above) the equation of state of neutron stars involved in a neutron star–neutron star binary. Through interleaved operation and improvement of ground-based detectors, an ever larger statistical sample of black hole–black hole, neutron star–neutron star and neutron star–black hole binary coalescences will be observed, enabling the reconstruction of the formation and evolution of these systems along their cosmic history160,161.

With the arrival of LISA in the 2030s, tens of thousands of BBH and BNS systems will be added to the existing catalogue162. The population of white dwarf–white dwarf binaries in the Milky Way will also be unveiled, enabling a range of astrophysical investigations, from the structure of our own galaxy to the connection between white dwarf–white dwarf binaries and type Ia supernovae. Beyond the Milky Way, hundreds of stellar-mass black hole binaries far from coalescence will be added to the event count, thus, providing precious complementary information to that gathered by ground-based detectors.

SMBH growth and evolution

SMBHs having masses in the range from 106 to 1010M and inhabiting the centres of galaxies also frequently form binaries by pairing with other compact objects163. This may happen in the aftermath of a galaxy–galaxy merger, when two SMBHs pair with each other, resulting in the formation of a SMBBH, or as a result of dynamical processes in dense stellar nuclei, in which the capture of a stellar remnant (black hole, neutron star or white dwarf) by a SMBH initiates an EMRI (ref.60). Both classes of sources are of capital importance in piecing together the puzzle of cosmic structure formation. Today, it is known that virtually every massive galaxy hosts a SMBH at its centre, that galaxies merge frequently, that protogalaxies were already in place at z > 10 and that quasars were already shining at z > 7. These pieces of evidence led to the framework of hierarchical structure formation, whereby galaxies grow by accreting gas cooling along the filaments of the cosmic web and by merging with other galaxies. LISA has the capability to detect mergers of black holes in the mass range 103–107M out to their formation redshift, including for z > 20 for some mass ranges59. Only GW detectors can observe individual objects at such early times. SMBBH mergers trace the assembly of their hosts from the formation of the first protogalaxies following the dark ages, well beyond the epoch of reionization, up to now. By inferring the redshifts of these events from their luminosity distances, LISA can follow the evolution of large-scale structure over time and, by exploring the demographics of black hole seeds (including their masses and spins), LISA can test models of how early black holes grow into the massive and SMBHs we see today in galaxies and quasars.

This information is complemented on the one hand by the observation of EMRIs up to z ≈ 1 and on the other hand by the PTA detection164 of the stochastic GW background produced by the most massive black hole binaries in the Universe. EMRIs can probe the population of inactive (thus, otherwise invisible) SMBHs, providing invaluable insights into the low-mass end of the SMBH mass function down to the mass scales of dwarf galaxies. The properties of individual inspirals (such as eccentricity and orbital inclination) carry information on the dynamical processes governing the evolution of dense relativistic systems, offering a unique laboratory for testing strong gravity.

At the super-massive end of the mass spectrum, PTAs are expected to reveal the cosmic population of inspiralling SMBBHs that inhabit the largest galaxies in the Universe82,164. These objects are in a frequency range inaccessible to LISA and ground-based detectors. Outstanding questions such as the precise occupation fraction of SMBHs in galaxies, the merger rate of galaxies, the relation between galaxy masses and the masses of the SMBHs they host, the efficiency of pairing of SMBHs and the nature of their dynamical interaction with the environments at the cores of galaxies will be answered by deciphering the information encoded in the amplitude and shape of the ASGWB spectrum88. The detection of the ASGWB will definitively resolve the ‘final parsec’ problem, proving that SMBHs can merge and possibly elucidating their dynamical interactions in the cores of galaxies. The dominant dynamical processes are expected to be the scattering of stars on orbits that intersect the galactic core or interactions with a circumbinary disk. PTAs probe frequencies at the interface between the environment-driven regimes (when the SMBHs are far apart) and GW-dominated regimes (when the SMBHs separations are below a milliparsec). Each dynamical mechanism also predicts different inspiral timescales compared with estimates that assume GW-driven inspiral, so measuring the ASGWB spectrum can provide clear evidence of which dynamical processes dominate in these massive galaxy hosts. Individual SMBBH systems are expected to be detected after the detection of the ASGWB. Studies of individual systems, coupled with EM observations, will allow probing the astrophysical processes driving mergers even further, and determine how the importance of various processes depends on the properties of the galaxies88.

Multi-messenger astronomy with GWs

Dawn of a new multi-messenger era

The detection of GWs from the inspiral and merger of the first BNS system GW170817 (ref.4) marked the start of an era of multi-messenger astronomy incorporating GW observations5. The extensive multi-wavelength, multi-year follow-up campaign of GW170817 enabled the detection of counterparts in almost all the EM bands, confirming that the merger of a binary system of neutron stars powers high-energy transients, such as short gamma-ray bursts29,30,31,32,33,34,35,36,37 and kilonovae38,39,40,41,42,43,44,45,46. This unique multi-messenger detection5 showed the potential of multi-messenger astronomy impacting our knowledge of relativistic astrophysics36,37,165,166, radioactively powered transients, nucleosynthesis and heavy-element enrichment of the Universe47,48,49,50,51,52, and the physics of dense nuclear matter132,167,168,169,170,171. It also showed the importance of population studies required to disentangle the microphysics of the source and its interaction with the environment, from the source geometry and energetics. Increasing the number of joint detections will make it possible to determine the equation of state of neutron stars171, to probe the properties of different components of the mass ejected during and after the merger172,173,174, to understand if the BNS mergers are the primary channel of formation of heavy elements and the details of the nuclear physics relevant to nucleosynthesis175, and to understand the structure of the relativistic jets and the physics behind their formation176,177.

Multi-messenger facilities

During the current decade (2020–2030), the transient sky will be explored by new observatories and surveys, which will probe a range of frequency bands and timescales with better sensitivity than ever before. The improved sensitivity will be crucial to follow up GW signals coming from larger distances accessible by the upgrades of the LIGO, Virgo and KAGRA detectors, and the third generation of GW detectors. In the optical band, the Vera C. Rubin Observatory178 will commence operation in the early 2020s and serve as a unique resource for deep, multi-colour searches for optical counterparts over hundreds of square degrees. The James Webb Space Telescope (JWST)179 and a new generation of 30–40-m class telescopes, such as the European Southern Observatory Extremely Large Telescope (ESO-ELT)180, the Giant Magellan Telescope (GMT181) and the Thirty Meter Telescope (TMT)182, will allow characterization of the nature of the GW source following the temporal evolution and spectral properties of the faint emission through deep imaging and spectroscopy. The high angular resolution and sensitivity of these telescopes will enable probing the local environment of the source, the properties of the host galaxy and the possible presence of star clusters, providing insights into the formation and evolution of compact objects. The ultraviolet transient sky will soon be monitored by ULTRASAT183. Its unprecedented large field of view in the ultraviolet range and its rapid real-time response make it ideal for the follow-up of GW signals.

The high-energy sky is currently monitored by sensitive, large field-of-view survey instruments, including NASA’s Neil Gehrels Swift184, Fermi185 satellites, the ESA’s INTErnational Gamma-Ray Astrophysics Laboratory (INTEGRAL) satellite186 and Russian–German ‘Spectrum-Roentgen-Gamma’ mission eROSITA187. A number of missions are expected to be launched in the coming years, such as Einstein Probe188, eXTP189 and the Space-based multi-band astronomical Variable Objects Monitor ECLAIRs (SVOM-ECLAIRs)190, and some are envisioned, such as the mission concept All-sky Medium Energy Gamma-ray Observatory (AMEGO)191. The sensitivity of the Advanced Telescope for High ENergy Astrophysics (Athena) X-ray observatory192, which is expected to be launched around 2030, and the ambitious Lynx project193, proposed in the USA, will be of great value to detect fainter X-ray sources, such as the X-ray afterglow emission from relativistic jets observed off-axis. Mission concepts, such as the Transient High Energy Sky and Early Universe Surveyor (THESEUS194) and the Transient Astrophysics Probe (TAP195), are designed to have a unique combination of instruments for gamma-ray, X-ray and infrared transient detections to catch non-thermal and thermal emissions from GW sources.

In the radio band, the Square Kilometre Array (SKA)196 and the next-generation Very Large Array (ngVLA)197 will have unprecedented sensitivity, excellent angular resolution and faster survey speed, which will make them ideal for survey studies and transient detections. These new radio facilities are capable of detecting the possible prompt radio burst198 signals produced by ultra-relativistic jets with timescales of weeks and by the sub-relativistic merger ejecta with timescales of a few years199.

Turning to particle detectors, the Cherenkov Telescope Array (CTA)200 will explore the GeV–TeV sky with a deeper sensitivity than previous instruments. The large field of view, the flexibility to map very large and arbitrary sky patches, and the rapid response time (within about 30 s) make CTA the ideal instrument to detect possible very-high-energy gamma-ray counterpart of a GW signal. Joint GW–neutrino observations with IceCube and KM3NeT may reveal coincident emissions of high-energy neutrinos from BNS mergers or other energetic astrophysical phenomena201.

Probing SMBBH counterparts

Unlike stellar-mass black holes, SMBH coalescences resulting from the collision and merger of galaxies are expected to take place in environments with significant amounts of gas202. This leads to the exciting possibility of EM signals associated with LISA detections, although the exact nature of such signals is, as yet, unclear. Several counterparts, including precursors, prompt transients and afterglows have been proposed in the literature203, but pinning down the distinctive nature of the emission of SMBBH coalescences will require detailed general-relativistic magnetohydrodynamic simulations that include radiative transfer, and is currently an active area of theoretical and numerical research204,205. Detecting and identifying EM counterparts of SMBBH coalescences will be challenging. Because of its frequency response, the bulk of LISA events are expected to involve fairly light (<106M), high-redshift (z > 3) systems, which will make it challenging to achieve a deep coverage on a typical deg2 sky localization with LISA59. This challenge is coordinated by the ESA, which is planning a significant overlap between LISA and the upcoming X-ray satellite Athena206, with the goal of discovering X-ray signatures from merging SMBBHs207. The scientific payoffs of a coincident detection would be very valuable, enabling the study of the host environment, shedding light on the formation and evolution of SMBHs and their galaxies, allowing, for the first time, detailed studies of accretion physics on SMBHs of known masses and spins (extracted from the GW signal)208.

The large population of white dwarf–white dwarf binaries detected by LISA will provide a rich arena for multi-messenger studies209. Taking advantage of the complementary nature of EM and GW observations, it will be possible to reveal information about orbital geometries, object sizes and mass transfer. Unlike many other GW sources, the white dwarf binaries detected by LISA evolve slowly on human timescales, making them persistent multi-messenger sources. In fact, several dozen systems that LISA will observe with high SNR have already been observed electromagnetically210. The population of these known ‘verification binaries’ is growing through the work of the Zwicky Transient Facility211 and GAIA212, for example, and is expected to increase significantly with surveys such as the Vera C. Rubin Observatory213 before being greatly expanded by LISA itself.

At nanohertz frequencies, PTAs will enable the individual detection of several SMBBHs80,82 of M > 109M at z < 1 in their adiabatic inspiral phase. These are the most massive binaries in the low-redshift Universe, which can only be hosted in extremely massive galaxies214. It will be possible to rank the most likely hosts within the PTA localization area81 and use time-domain surveys (such as the Vera C. Rubin Observatory), as well as available spectroscopic observations, to look for periodic AGNs matching the period of the detected GW and search for other spectral signatures indicative of a possible binary215,216. The secure identification of counterparts will be critical to understand the distinctive signatures of SMBBHs, distinguishing them from regular AGNs. Those signatures can then be searched for in large AGN data sets to identify the expected much larger population of SMBBHs with periods larger than several years (emitting GWs below the frequency range probed by PTAs217). EM counterpart identification for SMBH coalescences will also enable the study of the host environments of SMBHs, shedding light on the formation and evolution of black holes and their galaxies.

With future radio telescopes such as the SKA and the ngVLA, improved timing precision will enable much more precise distance measurements of pulsars used in a PTA218. With distances known to within a GW wavelength, the so-called ‘pulsar term’ can be used to determine the position of a single GW to arcsecond precision, hence, allowing multi-messenger follow-up observations219.

Future GW detectors

The ambitious GW science opportunities summarized in the previous section will be enabled in coming decade by upgrades to the existing ground-based and PTA detectors, and next decade by completely new detectors capable of achieving significant sensitivity increases or, in the case of LISA, completely new observational bands. Below, we survey the needed advances in each detector waveband and discuss the prospects for addressing them.

Next-generation ground-based detectors

The present generation of observatories have arm lengths of 3 km (Advanced Virgo, KAGRA) and 4 km (Advanced LIGO), respectively. Given the fixed arm length L, any increase in strain sensitivity for these observatories will be accomplished through reducing displacement noise that limits the measurement precision of δL. Both Advanced LIGO and Advanced Virgo are implementing mid-scale upgrade programmes, designated as A+ and AdV+, that aim to improve the sensitivities of existing observatories by more than two times their current levels220,221. The primary planned upgrades include improved mirrors with lower thermal noise optical coatings, better squeezed light performance, including frequency-dependent optical squeezing222, and improved GW readout method based on balanced homodyne detection223,224. LIGO-India, slated for operation late in this decade, is planned to come online in the A+ configuration.

Future ground-based detectors are targeting as much as a tenfold increase in sensitivity over the existing network. The key change will be an increase in the baseline arm length L. Two 3G detector concepts — ET in Europe and CE in the USA — are currently being pursued in parallel. ET is currently envisioned as an underground infrastructure in Europe housing three interferometers in a triangular configuration, each with 10-km arm lengths96. CE retains the ‘L’ interferometer configuration used in the current interferometers but increases the arm lengths to up to 40 km (ref.97). CE is planned for two-stage implementation: CE1 will primarily use tested Advanced LIGO technology (albeit with heavier mirrors and higher laser power), whereas CE2 may use newer technologies described in more detail below. An artist’s concept of ET is shown in Fig. 10a. The ET and CE configurations and lengths are shown for comparison in Fig. 10b. Additionally, external forces on the mirrors will be reduced, via better seismic filtering, reduced thermal noise and choice of site. Refinements to the sensing system will allow more precise measurements of the positions of the test mirrors, for example, by improving the ability to measure δL, to further improve their sensitivities.

Fig. 10: Third-generation ground-based detectors.
figure 10

a | Concept for the Einstein Telescope. A triangular underground installation with 10-km-long arms is planned. The influence of Newtonian noise is reduced with the underground location, and the multiple detectors in the triangular arrangement give a wide sensing bandwidth and polarization resolution. b | A comparison of the Einstein Telescope and Cosmic Explorer detector configurations. For comparison, the current Advanced LIGO detector is also shown. Panel a courtesy of the ET design study team.

As compellingly demonstrated by the detection of the BNS merger GW170817 (ref.4), multi-messenger astronomy is one of the driving scientific motivations for building 3G detectors. The angular response (or antenna pattern) of a single interferometer to a GW is essentially omnidirectional18. To sufficiently resolve the sky location of a GW source, a network of widely separated observatories is needed. As a standalone detector, ET has some capability to identify the sky positions of transient GW sources by virtue of its ability to resolve polarization (through its triangular geometry and multiple co-located interferometers), whereas a single CE has very little capability to determine position. To localize a large fraction of sources within z ~ 1.5 with ≤10 deg2 resolution, at least three interferometers operated as a single, global-scale detector are needed.

Designing 3G interferometers, housed in suitable observatories and capable of meeting the long-term ambitious science goals presented in the previous section, requires not only specifying a number of key detector design parameters but also understanding the interdependencies and design trade-offs to be made.

Interferometer arm length

Increasing the baseline L by lengthening the interferometer arms is, perhaps, as the strain equation h = δL/L implies, the most straightforward path to improved sensitivity. However, increasing the arm length places demands on other detector design aspects. For example, larger beam sizes (due to diffraction of the laser beam) require larger diameter mirrors, placing more stringent requirements on the mirror substrate material and the reflective coating in terms of homogeneity, uniformity and surface figure error. Moreover, when the arm length L approaches the half GW signal wavelength λ = c/fGW, where fGW is the GW frequency, the sensitivity is reduced. For targeting the detection of the post-merger ringdown of a BNS collision, the sensitivity of the detector at 4 kHz is important; a 40-km-long arm length reduces the sensitivity for these signals (due to the presence of a null in sensitivity caused by the 3.75 kHz free spectral range in a 40-km cavity) and a shorter arm length would be better for this particular science goal. In addition, costs for the vacuum system and facility (on the Earth’s surface or underground) housing the detector scale approximately linearly with length and must be taken into consideration. Models that combine astrophysical goals and measurement limitations have been developed to determine optimal arm lengths97.

Mirror substrate material and temperature

Lowering the mirror temperature can reduce thermal noise both through direct scaling with \(\sqrt{kT}\) (here, k is the Boltzmann constant and T the temperature of the mirror) and through the temperature dependence of the mechanical loss, especially in crystalline materials. Although room-temperature fused silica is used in most current GW detectors, its performance degrades when cooled. Potential future detector optical substrate materials include crystalline silicon and sapphire, with mirror masses of up to 320 kg currently envisioned to suppress QRPN. Growing single-crystal substrates possessing high homogeneity, low optical absorption and low internal stress birefringence will require significant research and development. Ultralow temperatures ≤5 K offer the highest possible reduction in thermal noise; however, the engineering challenges are formidable. As one example, a heat link will be required to extract heat (deposited by the laser interferometer sensing system) from the mirror without introducing excessive displacement noise, thermal noise or otherwise compromising design requirements. Silicon may have an advantage in that its coefficient of thermal expansion exhibits a zero crossing at 123 K (ref.225), thus, reducing thermo-elastic noise226 and, equally important from an operational standpoint, minimizing thermal lensing induced by heating from absorption of the laser light. In addition, it is feasible to extract heat via radiative cooling at 123 K (as the environment can be engineered to be significantly colder than the substrate), possibly alleviating the need for physically contacted heat links.

Mirror coatings

Future interferometers will also require reduced coating thermal noise11 to achieve their ultimate sensitivity, with the aim of a ten times reduction over the current state of the art. Recent research on ion beam sputtered amorphous coatings has produced factors of a few reduction at 1.06-μm wavelengths used in today’s detectors227, with evidence that better performance may be achieved out to 2 μm. Small-aperture crystalline coatings have also shown promise228 but suffer from increased optical loss and challenges in scaling up to large apertures. This appears to be the noise source least susceptible to a straightforward engineering solution. Furthermore, scaling the fabrication process up to coating large-aperture optics will require significant engineering development effort.

Laser wavelength

Silicon as a substrate material is being explored for both ET and CE. However, silicon is opaque at 1.06 μm (the wavelength used in current detectors), necessitating the use of wavelengths between 1.5 and 2.1 μm to be able to transmit light through Fabry–Perot input cavity mirrors, beamsplitters and auxiliary optics. Longer wavelengths have advantages and disadvantages. They are less susceptible to optical scatter and loss, important for harnessing and preserving the full impact of squeezed quantum states. There may also be evidence of lower coating mechanical loss at longer wavelengths. Conversely, longer wavelengths place tighter requirements on interferometry (the ability to ‘split a fringe’). In addition, the diffraction-limited beam size increases linearly with the wavelength, requiring larger aperture mirrors. Developing suitably large diameter (≈80 cm) and massive silicon substrates with low levels of absorption is a key technology goal for future generation instruments. Finally, frequency-dependent squeezing229 will be used to further minimize shot and QRPN. The requisite non-linear optical devices (such as phase modulators) exist at 1.06 μm but are in need of significant development at longer wavelengths.

Laser power

Shot noise is the dominant limit to interferometer sensitivities at high frequencies in a simple interferometer. The choice of mirror substrate material dictates the choice of laser wavelength, which leads to constraints of the laser technologies that can be used. Lasers operating at one micron can be scaled to 500 W with the required frequency/intensity stability and mode quality230; however, development work is needed to achieve the required power and stability for lasers operating in the 1.5–2.1 μm range.

Low-frequency performance

The low-frequency observing cut-off is a critical parameter for 3G detectors. Increasing sensitivity to lower frequencies below 10–20 Hz (where the current generation of detectors operate) will enable detections of intermediate-mass black hole mergers in the range of 102–104M and shed light on how heavier black holes form. Mirror suspension systems currently have fundamental resonances in the 1-Hz range; future detectors’ suspensions must push to lower resonant frequencies for greater isolation in the 1–10-Hz range. Suspension stages may need to be made from silicon or sapphire, which can be cooled to reduce thermal noise. Minimizing Newtonian noise necessitates finding a site with low environmental noise. An underground location (as is planned for ET) should reduce the Newtonian noise, although requires care in preserving the quiet environment through observatory design. In addition, Newtonian noise subtraction approaches231 need to be designed and tested to reach the planned factor of ten reduction to meet ET and CE performance goals.

Observatory network configuration

Multi-messenger astronomy requires accurate and relatively precise localization of GW events. The current network currently has four km-scale observatories in operation: LIGO Hanford, LIGO Livingston, Virgo and KAGRA. The addition of LIGO-India later this decade will further improve the sky localization capability18. Detailed studies of networks232,233 show that a third detector to complement ET or CE in the Southern Hemisphere — needed for detecting the majority of the events within z ~ 1.5, with error boxes less than 10 deg2 — would form a powerful array, and studies are ongoing in Australia for possible implementation.

Interferometer vacuum systems

Observatory vacuum systems are a critical infrastructure component for 3G observatories. The laser light used to probe the arm lengths must travel in an ultrahigh vacuum to avoid path-length fluctuations due to the polarizability of residual gas, and the beam tube must not introduce scattered light. As it is currently envisioned, CE will require two 40-km beam tubes, of 1.2 m diameter, at pressures less than 10−9 torr, with stringent requirements on partial pressures of molecular hydrogen, water and select hydrocarbons. As the vacuum system comprises much of the cost of a 3G observatory, ‘value engineering’ these systems is a high priority. Efforts are already underway exploring the use of low-carbon steel and nested vacuum systems234.

Civil engineering

Whether the new observatories are nominally on the surface of the Earth (such as CE) or underground (such as ET), there will be significant costs associated with the civil engineering. Site location, acquisition and preparation can present significant practical challenges, and can impact the configuration of a worldwide array.

Addressing these primary challenges (and many other challenges not discussed here) will require a sustained and globally coordinated R&D programme to be undertaken before ET and CE conceptual designs can be finalized. To have operating 3G detectors in the 2030s, facility construction should commence in this decade, requiring R&D and prototyping efforts currently underway to be ramped up significantly. Several efforts are of sufficiently large scale that industrial partnerships will be essential to succeed. Examples include the development of mirror optical coatings and mirror substrates with the requisite optical and mechanical properties. Also essential in the near term is the development of a prototype interferometer test bed for interferometry and laser development at the laboratory scale.

On a longer term, upgrading one or more of the existing LIGO facilities to a full km-scale interferometer using 3G technologies is a particularly appealing path to a full-scale 3G interferometer network in that it will both test 3G technologies almost ‘at scale’ and deliver a detector with considerably more sensitivity than the current second-generation detectors. The LIGO Voyager detector design98 is being explored as a possible upgrade to the existing LIGO observatories late in this decade. Voyager can potentially achieve a twofold sensitivity increase when compared against Advanced LIGO Plus. In addition, the Neutron star Extreme Matter Observatory (NEMO, ref.235) has been proposed in Australia as a 4-km observatory targeting neutron star GW astrophysics, aiming to have sensitivity comparable with ET and CE at frequencies above 2 kHz.

Future space-based detectors

Space-based detectors measure differential strain using an approach that is similar to that of their terrestrial counterparts. The primary difference between space-based designs such as LISA (Fig. 11 and ref.58) and terrestrial interferometers is the size of the baselines: L is currently 3 or 4 km for existing ground-based designs, with plans for 10 or 40 km for future designs, versus L ~ 2.5 × 106 km for the LISA design.

Fig. 11: The LISA mission.
figure 11

a | Three spacecraft, in a triangular stable orbit, are separated by 2.5 × 106 km. Laser beams measure between each spacecraft in both directions; from these six legs, distortions due to passing gravitational waves in both polarizations can be resolved. b | A 1-AU orbit is chosen, and the constellation scans the sky once per year. Figure courtesy of the LISA Consortium.

The longer baselines have two effects. First, the required displacement sensitivity of the interferometric metrology system required to achieve an equivalent GW strain sensitivity is roughly a factor of a million less for space-based interferometers than ground-based interferometers. Second, the optimal response to GWs is shifted to the mHz frequency band, roughly the inverse of the light travel time across the ~106-km baselines. These differences change the nature of the noise sources that limit the instrument’s response. Many of the fundamental limitations on measurement, such as seismic noise, and thermal noise in optics and coatings, that must be aggressively addressed in ground-based interferometers are not an issue for space-based interferometers due to the larger size of the displacements. However, the longer wave periods (hours to seconds) and increased duration of the signals (hours to years) demand that the designers of space-based interferometers must be more concerned with the very-low-frequency gravitational and EM stochastic forces due to thermal drifts. Furthermore, the space-based instruments must operate in, and protect their test masses from, the harsh environment of space (caused by the exposure to various types of radiation, as well as large thermal gradients produced by asymmetric solar illumination of the spacecraft) and without direct human intervention.

Aside from these differences between ground-based and space-based GW interferometers, the fundamental limitations on sensitivity are the same: stray forces on the test masses and a limited precision in the measurement of the separation of the test masses. The first limit is how well the test masses, which act as a fiducial inertial test particle, approximate an inertial frame. This can be characterized by the spectrum of residual acceleration from non-gravitational forces on the test mass. For first-generation space-based interferometers, the required level of accelerations is on the order of femto-gs (1 g = 9.81 m s−2). The specific requirement for LISA is for the non-gravitational forces on each test mass to be less than 3 × 10−15 m s−2 Hz−1/2 in the mHz band, with relaxations at both higher and lower frequencies58. When expressed as an acceleration, this is similar to the stability of the test masses in the second-generation ground-based interferometers that made the historic first detections of GWs.

A practical consequence of working in space is that the strength of the local gravitational field, which is ~1 g for ground-based interferometers, is significantly lower — typically limited by the mass distribution of the spacecraft itself (and, perhaps, nearby celestial bodies, depending on the orbit). This allows a conceptually simple technique to be used to realize the low-disturbance test mass: simply let it go and let it drift. Forces from sunlight and residual particles in the space environment can be avoided by allowing the test mass to drift within a shielded housing inside the spacecraft. To prevent the spacecraft from running into the test mass, and to limit disturbances from electrostatic and gravitational couplings between the spacecraft and the test mass, a control system known as drag-free control will be used to precisely steer the spacecraft to follow the test mass236. Drag-free control was incorporated into early design concepts for space-based interferometers, and was first demonstrated237,238 by the LPF spacecraft in 2016. At the core of the LPF were two gravitational reference sensors (GRS), each consisting of a test mass, an electrostatic sensing and control assembly, and a surrounding vacuum enclosure. The test masses were cubes of a gold-platinum alloy, approximately 4 cm on a side and with a mass of ~2 kg (Fig. 12). These were separated from their hollow cubic housing by gaps of a few millimetres. Electrodes on the inner faces of the housing allowed the position and orientation of the test mass to both be sensed in all six kinematic degrees of freedom. These same electrodes could be used to apply forces and torques to the test mass electrostatically. The two GRS assemblies were placed approximately 38 cm apart, with an optical bench placed between them. This optical bench was used to perform a series of interferometric measurements to determine the relative position of the two test masses and the position of one test mass relative to the spacecraft. During the LPF tests, the spacecraft operated in drag-free mode around one of the two test masses, with the second test mass electrostatically suspended as a witness. Although the design requirements for the LPF were deliberately relaxed from those of LISA, the LPF performed far better than the Pathfinder requirements (Fig. 12), and was able to demonstrate a performance that was significantly better than the LISA requirements63,237.

Fig. 12: LISA Pathfinder technology and performance.
figure 12

The left figures display the launch of the LISA Pathfinder (LPF) mission in December 2015 (panel a), the LPF core assembly (panel b), a close-up of the LPF optical bench (panel c), and one of the two test masses in the core assembly (panel d) used to read out the relative test mass displacement noise. e | The measured acceleration noise of the LPF mission. The vertical scale is the relative acceleration noise amplitude spectral density between the two gravitational reference sensors test masses. The LPF demonstrator performed far better than its requirements and exceeded the LISA flight requirements in the relevant low-frequency regime. Panel a © ESA-Stephane Corvaja. Panels b and c courtesy of Airbus Defence and Space GmbH, Friedrichshafen, Germany © Airbus/Ruediger Gerndt. Panel d courtesy of OHB-Italia. Panel e adapted from ref.63, CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/).

Relative to ground-based interferometers, the scale of the GW-induced displacements in space-based interferometers is significantly larger (~10−12 m for space-based versus ~10−18 m for ground-based). As a result, the interferometric techniques for space are comparatively simple — there is no need for resonant cavities and complex interferometer topologies, such as power-recycling and signal-recycling cavities, as shown in Fig. 4. However, the measurement must be made over long distances and between separate platforms with large relative motion. For LISA, the relative velocity between each pair of spacecrafts is several m s−1 over the separation distance of ~2.5 × 106 km. The first challenge is simply getting enough light from one spacecraft to another. Using a 30-cm-aperture telescope to both transmit and receive the beam on each spacecraft, diffraction over these baselines results in a reduction in power of roughly 1010 at the collection point. To reach a shot-noise-limited displacement noise in the picometre range requires that roughly 1 W of 1-μm laser light leaves the telescope. This is a moderately high level for space-qualified continuous-wave lasers and likely requires the use of a multi-stage laser system. Because of these large diffraction losses, there is no hope of reflecting the light back to the originating spacecraft, as would be done in a classical Michelson interferometer. Instead, a triangular constellation of three identical satellites is used. Each spacecraft simultaneously transmits its own laser signal while receiving signals from each of the other two in the constellation. The incoming and outgoing beams are combined to interfere on an optical bench, resulting in a fringe pattern that encodes information about each laser’s intrinsic frequency noise, the orbital motion of the spacecraft and small fluctuations from passing GWs. These fringe patterns are tracked and recorded on each of the three satellites and transmitted to the ground. The set of measurements is then combined using a technique called time-delay interferometry (TDI), in which linear combinations of each signal are formed, suppressing the intrinsic phase noise of the lasers, while retaining the GW information239,240.

The basic principles of the LISA displacement measurement have been validated through a combination of numerical experiments241 and demonstrations using laboratory analogues242,243. More recently, an opportunity to validate several key components of the LISA metrology system in flight has been realized with the Laser Ranging Instrument (LRI) on board the GRACE-Follow On (GRACE-FO) spacecraft, which measures the distance between two satellites via optical interferometry. Several of the LRI parameters, including the delivered light powers, Doppler rates and laser phase noise, are similar to those for LISA. The first LRI results demonstrate nanometre-level ranging over the 210-km link, meeting the design goals of the LRI and within a few orders of magnitude of the LISA requirements, despite the LRI operating with a single link and, therefore, being limited by laser frequency noise244.

The current planned launch date for LISA is 2034 to match the ESA’s schedule for its ‘L’ (Large) missions. The ESA and its European and US partners are currently engaged in a concerted effort to complete developments of technologies that were not demonstrated on the LPF or GRACE-FO or which require minor modifications. These developments will be completed by the time the mission enters its implementation phase, planned for the mid-2020s. Whereas on-time launch of LISA appears feasible, LISA may not be the sole member of the first generation of space-based GW observatories. Several Chinese-led efforts are currently in development, including the TianQin (ref.245) concept for a geocentric constellation and the Taiji (ref.246) concept for a heliocentric constellation, both of which use an architecture similar to LISA, but with somewhat different instrumental parameters.

To reach sensitivities beyond this first generation of observatories will require additional investments in basic research. Concepts for second-generation space-based GW observatories include ones targeting frequency bands above247 and below that of LISA. DECi-hertz Interferometer Gravitational wave Observatory (DECIGO)248,249 is a future Japanese space mission with a frequency band of 0.1–10 Hz, mainly aimed at the detection of primordial GWs. DECIGO consists of three spacecraft, which form three Fabry–Perot Michelson interferometers, with an arm length of 1,000 km. Concepts beyond that of LISA250 to look at even more massive systems have also been discussed, as well as detectors similar to LISA with increased sensitivity251. An alternative would be the deployment of networks of space-based observatories, which could greatly improve angular resolution and enable further multi-messenger science252. The specific technology required would depend on the science target and mission architecture. Examples include more powerful, yet, stable lasers, large, but still dimensionally stable, telescopes, inertial sensors with even better performance than those flown on the LPF or, perhaps, more compact and affordable elements that would allow for large networks of space-based interferometers to be deployed.

Future PTAs

In the near term, improvements to the PTA network will focus on improved receivers with much wider radio bandwidths being developed and deployed on several telescopes253. These will increase sensitivity dramatically by allowing for better removal of interstellar propagation effects and integrating more of the emitted radio signals. Wider bandwidth receivers will also allow for greater observing efficiency, obviating the need for observations at multiple frequencies. These new facilities and instruments will greatly enhance the sensitivity of the individual and global PTAs, but will also increase the already difficult task of merging the heterogeneous data from all of the individual telescopes in the array; due to this difficulty, the current best IPTA limits on the stochastic GW background are less constraining than the best individual PTA limit86.

Significant human and computational resources will be needed to make the IPTA project a success. Great progress has been made in developing new GW detection algorithms that properly account for a number of sources of noise in PTA data. As an example, planetary ephemerides have been found to be too inaccurate for PTA experiments, but a new software package can properly model these ephemeris errors while performing the GW searches86,254.

Other issues that must be carefully considered include: long-term pulse profile evolution, the ability to accurately model changes in the electron column density along the line of sight and pulse jitter (pulse–pulse deviations from the average). Pulse jitter is pulsar-dependent and defines a minimum dwell time regardless of telescope sensitivity to achieve a given timing precision. For the celebrated bright MSP PSR J0437–4715, observations of less than an hour can never achieve residuals below 40 ns. Others, such as PSR J1909–3744, have much lower levels of jitter, nearer 10 ns. Timing arrays are starting to factor jitter limits into their observing plans so that large-aperture facilities are not ‘wasted’ on targets that do not benefit from increased SNRs.

Looking further into the future (see, for example, ref.255), the SKA and the proposed ngVLA will further enhance detection and science prospects, and China has other large-aperture radio telescopes planned, such as the 110-m Xingjiang QTT. Existing timing limits are near where many models predicted the stochastic background might be, and there is a good chance that an individual source may be separable from the background within the next decade. The major threat to PTA science is the increasingly crowded radio spectrum from satellites, aircraft and ground-based transmitters that increasingly use the once sparsely populated 300-MHz to 3-GHz band for terrestrial navigation and communications.

Conclusions

In just a few years of using instruments capable of recording the waveforms of signals, ground-based GW observatories have made seminal contributions to the fields of GR, fundamental physics and astrophysics. The multi-messenger characterization of the first observable BNS coalescence dramatically enhanced our understanding of extreme states of nuclear matter and the astrophysics of kilonovae.

The scientific potential for the field of GW science in the next few decades is considerable, afforded by the prospects of upgrades to existing observatories in this decade and the construction or launch of new observatories in the 2030s. There are clear paths to improvements in both the sensitivity of the instruments and the range of frequencies. For ground-based detectors, sensitive to astrophysics of up to ~1,000-M compact objects, the network of detectors of 3-km and 4-km scale will both improve and grow in the coming decade, and the future planned ET and CE observatories offer the possibility of a quieter environment, implementation of detectors of greater complexity and longer arms. Hence, all stellar-mass coalescing BBH systems in the Universe will be within detection reach.

The LISA space-based detector will deliver sensitivity to signals from SMBHs, with exquisite resolution of signal waveforms and completing the survey of the Universe for binaries up to some 106M. The PTAs will continue to evolve with new antenna networks, more sensitive and wider-band receivers, and discovery of additional pulsar ‘clocks’, providing unique information on the dynamics of the very largest galaxies in the Universe. Together with EM and particle detectors, these instruments will provide quantitative and qualitative new insights into physics, astrophysics, cosmology and astronomy. GW detectors have, indeed, opened a new window onto the Universe.