Skip to article frontmatterSkip to article content

Power spectrum recovery

Gaussian and lognormal fields

We begin by testing the consistency of our Gaussian and lognormal maps generation pipeline.

Comparison of a Gaussian map, on the left, with a lognormal map on the right. Both maps arise from the same random seed. The colorbar has been adjusted to enhance the differences between the two. The histogram plot shows the clear difference in the map distributions.

Figure 9:Comparison of a Gaussian map, on the left, with a lognormal map on the right. Both maps arise from the same random seed. The colorbar has been adjusted to enhance the differences between the two. The histogram plot shows the clear difference in the map distributions.

We show example realisations of the two fields in Fig. 9, Gaussian on left and lognormal on the right. There’s a visible difference between the two, as it can be seen clearly from the distribution plot. The main check to perform is for testing whether the generated fields recover the theoretical power spectrum. Fig. 10 shows that this is the case for the Gaussian fields. They recover the fiducial C(l)C(l) within a few percent error, with larger deviations 5%\sim 5\% at the low and high ends of the ll-range. Instead, lognormal fields present deviations 10%\gtrsim 10\%. As the lognormal transformations we use have been reported by different sources Zhou et al. (2023)Boruah et al. (2022), the issue must lie with our JAX implementation of the Hankel transformation. Resolving such issues could be achieved by future iterations of this work. In this work, we restrict ourselves to the use of Gaussian fields, as it is enough to prove our thesis and show that Gaussian processes can be applied to cosmological fields.

Power spectrum estimation from Gaussian and lognormal maps. Mean and standard deviation are calculated with 500 realisation of both fields.

Figure 10:Power spectrum estimation from Gaussian and lognormal maps. Mean and standard deviation are calculated with 500 realisation of both fields.

Gaussian process priors

First, we test the ability of the kernels we have built in Sec. ?? to recover the power spectrum of our cosmology.

Reconstructed power spectrum from prior sample of GP with the four proposed kernels: integration, FFTlog, full-range FFT, half-range FFT and sinc FFTlog. Mean and standard deviation are calculated with 500 prior samples from each GP.

Figure 11:Reconstructed power spectrum from prior sample of GP with the four proposed kernels: integration, FFTlog, full-range FFT, half-range FFT and sinc FFTlog. Mean and standard deviation are calculated with 500 prior samples from each GP.

We test five models in Fig. 11: integration, FFTlog, full-range FFT, half-range FFT and sinc FFTlog. The first four methods are described in the Gaussian process kernels Sec. ??, whereas the sinc FFTlog referes to a FFTlog model on which we applied smoothing, by multiplying the power spectrum by a factor of sinc4(lL2πN)sinc^4(l\frac{L}{2\pi N}). The recovered power spectra are plotted against the fiducial power spectrum, or smoothed power spectrum for the sinc FFTlog. Mean and standard deviation associated to the plots are calculated from 500 samples. As expected the integration, FFTlog and full-range FFT perform similarly, as they all contain the same ammount of information. As these models deviate so strongly from the fiducial power spectrum we tried applying smoothing, which helps to recover half of the ll-range at large scales. The only method that seems to be consistently recovering the fiducial power spectrum is the half-range FFT. One could argue that due to the inherent discreteness and boundedness of the fields we are working with, using FFTs is the most natural choice; also, half-range FFT uses the only grid that recovers a correlation function of the same shape as the field without having to perform binning.

Reconstructed power spectrum from prior samples of a GP, as a function of \{\sigma_8, S_8\}. Mean and standard deviation are calculated with 500 prior samples for each different cosmology.

Figure 12:Reconstructed power spectrum from prior samples of a GP, as a function of {σ8,S8}\{\sigma_8, S_8\}. Mean and standard deviation are calculated with 500 prior samples for each different cosmology.

We have also tested the efficacy of the half-range FFT model for different cosmologies of values {σ8,S8}\{\sigma_8, S_8\} equal to {0.4,0.2}\{0.4,0.2\}, {1.2,1.5}\{1.2,1.5\} and, our fiducial cosmology, {0.8,0.8}\{0.8,0.8\}. As Fig. 12 shows, the model is independent of the choice of cosmology. From here on the results will be presented assuming a kernel built with the half-range FFT model.

Gaussian process map reconstruction

Armed with a reliable kernel, let’s embark upon the journey of reconstructing a heavily masked cosmological field. What we will do is: create a noiseless GRF in the fiducial cosmology Tab. %s, True map; apply a mask to obtain the Data map; condition a Gaussian process which assumes the fiducial cosmology. Fig. 13 lists the result of this operation, showing the resulting mean μ\mu and standard deviation σ\sigma of the conditioned GP. We also plot the ratio between residuals Δ=μ\Delta=\mu-True and standard deviation squared, to test the goodness of fit of our model, the values of the map sum up to χ22495\chi^2 \sim 2495. With the mask covering ν=2353\nu=2353 pixels, we obtain χ2/ν=1.06\chi^2 / \nu = 1.06. Of course, this is just a noiseless application, which is unreasonable for a real application.

Summary of field reconstruction abilities of a Gaussian process conditioned on data. The left column shows the masked GRF, which is our data. The middle column shows the true GRF without masks and a posterior sample drawn from the conditioned GP. The right column shows maps of the mean, standard deviation and residuals over standard deviation squared of the conditioned GP. Regions of higher uncertainty correspond to the masked regions. The residuals over standard deviation map also shows how regions with low mask recover the data.

Figure 13:Summary of field reconstruction abilities of a Gaussian process conditioned on data. The left column shows the masked GRF, which is our data. The middle column shows the true GRF without masks and a posterior sample drawn from the conditioned GP. The right column shows maps of the mean, standard deviation and residuals over standard deviation squared of the conditioned GP. Regions of higher uncertainty correspond to the masked regions. The residuals over standard deviation map also shows how regions with low mask recover the data.

Inference of cosmological parameters

To test the ability of Gaussian processes to recover cosmological parameters without any prior knowledge except a noisy and masked map, we perform a MCMC simulation to infer the posterior distributions of σ8\sigma_8 and S8S_8. We use the convention

S8=σ8Ωm0.3,S_8 = \sigma_8 \sqrt{\frac{\Omega_m}{0.3}},

to infer deterministically a posterior for Ωm\Omega_m. Such a reparametrisation is needed due to the strong degeneracy between σ8\sigma_8 and Ωm\Omega_m. Eq. (30) breaks this degeneracy, changing the geometry of the sampling space and making the sampling more consistent. The model assumes uninformed flat priors for the cosmological parameters, as shown in Tab. %s, such prior bounds are also in accordance with the jaxcosmo release Campagne et al. (2023). The likelihood of the model is given by a Gaussian process distribution conditioned on Data, with a standard deviation equal to the noise applied to the map. The analysis is coded with numpyro Phan et al. (2019) Bingham et al. (2019), using a the No-U-Turn Sampler (NUTS) method with max_tree_depth=16, target_accept_prob=0.8. We simulate 8 chains for the 32×3232\times32 grid and 4 chains for the 64×6464\times64. Each chain performs 1000 warmup steps and 3000 samples.

One parameter

As a first step and for a consistency check, we run the inference model for one cosmological parameter, keeping all others fixed. Using a 64×6464\times64 grid with ng=10 galaxies/arcmin2n_g=10 \text{ galaxies}/\text{arcmin}^2. In Fig. 15 we show the inferred distribution for both σ8\sigma_8 and Ωm\Omega_m. We find that we are able to recover the true value for both parameters within two sigmas, σ8=0.776±0.015\sigma_8 = 0.776\pm0.015 and Ωm=0.284±0.010\Omega_m = 0.284\pm0.010. We notice a slight tendency of the inferred distribution to be biased low; a tendency we also observe next for both sampled parameters, S8S_8 and σ8\sigma_8.

Inferred posterior distribution of \sigma_8 on the left and \Omega_m on the right. Dotted lines indicate the 1\sigma level. Truth values corresponding to the fiducial cosmology are indicated in blue.

Figure 15:Inferred posterior distribution of σ8\sigma_8 on the left and Ωm\Omega_m on the right. Dotted lines indicate the 1σ1\sigma level. Truth values corresponding to the fiducial cosmology are indicated in blue.

Two parameters

Effect of noise

Table 1:List of inferred cosmological parameters inferred by the model with a small 32×3232\times32 grid and for a fixed true GRF realisation. We present the cosmological parameters inferred as we increase the noise level, corresponding to ng=4n_g = 4, 10, 30 and 100 galaxies/arcmin2100 \text{ galaxies}/\text{arcmin}^2.

σnoise\sigma_\text{noise}0.00690.00440.00250.0014
S8S_80.716±0.0430.716\pm0.0430.733±0.0400.733\pm0.0400.747±0.0410.747\pm0.0410.752±0.0420.752\pm0.042
σ8\sigma_80.645±0.1570.645\pm0.1570.628±0.1580.628\pm0.1580.632±0.1660.632\pm0.1660.631±0.1700.631\pm0.170
Ωm\Omega_m0.423±0.1640.423\pm0.1640.469±0.1750.469\pm0.1750.485±0.1860.485\pm0.1860.497±0.1930.497\pm0.193

We perform some tests on low resolution 32×3232\times32 grids to see the effect that the noise level has on the recovered parameters, see Tab. 1. Here we report the inferred cosmological parameters for one data realisation and different noise levels, corresponding respectively to ng=4n_g = 4, 10, 30 and 100 galaxies/arcmin2100 \text{ galaxies}/\text{arcmin}^2, see Eq. (22). The inferred value of S8S_8 can vary as much as a full σ\sigma between high and low noise runs. Keeping in mind that σ8\sigma_8 and Ωm\Omega_m are extremely unreliable due to relative uncertainties of 2530%\sim 25-30\% caused by the degeneracy: as a general trend we notice Ωm\Omega_m gets bigger when σ8\sigma_8 gets smaller with less noise.

Inferred cosmological parameters

Running the model for a larger 64×6464\times64 grid with ng=10 galaxies/arcmin2n_g=10 \text{ galaxies}/\text{arcmin}^2, gives much better constraints on the cosmological parameters. We present the values recovered by the posterior distributions, listed as follows in Tab. 2.

Table 2:Mean and sigma values recovered from the inferred distributions of the cosmological parameters.

S8S_8σ8\sigma_8Ωm\Omega_m
0.762±0.0280.762\pm0.0280.745±0.1510.745\pm0.1510.353±0.1430.353\pm0.143

Fig. 16 shows the inferred posterior distributions and contours for the three cosmological parameters σ8\sigma_8, Ωm\Omega_m and S8S_8. Looking at the contours, we obtain the well known banana-shaped degeneracy between σ8\sigma_8 and Ωm\Omega_m. The S8S_8 and Ωm\Omega_m contour presents sharp cuts for high and low Ωm\Omega_m, indicating an issue with the bounds of the uniform priors imposed. Unfortunately the jaxcosmo package does not allow for the choice of priors to be wider than what shown in Tab. %s, as the model then starts to have divergent samples.

Inferred posterior distributions of S_8, \sigma_8 and \Omega_m. For noise level \sigma_\text{noise}\sim 0.0088. Contours indicate the 1\sigma and 2\sigma credible interval respectively. Dotted lines indicate the 1\sigma level. Truth values corresponding to the fiducial cosmology are indicated in blue.

Figure 16:Inferred posterior distributions of S8S_8, σ8\sigma_8 and Ωm\Omega_m. For noise level σnoise0.0088\sigma_\text{noise}\sim 0.0088. Contours indicate the 1σ1\sigma and 2σ2\sigma credible interval respectively. Dotted lines indicate the 1σ1\sigma level. Truth values corresponding to the fiducial cosmology are indicated in blue.

Posterior checks

Following the two parameter inference model, we perform some posterior checks at the map level Porqueres et al. (2021). Fig. 18 sums up the ability of the model to recover the true map, noiseless and unmasked. Here we present the run with noise level σnoise0.0088\sigma_\text{noise}\sim 0.0088 and a 64×6464\times64 grid. We show the mean and standard deviation for the sample with highest likelihood out of the 12000. The mean field μ\mu is visibly different to the true field in the masked regions and it seems to be of overall lower amplitude. The sample map is comparable to the noisy data; which is to be expected, as the internal noise given to the Gaussian process is the same as the noise level of the data. The standard deviation map σ\sigma presents an overall amplitude comparable to the noise level 0.010\sim 0.010, with higher values for the masked regions. Summing up the map values of the residuals divided by standard deviation squared, we obtain a χ21297.4\chi^2\sim 1297.4. Compared to the number of free parameters ν\nu in our inference model, which for a 10%10\% mask and a 64×6464\times64 grid, is ν=3689\nu=3689. The value of χ2\chi^2 therefore seems to be low, indicating that the noise level assumed by the GP is overestimated. This is supported by the fact that the sample map looks just as noisy as the data, according to Fig. 17, its distribution is in fact just as wide as the noise.

Residual distributions of the mean and sample compared to noise. The mean is less spread, whereas the sample is wider.

Figure 17:Residual distributions of the mean and sample compared to noise. The mean is less spread, whereas the sample is wider.

Residuals

Summary of the two parameter inference at the map level. The left column shows the masked and noisy GRF realisation used, which is our data. The middle column shows the true GRF and a sample from the conditioned GP. The right column shows maps of the mean, standard deviation and residuals over standard deviation squared resulting from the numpyro model sample with highest likelihood. Regions of higher uncertainty correspond to the masked regions.

Figure 18:Summary of the two parameter inference at the map level. The left column shows the masked and noisy GRF realisation used, which is our data. The middle column shows the true GRF and a sample from the conditioned GP. The right column shows maps of the mean, standard deviation and residuals over standard deviation squared resulting from the numpyro model sample with highest likelihood. Regions of higher uncertainty correspond to the masked regions.

References
  1. Zhou, A. J., Li, X., Dodelson, S., & Mandelbaum, R. (2023). Accurate field-level weak lensing inference for precision cosmology.
  2. Boruah, S. S., Rozo, E., & Fiedorowicz, P. (2022). Map-based cosmology inference with lognormal cosmic shear maps.
  3. Campagne, J.-E., Lanusse, F., Zuntz, J., Boucaud, A., Casas, S., Karamanis, M., Kirkby, D., Lanzieri, D., Peel, A., & Li, Y. (2023). JAX-COSMO: An End-to-End Differentiable and GPU Accelerated Cosmology Library. The Open Journal of Astrophysics, 6. 10.21105/astro.2302.05163
  4. Phan, D., Pradhan, N., & Jankowiak, M. (2019). Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro. arXiv E-Prints, arXiv:1912.11554. 10.48550/arXiv.1912.11554
  5. Bingham, E., Chen, J. P., Jankowiak, M., Obermeyer, F., Pradhan, N., Karaletsos, T., Singh, R., Szerlip, P. A., Horsfall, P., & Goodman, N. D. (2019). Pyro: Deep Universal Probabilistic Programming. J. Mach. Learn. Res., 20, 28:1-28:6. http://jmlr.org/papers/v20/18-403.html
  6. Porqueres, N., Heavens, A., Mortlock, D., & Lavaux, G. (2021). Bayesian forward modelling of cosmic shear data. Monthly Notices of the Royal Astronomical Society, 502(2), 3035–3044. 10.1093/mnras/stab204