SCUBIDO

Simulating Climate Using Bayesian Inference from proxy Data Observations

This vignette is used to explain in greater detail the work flow of the SCUBIDO R package.

The aim of this package is to create a reconstruction of palaeoclimate given multivariate μXRF-CS data from lake sediments using Bayesian inference. More detaols on the mathematics behind the model can be found in Boyall et al (in prep).

It is vital that each of the 3 functions are used for this package otherwise the package will fail. Therefore this vignette walks you each function.

This package depends on having the JAGS (just another gibbs sampler). It is integral that you download this software prior to installing SCUBIDO otherwise the package will not laod. Use the link here: https://sourceforge.net/projects/mcmc-jags/

Introduction

The first thing to do it load in the SCUBIDO package. If SCUBIDO is not already installed then do so by loading the devtools package and then installing from Github.

devtools::install_github("LauraBoyall/SCUBIDO") # if not previously installed
library(SCUBIDO)

Loading in your data

The SCUBIDO package requires two datasets, the first being some modern data modern_data with the most recent part of the μXRF-CS data which has an overlap with some instrumental climate data. For example it may be a section from 1950 to present. It is important for this data set that the age (in years CE) is in the first column (from the oldest to the youngest), the climate data is in the second and then the remaining columns are the individual elements. We recommend here that your μXRF-CS data is central-log ratio transformed.

Please not that your climate variable must be in anomalies as we use a grid of -3 to +3°C for the model to pick values from. This current version of SCUBIDO does not currently have an approach to reconstruct beyond these temperatures using the package functions. If users are reconstructing climate and believe that the values are going to be greater than this range then we refer them to the R file in the GitHub repository to change the grid manually.

An example of the structure data set is displayed below using a fake data set:

#>  Age_CE Temperature_Anomaly   S  Si   K  Ca  Ti   M   V  Rb  Sr
#>    1890         -0.28023782 239 785 986 354 237 845 471 924 274
#>    1891         -0.11508874 962   9 137 366 686 260 366 543 594
#>    1892          0.77935416 601 779 905 287 226  23 121 852 160
#>    1893          0.03525420 515 729 576  80 318 862  47 584 853
#>    1894          0.06464387 403 630 395 365 174 335 263 668 848
#>    1895          0.85753249 880 481 450 178 801 632 969 511 478

The second data set required is the fossil_data. This data is the remainder of your μXRF-CS data and follows the same structure as the modern but the second column which held the climate value is not there. Therefore you should have one less column in your table. Example below:

#>    Age_BP   S  Si   K  Ca  Ti   M   V  Rb  Sr
#>  1888.641 600 239 785 986 354 237 845 471 924
#>  1887.832 333 962   9 137 366 686 260 366 543
#>  1887.364 489 601 779 905 287 226  23 121 852
#>  1886.469 954 515 729 576  80 318 862  47 584
#>  1885.522 483 403 630 395 365 174 335 263 668
#>  1885.381 890 880 481 450 178 801 632 969 511

Once the data is set up we can begin by using the SCUBIDO package!

SCUBIDO_input

The SCUBDIO_input function is used to sort your data and split create a data list to be used later in the model. It is important that you save the outputs of the function as shown below as these are needed in the following function.

the summary print out shows what is saved within the sorted list. This function also re-scales the μXRF-CS data and creates a few time files which help with later application.

sorted_data <- SCUBIDO_input(modern_data = modern_data, fossil_data = fossil_data)
summary(sorted_data)
#>            Length Class  Mode   
#> time_m       75   -none- numeric
#> temp_m       75   -none- numeric
#> xrf_m_resc  975   -none- numeric
#> time_f      570   -none- numeric
#> xrf_f      7410   -none- numeric
#> xrf_f_resc 7410   -none- numeric
#> time_grid  2183   -none- numeric
#> time_fm     645   -none- numeric
#> time_all   2826   -none- numeric

SCUBIDO_cal

Once the data has been sorted we can now begin to model the data! This first function involves the calibration modern data. Within this stage we learn about the relationship between the individual elements and climate, but also about the covariance between them.

Here we use the saved list from the SCUBIDO_cal function. Once this function has run it will save a model file to your working directory. This uses the system date to allow the script to load previous models. If you would like to upload the model on another day then you should go to the saved file and change the date in the file name as the function will otherwise re-start the model and save with today’s date.

#{r calibration, message=FALSE,warning=FALSE}

calibrated_data <- SCUBIDO_cal(sorted = sorted_data, plot = TRUE, summary = FALSE)

Here we would like to ensure that the model has fully converged and a simple way of checking this is through looking at the R-hat values. You can read more about what R-hat values mean by checking out Gelman and Rubin (1992) and Brooks and Gelman (1998). However in summary your R-hat values should be <1.05. You can find this on the print out after this code has run on the far left side of the table if you choose summary = TRUE.

If you choose to also do plot = TRUE then you will get a series of graphs showing the quadratic relationship between the XRF elements and the climate time series. All of the outputs for this will be stored in your calibrated file, for our example it will be stored in calibrated_data and the sims.list list stored inside this will have all of the model outputs if you would like to plot differently / explore the model and then the sorted list stored in the calibrated_data contains the data from the previous function.

SCUBIDO_reconstruct

The final function in SCUBIDO is SCUBIDO_reconstruct and this will produce the final reconstruction of quantitative climate given μXRF-CS data. Like usual we we load in the previous function’s file (calibrated_data).

Note that this function does take a long time, it will ask you if you wish to continue but please be aware that for thousands of date layers (rows in your xrf data) then it is possible that it could take up to 24 hours (why not run it over the weekend!). This is based on running it on a single core so if you can use parallel cores then use this.


reconstruction <- SCUBIDO_reconstruct(calibration_data = calibrated_data, plot_graph = TRUE)

Whilst this is the default plot the function produces, you can easily extract the resulting data frame and plot in whichever way you wish.

final <- reconstruction$reconstruction # note that we have saved our reconstruction as 'reconstruction' however you may have chosen a different name!

# write.csv(final, "Final Climate Reconstruction.csv)

And there you have it! By this point you would have been able to reconstruct annual climate given μXRF-CS data! If you have any queries please contact Laura Boyall at Laura.boyall.2016@live.rhul.ac.uk

You must cite the original publication (Boyall et al., in prep) if you use this package to publish any work.

Happy modelling!!