Simulating Climate
Using Bayesian
Inference from proxy Data
Observations
This vignette is used to explain in greater detail the work flow of
the SCUBIDO
R package.
The aim of this package is to create a reconstruction of palaeoclimate given multivariate μXRF-CS data from lake sediments using Bayesian inference. More detaols on the mathematics behind the model can be found in Boyall et al (in prep).
It is vital that each of the 3 functions are used for this package otherwise the package will fail. Therefore this vignette walks you each function.
This package depends on having the JAGS (just another gibbs sampler).
It is integral that you download this software prior to
installing SCUBIDO
otherwise the package will not laod. Use
the link here: https://sourceforge.net/projects/mcmc-jags/
The first thing to do it load in the SCUBIDO
package. If
SCUBIDO
is not already installed then do so by loading the
devtools
package and then installing from Github.
The SCUBIDO
package requires two datasets, the first
being some modern data modern_data
with the most recent
part of the μXRF-CS data which has an overlap with some instrumental
climate data. For example it may be a section from 1950 to present. It
is important for this data set that the age (in years
CE) is in the first column (from the oldest to the
youngest), the climate data is in the second and then the
remaining columns are the individual elements. We recommend here that
your μXRF-CS data is central-log ratio transformed.
Please not that your climate variable must be in
anomalies as we use a grid of -3 to +3°C for the model to pick
values from. This current version of SCUBIDO
does not
currently have an approach to reconstruct beyond these temperatures
using the package functions. If users are reconstructing climate and
believe that the values are going to be greater than this range then we
refer them to the R file in the GitHub repository to change the grid
manually.
An example of the structure data set is displayed below using a fake data set:
#> Age_CE Temperature_Anomaly S Si K Ca Ti M V Rb Sr
#> 1890 -0.28023782 239 785 986 354 237 845 471 924 274
#> 1891 -0.11508874 962 9 137 366 686 260 366 543 594
#> 1892 0.77935416 601 779 905 287 226 23 121 852 160
#> 1893 0.03525420 515 729 576 80 318 862 47 584 853
#> 1894 0.06464387 403 630 395 365 174 335 263 668 848
#> 1895 0.85753249 880 481 450 178 801 632 969 511 478
The second data set required is the fossil_data
. This
data is the remainder of your μXRF-CS data and follows the same
structure as the modern but the second column which held the climate
value is not there. Therefore you should have one less column in your
table. Example below:
#> Age_BP S Si K Ca Ti M V Rb Sr
#> 1888.641 600 239 785 986 354 237 845 471 924
#> 1887.832 333 962 9 137 366 686 260 366 543
#> 1887.364 489 601 779 905 287 226 23 121 852
#> 1886.469 954 515 729 576 80 318 862 47 584
#> 1885.522 483 403 630 395 365 174 335 263 668
#> 1885.381 890 880 481 450 178 801 632 969 511
Once the data is set up we can begin by using the
SCUBIDO
package!
The SCUBDIO_input
function is used to sort your data and
split create a data list to be used later in the model. It is important
that you save the outputs of the function as shown below as these are
needed in the following function.
the summary print out shows what is saved within the sorted list. This function also re-scales the μXRF-CS data and creates a few time files which help with later application.
sorted_data <- SCUBIDO_input(modern_data = modern_data, fossil_data = fossil_data)
summary(sorted_data)
#> Length Class Mode
#> time_m 75 -none- numeric
#> temp_m 75 -none- numeric
#> xrf_m_resc 975 -none- numeric
#> time_f 570 -none- numeric
#> xrf_f 7410 -none- numeric
#> xrf_f_resc 7410 -none- numeric
#> time_grid 2183 -none- numeric
#> time_fm 645 -none- numeric
#> time_all 2826 -none- numeric
Once the data has been sorted we can now begin to model the data! This first function involves the calibration modern data. Within this stage we learn about the relationship between the individual elements and climate, but also about the covariance between them.
Here we use the saved list from the SCUBIDO_cal
function. Once this function has run it will save a model file to your
working directory. This uses the system date to allow the script to load
previous models. If you would like to upload the model on another day
then you should go to the saved file and change the date in the file
name as the function will otherwise re-start the model and save with
today’s date.
#{r calibration, message=FALSE,warning=FALSE}
calibrated_data <- SCUBIDO_cal(sorted = sorted_data, plot = TRUE, summary = FALSE)
Here we would like to ensure that the model has fully converged and a
simple way of checking this is through looking at the R-hat values. You
can read more about what R-hat values mean by checking out Gelman and
Rubin (1992) and Brooks and Gelman (1998). However in summary your R-hat
values should be <1.05. You can find this on the print out after this
code has run on the far left side of the table if you choose
summary = TRUE
.
If you choose to also do plot = TRUE
then you will get a
series of graphs showing the quadratic relationship between the XRF
elements and the climate time series. All of the outputs for this will
be stored in your calibrated file, for our example it will be stored in
calibrated_data
and the sims.list
list stored
inside this will have all of the model outputs if you would like to plot
differently / explore the model and then the sorted
list
stored in the calibrated_data
contains the data from the
previous function.
The final function in SCUBIDO is SCUBIDO_reconstruct
and
this will produce the final reconstruction of quantitative climate given
μXRF-CS data. Like usual we we load in the previous function’s file
(calibrated_data
).
Note that this function does take a long time, it will ask you if you wish to continue but please be aware that for thousands of date layers (rows in your xrf data) then it is possible that it could take up to 24 hours (why not run it over the weekend!). This is based on running it on a single core so if you can use parallel cores then use this.
Whilst this is the default plot the function produces, you can easily extract the resulting data frame and plot in whichever way you wish.
final <- reconstruction$reconstruction # note that we have saved our reconstruction as 'reconstruction' however you may have chosen a different name!
# write.csv(final, "Final Climate Reconstruction.csv)
And there you have it! By this point you would have been able to reconstruct annual climate given μXRF-CS data! If you have any queries please contact Laura Boyall at Laura.boyall.2016@live.rhul.ac.uk
You must cite the original publication (Boyall et al., in prep) if you use this package to publish any work.
Happy modelling!!